Product
Product overview
Get to know Swarmia
Business outcomes
Align engineering with the business
Developer productivity
Speed up feature delivery
Developer experience
Get feedback from engineers
Data platform
Reliably measure and export your data
Integrations
See all the systems we support
Features
AI impact
CI visibility
Developer overview
DORA metrics
Engineering metrics
Initiatives
Investment balance
Notifications
Signals
Software capitalization
Sprints
Surveys
Work log
Working agreements
Swarmia for startups Swarmia for enterprises Security
Changelog
Pricing
Customers
Learn
Blog
Insights for software leaders, managers, and engineers
Help center↗
Find the answers you need to make the most out of Swarmia
Podcast
Catch interviews with software leaders on Engineering Unblocked
Benchmarks
Find the biggest opportunities for improvement
Book: Build
Read for free, or buy a physical copy or the Kindle version
Read for free→Buy on Amazon→
About us
Careers

A staged approach to AI adoption for engineering teams

Rebecca Murphey, Field CTO · Dec 5, 2025

Right now, the gap between AI coding tool hype and reality is wide enough to drive a truck through, and LinkedIn would have you believe these tools are already navigating complex legacy architecture and shipping production-ready code autonomously. They’re not.

Meanwhile, most engineering leaders are dealing with more practical questions, like: where do these things actually help, and where do they just create different problems?

I’ve been working closely with engineering organizations as they navigate this, and it’s noticeable that the ones who are seeing success don’t treat AI tools as some kind of productivity panacea.

They’re often moving through four distinct stages — experimenting, adopting intentionally, measuring where it makes sense, and finally optimizing for cost. Without that deliberate approach (and a solid engineering foundation) these tools can create as many problems as they solve.

Let me walk through what each of those four stages look like in practice.

Stage one: Experiment

Experimentation here means exposure, not necessarily rigorous A/B testing with control groups and so forth. Get your hands on these tools, use them on actual work, and talk about what happens. Some people prefer editors with inline suggestions, and others are finding success coding on the command line with tools like Claude Code, describing what they want in plain language.

What’s interesting is how these tools can change your workflow in ways you don’t expect. The traditional advice about limiting work in progress starts to look different when you’re working with AI assistants. When Claude is working on something in one branch, you can start something else — though whether this actually reduces cognitive load seems to vary. Some engineers find juggling multiple parallel streams exhausting, while others find it helps them stay productive.

This doesn’t mean work-in-progress limits stop mattering. It just means you need to be thoughtful about what you’re working on. One branch for documentation, another for model updates? That works. Three agents all editing similar code, and you’re asking for merge conflicts.

The point is: experiment widely and pay attention to what works for you and your team.

At Swarmia, when these tools became available, we started using them, paying for them, and talking about what it was like to use them. That conversation piece has proved to be quite important — sharing what works, what doesn’t, and which tasks benefit most from AI assistance. Engineers have strong preferences about their tools and environments, but creating space for those conversations helps everyone learn faster.

What to do in this stage:

Try different tools and approaches — command line, editor-based, agentic, different models
Let people experiment on actual work, not only toy problems
Create space for sharing learnings (formal documentation can wait)
Pay attention to which tasks benefit and which feel forced
Accept varied adoption rates across engineers
Make sure you’re using the data (individually from the AI tools, centrally from Swarmia, or somewhere else) to learn from your experiments

What not to do:

Don’t mandate specific tools or workflows yet
Don’t try to measure productivity changes with numbers (feelings are ok)
Don’t worry about standardization
Don’t skip this because you want quick results

Stage two: Adopt

Once you understand where these tools help, make adoption easy and intentional across your organization. This means removing friction — make signup simple, clearly communicate what tools are available, and set expectations around usage.

Some organizations mandate AI tool use at this stage, with varying levels of success. At Swarmia, we don’t have a mandate, but we ask that everyone at least tries the tools. We’re not demanding teams use them for everything, just that they set them up and understand how they work so the barrier to incremental adoption stays low.

Build the foundation

Adoption isn’t only about changing habits. You also need to invest in systems and developer experience.

DORA’s research on AI adoption shows where to focus that investment. Organizations that excel in these seven areas see better outcomes from AI tools, while those that don’t struggle regardless of which tools they choose:

Clear and communicated AI stance
Healthy data ecosystems
AI-accessible internal data
Strong version control practices
Working in small batches
User-centric focus
Quality internal platforms

Most of these capabilities were already important before AI (version control, small batches, quality platforms) but they’re much more important now.

The same practices that help human developers also help AI tools: modular code with clear responsibilities, comprehensive documentation, robust testing. If your organization is in the adoption phase, now’s a good time to get these capabilities squared away.

Junior developers and AI tools

Now for what can be an uncomfortable topic: junior developers and AI tools. Some leaders worry about letting juniors use these tools. “They need fundamentals first. What if they submit poor quality code for review?”

And my answer to that is: yes, junior engineers submitting poor quality code (whether AI-assisted or hand-written) has real costs. But the solution isn’t restricting tool access. If your system can’t prevent low-quality code from reaching production regardless of how it was written, that’s a system problem. Not a people problem.

Strengthen your review process, invest in automated testing, and improve your deployment safeguards. These investments help your team regardless of which tools people use to write code — and they’re probably overdue anyway.

Use adoption data to guide your rollout

Tracking AI adoption gives you visibility into where barriers exist. If one team hits 80% adoption and another sits at 20%, it’s a signal to investigate.

Talk to the low-adoption team first. Maybe they’re working in a legacy codebase where AI tools struggle. Maybe they missed your communications. Maybe they tried the tools and found they didn’t fit their workflow. Maybe they need some training. Each conversation will tell you something useful about how these tools work in your context.

Then look at the high-adoption teams. What repositories are they using these tools on? What task types see the most activity? Which models or modes do they prefer? This data helps you identify patterns worth sharing. If your frontend team is getting great results using AI but your infrastructure team barely touches it, that’s valuable information too.

In Swarmia, you can see AI adoption rates, usage patterns across repositories and languages, and which modes and models teams prefer. This visibility helps you spot opportunities to increase adoption and understand what’s blocking progress — all before you start thinking about more rigorous productivity measurement.

What to do in this stage:

Make signup and setup frictionless
Clearly communicate available tools and usage policies
Invest in codebase documentation that helps both humans and AI
Create intentional knowledge sharing spaces (demos, hackathons, Slack channels, lunch-and-learns)
Consider requiring everyone try the tools
Monitor adoption rates across teams and use that data to identify where support is needed

What not to do:

Don’t restrict junior developers from AI tools
Don’t assume uniform adoption across your teams
Don’t skip the system investments in favor of higher adoption numbers
Don’t forget that good code, testing, and documentation practices matter more than ever

Stage three: Measure impact

Once you have meaningful adoption, you can start examining AI’s impact on your delivery metrics. This is where things get interesting — and where you need to be thoughtful about what you’re measuring and why.

Start by looking at your existing engineering metrics through an AI lens. Measure the same things you’ve always measured — DORA metrics like change lead time, deployment frequency, change fail percentage, and failed deployment recovery time — but now segment by whether AI tools were involved.

This can give you an idea about whether AI-assisted work is moving through your system faster, slower, or about the same.

That’s different from trying to calculate an isolated “AI productivity gain” number, which is both methodologically impossible and potentially misleading.

When you see differences between AI-assisted and non-AI-assisted work, you’re looking at correlation, not causation. And correlation can point in multiple directions.

If AI-assisted pull requests have shorter cycle times, that could mean:

AI tools are genuinely helping engineers work faster
Your most experienced engineers (who naturally work faster) are the primary AI adopters
Teams with better engineering practices are both faster and more likely to adopt new tools
Some combination of all three

All of these scenarios are valuable to understand, but they require different responses. If senior engineers are the main AI users, you might need better onboarding for less experienced team members. If high-performing teams adopt faster, maybe you need to address systemic barriers elsewhere.

The key is not trying to prove pure causation — that would require controlled experiments that aren’t practical in most organizations. Instead, you’re looking for patterns that help you make better decisions.

To interpret your data more effectively:

Look for consistency across multiple signals. If AI-assisted PRs are faster and engineers report AI tools help them work faster and quality metrics stay stable, that’s stronger evidence than any single data point.
Consider context when comparing teams. Before concluding that Team A benefits more from AI than Team B, ask: Are they working on similar problems? Do they have comparable seniority levels? Is their codebase complexity similar? Different contexts often explain different outcomes better than tool effectiveness alone.
Track changes over time within teams. Comparing a team’s metrics before and after AI adoption often provides clearer insights than comparing different teams at a single point in time.

Also, we all know by now that it’s not helpful to compare individual engineers, and that should not change when you’re measuring the impact of AI. Team dynamics and dependencies are too complex for that to be useful, and after all, the unit of delivery is the team.

Instead, look for actionable patterns in your data:

Which teams see the most benefit from AI tools?
What types of work show the biggest improvements?
Are quality indicators like change failure rate staying stable or improving?
Which tools work best for which use cases?

Combine this quantitative data with qualitative feedback from your teams. When you spot interesting patterns — like one team’s AI-assisted PRs moving notably faster — talk to them, or run a developer experience survey focused on AI usage. What are they doing differently? Can other teams learn from their approach?

The goal in measurement isn’t to produce a single “AI ROI” number, but to understand whether AI tools are helping your teams deliver better quality software faster, where they’re most effective, and where you might need to adjust your approach.

What to do in this stage:

Track proven engineering effectiveness metrics (DORA and other supporting metrics)
Segment metrics by AI tool usage to understand where AI-assisted work differs from other work
Look for patterns: which teams and task types benefit most from AI tools?
Combine quantitative metrics with qualitative feedback from developer experience surveys
Check that quality indicators aren’t degrading with increased AI usage
Focus on team-level patterns, not individual comparisons
Talk to teams about what you’re seeing in the data

What not to do:

Don’t expect a single “AI ROI” number at this point
Don’t create totally new metrics for “AI work”
Don’t expect rigorous experimental proof of productivity gains
Don’t compare teams without considering readiness and context

Stage four: Optimize for cost

This is where too many people want to start: figuring out if it makes sense to pay $200ish per engineer per month for Claude Code, Copilot, or whatever is the tool du jour.

If you go straight to cost optimization without the previous stages, you’ll never actually succeed with these tools. You need to know what works in your context before you can make smart decisions about consolidating tools or adjusting spend.

Once you’ve gone through the other stages, optimization becomes clearer. You consolidate tools based on data, reduce unnecessary usage by giving the AI better context, and discover that certain tools work better for certain use cases.

If these tools help your teams move faster, maintain quality, and reduce friction, spending the equivalent of one or two engineers’ salaries for 100 engineers worth of assistance is obviously worthwhile.

On the other hand, spending $250k on AI coding tool licenses won’t help much if your teams are waiting five days for code reviews or two weeks for deployments.

What to do in this stage:

Reduce API costs by providing better context to AI tools
Examine the whole system: where are actual bottlenecks?
Invest in infrastructure that helps AI tools work better (testing, CI/CD, documentation)
Review usage patterns to identify and eliminate redundant AI tool licenses
Make decisions based on outcomes, not just line-item costs

What not to do:

Don’t start here without doing the work to get here
Don’t optimize away experimentation and learning
Don’t focus on AI tool costs while ignoring larger system inefficiencies
Don’t assume the cheapest option is the most cost-effective

Just start

The appetite for experimentation with these tools will diminish over time. We’re in a window right now where everyone is trying to figure this out, and while that window won’t last forever, there’s no need to panic or feel like you’re behind.

Just keep perspective: the measurement approach you need now isn’t fundamentally different from how you’ve always understood engineering effectiveness. You’re looking for the same signals — are we shipping quality software reliably? Can our teams do their best work without burning out? Is the work flowing smoothly through the system?

If you’re searching for real and sustainable results from AI, this four-stage approach works. Don’t skip stages because you’re impatient, and don’t let the hype convince you there’s a shortcut.

So just start. Experiment first, adopt intentionally, measure where it makes sense, then answer the cost question. These tools are promising, and with the right approach, they can genuinely improve how your teams work.

There’s no one AI metric to rule them all

Combine adoption and usage data, developer experience surveys, and proven engineering metrics to understand where AI tools are actually helping your teams.

Learn more

Rebecca Murphey helps Swarmia customers navigate people, process, and technology challenges in pursuit of building an effective engineering organization.

Subscribe to our newsletter

Get the latest product updates and #goodreads delivered to your inbox once a month.

I confirm I want to receive Swarmia news and updates.