Your practical guide to DORA metrics

DORA metrics have evolved since they were first introduced in 2014, but they remain the gold standard for measuring software delivery performance. This guide, first published in 2022 and updated in 2026, covers both the current five-metric model and the context you need to use it effectively.

Ever since the book Accelerate was published in 2018, it’s been borderline impossible to have a conversation about measuring software development performance without any reference to the DORA metrics.

And while DORA metrics remain a powerful tool for understanding your software delivery performance, there are a number of things you should consider before you jump head-first into measuring them.

In this post, we’ll discuss how the DORA metrics came to be, what they are today, how to get started with measuring them, and how to avoid some of the typical mistakes software organizations make when they’re first starting out.

What is DORA?

The DevOps Research and Assessment (DORA) team was founded in 2014 as an independent research group focused on investigating the practices and capabilities that drive high performance in software delivery and financial results.

In 2018, three members of the DORA team, Nicole Forsgren, Jez Humble, and Gene Kim, published a book called Accelerate: The Science of Lean Software and DevOps: Building and Scaling High Performing Technology Organizations, which goes into detail about the group’s research methodology and findings. The book uncovers a complex relationship between organizational culture, operational performance, and organizational performance. Eight years later, it remains the book that has had the most positive impact on our industry.

In 2019, DORA was acquired by Google, and has continued to produce the most comprehensive research report in our industry each year, with a bigger change in focus that happened in 2025: rather than publishing their traditional State of DevOps report, DORA made 2025 entirely about AI-assisted software development.

The 2025 DORA Report provided the first in-depth look at how AI is changing core metrics, finding that AI adoption improves throughput but increases delivery instability. Among other findings, the report also identifies seven critical capabilities — from making internal data AI-accessible to working in small batches — that engineering leaders need to build before AI tools can deliver their full value.

What are the DORA engineering metrics?

Even though DORA research and Accelerate uncovered many complex relationships between culture, software delivery, and organizational performance, the most famous part of the group’s research are the software delivery performance metrics that are known widely as the DORA metrics or, less often now, the “four keys.”

And while many people still refer to the “four DORA metrics,” the framework has actually evolved to include five metrics as of 2024. This evolution reflects DORA’s ongoing research into what truly drives software delivery performance.

The five DORA metrics are now:

Software delivery throughput:

Deployment frequency: How often a software team pushes changes to production
Change lead time: The time it takes to get committed code to run in production
Failed deployment recovery time: Previously MTTR (Mean Time To Recovery), this metric is the time it takes to recover from a deployment that fails and requires immediate intervention

Software delivery instability:

Change fail rate: Previously change failure rate, this is the share of incidents, rollbacks, and failures out of all deployments
Deployment rework rate: The ratio of deployments that are unplanned but happen as a result of an incident in production

The unique aspect of the research is that these metrics were shown to predict an organization’s ability to deliver good business outcomes. This predictive capability makes DORA metrics not only essential for engineering teams but also valuable for investors evaluating a company’s operational efficiency.

Why should I care about DORA metrics?

Historically, measuring software development productivity was mostly a matter of opinion. But since your opinion is as good as mine, any discussion stalled easily and most organizations defaulted to doing nothing.

The team behind DORA applied scientific rigor to evaluating how some well-known DevOps best practices relate to business outcomes.

The metrics represent a simple and relatively harmless way to start your journey. The basic logic is: maximize your ability to iterate quickly while making sure that you’re not sacrificing quality.

In this space, being mostly harmless is already an achievement. The industry is full of attempts to stack rank developers based on the number of commits, or provide coaching based on the number of times they edited their own code.

The five metrics

As well as introducing a fifth metric, the current DORA metrics are grouped into two categories: throughput (how fast you can deliver), and instability (the quality and reliability of that delivery). Let’s go through them in a little more detail.

Change lead time

Change lead time (also known as lead time for change) captures the time it takes to get committed code to run in production.

The purpose of the metric is to highlight the waiting time in your development process. Your code needs to wait for someone to review it and it needs to get deployed. Sometimes it’s delayed further by a manual quality assurance process or an unreliable CI/CD pipeline.

These extra steps in your development process exist for a reason, but the ability to iterate quickly makes everything else run more smoothly. It might be worth taking some extra risk for the added agility, and in many cases, smaller batch size actually reduces risk.

DORA benchmarks suggest that on average, elite teams get changes to production in under a few hours. Anything in the ballpark of 24 hours is a great result.

For a team that’s interested in improving their change lead time, these are some common discussion topics:

Is something in our process inherently slowing you down? If you’re manually testing every change, requesting it from an external QA team is going to be slow. Can we embed testers into the team? Can we use feature flags to hide features while they’re being worked on?
How quick are our code reviews? Pull request reviews don’t have to be slow.
Are you working on too many things at once? Multi-tasking might feel efficient when you’re able to move from a blocked task to something else, but it also means that you’re less likely to address those blockages.

Deployment frequency

Deployment frequency measures how often a team pushes changes to production. High-performing software teams ship often and in small increments.

Shipping often and in small batches is beneficial for two reasons. First, it helps software teams create customer value faster. Second, it reduces risk by making it easier to identify and fix any possible issues in production.

Deployment frequency is affected by a number of things:

Can we trust our automated tests? A passing test suite should indicate that it’s safe to deploy to production, and a failing test suite should indicate that we need to fix something. Problems often arise from a lack of automated test coverage and flaky tests.
Are the deployments automated? Investing in automated deployments pays itself back very quickly.
Is it possible to ship in small increments? Optimally, your developers will work in short-lived branches. Features under construction will be hidden from end-users with feature gates.
Do we know how to split work to small increments? Planning and splitting the work requires some practice and a good grasp of the codebase.

The best teams deploy to production after every change, multiple times a day. If deploying feels painful or stressful, you need to do it more frequently.

Failed deployment recovery time

Previously known as mean time to recover (MTTR) or time to restore service, this metric was refined in 2023 to focus specifically on failures caused by software changes rather than external factors like infrastructure outages.

Failed deployment recovery time focuses on incidents caused by the changes that you’ve made yourself – as opposed to external factors, such as cloud provider downtime. This makes a great control variable for the throughput metrics.

The definition of an incident or failure is up to you. Production downtime caused by a change is clearly a failure. Having to roll back a change is likely a good indication too. Still, bugs are a normal byproduct of newly built software and you don’t necessarily need to count every regression.

Good infrastructure will help you limit the blast radius of these issues. For example, a Kubernetes cluster that only sends traffic to instances if they respond to readiness and liveness checks can block deployments that would otherwise take the whole app down.

Change failure rate

Change failure rate measures the percentage of deployments that require immediate intervention following deployment, likely resulting in a rollback of the changes or a “hotfix” to quickly remediate any issues.

Deployment rework rate

Added to the DORA metrics in 2024, deployment rework rate captures the percentage of deployments that are unplanned but happen as a result of an incident in production.

This metric helps teams understand how much of their deployment activity is reactive rather than planned feature work. High rework rates indicate that teams are spending significant time fixing production issues instead of delivering new value.

Common challenges when introducing DORA metrics

The beauty of the DORA metrics is that they offer a framework for measuring and benchmarking engineering performance across two variables: speed (deployment frequency and change lead time) and stability (change fail rate, deployment rework rate, and failed deployment recovery time).

However, as anyone who’s ever worked in software engineering would attest: numbers — and especially aggregate ones — don’t always tell the whole truth. Here are some of the key issues we’ve seen with software organizations that are getting started with DORA metrics:

1. Taking them too literally

Cargo cults were common in early agile adoption. People would read a book about Scrum and argue about “the right way” to do things without understanding the underlying principles.

DevOps practices are not the only thing you need to care about. Great product management and product design practices still matter. Psychological safety still matters. Running a great product development organization takes more than just the metrics.

2. Hiding behind aggregate metrics

Aggregate values of DORA metrics are useful for two main reasons: following the long-term trends and getting the initial benchmark for your organization.

However, your team needs more than the aggregate number to start driving improvement. What are the individual data points? What are the contributing factors for these numbers? How should they be integrated into your existing daily and weekly workflows?

3. Lack of organizational buy-in

Measuring software development productivity is a delicate topic, and as such, top-down decisions can easily cause some controversy. On the other hand, without direction from the engineering leadership, it’s too easy to just give up.

The role of the leadership is to build an environment where teams and individuals can be successful. Ensuring that some feedback loops are in place is a perfect example of this. Thus, it makes sense to be proactive in this discussion.

Developers often have concerns about tracking harmful metrics and individual performance. We suggest proactively bringing it up and explaining how DORA metrics are philosophically aligned with how most developers think.

4. Obsessing over something that’s good enough

If you’re consistently getting code to production in 24 hours and you’re deploying every change without major issues, you don’t necessarily have to worry about DORA metrics too much. It’s still good to keep these numbers around to make sure that you’re not getting worse as complexity grows, but they don’t need to be top of mind all the time.

The good news is that your continuous improvement journey doesn’t need to stop there.

How to get started with measuring DORA metrics

If you’re eager to start measuring the DORA metrics, you essentially have two paths: build your own solution, or use a purpose-built platform. We’ve written a comprehensive guide on the build vs. buy decision that can help you evaluate which approach makes sense for your organization, but I'll touch on the basics below.

Build your own solution

Building a DIY DORA metrics solution (even a vibe-coded one) requires connecting to your version control system, CI/CD pipeline, and various other sources to collect and aggregate the data. While this gives you complete control and customization, it also means taking on the ongoing maintenance burden of keeping integrations working as your toolchain evolves.

Most teams that go this route underestimate the effort required not just to build the initial dashboards, but to maintain data quality, handle edge cases, and evolve the metrics as their teams and processes change.

Use an engineering intelligence platform

For engineering leaders who are looking to not only measure the DORA metrics but also improve across all areas of engineering effectiveness (including business impact and developer experience), a tool like Swarmia might be a better fit.

Swarmia allows you to measure all of the DORA metrics (and more) with proper context, drill down into individual data points, and understand the contributing factors behind your numbers. The platform is designed to make metrics actionable, rather than just observable.

DORA metrics are just the beginning

Swarmia gives you the insights and tools you need to improve across the entire engineering organization. Get started with a free 14-day trial.

Start trial