Who is this article for?
This article is for:
- Anyone who is about to introduce software development productivity metrics to their team.
- Those who already use these metrics but want to revisit “Why these metrics?” and add their own insights
The Toyota Production System (TPS)
Why Toyota? It’s a fair question.(If you’re already using the DORA 4 Keys and wondering “Why bring up Toyota?” please read to the end!)
You won’t find many articles on the web that directly link the DORA 4 Keys and the Toyota Production System, but understanding TPS can give you a deeper grasp of the 4 Keys. That’s because the origins of “Lean” (and therefore DevOps) are deeply rooted in TPS. If you look into academic DevOps literature, TPS pops up everywhere as a keyword.
Personally, I discovered the DORA 4 Keys before learning about DevOps, and in the beginning, I blindly chased “Elite” or “High” scores. But studying the history and underlying philosophies of DevOps made me realize that these numbers are just the start—and helped me think more critically about how to actually use the 4 Keys.
Here are some TPS concepts that might be useful:
- Value-Added Work:
- Work that directly creates value for customers; the work you can actually charge for.
- Example: Painting the car body
- Incidental Work:
- Necessary to create value, but not directly chargeable; something to be minimized.
- Example: Preparing tools and jigs
- Waste:
- Overprocessing, excess inventory, unnecessary movement, etc.—should be eliminated.
- In Lean, this means removing pain and heavy labor from everyday work to achieve organizational goals.
Software Development Productivity Metrics: DORA 4 Keys
For agile-minded development teams, the “4 Keys” are the most well-known and widely used set of metrics for measuring team performance quantitatively.
- Lead Time for Changes
- Deployment Frequency
- Change Failure Rate
- Mean Time to Restore (MTTR)
Why did DORA pick these four? I’d like to share my own thoughts, digging into each.
Lead Time for Changes
The DORA 4 Keys define Lead Time for Changes as
“the time from when a customer request is made until it’s delivered.”
Naturally, you might wonder: where does measurement start?
DORA further splits Lead Time into two parts:
- Time for product/feature design and validation
- Time to deliver the product/feature to the customer
In practical software terms:
- Product design and development
- Delivery from code commit to production Estimating product design and development is often unpredictable and highly variable. In contrast, delivery from commit to production is more measurable and stable.
The 4 Keys focus on the latter.
“Incidental Work” and “Waste” in TPS Terms
The process from code commit to production is mostly “incidental work” in TPS terms and can be full of “waste.”
To make it more concrete, here are some examples:
Incidental Work:
- CI (builds, automated tests)
- Code reviews
- CD (deployment, release)
Waste:
- Waiting for CI/CD pipelines
- Deployment failures and rollbacks
- Manual environment setup or release steps
- Overly complex approval processes Suppose your ratios are:
Value-Added : Incidental : Waste = 1 : 2 : 3
If you eliminate waste:
2 : 2 : 2
The proportion of value-added work jumps to 33%.
Why Lead Time is a Great Metric
In short, Lead Time for Changes has these powerful qualities:
Measurable
It’s objectively quantifiable.
Improvable
It makes waste and pain in daily work visible and actionable.
Comparable
Since it’s not a team-specific velocity metric, it’s easy to benchmark and compare across teams.
This makes Lead Time a solid starting point for ongoing improvement in any development environment.
Advanced Note
A quote from Nicole Forsgren’s “Accelerate: The Science of Lean Software and DevOps”:
Note that the focus here is IT delivery—specifically, from commit to production—rather than the entire software development process.
—from the foreword by Martin Fowler
As Martin Fowler points out, these metrics cover only a slice of the process: from commit to production. Because DORA emphasizes comparability, it limits itself to metrics that are universal and objectively measurable.
But if you analyze your own value stream in TPS terms—identifying what is value-added, incidental, or waste—even upstream processes can become targets for measurement and improvement.
Deployment Frequency
Another quote from “Accelerate”:
The second measure is batch size: the size of work moved through the process at once... Reducing batch size leads to shorter cycle times, smoother flow, faster feedback, reduced risk and overhead, improved motivation and urgency, and helps prevent cost and schedule overruns. But in software, where work is often invisible, measuring batch size is hard. That’s why we decided to use deployment frequency as a proxy.
This doesn’t mean “deploy more = good” as a blanket statement.
Batch size is a classic TPS/Lean concept—flowing work one piece at a time (instead of in large batches) leads to smoother, more efficient production.
Flow in TPS
Flow (or “heijunka” in TPS) is about connecting processes and letting work move smoothly with minimal inventory and waiting.
Example: Flow in Real Life
As a child, I helped my mother with home assembly work: making tie gift sets.
The process:
1. Fold a box
2. Neatly fold the tie
3. Bag the tie
4. Place in box
I used to do all the boxes first (batching), then all the ties, and so on. For small volumes, it’s fine, but for mass production, this leads to excess inventory, waiting, and extra costs.
Reducing Batch Size (Improving Flow)
If you want maximum efficiency:
- Fold a box
- Immediately fold and bag one tie
- Put it in the box
- Pack for shipping
- Repeat
This way:
- Less work-in-process (WIP) inventory
- Faster feedback if something’s wrong
- Shorter total lead time per item
- Faster cash flow
Batch Size in Software Development
With software, “batch size” is invisible. That’s why DORA uses “deployment frequency” as a proxy—small, frequent deploys mean smaller batch sizes and better flow.
Applying the Concept
Approaches like GitHub Flow fit well with this idea of “single-piece flow.”
But not every team can or should aim for super-frequent deploys—some teams (using Git Flow, for example) work in release branches, and deployment frequency alone may not tell the whole story.
In those cases, you can measure and improve the “flow” of release preparation—how smoothly things move to release—even if the deploy frequency itself is not high.
Change Failure Rate
Change Failure Rate is the percentage of changes that cause incidents or defects—a quality metric in DORA. In TPS terms, this is like letting defects slip to the next stage of production.
A high failure rate means you’re not building quality into the process (design, implementation, or testing).
In TPS, letting defects flow downstream is a major problem. Tools like jidoka (automation that stops on errors), standardized work, and mistake-proofing (poka-yoke) are used to prevent this.
For software teams, reducing change failure rate means:
- Robust automated testing (early detection)
- Feature flags for staged rollout (limiting blast radius)
- Blameless retrospectives (continuous improvement)
- Standardized release flow (reducing human error)
In short, lowering the change failure rate = building quality in, which is at the heart of TPS.
Mean Time to Restore (MTTR)
MTTR measures the average time to restore service after a production incident.
This closely aligns with the TPS concept of stopping the line and fixing problems immediately. On the Toyota shop floor, anyone can pull the andon cord (signal light) to stop the line so the team can fix the issue fast.
For software teams, keeping MTTR low requires:
- Fast incident detection (monitoring and alerting)
- Visible logs and metrics (troubleshooting)
- Immediate rollback/hotfix procedures (operational readiness)
- A “one team” response (collaborative incident response)
Short MTTR is a sign of organizational agility and strength—a direct reflection of TPS’s “empowered teams and autonomous improvement” culture.
Summary
The essential capabilities and practices for applying the DORA 4 Keys are detailed in “Accelerate,” and since then, over 30 related capabilities have been identified.
Start by gradually introducing these capabilities in your team, running cycles of improvement.
And whenever you wonder, “Why is this practice important?”—look back to the Toyota Production System. It often holds the deeper meaning and fundamental principle behind the metrics.
Top comments (0)