Building Evidence-Based Delivery Practices

Most delivery teams make decisions based on experience and intuition. That's not wrong, experience matters, but it's incomplete. When I ask teams how they know their process is working, the answers are usually some variation of "we deliver things" or "the client seems happy." That's not evidence. That's hope.

What You Need to Know

Evidence-based delivery means using data, not just intuition, to make decisions about how you work
Start with three to five metrics that matter, not twenty. Metric overload produces dashboard fatigue, not insight
Leading indicators (cycle time, work-in-progress limits, defect injection rate) are more useful than lagging indicators (project completion, budget variance)
The goal isn't measurement for its own sake. It's measurement that changes behaviour

The Gap Between Feeling and Knowing

Here's a pattern I've seen across dozens of delivery teams. The team feels productive. The standup meetings are energetic. Work is moving. Then the quarterly review reveals that two-thirds of the stories delivered were rework, scope was underestimated by 40%, and the features that shipped didn't move the metrics the client cares about.

Nobody did anything wrong. The team was working hard on the things in front of them. The problem was that nobody had visibility into whether those things were the right things, or whether the process was actually efficient.

45%

of agile teams report they don't measure the effectiveness of their delivery process beyond velocity

Source: State of Agile Report, Digital.ai, 2022

Velocity is the metric most teams track. It's also the least useful. Velocity tells you how much you delivered. It tells you nothing about whether what you delivered was valuable, whether the process was efficient, or whether the team is sustainable.

What to Measure Instead

The metrics that actually change delivery quality fall into three categories.

Flow metrics

Cycle time: How long does a piece of work take from start to finish? Not the estimate, the actual elapsed time. Cycle time reveals process bottlenecks. If stories take two days of effort but ten days of elapsed time, the problem isn't development speed, it's wait time.

Work in progress (WIP): How many things are simultaneously in progress? High WIP is the most reliable predictor of delivery problems. Teams with too many things in flight finish everything slowly instead of finishing some things quickly.

When I see a team with fifteen items in progress, I don't need to look at anything else. High WIP is the universal symptom of a delivery process that's absorbing demand without managing it.

Dr Tania Wolfgramm

Chief Research Officer

Throughput: How many items does the team complete per week? Not story points, actual items. Throughput is more stable than velocity and more meaningful than effort estimates.

Quality metrics

Defect injection rate: How many defects are created per unit of work delivered? This is more useful than total defect count because it normalises for delivery volume. A rising injection rate means quality practices are degrading, even if the total defect count looks stable.

Rework percentage: What proportion of the team's capacity goes to fixing things that were already delivered? Some rework is inevitable. Above 20%, it's a process signal that something upstream needs attention, usually requirement quality or testing coverage.

Value metrics

Feature adoption: Of the things you shipped, how many are actually being used? This is the metric that connects delivery to outcomes. Shipping features that nobody uses is waste, regardless of how efficiently you shipped them.

Time to value: How long after a feature is requested does it start producing value for users? This combines cycle time with deployment frequency and measures the thing that actually matters to the business.

How to Start

The instinct is to instrument everything and build an exhaustive dashboard. Don't. That takes months and produces something nobody looks at.

Week 1: Pick three metrics

Choose one flow metric (cycle time is the most useful starting point), one quality metric (defect injection rate), and one value metric (feature adoption if you can measure it, rework percentage if you can't). Three numbers. That's it.

Week 2-4: Establish baselines

Measure your current state. Don't try to improve anything yet. Just understand where you are. Most teams are surprised by what the data shows, particularly cycle time. The gap between perceived and actual cycle time is usually significant.

Month 2-3: Identify one improvement

Pick the metric with the most obvious improvement opportunity. Usually it's WIP, because limiting work in progress has an immediate effect on cycle time and throughput. Make one process change and observe the effect.

Ongoing: Review monthly

Once a month, look at your three numbers. Are they trending in the right direction? If yes, maintain. If no, investigate. Keep the review short, thirty minutes maximum, focused on patterns rather than individual data points.

The Cultural Shift

The hardest part isn't the measurement. It's changing the culture from "we delivered a lot" to "we delivered the right things efficiently." Some teams resist measurement because it feels like surveillance. That's a reasonable concern and it needs to be addressed directly.

Evidence-based delivery isn't about monitoring individual performance. It's about understanding system behaviour. The metrics belong to the team, not to management. They're tools for the team to improve their own process, not tools for leadership to evaluate individuals.

When teams own their metrics and use them to improve their own work, the resistance disappears. When metrics are imposed from above and used for performance evaluation, they get gamed. The implementation approach determines the outcome.

Start Small, Stay Honest

The organisations that get the most value from evidence-based delivery are the ones that start with a few honest metrics and use them to make small, consistent improvements. Not the ones that build elaborate measurement frameworks. Small and honest beats elaborate and ignored, every time.