Yuto Takashi

Posted on Jan 27

Measuring ROI of Forward-Looking Design Decisions with ADR

#architecture #documentation #productivity #softwareengineering

Why You Should Care

Ever built a feature "just in case" only to never use it? Or skipped implementing something flexible, only to refactor it weeks later?

We all face this dilemma: YAGNI (You Aren't Gonna Need It) vs. forward-looking design.

The problem? We make these decisions based on gut feeling, not data.

This post shows how to make forward-looking design measurable using ADR (Architecture Decision Records).

The Core Problem

What we really want to know is:

How often do our predictions about future requirements actually come true?

Three dimensions to evaluate:

Aspect	Detail
Prediction	"We'll need X feature in the future"
Implementation	Code/design we built in advance
Reality	Did that requirement actually come?

We want to measure prediction accuracy.

YAGNI vs. Forward-Looking Design

YAGNI means "You Aren't Gonna Need It now", not "You'll Never Need It".

The problem is paying heavy upfront costs for low-accuracy predictions.

When forward-looking design makes sense

Extension points (interfaces, hooks, plugin architecture)
Database schema separation
Fields that are easy to add later

→ Low cost, low damage if wrong

When YAGNI is the answer

UI implementation
Complex business logic
External integrations
Permission/billing logic

→ Will need complete rewrite if wrong

The Estimation Problem

You might think: calculate value like this.

Value = (Probability × Cost_Saved_Later) − Cost_Now

But here's the catch: we can't estimate these accurately.

Nobody knows the real probability
Future costs are unknown until we do the work
Even upfront costs grow during implementation

If we could estimate accurately, we wouldn't need this system.

Learning from inaccuracy

We don't need perfect estimates.

What we need:

Learn which areas have accurate predictions
Learn which areas consistently miss
Build organizational knowledge

Example:

Area	Hit Rate	Tendency
Database design	70%	Forward-looking OK
UI specs	20%	Stick to YAGNI
External APIs	10%	Definitely YAGNI

Numbers are learning material, not absolute truth.

Recording Predictions in ADR

To make forward-looking design measurable, we need to record our decisions.

We use ADR (Architecture Decision Records). I've written about ADRs before in this post:

Why did we choose this again?" - How ADRs Solved Our Documentation Problem

Yuto Takashi ・ Jan 25

#architecture #documentation #productivity #softwareengineering

Example ADR with forecast:

## Context
Customer-specific permission management might be needed in the future

## Decision
Keep simple role model for now

## Forecast
- Probability estimate: 30% (based on sales feedback)
- Cost if later: 20 person-days (rough estimate)
- Cost if now: 4 person-days (rough estimate)
- Decision: Don't build it now (negative expected value)

Estimates can be rough. What matters is recording the rationale.

Making It Measurable

Architecture

In our environment, this works:

Repositories (ADR + metadata.json)
  ↓
Jenkins (cross-repo scanning, diff collection)
  ↓
S3 (aggregated JSON)
  ↓
Microsoft Fabric (analysis & visualization)
  ↓
Dashboard

We already have Jenkins scanning repos for code complexity. We can extend this for ADR metadata.

Metadata Design

Keep ADR content free-form. Standardize only the metadata for aggregation.

docs/adr/ADR-023.meta.json:

{
  "adr_id": "ADR-023",
  "type": "forecast",
  "probability_estimate": 0.3,
  "cost_now_estimate_pd": 4,
  "cost_late_estimate_pd": 20,
  "status": "pending",
  "decision_date": "2025-11-01",
  "outcome": {
    "requirement_date": null,
    "actual_cost_pd": null
  }
}

Minimum fields needed:

adr_id: unique identifier
type: forecast
probability_estimate: 0-1
cost_now_estimate_pd: upfront cost (person-days)
cost_late_estimate_pd: later cost (person-days)
status: pending / hit / miss
outcome: actual results

Treat estimates as "estimates", not gospel truth.

Diff-Based Collection

Full scans get expensive. Collect only diffs:

# Record last commit SHA
git diff --name-only <prev>..<now> | grep '*.meta.json'

Scales as repos grow.

Comparing Predictions to Reality

After 6-12 months, review:

ID	Feature	Est. Prob	Actual	Est. Cost	Actual Cost	Result
F-001	CSV bulk import	30%	Came after 6mo	15pd later	1pd	Hit & overestimated
F-002	i18n	50%	Didn't come	-	-	Miss
F-003	Advanced perms	20%	Came after 3mo	20pd later	25pd	Hit & underestimated

Focus on trends and deviation reasons, not absolute accuracy.

Aggregation & Visualization

Two Types of Output

Raw facts (NDJSON, append-only):

{"adr_id":"ADR-023","type":"forecast","status":"hit",...}
{"adr_id":"ADR-024","type":"forecast","status":"miss",...}

Snapshot (daily/weekly metrics):

{
  "date": "2025-01-27",
  "metrics": {
    "success_rate": 0.30,
    "total_forecasts": 20,
    "hits": 6,
    "misses": 14,
    "avg_cost_deviation_pd": -3.5
  }
}

What Leadership Wants to See

CTOs and executives probably care about:

Forecast success rate (prediction accuracy)
Cost savings trend (rough ROI)
Learning curve (are we getting better?)

Treat ADR as transaction log. Handle visualization separately.

When Do We Know It Was Worth It?

Three evaluation moments:

① When requirement actually comes

"Did that requirement actually happen?"

No → prediction missed
Yes → move to next evaluation

Not "worth it" yet.

② When we measure implementation time (most important)

"How fast/cheap could we implement it?"

Case	Additional work
With forward-looking design	1 person-day
Without (estimated)	10 person-days

This is when we can say "forward-looking design paid off".

User adoption doesn't matter yet.

③ When users get value

This evaluates business value, but involves marketing, sales, timing, competition.

For technical decisions, focus on ② implementation cost difference.

Why Not Excel?

Excel management fails because:

Updates scatter across time
Unclear ownership
Diverges from decision log
Nobody looks at it

Excel becomes "create once, forget forever".

Treat ADR as input device, visualization as separate layer.

Summary

This system's goal isn't perfect estimates or perfect predictions.

Goals are:

Record decisions
Learn from results
Improve prediction accuracy over time

Wrong estimates aren't failures. Making the same wrong decision repeatedly without learning is the failure.

Treat numbers as learning material, not absolute truth.

Next Steps

Planning to propose:

Finalize metadata.json schema
PoC with 2-3 repos
Build Jenkins → S3 → Fabric pipeline
Start with hit rate & cost deviation
Run for 3 months, evaluate learning

Not sure if this will work, but worth trying to turn forward-looking design from "personal skill" into "organizational capability".

I write more about design decisions and engineering processes on my blog.
If you're interested, check it out: https://tielec.blog/

DEV Community