王凯

Posted on Feb 24

I Tracked Every Technical Decision for 6 Months. Here's What I Learned

#webdev #career #productivity #programming

Six months ago, I started a simple experiment: log every significant technical decision I make, along with my reasoning and expected outcome. Then review each one after enough time has passed to see the results.

The idea came from investment journals. Investors like George Soros and Ray Dalio have advocated for decades that the only way to improve decision quality is to create a written record and systematically review it. I figured if it works for people making billion-dollar bets, it might work for a developer making architecture and tooling choices.

183 decisions later, I have data. And the data tells a story I wasn't expecting.

The Setup

I kept it simple. A markdown file per month with a table:

| Date | Decision | Context | Alternatives | Expected Outcome | Review Date |
|------|----------|---------|--------------|------------------|-------------|

"Significant" meant any decision that would take more than a day to reverse or that affected people beyond myself. This excluded things like variable naming but included:

Technology choices (libraries, frameworks, services)
Architecture decisions (patterns, data flow, API design)
Process decisions (testing strategy, deployment approach, review process)
Priority decisions (what to build next, what to defer)

Every entry took about five minutes to write. At the end of each month, I spent an hour reviewing decisions from three months prior to check outcomes against expectations.

The Numbers

Over six months, I logged 183 decisions. Here's the breakdown:

Technology choices: 47 (26%)
Architecture decisions: 38 (21%)
Priority decisions: 52 (28%)
Process decisions: 29 (16%)
Other: 17 (9%)

Of the decisions I've had time to review (roughly the first three months, 94 decisions):

Clearly correct: 41 (44%)
Acceptable but not optimal: 33 (35%)
Wrong and costly: 12 (13%)
Too early to tell: 8 (8%)

A 13% failure rate surprised me. I would have guessed lower. More surprising was where the failures clustered.

Finding 1: My Worst Decisions Were Made Under Social Pressure

Of the 12 clearly wrong decisions, 9 were made in meetings with more than three people. Not because groups make bad decisions inherently, but because I was optimizing for group consensus rather than decision quality.

Here are the three patterns I identified:

The Loudest Voice Pattern: In four cases, I went with a technical direction because a respected senior engineer advocated for it strongly, even though my own analysis pointed elsewhere. In three of those four cases, my original instinct was correct.

One example: choosing a complex event-sourcing pattern for a service that turned out to need simple CRUD. A staff engineer made a compelling case for event sourcing in the architecture review. I had doubts but didn't push back. Six weeks later, we simplified to CRUD after wasting significant effort on event-sourcing infrastructure we didn't need.

The Compromise Pattern: In three cases, I chose a "middle ground" between two competing proposals to keep peace. The middle ground was worse than either original proposal because it inherited the constraints of both approaches without the benefits of either.

The Urgency Pattern: Two wrong decisions were made because someone framed the decision as time-sensitive when it wasn't. "We need to decide by EOD" is almost never true, but it short-circuits careful thinking.

The takeaway: I now have a personal rule. If I feel social pressure influencing a technical decision, I explicitly ask for 24 hours before committing. If the decision truly can't wait 24 hours, that's actually fine; genuine emergencies are obvious. But manufactured urgency evaporates when you ask for a day.

Finding 2: Technology Choices Had the Worst Outcomes

My decision accuracy by category:

Category	Clearly Correct	Acceptable	Wrong
Priority decisions	52%	35%	13%
Process decisions	56%	33%	11%
Architecture decisions	42%	40%	18%
Technology choices	29%	37%	34%

A 34% wrong rate on technology choices is alarming. When I dug into why, the common thread was overweighting features and underweighting operational characteristics.

I'd choose a database because it had the right query capabilities, then discover its backup story was nightmarish. I'd pick a library because the API was elegant, then find it had undocumented memory leaks under concurrency. I'd select a service because the feature set was compelling, then realize the observability story was nonexistent.

The pattern: I was evaluating technologies like a user (features first) instead of like an operator (operational characteristics first).

After this discovery, I started using an evaluation template that puts operational concerns before features:

1. How does it fail? (failure modes, blast radius)
2. How do you fix it when it fails? (recovery process, MTTR)
3. How do you know it's about to fail? (monitoring, alerting)
4. How do you deploy and upgrade it? (zero-downtime, rollback)
5. Does it have the features we need? (finally, features)

My technology choice accuracy improved noticeably in months four through six after adopting this template.

Finding 3: Speed Did Not Correlate With Quality

I tracked how long each decision took (from when I first engaged with the problem to when I committed to a direction). Then I correlated decision speed with outcomes.

There was no meaningful correlation. Fast decisions weren't worse than slow ones. Decisions that took weeks weren't better than decisions that took hours.

What did correlate with quality:

Whether I wrote down alternatives. Decisions where I explicitly listed at least two alternatives before choosing had a 52% "clearly correct" rate vs. 31% for decisions where I didn't.
Whether I did a premortem. Decisions where I wrote "This could go wrong if..." had a 58% "clearly correct" rate vs. 37% without.
Whether I consulted someone with different expertise. Decisions where I talked to someone outside my immediate domain (e.g., asking an SRE about a database choice I was making as a backend dev) had a 61% "clearly correct" rate.

The lesson: the time you spend deciding matters less than how you spend it. Five minutes listing alternatives, five minutes on a premortem, and a 15-minute conversation with a domain expert beats three weeks of solo analysis every time.

Finding 4: I Was Dramatically Overconfident in Reversibility

In 23 decisions, I explicitly noted "this is easily reversible" as a reason to move fast. When I reviewed those decisions:

9 were genuinely easy to reverse
8 were reversible but much harder than I'd estimated
6 were effectively irreversible due to data migrations, client integrations, or team investments

That's a 60% miss rate on reversibility estimates. I was using "this is reversible" as a comfort blanket rather than an honest assessment.

The worst case: choosing an API response format because "we can always change it later." Six months on, three external partners had integrated with that format. Changing it would require coordinated migration across four teams and three companies. Technically reversible. Practically permanent.

Now I treat reversibility as something that degrades over time. A decision might be reversible on day one, partly reversible after a month, and effectively permanent after a quarter. If I'm using reversibility as justification for moving fast, I set a review date before the reversibility window closes.

Finding 5: My Predictions Were Poorly Calibrated

For each decision, I rated my confidence from 1 to 10. When I plotted confidence against actual outcomes:

When I said I was 90%+ confident, I was right about 70% of the time
When I said I was 50% confident, I was right about 50% of the time
When I said I was 30% confident, I was right about 45% of the time

Classic overconfidence bias. I was well-calibrated in the middle of the confidence range but significantly overconfident at the top. In practice, this meant that my "I'm sure about this" decisions were wrong nearly a third of the time.

This finding alone justified the entire experiment. Knowing that your high-confidence predictions are less reliable than you think changes how you make decisions. I now treat my own "90% confidence" as roughly 70% and plan accordingly.

What Changed After Six Months

Based on these findings, I made five changes to how I work:

1. The 24-Hour Rule for Pressured Decisions. If social pressure is a factor, I sleep on it. No exceptions.

2. Operations-First Technology Evaluation. Failure modes and recovery processes before feature comparison. Always.

3. The Five-Minute Premortem. Before committing to any significant decision, I spend five minutes writing "This will go wrong if..." It takes almost no time and dramatically improves outcomes.

4. Reversibility Audits. I no longer trust my gut on reversibility. I explicitly list what would need to happen to reverse a decision, who would be affected, and when the reversal window closes.

5. Continued Journaling. This experiment is now a permanent practice. The feedback loop of logging decisions and reviewing them quarterly is the single most effective improvement I've made to my technical judgment.

The journaling practice has connected me to a broader community of people who take structured decision-making seriously. I've been particularly influenced by thinkers documented on resources like KeepRule's blog on decision thinking, where the intersection of investment wisdom and practical decision frameworks is explored in depth.

How to Start Your Own Decision Journal

You don't need a fancy tool. Here's the minimum viable decision journal:

Step 1: Create a file. Markdown, spreadsheet, whatever you'll actually use.

Step 2: For each significant decision, log:

What you decided
What alternatives you considered
Why you chose this option
Your confidence level (1-10)
What you expect to happen
What would tell you the decision was wrong
A review date (typically 3 months out)

Step 3: Set a monthly calendar reminder to review decisions from three months ago.

Step 4: After three months of data, look for patterns. Where are you consistently right? Where are you consistently wrong? What conditions lead to your worst decisions?

The first month of journaling feels tedious. The first review session is revelatory. By month three, you'll have insights about your own decision-making that no amount of reflection could provide.

The Meta-Lesson

The most important thing I learned isn't about any specific decision pattern. It's that technical judgment, the thing that separates senior engineers from everyone else, isn't a talent. It's a skill that can be systematically improved.

But only if you have data. Memory is unreliable, especially for decisions. We remember our successes vividly and quietly forget our failures. We reconstruct narratives that make our past decisions seem more rational than they were.

A decision journal prevents this. It gives you an honest mirror for your technical judgment. And once you see clearly, you can improve deliberately.

183 decisions in. I'm a measurably better engineer than I was six months ago. Not because I learned a new language or framework. Because I learned how I think, and where my thinking fails.

Have you ever formally tracked your technical decisions? If not, what's the biggest decision you've made in the last month, and what was your reasoning? I'm curious whether others see the same patterns I found.

DEV Community