How We Linked Technical Decisions to Pull Requests on GitHub

#programming #github #productivity

I want to tell you about a mass that happened on a project I worked on last year.
We had this payment service. Handled subscriptions, invoicing, webhooks from Stripe — the usual stuff. The code had a weird pattern: every webhook was processed sequentially, one at a time, even though we were getting hundreds per minute during peak hours.
A new dev joined the team. Sharp guy. He looked at the code, immediately said "this is a bottleneck, I'll parallelize the webhook processing this sprint." Nobody pushed back because honestly, it looked like an obvious win.
Two weeks later, we're in production with the parallel version. And invoices start duplicating. Customers getting charged twice. Total chaos.
Turns out, the sequential processing was deliberate. Eight months earlier, the team had tried parallel processing and hit a nasty race condition with Stripe's idempotency keys. The fix was to process sequentially and accept the performance trade-off. It was the right call at the time.
But nobody wrote it down. The dev who made that decision had left the company. And we just burned two sprints rediscovering the exact same problem.
That's when I started thinking seriously about this.

The real problem isn't documentation
I know what you're thinking. "Just write better docs." Yeah, we tried that.
We had a Confluence space called "Technical Decisions." I think 4 people knew it existed. It had 12 pages, 8 of which were outdated, and the other 4 contradicted each other. Confluence is where decisions go to die. I'll fight anyone on this.
We also tried ADRs as markdown files in the repo — the classic /docs/adr/ folder approach. Honestly? It was better. At least the decisions lived with the code. But here's what happened in practice: nobody looked at them. When you're reviewing a PR or debugging at 11pm, you're not going to go browse a folder of markdown files hoping one of them is relevant to what you're looking at.
The decisions existed. They just weren't findable at the moment you needed them.

The idea that actually worked
It clicked for me during a PR review. I was reading a diff and thought "why the hell did they do it this way?" — and I realized that's the exact moment where I need the decision context. Not in a wiki. Not in a Slack search. Right here, in the PR.
The pull request is where a decision stops being abstract and becomes code. That's the linking point.
So we started doing something simple: every significant technical decision gets documented, and it gets linked to the PR that implements it. When you open that PR — today or in 18 months — the reasoning is right there.

What this looks like day to day
It's not complicated, which is probably why it actually stuck.
When the team makes a non-trivial technical choice (I'll define what "non-trivial" means in a sec), someone takes 5-10 minutes to write it down. We keep it short on purpose:
Context — 2-3 sentences max. What situation are we in, what constraint are we dealing with.
Options we looked at — What else did we consider and why we didn't go with it. This is the part people skip and it's honestly the most valuable part.
The decision — What we picked and the main reasons.
What we're accepting — Every decision has trade-offs. Write them down. Future you will thank present you.
Then when the dev opens a PR to implement it, the decision gets linked. We set up a GitHub bot that drops a comment on the PR with the full decision context. You can also just reference it in the PR description — works fine too, it's less fancy but it does the job.

The magic is what happens 6 months later. Someone does a git blame, finds the PR, and boom — the decision is right there. No Slack archaeology. No "hey does anyone remember why we..."

A concrete example because I hate vague articles
We needed rate limiting on our API. The team had a 30-minute discussion and three options came up:
express-rate-limit — dead simple but only works for single instance. We were running 4 instances behind a load balancer so this was out immediately.
Custom solution with Redis — more work upfront but we already had Redis running for caching. Full control over the rules.
API Gateway (we looked at Kong) — would handle it cleanly but honestly felt like overkill for our current scale, and nobody on the team had ops experience with Kong.
We went with the Redis approach. Here's what the decision record looked like (I'm paraphrasing but it was roughly this):
Decision: API Rate Limiting

Context: Need rate limiting on public API. Running 4 instances
behind ALB. Already have Redis for session cache.

Considered:

express-rate-limit → single instance only, doesn't work for us
Custom Redis sliding window → uses existing infra, we control the rules
Kong API Gateway → good but overkill, nobody knows Kong on the team

Going with: Custom Redis sliding window.

Accepting: We own the maintenance. If we go past 10k RPM,
should probably revisit the gateway option. Also if the team
grows and we get someone with Kong experience, worth reconsidering.
Took maybe 7 minutes to write.
Fast forward 4 months. New dev sees the custom rate limiting code. His first instinct (understandably) is "why didn't they just use a library or a gateway?" He checks the PR. Reads the decision. Gets it. Moves on with his day.
Without the linked decision, that's a 30-minute Slack thread minimum. Probably a meeting. Maybe a "let's spike on migrating to Kong" that wastes a sprint.

What actually changed for us
I don't want to oversell this. It's not like everything was magically fixed overnight. But after a few months, some things were noticeably different.
New devs stopped asking "why" questions every other hour. One junior dev told me she'd spend 20 minutes reading linked decisions before her first standup and already understood more context than she usually gets in a first week. That one felt good.
PR reviews got sharper. Instead of reviewers guessing the intent, they could check the code against the decision. "The decision says we want fine-grained per-endpoint rules — but this implementation applies the same limit globally. Is that intentional or did we change the plan?"
We mostly stopped re-debating things. I say "mostly" because engineers love debating (myself included), but at least now the starting point was "here's what we decided and why" instead of "I think we should use Kafka" for the fifth time.
And — this one surprised me — we caught stale decisions faster. When someone modified code that had a linked decision, the decision naturally surfaced. Several times someone flagged "hey this decision was made when we had 500 users, we have 30k now, the constraints are totally different."

"But that's too much work"
I hear this a lot so let me be direct about it.
Writing a decision takes 5-10 minutes. Not documenting it costs you hours or days down the line, spread across multiple people. I won't pretend I've measured this scientifically but from what I've seen, the ratio is something like 10 minutes invested saves 2-4 hours of future confusion. Per decision. And that's conservative.
Also — and this is important — you don't document everything. If two reasonable engineers would both make the same choice, don't bother. You document the decisions where there's genuine tension between options, where the reasoning isn't obvious from the code, or where you're consciously accepting a trade-off.
In practice, for a team of 6-8 devs, that's maybe 3-5 decisions per sprint. Totally manageable.

If you want to try this tomorrow
You don't need any special tooling. Seriously. Start with this:
Create a /docs/decisions/ folder in your repo. Use a dead simple template (context, options, decision, trade-offs — that's it). When you open a PR that implements a significant decision, paste the decision in the PR description or reference it.
Add one line to your PR template: "Does this PR implement a significant technical choice? If yes, link the decision."
That's it. You can get fancy with bots and automation later. The habit is what matters. Once your team starts doing this consistently, you'll wonder how you ever worked without it.
The tooling comes after the habit, not before. (And yeah, there are tools that make this smoother — I've been exploring a few — but I wanted to keep this article focused on the practice itself.)

The thing nobody says out loud
Every engineering team has 2-3 people who hold most of the architectural context in their heads. Everyone knows who they are. They're the ones who get pinged on every Slack thread, pulled into every design discussion, and asked "hey quick question" twelve times a day.
That's not sustainable. And it's a massive risk. What happens when they go on vacation? When they burn out? When they leave?
Code tells you what a system does. Git tells you what changed and when. But neither tells you why. And that "why" is probably the most valuable — and most fragile — knowledge in your entire organization.
Writing it down and linking it to the code that implements it is embarrassingly simple. But simple doesn't mean easy to adopt. It takes discipline. It takes a team that agrees this matters. And it takes someone to go first and show that it works.
If you've tried something similar in your team — or if you tried and it failed — I'd really like to hear about it. Drop a comment, I'm curious what works for different team sizes and setups.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.