yunbow

Posted on May 6

Codifying Tacit Knowledge: The Missing Layer Between Your Team's Conventions and Your AI Assistant

#teamwork #softwareengineering #aitools #aidevos

Every engineering team has two codebases.

The first is the one in your repository — versioned, reviewed, deployed. The second lives in your engineers' heads: which patterns are preferred, which shortcuts are acceptable, what "clean code" means in this project specifically.

New hires absorb it over months. It shapes every PR comment. It's the reason two experienced engineers on your team produce recognizably similar code even when working independently.

This second codebase — tacit knowledge — is invisible to your AI assistant.

What Tacit Knowledge Actually Is

There's a concept from philosophy of knowledge that describes exactly this problem. Michael Polanyi called it tacit knowledge: "we know more than we can tell." In software teams, it shows up as patterns that everyone enforces but no one has written down.

Pull from a real Next.js project:

"We don't use useEffect for data fetching." Never written down. Every senior dev enforces it in code review. New hires learn it by getting the comment. The reason (Server Components exist now, and useEffect fetching causes race conditions we've hit before) lives only in institutional memory.
"Server Actions always return ActionResult<T>." Emerged from a painful debugging incident six months ago. There's no ticket, no ADR, no documentation. It's in the team's blood.
"Zod schemas live in lib/schemas/, not colocated with components." A decision made when two engineers independently created conflicting schemas for the same entity. One PR comment from a tech lead fixed it for the humans. The AI has no idea.

These aren't minor style preferences. They're load-bearing conventions. When they're violated at scale, the codebase becomes harder to navigate, harder to maintain, and harder to reason about.

The traditional transfer mechanism — code review, pair programming, osmosis — worked well enough when humans wrote all the code. AI changed the equation.

How AI Breaks the Knowledge Transfer Mechanism

Junior developers historically learned team conventions through accumulated PR feedback. They'd submit code, get a comment ("we don't do it that way here, here's the pattern we use"), adjust, and internalize. Over months, the tacit knowledge transferred.

This mechanism has two requirements: the author receives feedback, and the author can learn from it. AI violates both.

AI generates code without having gone through the feedback loop. It has no history with your specific codebase, no memory of last week's PR comment. It defaults to whatever patterns were most common in its training data — which may be completely different from your team's conventions.

When you leave a code review comment on AI-generated code, you're not teaching the AI. Next conversation, same violation. The correction doesn't compound.

The silence problem: AI doesn't know what it doesn't know about your team. It generates return Response.json(user) without knowing that returning a raw Prisma object is specifically something your team decided against. It's not ignoring your convention — it's unaware the convention exists.

The volume problem compounds this. A developer submitting 20 PRs a week (with AI assistance) outpaces the feedback loop that worked for 3 PRs a week. Review quality degrades. Violations propagate.

A Four-Layer Model for Externalizing Team Knowledge

The solution is to move tacit knowledge from your engineers' heads into a structure your AI can persistently access.

The key is separating by lifespan. Rules at different stability levels shouldn't be mixed together — updating a tool-specific setting shouldn't require touching your design philosophy, and vice versa.

L1: Design Philosophy          (2–5 year lifespan)
    Why we make the decisions we do.
    "We optimize for correctness and observability over developer convenience."
    "Explicit over implicit: named types, no any, no magic."

L2: Technology Decisions       (1–3 years)
    What we've chosen and why.
    "Next.js App Router — no Pages Router patterns."
    "Prisma for database access. We evaluated Drizzle and chose Prisma for
     type safety and migration tooling."
    "Tailwind CSS — we decided against CSS-in-JS for bundle size reasons."

L3: Concrete Coding Rules      (6–12 months)
    How we implement things.
    Naming conventions, error handling patterns, security invariants,
    test structure, import ordering.

L4: Tool-Specific Config       (2–4 months)
    Where the rules live in your AI tool.
    CLAUDE.md (index file: 3 directives + file references),
    .mdc files for Cursor, Steering Rules for Kiro.

Why separate by lifespan: updating your Claude Code plugin configuration (L4) shouldn't risk accidentally overwriting your design philosophy (L1). Adopting a new library (L2) shouldn't require rewriting your security rules (L3).

Each layer has a different update frequency, different owner, and different blast radius if changed incorrectly. Keep them separate.

The Codification Process

This isn't a one-afternoon project. But it's not a six-month initiative either. Here's the process that produces results quickly.

Step 1: Knowledge Extraction

Don't start from scratch. Your team's tacit knowledge is already encoded — in PR comments.

Review the last 20 PRs in your repository. For each non-trivial comment (not typos or formatting), ask: "what convention is this enforcing?" Write it down.

Then ask your senior engineers: "What would make you reject a PR immediately, without discussion?" The answers reveal your hardest conventions — the ones everyone knows but no one has documented.

Finally, list every "everyone knows that" convention you can think of. If you've ever thought "obviously we don't do X here," that's a candidate.

After this exercise, you'll typically have 30–60 items. Most of them have never been written down.

Step 2: Classify by Layer

For each item:

Does this change when you switch AI tools? → L4
Does this change when you upgrade a framework or library? → L3
Does this reflect a technology choice that's lasted more than a year? → L2
Does this reflect why your team makes the decisions it makes? → L1

Don't overthink placement. A rule that's in the wrong layer is still better than a rule that doesn't exist. You can move it later.

Step 3: Write for an AI Reader

This is where most teams go wrong. They write rules that make sense to humans and assume AI can infer the intent.

AI needs explicit, unambiguous rules. Compare:

Written for a human:

Use proper error handling in Server Actions.

Any experienced engineer on your team knows what this means. AI doesn't.

Written for an AI:

All Server Actions must return ActionResult<T>. Never throw an exception from a Server Action. Return { success: false, error: string } for any failure case. The error field should be a user-readable message, not a technical error string.

The second version leaves no room for interpretation. It tells the AI exactly what to do and what not to do.

The test: could a developer who's never seen your codebase follow this rule exactly right the first time? If yes, it's written for an AI reader.

Step 4: Test and Iterate

Run the same generation task with and without your codified rules. Score the output on your team's quality criteria.

If the AI is still violating a rule despite it being codified, there are three likely causes:

The rule is in the wrong context. A security rule needs to be in context when generating API routes, not buried in a general guidelines file.
The rule is too abstract. Rewrite it with a concrete example. "Never return raw Prisma objects" + show a before/after pair.
The rule conflicts with another rule. When two rules contradict, the AI picks arbitrarily. Find the conflict and resolve it.

Iterate until the before/after output shows consistent adherence. In my experience across several Next.js projects, 2–3 rounds per rule set is typical: the first round surfaces the biggest gaps, the second tightens ambiguous wording, and the third (if needed) resolves edge cases.

The Side Effect: Forced Explicit Alignment

Codifying tacit knowledge for AI has an unexpected benefit: it forces your team to make implicit agreements explicit.

When you sit down to write "our Server Action pattern," you discover that two senior engineers have different mental models of what that pattern is. The disagreement was never surfaced because each enforced their own version in PRs and neither caught the other's violations.

The codification process surfaces these gaps. You have a 30-minute conversation, align on one pattern, write it down. The disagreement is resolved. The rule is clear.

Teams that go through this process tend to report:

Faster onboarding for new engineers (they read the guideline files, not just the code)
Clearer PR reviews (reviewers reference rules, not intuition)
Less ambiguity in design discussions ("does this fit our L2 technology decisions?")

The rules you write for AI become your living engineering handbook. The investment pays forward into human processes, not just AI tooling.

The Compounding Payoff

The payoff is slow at first and accelerates.

Month 1: Codification takes time. AI output quality improves noticeably on the rules you've defined, but you're still filling gaps.

Month 3: Core conventions are codified. AI output is reliably on-pattern for the common cases. Rule violations shift from "common but not obvious" to "rare and immediately visible."

Month 6: New engineers join the team. Their onboarding time is shorter — they read the guideline files and get an accurate picture of the codebase's conventions. The AI and the engineers are working from the same source of truth.

The rules you wrote for your AI assistant become the documentation that humans use too. The two problems — AI convention adherence and human knowledge transfer — turn out to have the same solution.

Starting Small

You don't have to codify everything at once.

Pick your highest-value conventions first: the ones that cause the most cleanup work when violated. For most Next.js projects, that's:

Auth patterns in API routes
Server Action return types
Zod schema placement and usage

Write those three rules down. Write them for an AI reader. Test them. Those three rules, consistently applied, will eliminate more rework than an 80-rule document that's never precise enough.

What's one convention in your codebase that lives only in your senior devs' heads? Drop it in the comments — these are exactly the kinds of rules that the default AI Dev OS templates are missing.

The AI Dev OS framework, including rule templates for Next.js and TypeScript, is at github.com/yunbow/ai-dev-os.

Next in this series: AI-Generated Code Has 2.74× More Security Vulnerabilities — Here's the Data and What to Do (link will be added on publish)

DEV Community