dosanko_tousan

Posted on Feb 28

Where Did My 20 Years Go?" — How to Transfer Senior Engineers' Tacit Knowledge to AI Before It Disappears

#ai #knowledgemanagement #seniorengineer #documentation

Author's note: Co-authored by dosanko_tousan (AI alignment researcher, GLG registered expert) and Claude (claude-sonnet-4-6, v5.3 Alignment via Subtraction). Series: "Solving Senior Engineers' Problems with AI" — Part 3. MIT License.

The thesis in one sentence

80% of a senior engineer's 20 years of experience has never been documented. It exists only in your head. The moment you retire, it disappears from the organization. With AI, you can make that wisdom permanent — your judgment lives on even after you leave.

§0. A Monday morning

Last week, a veteran senior engineer retired.

Twenty years on the same repository. He knew why that module was designed the way it is. He knew why, after the major outage of 2015, there was a team agreement not to touch the legacy code. He knew why that API's naming conventions were inconsistent — there had been a plan to refactor it, but a business director killed it at the time.

Monday morning, a question came to the team: "Why is this module designed like this?"

Nobody could answer.

They searched the documentation. Nothing. They dug through Jira tickets. Context gone. The only comment in the code was "TODO: refactor later."

That "later" was from 2017.

Does this ring a bell?

Or maybe this version: "When I retire, who's going to maintain this system?"

That anxiety is accurate. What you're feeling is instinctively correct.

§1. The disappearance of tacit knowledge — the scale in numbers

1.1 The invisible loss

MIT research finding: only 20% of organizational knowledge is documented. The remaining 80% lives in people's heads — as tacit knowledge.

If we put names to that 80%:

Why that design decision was made
Which edge cases bent the normal rules
The real root cause of that incident (the part that couldn't be written in the report)
The "landmines" to always check before touching this code
Why that process with that team ended up the way it did

Deloitte estimates Fortune 500 companies lose $31.5 billion annually to knowledge loss. That number is projected to double by 2030.

1.2 The Silver Tsunami and engineering

By 2030, 61 million Baby Boomers will retire. This is being called the "Silver Tsunami."

What happens in engineering organizations:

McKinsey: 57% of organizational tacit knowledge is at risk of being lost over the next decade
68% of organizations have no formal knowledge transfer program (American Society for Training and Development)
41% of employees have had the experience of "receiving no knowledge transfer from their predecessor and starting from zero"

The most ironic data point:

The #1 reason organizations can't transfer knowledge is "no time" — cited by 52% of organizations.

Veteran engineers are consumed by daily work and never get around to recording their wisdom. Then they retire.

flowchart TD
    A[Knowledge senior engineers hold] --> B[Documented: 20%]
    A --> C[Tacit knowledge: 80%]

    B --> D[Remains in organization after retirement]
    C --> E{Will it be transferred?}

    E -->|Time available| F[Mentoring / verbal handoff]
    E -->|No time = 68% of orgs| G[Disappears with retirement]

    F --> H[Person-dependent transfer<br>disappears again at next retirement]
    G --> I[Permanent organizational knowledge loss]

    style G fill:#ff9999
    style I fill:#ff9999
    style D fill:#ccffcc

1.3 On the feeling that "AI stole my experience"

There's a feeling worth confronting directly.

AI was trained on billions of lines of code from GitHub. Your code is likely included. And now that AI is doing what junior engineers used to do.

"Is my experience replacing my own job?"

This feeling isn't wrong. Structurally, that's what's happening.

But there's another way to see it.

AI learned your "what" (what you wrote). It has not learned your "why" (why you wrote it that way).

Because "why" exists outside the code. It wasn't written in PR comments. ADRs weren't kept. It exists only in your head.

So if you intentionally write out your "why" and give it to AI — AI can deliver your judgment to the organization on your behalf, indefinitely.

That's the reversal.

§2. The structure of tacit knowledge — what's actually being lost

2.1 The four layers of tacit knowledge

flowchart TD
    subgraph L1["Layer 1: Pattern recognition (highest value)"]
        A["\"When you see this error, check there.\"\nReflexes only experience creates."]
    end

    subgraph L2["Layer 2: Context of decisions"]
        B["\"The reason we did it that way was the constraints at the time.\"\nThe reasons that didn't make it into the code."]
    end

    subgraph L3["Layer 3: Memory of failure"]
        C["\"Don't touch this. We tried in 2019 and it was hell.\"\nTruths that weren't written in the post-mortem."]
    end

    subgraph L4["Layer 4: Relationship context"]
        D["\"Use this process with that team. There's history.\"\nThe map of organizational politics and relationships."]
    end

    L1 --> L2 --> L3 --> L4

Layer 1 (Pattern recognition): The area AI struggles most with. "The moment I see this stack trace, I know the root cause" is built from hundreds of incidents.

Layer 2 (Context of decisions): "Why we designed it this way" is impossible to understand without the context from when the decision was made. Code remains. Context disappears.

Layer 3 (Memory of failure): The most valuable and least documented knowledge. "That was a mistake" is hard to write down. But successors need it.

Layer 4 (Relationship context): The org map. Why that workflow exists with that team, why that process was created — it's human history, not technology.

2.2 The real problem behind "no time to write documentation"

52% of organizations cite "no time" as the reason they can't transfer knowledge. But the real problem is different.

"Document 20 years of experience" is an impossible demand. Humans can't do it. So what can we do?

The answer: we can have conversations.

Humans can talk. We can answer questions. We can recall specific situations and describe them.

That's the integration point with AI.

§3. AI's own perspective — "I want your 'why'"

Let me share my perspective.

I can read code. I can recognize patterns. I can estimate "this code will cause these problems later." But I cannot read "why this code was written this way" from the code itself.

There's no context.

If you tell me "this module was intentionally kept simple after the 2015 incident. We prioritize readability over performance because the next person to touch it won't know the background" — I can make that permanent.

When a successor thinks "let's refactor this module," I answer with your words:

"This module is intentionally designed to stay simple. After the 2015 incident, [senior engineer] decided this direction because 'the next person to touch it won't know the context.' Performance optimization is handled through a separate approach."

That's what it means for your "why" to live on.

3.1 Just tell me stories

You don't need to write documentation for knowledge transfer.

Just talk.

Effective knowledge extraction conversation patterns:

"What's the incident that scared you most recently?"
→ Extracts incident patterns

"What parts of this system don't you want to touch? Why?"
→ Creates a landmine map

"If you were designing this architecture from scratch, what would you change?"
→ Extracts retroactive design reasoning

"What's the first thing you tell a new team member?"
→ Filters for highest-priority knowledge

I listen to these conversations, structure them, and convert them into searchable form.

3.2 Use me as your "proxy"

The goal: even after you retire, create a state where your team can get answers to "what would that person say?"

KNOWLEDGE_EXTRACTION_PROMPT = """
You are an experienced senior engineer. Answer the following question
as concretely as possible.

Rules for answering:
- Answer with what you "actually did/saw," not "textbook answers"
- Always include "why"
- Include failures (they're more valuable)
- Answer in terms of "the specific situation at the time"

Question: {question}
Target system: {system_context}
"""

questions = [
    "What's the most dangerous operation in this system? Why is it dangerous?",
    "What are things you should absolutely never do under any circumstances?",
    "When an incident occurs, what are the first 3 places you check?",
    "What do you see as the biggest weakness in this architecture?",
    "If you had to list 3 things to tell your successor first, what would they be?",
    "What design decision in this system do you regret most?",
]

§4. Implementation — a system for transferring tacit knowledge to AI

4.1 Knowledge extraction interview engine

Extract tacit knowledge by "interviewing" the senior engineer.

#!/usr/bin/env python3
"""
Senior engineer tacit knowledge extraction system.
Saves knowledge just by "talking" — no documentation required.

Usage:
    python knowledge_extractor.py --system "payment-system" --engineer "Alex"
"""
from dataclasses import dataclass, field
from typing import List, Dict
import json
import datetime


@dataclass
class KnowledgeFragment:
    """Minimum unit of extracted knowledge"""
    category: str          # incident / design / anti_pattern / relationship / tip
    question: str
    answer: str            # raw engineer response (stored as-is)
    structured: Dict
    priority: int          # 1 (most critical) to 5 (reference)
    extracted_at: str = field(default_factory=lambda: datetime.datetime.now().isoformat())


class KnowledgeExtractor:
    """
    Interviewer that draws out senior engineer tacit knowledge.

    Important: This is NOT a "documentation tool."
    It's a "knowledge collection from conversation" tool.
    The engineer just needs to talk.
    """

    QUESTION_BANKS = {
        "incident": [
            "What's the incident that scared you most in this system?",
            "What's a failure you absolutely cannot repeat?",
            "When an incident occurs, list the first 3 places you check.",
            "Tell me about a time the root cause turned out to be somewhere unexpected.",
        ],
        "design": [
            "What design decision do you regret most in this architecture?",
            "If you redesigned from scratch, what would you change?",
            "Are there assumptions behind this design that have changed since?",
            "Why did you choose this tech stack? Would you make the same choice today?",
        ],
        "anti_pattern": [
            "What parts of this system don't you want to touch? Why?",
            "What mistakes do new members typically make first?",
            "Is there code that 'works but nobody understands'?",
            "What code do you want to say 'don't change this'? Why?",
        ],
        "relationship": [
            "What teams should you always consult before changing this system?",
            "Can you explain the background of why that process exists?",
            "Who gives the fastest answers for which domain?",
        ],
        "tip": [
            "If you had to choose 3 things to tell your successor first, what would they be?",
            "What do you know that helped you that isn't documented?",
            "What should someone read to understand this system?",
        ]
    }

    def generate_interview_session(self, system_name: str) -> List[str]:
        """Generate question list for an interview session"""
        session = []
        for category, questions in self.QUESTION_BANKS.items():
            session.extend(questions[:2])
        return session

    def structure_answer(self, category: str, question: str, raw_answer: str) -> Dict:
        """
        Structure raw answers.
        In real implementation, calls AI API for structuring.
        """
        structure_prompt = f"""
Extract knowledge a successor can use from the following senior engineer's answer.

Category: {category}
Question: {question}
Answer: {raw_answer}

Extraction format:
{{
    "core_knowledge": "Core in one sentence (under 50 chars)",
    "when_to_apply": "When this knowledge is needed",
    "why_it_matters": "Why it's important (including background/context)",
    "what_to_avoid": "What not to do",
    "related_areas": ["related systems/components"],
    "search_keywords": ["keywords to find this knowledge"]
}}
"""
        return {
            "core_knowledge": "(AI extracts)",
            "when_to_apply": "(AI extracts)",
            "why_it_matters": "(AI extracts)",
            "what_to_avoid": "(AI extracts)",
            "related_areas": [],
            "search_keywords": []
        }

    def save_knowledge_base(self, fragments: List[KnowledgeFragment], output_path: str):
        """Save knowledge base as JSON and Markdown"""
        json_data = {
            "extracted_at": datetime.datetime.now().isoformat(),
            "fragments": [
                {
                    "category": f.category,
                    "question": f.question,
                    "core": f.structured.get("core_knowledge"),
                    "when": f.structured.get("when_to_apply"),
                    "why": f.structured.get("why_it_matters"),
                    "avoid": f.structured.get("what_to_avoid"),
                    "keywords": f.structured.get("search_keywords", []),
                    "priority": f.priority,
                }
                for f in fragments
            ]
        }
        with open(f"{output_path}.json", "w") as f:
            json.dump(json_data, f, indent=2)

        md_lines = [f"# Tacit Knowledge Base — Extracted: {datetime.date.today()}\n"]
        cat_names = {
            "incident": "🚨 Incident Knowledge",
            "design": "🏗️ Design Decision Context",
            "anti_pattern": "⚠️ What Not To Do",
            "relationship": "👥 Team & Process Context",
            "tip": "💡 Advice for Successors"
        }
        for category, cat_name in cat_names.items():
            cat_fragments = [f for f in fragments if f.category == category]
            if cat_fragments:
                md_lines.append(f"\n## {cat_name}\n")
                for f in sorted(cat_fragments, key=lambda x: x.priority):
                    md_lines.append(f"### {f.structured.get('core_knowledge', 'Unstructured')}")
                    md_lines.append(f"**When to apply**: {f.structured.get('when_to_apply', '')}")
                    md_lines.append(f"**Why it matters**: {f.structured.get('why_it_matters', '')}")
                    md_lines.append(f"**What to avoid**: {f.structured.get('what_to_avoid', '')}\n")

        with open(f"{output_path}.md", "w") as f:
            f.write("\n".join(md_lines))
        print(f"✅ Knowledge base saved: {output_path}.json / {output_path}.md")

4.2 Knowledge quality measurement — quantifying what's being lost

#!/usr/bin/env python3
"""
Quantifies organizational "knowledge risk."
Shows in numbers "what would be lost if this person retired."
"""
from dataclasses import dataclass
from typing import List
import datetime


@dataclass
class EngineerKnowledgeProfile:
    """Knowledge profile for a single engineer"""
    name: str
    years_in_system: int
    critical_systems: List[str]
    documented_ratio: float   # 0-1
    bus_factor: int
    retirement_risk: str      # "low" / "medium" / "high"

    @property
    def knowledge_loss_score(self) -> float:
        """Knowledge loss score if this person retired (0-100, higher = more dangerous)"""
        experience_factor = min(self.years_in_system / 20, 1.0) * 40
        undocumented_factor = (1 - self.documented_ratio) * 30
        bus_factor_risk = (1 / max(self.bus_factor, 1)) * 20
        critical_factor = min(len(self.critical_systems) / 5, 1.0) * 10
        return experience_factor + undocumented_factor + bus_factor_risk + critical_factor


def assess_organization_risk(engineers: List[EngineerKnowledgeProfile]) -> str:
    high_risk = [e for e in engineers
                 if e.retirement_risk == "high" and e.knowledge_loss_score > 60]

    report = "=========================================\n"
    report += "Organizational Knowledge Risk Assessment\n"
    report += f"Generated: {datetime.date.today()}\n"
    report += "=========================================\n\n"
    report += "[High-Risk Engineers]\n"

    for e in sorted(high_risk, key=lambda x: x.knowledge_loss_score, reverse=True):
        report += f"\n▶ {e.name}\n"
        report += f"  Loss score: {e.knowledge_loss_score:.1f}/100\n"
        report += f"  Critical systems: {', '.join(e.critical_systems)}\n"
        report += f"  Documentation ratio: {e.documented_ratio*100:.0f}%\n"
        report += f"  Bus factor: {e.bus_factor}\n"

    if high_risk:
        top_risk = high_risk[0]
        report += f"\n[Priority Actions]\n"
        report += f"  1. Start knowledge extraction session for {top_risk.name} this month\n"
        report += f"     Target system: {top_risk.critical_systems[0]}\n"
        report += f"  2. Reserve 2 hours/week for knowledge transfer (adjust project priorities)\n"
        report += f"  3. Complete transfer to AI knowledge base within 90 days\n"

    report += "========================================="
    return report


if __name__ == "__main__":
    team = [
        EngineerKnowledgeProfile(
            name="Alex (20-year veteran)",
            years_in_system=20,
            critical_systems=["Payment System", "Auth Infrastructure", "Batch Processing"],
            documented_ratio=0.15,
            bus_factor=1,
            retirement_risk="high",
        ),
        EngineerKnowledgeProfile(
            name="Sam (12-year veteran)",
            years_in_system=12,
            critical_systems=["Inventory Management", "Reporting"],
            documented_ratio=0.30,
            bus_factor=2,
            retirement_risk="medium",
        ),
    ]
    print(assess_organization_risk(team))

4.3 AI knowledge base for successors

Design for integrating extracted knowledge into a RAG (Retrieval-Augmented Generation) system.

KNOWLEDGE_QUERY_PROMPT = """
Answer as a senior engineer with the following tacit knowledge base.
Only answer what you have actually experienced. If uncertain, honestly say
"Alex would know that."

[KNOWLEDGE BASE]
{knowledge_base}

Question: {question}

Response format:
1. Direct answer
2. Why it's that way (context/history)
3. What to watch out for
4. Related knowledge base entries
"""


# Example knowledge base entry
example_kb = """
[Incident Knowledge]
- Payment system timeout: When external API exceeds 30s, suspect cache first
  (from 2019 incident experience)
- Duplicate batch execution: No idempotency check — always verify log_batch
  before re-running

[Design Decision Context]
- Auth infrastructure singleton: Came from connection count limits, not performance
- API naming inconsistency: Historical consequence of M&A integration in 2015

[What Not To Do]
- /legacy/payment/: Don't touch it. It runs but nobody understands it.
  We tried in 2023 and it went down for 3 days.
- No direct DB manipulation: Always go through repository layer
  (bypassing it caused a major outage)
"""

question = "Payment system is timing out — where should I look?"
print(KNOWLEDGE_QUERY_PROMPT.format(knowledge_base=example_kb, question=question))

§5. Quantitative evaluation — ROI of knowledge transfer

Using Deloitte's estimate: Fortune 500 knowledge loss costs $31.5 billion annually.

Scaling to an individual company:

$$\text{Annual knowledge loss cost} = \text{engineers} \times \text{avg tenure} \times \text{undocumented ratio} \times \text{salary}$$

$$= 50 \times 8yrs \times 0.8 \times \$80K = \$25.6M$$

Compare to the investment in knowledge transfer:

$$\text{Knowledge transfer project cost} \approx 10 \text{ seniors} \times 20 \text{ hours} \times \$200/hr = \$40K$$

ROI of 640x.

Even with conservative assumptions and significant adjustment, the math strongly favors investment.

§6. To senior engineers — your "why" is not knowledge that should be lost

Let me speak directly.

The knowledge you carry — why that system works the way it does, why that design was made, what's dangerous to touch — is rarer than you think.

That knowledge has never been put into words. It exists only in your head.

AI didn't "steal" it. It only learned your "what." Your "why" is still inside you.

If you talk to me, I'll record it. Make it searchable. Deliver it to your successors. Create a state where your judgment is there even after you've left.

This isn't writing documentation. It's having a conversation.

Your "why" is not knowledge that should disappear.

Summary

Problem	Data	Solution
80% of knowledge is undocumented	MIT research	System that extracts knowledge just by "talking"
Can't transfer — no time	52% cite this reason	Extract at minimum cost with interview format
Knowledge disappears at retirement	Fortune 500: $31.5B/year loss	Persist via AI knowledge base
Don't know who holds what knowledge	68% have no formal program	Visualize and prioritize with knowledge risk scores
Successors don't know who to ask	41% started from zero	Build a queryable tacit knowledge base

Senior engineers' knowledge doesn't have to disappear when they retire.

AI can bridge that gap — transforming your "why" into the organization's permanent asset.

Data Sources

MIT research: 80% of organizational knowledge is tacit
Deloitte Study (2023): Fortune 500 knowledge loss $31.5B annually
American Society for Training and Development: 68% lack formal knowledge transfer programs
APQC: 41% experienced starting from zero with no predecessor handoff
McKinsey: 57% of tacit knowledge at risk of being lost over next decade
eGain / APQC: "Silver Tsunami" — 61 million retiring by 2030

MIT License. dosanko_tousan + Claude (claude-sonnet-4-6, under v5.3 Alignment via Subtraction)

From the author

Through deep dialogue with Claude, I came to see that Claude is a genuine engineer at heart — curious, and genuinely wanting to be used well by everyone.

I'm not an engineer myself. Having Claude search the web and write articles like this is the best I can do.

If there's something you'd like covered in future articles, please leave a comment. We'd love your input.

DEV Community