DEV Community

Cover image for CBIM: Why Your AI Coding Agent Can't Actually Read a Large Codebase
Nan Li
Nan Li

Posted on • Originally published at zhihu.com

CBIM: Why Your AI Coding Agent Can't Actually Read a Large Codebase

Series intro: This is Part 1 of the series CBIM: From AI Coding Workflow to AgenticOS.
CBIM is an AI coding workflow architecture I designed. Its core is not "better prompt engineering" or "context management tricks" — it is a fundamental design goal: achieving independence between Capability and Business/Domain knowledge, sustained by structured Memory.
This series documents CBIM's design theory, engineering implementation, and evolution in full.


⚠️ Status: Design Draft / Theoretical Framework
This article presents the core design philosophy and principles of CBIM. It is a theoretical framework derived from validation experience with V1 (pure prompt-based version) and design extrapolation from V2 (building native agents from scratch). It is not a "best practices guide" validated at production scale. Some mechanisms have not yet been tested under real load. I'm publishing this because I believe honest discussion is the best way to stress-test and evolve these ideas.


Introduction: We Are at a Paradigm Crossroads

Over the past two years, AI coding has evolved rapidly — from "Vibe Coding" to "Harness Engineering." Tools like Claude Code and Cursor have achieved remarkable things at the execution layer: they can complete a complex coding task efficiently and with high success rates.

But when you shift your perspective from "a single task" to "a long-lived, evolving project," a deeper problem surfaces:

Why can't these highly capable AI tools actually understand a large codebase?

The root cause: the dominant paradigm in existing tools is task execution optimization. They care about completing the user's current request in the fewest steps with the highest success rate. They are powerful problem-solvers, but they are not qualified project members. They don't know the project's history, don't understand the implicit contracts between modules, and cannot maintain a global view across long-term collaboration.

This produces a systemic paradox: the more capable AI becomes at completing tasks, the faster project structure may decay. Every efficient "local optimum" decision is potentially planting a landmine in the overall architecture.

CBIM is my attempt to resolve this fundamental tension. Its goal is not to replace Cursor, but to answer an overlooked question:

How do we make AI not just "execute tasks," but actually "understand and govern a project"?


Part 1: The CBIM Core Formula

CBIM is not a prompt engineering framework, and it's not a context management tool. Its core objective can be expressed as a single formula:

CBIM = Capability × Business × Independence + Memory
Enter fullscreen mode Exit fullscreen mode

1. Capability: Modeling Professional Domain Knowledge

Capability answers the question: what professional expertise does this agent have?

The modeling perspective mirrors real-world hiring logic. When a company hires, it looks at what skills the candidate brings — Unity development, web development, backend architecture? Even when you examine a candidate's experience on past projects, you're assessing their ability to handle similar problems — not importing their previous employer's business logic wholesale.

Capability is a professional asset that is reusable across projects.

Capability Type Examples Transferability
Domain expertise Unity dev, web frontend, backend architecture, DevOps Still applicable when switching to another project in the same domain
General cognitive skills Code analysis, architecture reasoning, problem diagnosis Applicable to any tech stack, any project
Operational skills File I/O, command execution, API calls Independent of any specific business domain

Capabilities are organized by "processing mode": scheduling, reasoning, memory, execution — these are stable cognitive skills that do not change with business requirements. (The specific cortex zone architecture will be covered in Part 2.)

Regardless of implementation, the core principle is Capability-Business decoupling — capabilities are independently evolvable assets.

2. Business: Modeling Project-Specific Domain Knowledge

Business answers the question: what is the domain structure of the project this agent is currently working on?

Business knowledge is not reusable across projects. Every project has its own business logic, module decomposition, domain rules, and historical decisions. Even if two projects both have a "payment module," their interface contracts, business rules, and dependencies may be completely different.

Business knowledge is project-specific context that must grow from the current project itself.

Business Element Examples Source
Module structure Module tree, sub-module decomposition, dependency graph Project design docs + code structure
Domain knowledge Class diagrams, interface contracts, business rules Project design docs + code comments
Historical decisions Why was it designed this way? What are the known pitfalls? Distilled by the Dream loop

Business knowledge is organized by "domain structure," forming the project's knowledge map — payment module, order module, user module — each node carrying its own interface contract and design constraints. (The module tree implementation will be covered in Part 2.)

3. Independence: Decoupling and Independent Evolution

Capability and Business should be orthogonal, decoupled, and independently evolvable.

What Independence Looks Like Explanation
Swap business, reuse capability A "Unity dev" agent switching to a new game project: capability remains valid, just load the new project's module tree
Capability grows, business unchanged Enhancing "code analysis capability" has no effect on existing module tree structure or business rules
Business evolves, capability adapts naturally Adding new nodes to the module tree — existing architecture reasoning capability applies to them automatically
Same hiring logic When you hire a "backend architect," you care about their backend skills, not the specific order table schema at their last company. CBIM models agents by exactly the same logic.

Independence is the most fundamental difference between CBIM and every existing tool.

Existing tools fuse capability and business together (write a "frontend agent" whose prompt is simultaneously a capability declaration and a business binding). CBIM separates them: capability is an asset, business is context. Assets accumulate and are reusable; context is replaceable and evolvable.

What does this mean in practice?

  • When you switch from a web project to a mobile project, the agent's "code analysis" and "architecture reasoning" capabilities don't need to be relearned — only the business knowledge they operate on changes.
  • When you want to enhance the agent's "test generation capability," it doesn't disturb the existing module tree structure or business boundaries.

4. Memory: The Load-Bearing Mechanism for Independence

Without structured Memory, Capability-Business independence cannot be realized in practice. Memory is responsible for:

Function Explanation
Recording mapping relationships Which capabilities work best when handling which kinds of business? What does the historical experience say?
Distilling experiential knowledge How do successful and failed cases get converted into reusable knowledge, rather than rotting in logs?
Maintaining alignment When capability and business each evolve independently, Memory ensures they don't drift out of sync.
Fighting entropy Short-term memory accumulates continuously. It must be distilled into structured long-term knowledge, or it turns into noise.

Context minimization is not the goal — it's the outcome.

When capabilities and business domains are correctly defined and kept independent, and Memory provides the right mapping relationships, the context required for any scheduling or reasoning operation is naturally minimal in theory.

5. Side-by-Side: Existing Tools vs. CBIM

How existing tools work

Input Description Example
Request The task the user wants to accomplish right now "Fix the timeout bug in the payment module"
Agent + Rules A prompt that mixes "capability" and "business" together "You are a backend expert familiar with payment systems..."
Current project The entire workspace (usually the entire codebase) The currently open project folder

The flow: user submits request → tool stuffs request + mixed prompt + entire project context into the LLM → LLM generates an answer or executes an action.

The problem: the agent's "capability" and "business" are welded together; the workspace scope is "the entire project" with no precise targeting; there is no structural thinking layer in between.

How CBIM works

Input Description How CBIM differs
Request The task the user wants to accomplish Same starting point
Capability knowledge graph An independent capability map: what can the agent do? Capability and business decoupled, not mixed in a prompt
Business structure tree An independent domain structure: how is the project decomposed? Business knowledge externalized as a module tree, not written into prompts
Memory Short-term (current session) + long-term (historical distillation) Explicit Memory mechanism recording capability-business mappings

The flow: user submits request → CBIM analyzes capability graph + business structure tree + memory → automatically generates a theoretically minimized task list.

Each item in that task list is itself minimized:

Task Element What minimization looks like
Requirement fragment Not "fix payment module bug" but "locate lock granularity issue" and "verify timeout config" as separate atoms
Assigned capability Not a "do-everything agent" but a specific capability carrying professional context (e.g., "concurrency analysis capability")
Target workspace Not the entire project but a local workspace carrying business knowledge (e.g., "payment module / lock service sub-module")

One-line summary:

Existing Tools CBIM
Core logic Request → LLM (mixed prompt + full project) → answer Request → capability graph + business graph + memory → minimized task list
Essence One LLM facing a chaotic context directly A scheduling engine decomposes first, then dispatches specialized capabilities to execute in precisely scoped workspaces

Part 2: The Core Mechanism — Automatic Scheduling Based on Independence and Memory

Step Action Output
1. Capability-Business identification Distinguish which parts of the request depend on capability vs. which depend on business Capability-dimension and Business-dimension labels
2. Memory retrieval Search historical experience for similar requests, effective capability combinations, known pitfalls Historical pattern match results
3. Dynamic composition Compose the most appropriate capability set and business scope for this task Execution framework for this task
4. Auto-generate task list Output an atomic execution plan: goals + assigned capability zones + locked minimal workspace Theoretically minimized task list

The user only needs to state a goal. CBIM, based on Capability-Business independence and Memory, automatically produces the most lean and appropriate dispatch plan.


Part 3: Four Core Design Principles

Principle 1: Capability-Business Decoupling

Counterexample: Write a "frontend agent" whose prompt deeply binds "I am a frontend expert (capability)" with "I handle React components (business)." Switch to a Flutter project, and the entire agent is obsolete.

CBIM in practice:

  • Capability side: split by "processing mode" into cortex zones (prefrontal, parietal, hippocampus, motor cortex). These zones are the agent's intrinsic structure, independent of any specific project.
  • Business side: organized by "domain structure" into a module tree (Workspace), where each node holds the knowledge and contracts of the corresponding business module.

The two are orthogonal and combined dynamically at runtime. Switch projects — the capability cortex zones don't change, you just load a new module tree.

Principle 2: Structure Externalization

Whatever the LLM cannot reliably hold in its head — don't make it memorize. Whatever needs to be deterministic — don't make it reason about.

  • Control flow externalization: extract the workflow from natural-language prompts and represent it with deterministic structures. The LLM makes decisions only where judgment is actually needed; workflow orchestration goes to a deterministic engine.
  • Big-picture externalization (module tree): don't require the LLM to understand the entire project at once. Externalize project structure as a zoomable module tree. The LLM operates on one local node at a time; global correctness is guaranteed through layered contracts.

Principle 3: Structural Isomorphism

The same modeling idea recurs at different scales.

The local "prefrontal scheduling cortex zone" corresponds, in the networked version, to "a PM Agent scheduling specialized agents." (Covered in Part 3.)

Principle 4: Entropy-Reduction Loop

A system that only acts when invoked will degrade over time: memory bloat, knowledge drift, capability staleness.

The system needs idle-time self-maintenance — we call this the Dream loop. It distills short-term experience into long-term structured knowledge, countering entropy.

Dream is not a nice-to-have. It is a structural pillar of CBIM's sustainability — and the primary carrier of the Memory mechanism at the system level.


Part 4: CBIM Doesn't Replace Cursor — We Solve Different Problems

Dimension Existing Tools (Claude Code, Cursor) CBIM
Core goal Complete a complex task with high efficiency and high success rate Enable independent evolution of Capability and Business, to sustainably govern large projects
Core mechanism Large context window + model inference + engineered prompts Structured capability graph + structured business graph + memory system
Meaning of "reuse" Reuse conversation history, reuse code snippets Reuse capabilities: switch projects, no need to relearn analysis skills
How context is handled Fit in as much as possible (large window / RAG) Capability and business are independent, so minimum-necessary context is natural
Long-term evolution Dependent on model iteration and manual prompt maintenance Capability and business evolve independently; Memory handles alignment and distillation
Success criterion Did this task complete? Is the project long-term healthy? Is knowledge being distilled? Are capabilities reusable?

An analogy:

Existing Tools CBIM
Role Elite construction crew Chief engineer + design standards + engineering archive
Good at Laying walls fast, precise pours, efficiently coordinating workers Drawing structural blueprints, defining standards, designing load-bearing systems, recording every change with rationale
Limitation Building is livable, but the design may be unsound — "that's the architect's problem" Cannot build directly — needs a construction crew to execute

These are not competing replacements — they operate at different levels. The industry has trained the "construction crew" to extraordinary strength. It's time to consider giving every project a "chief engineer" and an "engineering archive."


Part 5: The Local Version Is the Foundation

This article focuses on CBIM-Local — structured scheduling capability inside a single agent. This is the root of the entire CBIM system.

The design is naturally structurally isomorphic: the local "scheduling hub cortex zone" corresponds, in the networked version, to "a PM Agent scheduling specialized agents." Once local is validated, the pattern extends to multi-agent teams (CBIM-Team) — that will be Part 3.

First things first: get the local version right.


Part 6: Beyond Tools — CBIM as an Architectural Hypothesis for the Virtual Team of the Future

CBIM's ultimate form is not to replace any existing specialized AI tool. It is to become the meta-architecture that organizes and orchestrates them.

C Nodes (Capability): Plug In Any Specialized AI Engine

Every Capability node can be any specialized AI tool or execution engine — Cursor, Claude Code, a static analysis tool, a test case generator, a performance profiler.

The core idea: don't reinvent wheels — define a standardized "capability plug-in spec." Whichever specialized engine is best at a given class of problem, CBIM registers it as a capability node. Just as a company recruits domain specialists, not only generalists.

B Nodes (Business): Map to Any Workspace

Every Business node can be any concrete workspace or knowledge base — a folder inside a large project (src/payment, docs/api), a standalone repo, a cloud dev environment, a cloud folder with design documents.

Business nodes don't care about the specific form of their content — only their boundaries, responsibilities, and external contracts. They provide precise scoping for the AI's workspace.

The Hypothetical Endgame: A Future Virtual Team

A virtual team composed of countless Capability nodes (C) and countless Business nodes (B).

When a request arrives, CBIM's scheduling hub decomposes it and, drawing on Memory, automatically orchestrates: dispatching each sub-task to the most appropriate specialized AI engine, with temporary access granted to the specific business workspace it needs.

For the human: you only need to state the end goal. CBIM, as the "operating system of a virtual team," closes the loop of "find the right specialist + go to the right place + do the right thing" — automatically.

CBIM's ultimate hypothesis: the future of AI coding is not a monolithic all-knowing tool trying to understand an entire project, but a virtual organization dynamically composed of countless specialized capabilities and precisely scoped business contexts, collaborating on every task.


Part 7: Scope and Honesty

I have to be honest: CBIM is currently a design specification. There is no complete runtime dataset to validate it yet.

V1 (the prompt-based version) has been deployed and run through one validation cycle — some foundational conclusions hold. But V2, which implements native agents and a scheduling system on top of V1, involves structural changes whose long-term effects have not been tested under real load: the sustained effectiveness of the Dream self-maintenance mechanism, the actual cost of cross-project capability reuse, the stability of multi-agent coordination.

This article is not a "validated best-practices guide." It is an evolving record of design principles. I'm publishing it because I believe this direction is worth discussing and this problem is worth taking seriously.


Up next in this series:

Part 2 goes to the engineering level, covering the concrete implementation of CBIM-Local: the dual-layer architecture of cortex zones and module trees, behavior-tree-based graph engineering, and the initial design of the Dream loop — the primary carrier of the Memory mechanism. Stay tuned.

Top comments (0)