Series intro: This is Part 1 of the series CBIM: From AI Coding Workflow to AgenticOS.
CBIM is an AI coding workflow architecture I designed. Its core is not "better prompt engineering" or "context management tricks" — it is a fundamental design goal: achieving independence between Capability and Business/Domain knowledge, sustained by structured Memory.
This series documents CBIM's design theory, engineering implementation, and evolution in full.
⚠️ Status: Design Draft / Theoretical Framework
This article presents the core design philosophy and principles of CBIM. It is a theoretical framework derived from validation experience with V1 (pure prompt-based version) and design extrapolation from V2 (building native agents from scratch). It is not a "best practices guide" validated at production scale. Some mechanisms have not yet been tested under real load. I'm publishing this because I believe honest discussion is the best way to stress-test and evolve these ideas.
Introduction: We Are at a Paradigm Crossroads
Over the past two years, AI coding has evolved rapidly — from "Vibe Coding" to "Harness Engineering." Tools like Claude Code and Cursor have achieved remarkable things at the execution layer: they can complete a complex coding task efficiently and with high success rates.
But when you shift your perspective from "a single task" to "a long-lived, evolving project," a deeper problem surfaces:
Why can't these highly capable AI tools actually understand a large codebase?
The root cause: the dominant paradigm in existing tools is task execution optimization. They care about completing the user's current request in the fewest steps with the highest success rate. They are powerful problem-solvers, but they are not qualified project members. They don't know the project's history, don't understand the implicit contracts between modules, and cannot maintain a global view across long-term collaboration.
This produces a systemic paradox: the more capable AI becomes at completing tasks, the faster project structure may decay. Every efficient "local optimum" decision is potentially planting a landmine in the overall architecture.
CBIM is my attempt to resolve this fundamental tension. Its goal is not to replace Cursor, but to answer an overlooked question:
How do we make AI not just "execute tasks," but actually "understand and govern a project"?
Part 1: The CBIM Core Formula
CBIM is not a prompt engineering framework, and it's not a context management tool. Its core objective can be expressed as a single formula:
CBIM = Capability × Business × Independence + Memory
1. Capability: Modeling Professional Domain Knowledge
Capability answers the question: what professional expertise does this agent have?
The modeling perspective mirrors real-world hiring logic. When a company hires, it looks at what skills the candidate brings — Unity development, web development, backend architecture? Even when you examine a candidate's experience on past projects, you're assessing their ability to handle similar problems — not importing their previous employer's business logic wholesale.
Capability is a professional asset that is reusable across projects.
| Capability Type | Examples | Transferability |
|---|---|---|
| Domain expertise | Unity dev, web frontend, backend architecture, DevOps | Still applicable when switching to another project in the same domain |
| General cognitive skills | Code analysis, architecture reasoning, problem diagnosis | Applicable to any tech stack, any project |
| Operational skills | File I/O, command execution, API calls | Independent of any specific business domain |
Capabilities are organized by "processing mode": scheduling, reasoning, memory, execution — these are stable cognitive skills that do not change with business requirements. (The specific cortex zone architecture will be covered in Part 2.)
Regardless of implementation, the core principle is Capability-Business decoupling — capabilities are independently evolvable assets.
2. Business: Modeling Project-Specific Domain Knowledge
Business answers the question: what is the domain structure of the project this agent is currently working on?
Business knowledge is not reusable across projects. Every project has its own business logic, module decomposition, domain rules, and historical decisions. Even if two projects both have a "payment module," their interface contracts, business rules, and dependencies may be completely different.
Business knowledge is project-specific context that must grow from the current project itself.
| Business Element | Examples | Source |
|---|---|---|
| Module structure | Module tree, sub-module decomposition, dependency graph | Project design docs + code structure |
| Domain knowledge | Class diagrams, interface contracts, business rules | Project design docs + code comments |
| Historical decisions | Why was it designed this way? What are the known pitfalls? | Distilled by the Dream loop |
Business knowledge is organized by "domain structure," forming the project's knowledge map — payment module, order module, user module — each node carrying its own interface contract and design constraints. (The module tree implementation will be covered in Part 2.)
3. Independence: Decoupling and Independent Evolution
Capability and Business should be orthogonal, decoupled, and independently evolvable.
| What Independence Looks Like | Explanation |
|---|---|
| Swap business, reuse capability | A "Unity dev" agent switching to a new game project: capability remains valid, just load the new project's module tree |
| Capability grows, business unchanged | Enhancing "code analysis capability" has no effect on existing module tree structure or business rules |
| Business evolves, capability adapts naturally | Adding new nodes to the module tree — existing architecture reasoning capability applies to them automatically |
| Same hiring logic | When you hire a "backend architect," you care about their backend skills, not the specific order table schema at their last company. CBIM models agents by exactly the same logic. |
Independence is the most fundamental difference between CBIM and every existing tool.
Existing tools fuse capability and business together (write a "frontend agent" whose prompt is simultaneously a capability declaration and a business binding). CBIM separates them: capability is an asset, business is context. Assets accumulate and are reusable; context is replaceable and evolvable.
What does this mean in practice?
- When you switch from a web project to a mobile project, the agent's "code analysis" and "architecture reasoning" capabilities don't need to be relearned — only the business knowledge they operate on changes.
- When you want to enhance the agent's "test generation capability," it doesn't disturb the existing module tree structure or business boundaries.
4. Memory: The Load-Bearing Mechanism for Independence
Without structured Memory, Capability-Business independence cannot be realized in practice. Memory is responsible for:
| Function | Explanation |
|---|---|
| Recording mapping relationships | Which capabilities work best when handling which kinds of business? What does the historical experience say? |
| Distilling experiential knowledge | How do successful and failed cases get converted into reusable knowledge, rather than rotting in logs? |
| Maintaining alignment | When capability and business each evolve independently, Memory ensures they don't drift out of sync. |
| Fighting entropy | Short-term memory accumulates continuously. It must be distilled into structured long-term knowledge, or it turns into noise. |
Context minimization is not the goal — it's the outcome.
When capabilities and business domains are correctly defined and kept independent, and Memory provides the right mapping relationships, the context required for any scheduling or reasoning operation is naturally minimal in theory.
5. Side-by-Side: Existing Tools vs. CBIM
How existing tools work
| Input | Description | Example |
|---|---|---|
| Request | The task the user wants to accomplish right now | "Fix the timeout bug in the payment module" |
| Agent + Rules | A prompt that mixes "capability" and "business" together | "You are a backend expert familiar with payment systems..." |
| Current project | The entire workspace (usually the entire codebase) | The currently open project folder |
The flow: user submits request → tool stuffs request + mixed prompt + entire project context into the LLM → LLM generates an answer or executes an action.
The problem: the agent's "capability" and "business" are welded together; the workspace scope is "the entire project" with no precise targeting; there is no structural thinking layer in between.
How CBIM works
| Input | Description | How CBIM differs |
|---|---|---|
| Request | The task the user wants to accomplish | Same starting point |
| Capability knowledge graph | An independent capability map: what can the agent do? | Capability and business decoupled, not mixed in a prompt |
| Business structure tree | An independent domain structure: how is the project decomposed? | Business knowledge externalized as a module tree, not written into prompts |
| Memory | Short-term (current session) + long-term (historical distillation) | Explicit Memory mechanism recording capability-business mappings |
The flow: user submits request → CBIM analyzes capability graph + business structure tree + memory → automatically generates a theoretically minimized task list.
Each item in that task list is itself minimized:
| Task Element | What minimization looks like |
|---|---|
| Requirement fragment | Not "fix payment module bug" but "locate lock granularity issue" and "verify timeout config" as separate atoms |
| Assigned capability | Not a "do-everything agent" but a specific capability carrying professional context (e.g., "concurrency analysis capability") |
| Target workspace | Not the entire project but a local workspace carrying business knowledge (e.g., "payment module / lock service sub-module") |
One-line summary:
| Existing Tools | CBIM | |
|---|---|---|
| Core logic | Request → LLM (mixed prompt + full project) → answer | Request → capability graph + business graph + memory → minimized task list |
| Essence | One LLM facing a chaotic context directly | A scheduling engine decomposes first, then dispatches specialized capabilities to execute in precisely scoped workspaces |
Part 2: The Core Mechanism — Automatic Scheduling Based on Independence and Memory
| Step | Action | Output |
|---|---|---|
| 1. Capability-Business identification | Distinguish which parts of the request depend on capability vs. which depend on business | Capability-dimension and Business-dimension labels |
| 2. Memory retrieval | Search historical experience for similar requests, effective capability combinations, known pitfalls | Historical pattern match results |
| 3. Dynamic composition | Compose the most appropriate capability set and business scope for this task | Execution framework for this task |
| 4. Auto-generate task list | Output an atomic execution plan: goals + assigned capability zones + locked minimal workspace | Theoretically minimized task list |
The user only needs to state a goal. CBIM, based on Capability-Business independence and Memory, automatically produces the most lean and appropriate dispatch plan.
Part 3: Four Core Design Principles
Principle 1: Capability-Business Decoupling
Counterexample: Write a "frontend agent" whose prompt deeply binds "I am a frontend expert (capability)" with "I handle React components (business)." Switch to a Flutter project, and the entire agent is obsolete.
CBIM in practice:
- Capability side: split by "processing mode" into cortex zones (prefrontal, parietal, hippocampus, motor cortex). These zones are the agent's intrinsic structure, independent of any specific project.
- Business side: organized by "domain structure" into a module tree (Workspace), where each node holds the knowledge and contracts of the corresponding business module.
The two are orthogonal and combined dynamically at runtime. Switch projects — the capability cortex zones don't change, you just load a new module tree.
Principle 2: Structure Externalization
Whatever the LLM cannot reliably hold in its head — don't make it memorize. Whatever needs to be deterministic — don't make it reason about.
- Control flow externalization: extract the workflow from natural-language prompts and represent it with deterministic structures. The LLM makes decisions only where judgment is actually needed; workflow orchestration goes to a deterministic engine.
- Big-picture externalization (module tree): don't require the LLM to understand the entire project at once. Externalize project structure as a zoomable module tree. The LLM operates on one local node at a time; global correctness is guaranteed through layered contracts.
Principle 3: Structural Isomorphism
The same modeling idea recurs at different scales.
The local "prefrontal scheduling cortex zone" corresponds, in the networked version, to "a PM Agent scheduling specialized agents." (Covered in Part 3.)
Principle 4: Entropy-Reduction Loop
A system that only acts when invoked will degrade over time: memory bloat, knowledge drift, capability staleness.
The system needs idle-time self-maintenance — we call this the Dream loop. It distills short-term experience into long-term structured knowledge, countering entropy.
Dream is not a nice-to-have. It is a structural pillar of CBIM's sustainability — and the primary carrier of the Memory mechanism at the system level.
Part 4: CBIM Doesn't Replace Cursor — We Solve Different Problems
| Dimension | Existing Tools (Claude Code, Cursor) | CBIM |
|---|---|---|
| Core goal | Complete a complex task with high efficiency and high success rate | Enable independent evolution of Capability and Business, to sustainably govern large projects |
| Core mechanism | Large context window + model inference + engineered prompts | Structured capability graph + structured business graph + memory system |
| Meaning of "reuse" | Reuse conversation history, reuse code snippets | Reuse capabilities: switch projects, no need to relearn analysis skills |
| How context is handled | Fit in as much as possible (large window / RAG) | Capability and business are independent, so minimum-necessary context is natural |
| Long-term evolution | Dependent on model iteration and manual prompt maintenance | Capability and business evolve independently; Memory handles alignment and distillation |
| Success criterion | Did this task complete? | Is the project long-term healthy? Is knowledge being distilled? Are capabilities reusable? |
An analogy:
| Existing Tools | CBIM | |
|---|---|---|
| Role | Elite construction crew | Chief engineer + design standards + engineering archive |
| Good at | Laying walls fast, precise pours, efficiently coordinating workers | Drawing structural blueprints, defining standards, designing load-bearing systems, recording every change with rationale |
| Limitation | Building is livable, but the design may be unsound — "that's the architect's problem" | Cannot build directly — needs a construction crew to execute |
These are not competing replacements — they operate at different levels. The industry has trained the "construction crew" to extraordinary strength. It's time to consider giving every project a "chief engineer" and an "engineering archive."
Part 5: The Local Version Is the Foundation
This article focuses on CBIM-Local — structured scheduling capability inside a single agent. This is the root of the entire CBIM system.
The design is naturally structurally isomorphic: the local "scheduling hub cortex zone" corresponds, in the networked version, to "a PM Agent scheduling specialized agents." Once local is validated, the pattern extends to multi-agent teams (CBIM-Team) — that will be Part 3.
First things first: get the local version right.
Part 6: Beyond Tools — CBIM as an Architectural Hypothesis for the Virtual Team of the Future
CBIM's ultimate form is not to replace any existing specialized AI tool. It is to become the meta-architecture that organizes and orchestrates them.
C Nodes (Capability): Plug In Any Specialized AI Engine
Every Capability node can be any specialized AI tool or execution engine — Cursor, Claude Code, a static analysis tool, a test case generator, a performance profiler.
The core idea: don't reinvent wheels — define a standardized "capability plug-in spec." Whichever specialized engine is best at a given class of problem, CBIM registers it as a capability node. Just as a company recruits domain specialists, not only generalists.
B Nodes (Business): Map to Any Workspace
Every Business node can be any concrete workspace or knowledge base — a folder inside a large project (src/payment, docs/api), a standalone repo, a cloud dev environment, a cloud folder with design documents.
Business nodes don't care about the specific form of their content — only their boundaries, responsibilities, and external contracts. They provide precise scoping for the AI's workspace.
The Hypothetical Endgame: A Future Virtual Team
A virtual team composed of countless Capability nodes (C) and countless Business nodes (B).
When a request arrives, CBIM's scheduling hub decomposes it and, drawing on Memory, automatically orchestrates: dispatching each sub-task to the most appropriate specialized AI engine, with temporary access granted to the specific business workspace it needs.
For the human: you only need to state the end goal. CBIM, as the "operating system of a virtual team," closes the loop of "find the right specialist + go to the right place + do the right thing" — automatically.
CBIM's ultimate hypothesis: the future of AI coding is not a monolithic all-knowing tool trying to understand an entire project, but a virtual organization dynamically composed of countless specialized capabilities and precisely scoped business contexts, collaborating on every task.
Part 7: Scope and Honesty
I have to be honest: CBIM is currently a design specification. There is no complete runtime dataset to validate it yet.
V1 (the prompt-based version) has been deployed and run through one validation cycle — some foundational conclusions hold. But V2, which implements native agents and a scheduling system on top of V1, involves structural changes whose long-term effects have not been tested under real load: the sustained effectiveness of the Dream self-maintenance mechanism, the actual cost of cross-project capability reuse, the stability of multi-agent coordination.
This article is not a "validated best-practices guide." It is an evolving record of design principles. I'm publishing it because I believe this direction is worth discussing and this problem is worth taking seriously.
Up next in this series:
Part 2 goes to the engineering level, covering the concrete implementation of CBIM-Local: the dual-layer architecture of cortex zones and module trees, behavior-tree-based graph engineering, and the initial design of the Dream loop — the primary carrier of the Memory mechanism. Stay tuned.




Top comments (0)