Inshal Rauf

Posted on Mar 13

Why AI Code Generators Produce Incoherent Codebases — and How a Model Registry Fixes It

#ai #python #architecture #odoo

The architecture behind Factory de Odoo: a 28-agent pipeline that guarantees cross-module coherence across a full ERP module suite

Every time I tried using an AI code generator for Odoo ERP modules, I ran into the same wall. The individual files looked reasonable. Models were declared, views were structured, security rules were present. But as soon as I looked at the codebase as a whole, the cracks appeared: a Many2one field in Module 8 pointing at a model that Module 3 never actually declared. A security group defined twice in two different modules with slightly different XML IDs. Menu items from Module 6 and Module 11 both trying to parent themselves to a root menu that didn't exist yet. The generator hadn't failed. The output had.

The root cause is architectural: most code generators — AI-assisted or not — have no persistent shared state across files. Each file is generated in isolation, with only whatever fits in the current context window as "memory" of what came before. On a small project, this is manageable. On a 12-module ERP customization, it's a maintenance nightmare.

Factory de Odoo — https://github.com/TIFAQM/Factory-de-Odoo — is my attempt to solve this at the architecture level, not the prompt level.

1. The Problem: Code Generators Without Memory

The failure mode is easy to reproduce. Ask any AI code generator to build a multi-module Odoo addon suite and watch what happens around module four or five. The generator starts making assumptions it can no longer verify: that a particular model exists, that a security group was declared upstream, that a menu root was created somewhere earlier in the pipeline. Because there is no authoritative record of what has already been generated, those assumptions are essentially guesses.

This produces three distinct categories of failure.
Phantom references are the most common. A field like partner_id = fields.Many2one('res.partner', ...) is fine. But project_phase_id = fields.Many2one('project.phase', ...)in a module that was generated before project.phase was ever declared anywhere — that's a phantom reference. At install time, Odoo will either fail silently or raise a hard error. Either way, the developer is left hunting through a codebase they didn't write, trying to find a model that doesn't exist.

Duplicate declarations are subtler and often more damaging. A security group declared as base_module.group_manager in one module and base_module_v2.group_manager in another won't cause an immediate crash — but it will cause inconsistent access control behavior that's nearly impossible to trace without knowing both declarations exist. The same problem surfaces with duplicate menu item XML IDs, duplicate action names, and duplicate record rules with overlapping domains.

Broken hierarchies are specific to Odoo's menu system and its inherited view mechanism. Odoo builds its navigation tree from parent-child relationships between menu items. If Module 6 declares a menu item with parent_id="base_module.root_menu" but root_menu doesn't exist at the time Module 6 is installed — or worse, is defined in Module 9, which installs after Module 6 — the entire navigation breaks. AI generators that don't track menu structure globally will produce this kind of dependency inversion routinely.

The tempting answer is "just use a bigger context window." Feed all twelve modules into one prompt and let the model see everything at once. This doesn't actually work at scale for several reasons. Large context windows degrade in quality as they fill — models lose track of details introduced thousands of tokens earlier. The window is read-only: there's no mechanism for an agent generating Module 7 to update a shared record that an agent generating Module 10 can later query. And for a full ERP suite, the total codebase easily exceeds what any current context window can hold without compression that destroys the precision you need for code generation. The real problem isn't memory size. It's the absence of a write-capable shared state.

2. The Solution: A Model Registry as Shared Agent State
The central architectural insight behind Factory de Odoo is this: the data a generator needs to produce coherent code across modules is not large, and it's highly structured. You don't need agents to share full file contents. You need them to share a registry of declared entities — models, fields, relations, security groups, menu items — that every agent can read from and write to before emitting any code.
The registry is a typed, validated data structure that persists across the entire generation pipeline. It stores:

Model declarations: every _name value declared across all modules, with its module of origin, its inheritance chain, and its field inventory
Field metadata: for each field, its type, its comodel_name (for relational fields), its compute method if applicable, and whether it has a corresponding inverse
Computed fields and dependencies: the full depends chain for each @api.depends declaration, validated against the fields that actually exist
Security groups: canonical XML IDs for every group declaration, with the module that owns it
Menu structure: the full menu tree, with declared parents and expected install-order dependencies
Inter-module relations: the directed graph of which modules reference which models from which other modules

This is not a novel concept in software engineering. It's a symbol table — the same structure a compiler uses to track declared identifiers across compilation units. The insight is applying it to AI-assisted code generation, where the "compilation units" are LLM generations and the "compiler" is a pipeline that needs to catch semantic errors before they reach the filesystem.

The read/write protocol is strict. Before an agent writes any file, it queries the registry for every external reference in the code it's about to generate. A Many2one field to project.phase requires a registry lookup: does project.phase exist? Which module declared it? Is that module declared as a dependency in the current module's manifest? If any of these checks fail, the agent does not emit the code. It raises a registry error that the orchestrator handles — either by reordering generation to declare the model first, or by flagging a genuine design inconsistency for the developer to resolve.

After a file is written, the agent writes its declarations back to the registry. Model name → module. Field list → model entry. Security group XML ID → group entry. This write is transactional: if validation fails, the write does not commit, and the registry remains in its last known good state.

The "no phantom reference" guarantee is a consequence of this protocol, not an extra check bolted on top. An agent literally cannot emit a reference to a model that hasn't been registered, because the field type constructor for relational fields performs a registry lookup at construction time. If the lookup returns empty, construction fails. This moves phantom reference detection from runtime (Odoo install) to generation time, when fixing it is cheap.

Collision detection is implemented at the registry write layer. When an agent attempts to register base_module.group_manager, the registry checks whether that XML ID already exists. If it does, and the registering module is different from the owning module, the write raises a collision error. The orchestrator logs it, and the agent is retried with an instruction to use the existing group rather than redeclare it. This eliminates duplicate security groups, duplicate menu IDs, and duplicate action names before they ever reach XML.

3. The Two-Layer Agent Architecture

Factory de Odoo uses 28 agents organized into two distinct layers. Understanding why requires understanding the difference between decisions that are inherently cross-module and decisions that are inherently per-module.

The Orchestrator Layer (19 agents)
The orchestrator layer handles everything that requires a view of the whole system. Its agents are responsible for:

Domain decomposition: Taking a plain-English description of an ERP customization and breaking it into a coherent set of modules with explicit boundaries. This is the step where "we need to manage project phases and link them to sales orders" becomes a decision about whether project.phase belongs in a new project_phase module or extends the existing project module, and what the dependency graph between modules looks like.

Registry initialization: Before any code is generated, the orchestrator runs a planning pass that initializes the registry with the declared skeleton of every module — names, dependency relationships, and high-level model outlines. This means that when the pipeline layer starts generating Module 3, the registry already knows that Module 7 intends to declare project.phase. Module 3's agents can therefore safely create a Many2one to it, with the registry flagging a dependency that must be validated once Module 7's generation is complete.

Cross-module planning: Decisions like "which module owns the root menu item" and "which module declares the base security group that others inherit from" are made at the orchestrator layer and written to the registry before per-module generation begins. This eliminates the category of errors caused by two modules independently trying to own shared infrastructure.
Conflict resolution: When pipeline agents raise registry errors, the orchestrator decides how to resolve them — reorder generation, consolidate declarations, or surface the conflict to the developer as an intentional design question.

The orchestrator layer's 19 agents are not all running simultaneously. They execute in a defined sequence that mirrors the phases of architectural planning: domain analysis → module decomposition → dependency graph construction → registry initialization → pipeline handoff.

The Pipeline Layer (9 agents)

The pipeline layer handles per-module generation. Each of the 9 pipeline agents is responsible for one aspect of a module's output:

Model agent: Generates models/*.py files, writing model and field declarations to the registry after each file
View agent: Generates views/*.xml files, querying the registry to confirm that every field referenced in a view exists in the model it's bound to
Security agent: Generates security/ir.model.access.csv and security group XML, querying the registry for existing groups before declaring new ones
Menu agent: Generates menu item XML, querying the registry for the menu tree to validate parent references
Wizard agent: Generates transient models and their views, with the same registry protocol as the model and view agents
Report agent: Generates QWeb report templates and their actions
Test agent: Generates unit tests, querying the registry for the full model and field inventory of the module under test
i18n agent: Generates .pot translation templates by scanning all generated string literals
Manifest agent: Generates __manifest__.py, querying the registry for the module's dependency list as established by the orchestrator

The pipeline agents are deliberately narrow. Each one knows how to generate one kind of artifact, query the registry for what it needs, and write its declarations back. They don't coordinate directly with each other — the registry mediates all cross-agent information exchange. This keeps individual agents simple and testable.

Knowledge Files: Keeping Agents Odoo-Idiomatic

A pure LLM approach to code generation will produce technically valid Python that is not Odoo-idiomatic. Odoo has strong conventions — about how computed fields declare their dependencies, how view inheritance uses XPath, how ACL lines are structured, how _sql_constraints interact with form validation — that are not well represented in general Python training data.

Factory de Odoo ships with over 80 WRONG/CORRECT example pairs that are injected into each agent's system prompt for the relevant artifact type. These aren't just style guidelines. They are executable specifications: the "WRONG" example is real code that would pass a syntax check and fail at Odoo install time or produce incorrect behavior, and the "CORRECT" example is the pattern the agent must use instead.

Examples include: the correct way to declare a Many2many with an explicit relation table name to avoid conflicts; how _inherit differs from _name in model extension; why @api.constrains cannot call sudo() without specific justification; and how to structure _sql_constraints so they don't shadow Python-level validation in confusing ways. This domain knowledge is what separates generated code that installs from generated code that works.

4. The Auto-Fix Pipeline

Generated code that passes registry validation still has to run through Odoo's own checks and a linting pass before it can be considered complete. Factory de Odoo includes a deterministic post-processing pipeline that handles the class of errors that are mechanical to fix but tedious to catch manually.

The auto-fix pipeline operates in three stages.

Pylint-odoo violations are caught first. pylint-odoo is the standard linter for Odoo addons, and it has opinions about things like: missing __init__.py exports, field definitions missing string attributes, methods without docstrings in specific contexts, and translation function usage. The pipeline runs pylint-odoo against every generated file and applies a rule-based fixer for each violation category. Violations that can't be mechanically fixed are reported as warnings with the relevant pylint error code, so the developer knows exactly what needs human attention.

Docker validation comes next. Every generated module suite is instantiated inside a Docker container running the target Odoo version (17.0, 18.0, or 19.0) and installed against an empty database. Install-time errors — missing model declarations, broken XML, invalid field types, dependency order failures — appear here. The pipeline captures Odoo's error log, parses it for known error patterns, and applies targeted fixes: adding missing dependencies to manifests, correcting XML ID references, and reordering record declarations to satisfy Odoo's loading order constraints.

Missing ACL detection runs last. A common failure mode in generated modules is a model that's declared in Python but missing from security/ir.model.access.csv. Odoo will install the module but deny all access to the model, producing confusing "you don't have access" errors for users. The auto-fix pipeline compares the set of declared models in the registry against the set of models with ACL entries and generates the missing rows, defaulting to read access for the base user group with a comment marking them for developer review.

The philosophy here is that deterministic post-processing is more reliable than trying to get the generation prompt right enough to never produce these issues. LLMs are stochastic. A pylint fixer is not. For the class of problems that are structural and rule-based, running a fixer after generation is more robust than attempting to eliminate the error from the prompt.

5. What the Output Looks Like

A concrete example makes this more tangible. Suppose you describe the following domain in plain English:

"We need to manage equipment maintenance for multiple client sites. Each site has equipment. Equipment has scheduled maintenance intervals. When maintenance is due, the system should create a work order. Work orders are assigned to technicians and tracked to completion. Completed work orders generate service invoices."

From this description, the orchestrator layer decomposes the domain into a module plan: site_management (client sites and their equipment), maintenance_scheduling (maintenance intervals and triggers), work_order (work orders and technician assignment), and service_invoicing (invoice generation linked to work orders). The registry is initialized with each module's planned models and their inter-module dependencies: work_order depends on site_management for the equipment model; service_invoicing depends on work_order for the work order model.

The pipeline layer then generates, in dependency order:

site_management/models/site.py and equipment.py — registered to the registry
maintenance_scheduling/models/maintenance_interval.py — with a Many2one to site.equipment, validated against the registry
work_order/models/work_order.py — with Many2one fields to both site.equipment and maintenance.interval, both registry-validated
service_invoicing/models/service_invoice.py — with a Many2one to work.order, registry-validated

Each module gets a complete set of artifacts: views with search panels and kanban/list/form layouts appropriate to the model, security groups (site_manager, maintenance_technician, invoicing_user), ACL entries for each group/model combination, menu items in a coherent hierarchy, QWeb invoice reports for service_invoicing, unit tests for model methods and field constraints, .pot translation templates, and correct __manifest__.py files with accurate dependency lists.

The developer receives a directory of four installable Odoo modules that install cleanly on first attempt, pass pylint-odoo with no errors, and have test coverage for the business logic. What they then do — extending the models, customizing the views, adjusting the security model for their org structure — is the actual interesting work. The scaffolding is not interesting. Getting the scaffolding right at scale is.

6. What's Still Hard

Factory de Odoo solves a well-defined problem cleanly. It does not solve all problems.

Complex relational graphs remain the hardest case. Many2many relations in Odoo require an explicit intermediate table name when the same pair of models participates in multiple Many2many relations. The registry tracks these table names and collision-detects on them, but for domains with dense many-to-many graphs — think a manufacturing scheduling domain where operations, resources, products, and work centers all have multiple many-to-many relationships with each other — the constraint space becomes difficult to navigate automatically. The current implementation handles common cases correctly but can fail in ways that require developer intervention on sufficiently complex graphs.

Testing stateful multi-agent pipelines is an open problem in the field, not just in this project. The pipeline is inherently stateful: the registry accumulates writes as generation proceeds, and the behavior of later agents depends on the accumulated state of earlier ones. Unit testing individual agents is straightforward. Testing the emergent behavior of the full pipeline — particularly the edge cases in orchestrator conflict resolution — requires either running the full pipeline (expensive) or building sophisticated mock registry states (complex). The current test suite covers unit-level and integration-level behavior well, but property-based testing of registry invariants across arbitrary generation sequences is a gap.

Semantic OCA search has limits. The OCA (Odoo Community Association) maintains a large catalog of community modules that implement common patterns. Factory de Odoo includes a semantic search step that checks whether a requested module already exists in OCA before generating it from scratch. The search uses embedding similarity, which works well for common domains but degrades for specialized verticals where OCA coverage is thin and the semantic distance between "what we need" and "what exists" is high. False negatives (missing an existing OCA module that would have been the right answer) are more likely than false positives here, but both happen.

7. How to Contribute

Factory de Odoo is a real project with real tests. As of writing: 2,953 tests, 33,200+ lines of code, and support for Odoo 17.0, 18.0, and 19.0 generation targets. It is actively maintained and has already generated production-deployed modules for real ERP customizations.

The two highest-value contribution areas right now are:

(1) Property-based tests for registry invariants using Hypothesis. The registry has a well-defined set of invariants: no model is referenced before it's declared; no XML ID is registered twice under different owners; the menu tree contains no cycles; the module dependency graph is acyclic. These invariants hold across all generated states, which makes them ideal candidates for property-based testing. A contributor who sets up a Hypothesis test suite that generates arbitrary sequences of registry operations and verifies invariant preservation would substantially improve the project's confidence in the registry's correctness — especially for the edge cases in conflict resolution that are hard to exercise with example-based tests.

(2) A test fixture library for complex multi-module relation graphs. One of the most common blockers for contributors is that reproducing edge cases in relational generation requires constructing realistic ERP-scale module structures. This is time-consuming and discourages contribution on the harder problems. A library of pre-built test fixtures — representing complex multi-module domains with intentional edge cases (cycles, dense many-to-many graphs, deep inheritance chains, conflicting menu hierarchies) — would make it possible for a new contributor to reproduce a bug in thirty seconds rather than thirty minutes. This is the most direct way to lower the barrier to contribution on the parts of the project that need the most work.

Beyond these two areas, the project would benefit from additional WRONG/CORRECT pair documentation for Odoo 18 and 19 specific patterns (the existing knowledge base skews toward 17.0), improvements to the OCA semantic search index, and expanded i18n coverage for the generated translation templates.

The project is fully open source. Issues are tagged with good-first-issue and help-wanted for contributors who want a guided entry point. For contributors interested in the registry or pipeline architecture specifically, the design documentation in docs/architecture/ describes the invariants and protocols in detail.

The full project is open source at https://github.com/TIFAQM/Factory-de-Odoo — give it a star if the architecture is interesting, or open an issue if you want to contribute

DEV Community

Why AI Code Generators Produce Incoherent Codebases — and How a Model Registry Fixes It

The architecture behind Factory de Odoo: a 28-agent pipeline that guarantees cross-module coherence across a full ERP module suite

Top comments (0)