Every time I asked an AI assistant to add a CRUD endpoint, it tried to create 6 to 8 files.
Entity. Repository. Factory. Controller. DTO. Service. Interface. Test.
Not because the AI was wrong. Because the architecture demanded it.
I got tired of it. So I rethought the architecture from scratch — not for humans, but for the reality we're living in: AI is writing most of the code now.
The result is MicroCoreOS — an Atomic Microkernel Architecture where 1 file = 1 feature.
The Real Problem: Architecture Was Designed for Humans
Clean Architecture, Hexagonal, N-Layer — these patterns were built around human cognitive constraints. Separation of concerns made sense when a senior dev had to navigate a codebase mentally and hand off work to junior devs who owned specific layers.
But that contract has changed.
When an LLM generates a feature, it doesn't need layers for cognitive separation. It needs minimal context and a predictable pattern. Every extra file is:
- More tokens consumed (= more cost, more latency)
- More surface area for hallucination
- More files to keep consistent across the context window
- More noise in every pull request
The architecture itself became the bottleneck.
The Insight: Context Window as a Design Constraint
The core idea of MicroCoreOS is simple:
Design the architecture around the LLM's context window, not around human organizational preferences.
If the unit of work for an AI is a single feature, the unit of deployment should be a single file.
This is the Atomic Plugin: one file that contains the endpoint registration, the business logic, the database calls, and the event publishing. Everything the feature needs, nothing it doesn't.
# domains/users/plugins/create_user_plugin.py
from core.base_plugin import BasePlugin
from domains.users.models.user import UserEntity
class CreateUserPlugin(BasePlugin):
def __init__(self, http, db, event_bus, logger, auth):
self.http = http
self.db = db
self.bus = event_bus
self.logger = logger
self.auth = auth
async def on_boot(self):
self.http.add_endpoint(
"/users",
"POST",
self.handler,
tags=["Users"],
request_model=UserEntity
)
async def execute(self, data: dict):
try:
user = UserEntity(**data)
password_hash = self.auth.hash_password(user.password) if user.password else None
user_id = await self.db.execute(
"INSERT INTO users (name, email, password_hash) VALUES (?, ?, ?)",
(user.name, user.email, password_hash)
)
await self.bus.publish("user.created", {"id": user_id, "email": user.email})
return {"success": True, "data": {"id": user_id, "name": user.name}}
except Exception as e:
self.logger.error(f"Failed to create user: {e}")
return {"success": False, "error": str(e)}
async def handler(self, data: dict, context):
return await self.execute(data)
That's a complete feature. Endpoint registration, validation, database write, event publishing, error handling. One file. ~40 lines.
The Architecture: Three Layers, Implicit Auto-wiring
MicroCoreOS has exactly three concepts. If you understand these three, you understand the entire framework.
Tools — Stateless Infrastructure
Tools are pure technical capabilities with no business logic. HTTP server, database, event bus, logger, auth. They live in /tools and are completely stateless.
tools/
http_server/ → FastAPI wrapper
sqlite/ → Async SQLite
event_bus/ → Pub/Sub + Async RPC
auth/ → JWT + bcrypt
logger/ → Structured logging
The key rule: Tools never import other Tools. They are atoms.
Plugins — Stateful Business Logic
Plugins are where your application lives. Each plugin is one feature, one file, in /domains/{domain}/plugins/.
The key rule: Plugins never import other Plugins. They communicate only through the event bus.
The Kernel — The Blind Orchestrator
The Kernel (~240 lines total) does exactly three things:
- Discovers tools in
/toolsand boots them - Discovers plugins in
/domainsand resolves their dependencies by parameter name - Calls
on_boot()on each plugin to register endpoints and subscriptions
That's it. The Kernel knows zero business rules. It cannot know them by design.
# This is how dependency injection works — by parameter name
class CreateUserPlugin(BasePlugin):
def __init__(self, http, db, event_bus, logger, auth):
# The Kernel reads this signature and injects the matching tools
# No decorators. No config files. No manual wiring.
Drop a file in /domains/users/plugins/ and it boots automatically. Delete it and nothing breaks.
The AI Context Manifest
Every time the system boots, it regenerates AI_CONTEXT.md — a live inventory of every available tool and its exact method signatures.
## 🛠️ Available Tools
### 🔧 Tool: `db` (Status: ✅)
- await query(sql, params): Read data. Returns list of rows.
- await execute(sql, params): Write data. Returns last ID.
### 🔧 Tool: `event_bus` (Status: ✅)
- await publish(event_name, data): Fire-and-forget broadcast.
- await subscribe(event_name, callback): Listen for events.
- await request(event_name, data, timeout=5): Async RPC.
When an AI assistant reads this file, it knows exactly what's available without exploring the codebase. The architecture documents itself for the AI.
Measured Token Usage
These are estimated measurements based on real experience implementing the same feature across architectures.
| Architecture | Files Touched | Input + Output Tokens |
|---|---|---|
| MicroCoreOS | 1 | ~1,000 |
| Vertical Slice | 2–3 | ~1,500 |
| N-Layer | 4–5 | ~2,500 |
| Hexagonal | 5–7 | ~3,500 |
| Clean Architecture | 6–8 | ~4,000 |
74% fewer tokens means 74% less cost per feature generation, 74% faster iteration, and significantly fewer hallucinations caused by context window saturation.
Testing Is Not Complicated
One of the first objections is: "if everything is in one file, how do you test it?"
The answer is simpler than in any layered architecture. Because dependencies are injected via constructor, you mock them directly:
import pytest
from unittest.mock import AsyncMock, MagicMock
from domains.users.plugins.create_user_plugin import CreateUserPlugin
@pytest.mark.asyncio
async def test_create_user_success():
mock_db = AsyncMock()
mock_db.execute.return_value = 42
mock_bus = AsyncMock()
mock_auth = MagicMock()
mock_auth.hash_password.return_value = "hashed_pw"
plugin = CreateUserPlugin(
http=MagicMock(),
db=mock_db,
event_bus=mock_bus,
logger=MagicMock(),
auth=mock_auth
)
result = await plugin.execute({
"name": "Ana",
"email": "ana@example.com",
"password": "secret"
})
assert result["success"] is True
assert result["data"]["id"] == 42
mock_bus.publish.assert_called_once_with(
"user.created",
{"id": 42, "email": "ana@example.com"}
)
No test containers. No complex setup. No mock framework gymnastics. Just pass your mocks to the constructor.
Tools Are Interchangeable
MicroCoreOS does not compete with FastAPI or SQLAlchemy. It orchestrates them.
The http tool uses FastAPI today. If your organization uses a different HTTP framework, you write a new tool that wraps it, implementing the same interface. Your plugins — all your business logic — don't change a single line.
Your company uses Kafka → swap the event_bus tool
Your company uses PostgreSQL → swap the db tool
Your company uses Kong → swap the http tool
Your 200 plugins: untouched.
The contract between a plugin and a tool is the constructor parameter name and the methods documented in AI_CONTEXT.md. That's the entire API surface.
Event-Driven by Default
Plugins never import each other. Cross-domain communication is exclusively through the event bus.
# In CreateUserPlugin
await self.bus.publish("user.created", {"id": user_id, "email": email})
# In WelcomeEmailPlugin — completely separate file, separate domain
async def on_boot(self):
await self.bus.subscribe("user.created", self.send_welcome_email)
Delete WelcomeEmailPlugin and CreateUserPlugin doesn't know or care. Add a new SlackNotificationPlugin that also listens to user.created and no existing code changes.
This is not theoretical decoupling. It's structural decoupling enforced by the architecture itself.
The Hybrid Async Engine
The Kernel handles sync and async transparently. If a plugin method is async def, it runs on the event loop. If it's def, the Kernel offloads it to a thread pool via asyncio.to_thread automatically.
This means you can use blocking libraries — legacy database drivers, CPU-heavy processing — without freezing the system. The performance model adapts to your code, not the other way around.
What MicroCoreOS Is Not
It is not a replacement for your infrastructure choices. It is an application kernel that sits above them.
It is not opinionated about your database, your HTTP framework, or your message broker. It is opinionated about one thing: how features are structured and how they communicate.
It is not trying to solve distributed systems problems out of the box. It is trying to make the 80% of backend work — CRUD, events, background jobs, API endpoints — as fast and cheap to generate and maintain as possible.
Get Started
git clone https://github.com/theanibalos/MicroCoreOS.git
cd MicroCoreOS
uv run main.py
# Visit http://localhost:5000/docs
The full source is on GitHub. The core kernel is ~240 lines. Read it once and you'll understand every behavior of the system.
⚠️ A Note on Stability: MicroCoreOS is currently an experimental proof-of-concept in active development (v0.1-alpha). The philosophy is solid, but the framework itself will undergo breaking changes. I'm sharing this early to spark a conversation about how we architect for AI. If you want to test it, break it, or improve the Tools, PRs are highly welcome!
MicroCoreOS is MIT licensed. Built because I was tired of explaining my architecture to AI — and now AI writes my plugins without explanation.
Top comments (0)