
I'm building an AI Company Builder — a platform where you describe a business idea in plain text, and a team of 28 AI agents researches the market, validates viability, designs the product, writes the code, creates content, and runs marketing. All autonomously.
This is not a chatbot. This is not another wrapper around ChatGPT. This is a full-stack agent orchestration platform with a two-layer architecture, persistent knowledge vault, multi-model routing across 300+ models, and a built-in marketplace for buying and selling AI-powered businesses.
I want to share the architecture, the tech stack, and the decisions I made — because I haven't seen anyone build exactly this combination yet.
The problem
Right now, if you want to launch a business with AI, you're stitching together 5-10 tools manually: ChatGPT for strategy, Lovable for code, Jasper for content, Perplexity for research, Notion for knowledge, Zapier for automation. Each tool does one thing. None of them talk to each other. And none of them understand your business as a whole.
What if one platform did it all? Not by being mediocre at everything — but by orchestrating specialized agents, each expert in their domain, all sharing context through a persistent knowledge vault?
The architecture: two layers, not one
Most agent platforms put all agents on the same level. Marketing agent, coding agent, sales agent — flat list, no hierarchy. This works for simple automation but breaks down when you're building a complete business.
I went with a two-layer approach:
Business layer — 7 manager agents that understand your specific business. Product Manager (Max), Marketing Lead (Ivy), Sales Strategist (Sam), Financial Analyst (Finn), Customer Success (Joy), Legal Advisor (Lex), and a Business Generator (Chief) that creates the whole structure from your description. These agents know your niche, your competitors, your audience.
Tool layer — 21 universal agents that do the actual work. Architect (Atlas), Designer (Maya), Frontend Dev (Kai), Backend Dev (Dev), Security (Shield), Researcher (Nova), Writer (Sage), and 14 more. These don't know your business — they know their craft. The business layer delegates to them with full context.
The key insight: business agents are per-business instances, tool agents are shared. If you run 3 businesses simultaneously, each has its own Max and Ivy, but they all share the same Atlas and Kai. This scales without multiplying costs.
Model routing: 90% quality at 7% cost
Running everything on Claude Opus 4.6 would cost a fortune. Running everything on a cheap model would produce garbage. The answer is intelligent routing.
I use OpenRouter as a gateway to 300+ models, organized in 4 tiers:
| Tier | Models | Cost/1M tokens | Use case |
|---|---|---|---|
| Free | Llama 3.3 70B | $0 | Routing, classification |
| Budget | DeepSeek V3, Gemini Flash | $0.14-0.60 | Content writing, planning |
| Performance | MiniMax M2.7 | $0.30-1.20 | Coding, testing, debugging |
| Premium | Claude Sonnet/Opus 4.6 | $3-25 | Architecture, security, design |
MiniMax M2.7 is the secret weapon here. In real-world tests, it delivers 90% of Opus quality for 7% of the cost. It found all 6 bugs and all 10 security vulnerabilities that Opus found — the fixes were just slightly less thorough. For most coding tasks, that's more than enough.
The system also auto-escalates: if an agent fails 3 times on a cheaper model, it automatically upgrades to the next tier. And auto-downgrades: 10 consecutive successes on Sonnet? The system suggests trying M2.7 next time.
A full project milestone that costs $50-80 on all-Opus runs $10-12 with routing. That's 80-85% savings.
The research stack: not just chat, actual research
This is where I think most agent platforms fall short. They can write code and generate content — but they can't research. They don't know what's happening in the market right now.
My stack includes four self-hosted research tools:
Perplexica — open-source Perplexity alternative. AI-powered web search with cited sources. When Nova (researcher agent) needs to analyze a market, she searches the web through Perplexica and gets answers with real citations, not hallucinations.
SurfSense — open-source NotebookLM alternative. Upload documents, chat with them, get cited answers. Hybrid search (semantic + full text). Can even generate podcasts from documents.
AnythingLLM — RAG workspace for document analysis. Upload PDFs, DOCX, code files — agents query them with grounded answers.
Firecrawl — web scraping via MCP. Agents can scrape any URL into clean markdown, crawl entire websites, extract structured data.
The combination means agents can research a market, analyze competitors, scrape their pricing pages, summarize uploaded pitch decks, and cite every claim with a real source.
The gate system: think before you build
Here's what nobody else does. Before my system commits resources to building something, it analyzes whether it's worth building.
You write: "Build an online chess school for kids 6-14. Analyze viability first. Only proceed if rating is above 7/10."
The system runs a full analysis:
| Criterion | Weight | Score |
|---|---|---|
| Market size | 20% | 8/10 |
| Competition level | 20% | 7/10 |
| Niche uniqueness | 15% | 9/10 |
| Revenue potential | 15% | 8/10 |
| Acquisition cost | 15% | 5/10 |
| Channel accessibility | 15% | 8/10 |
| Overall | 7.4/10 |
If it passes the threshold — development begins. If not — the system explains why and suggests modifications. "Focus on children 6-10 instead of 6-14 — less competition, higher willingness to pay. Adjusted score: 8.1/10."
This saves thousands of dollars and weeks of development on ideas that won't work.
Persistent memory: the Obsidian Vault
Every research finding, every architectural decision, every bug fix, every content plan — saved as markdown notes in an Obsidian-compatible vault with git version control.
The vault isn't just storage. It's a living knowledge base:
- Auto-indexing: Vault Librarian agent (Libra) maintains indexes, tags notes, creates links between related decisions
- Git history: every change tracked, every note timestamped, full rollback capability
- Memory consolidation: Libra periodically merges scattered notes into coherent knowledge structures
- Cross-project learning: insights from one project automatically available in related projects
After 3 months of operation, the vault contains hundreds of notes — and the system is measurably smarter. Nova doesn't re-research topics she already investigated. Atlas references past ADRs when making new architecture decisions. The knowledge compounds.
Event-driven architecture: everything is observable
Every agent action emits an event to Redis pub/sub:
{
"agent": "Atlas",
"action": "created_adr",
"model": "claude-opus-4.6",
"tokens_in": 2400,
"tokens_out": 5100,
"cost_usd": 0.142,
"duration_ms": 8300,
"vault_note": "projects/chess/decisions/ADR-001.md"
}
Multiple services subscribe: the audit logger saves to immutable JSONL, the cost tracker aggregates spending, the vault manager auto-saves results, and the live activity stream pushes to the Web UI via WebSocket.
This gives you:
- Full audit trail for compliance (EU AI Act, GDPR)
- Real-time cost tracking with ROI calculation ("Your agents saved $28,000 in equivalent human labor this month")
- Live activity feed — watch your agents work in real-time
- Kill switch — instantly halt all agent activity if something goes wrong
A2A protocol readiness
Google's Agent-to-Agent protocol (A2A) is becoming the standard for inter-platform agent communication. 50+ partners including Salesforce, SAP, and PayPal are building on it.
I'm building A2A compatibility from day one. Every agent has an Agent Card — a JSON file describing its capabilities. External agents can discover our agents, send tasks, and receive results through standardized endpoints.
Why this matters: in 2027-2028, your business agents will negotiate with supplier agents, customer agents will talk to support agents across platforms, and marketing agents will coordinate campaigns with influencer agents — all machine-to-machine. Building the protocol layer now means we're ready when this arrives.
What's coming next
The full platform has 14 milestones. I'm currently on the build phase, deploying infrastructure on a Hetzner VPS with Claude Code + GSD-2 running the development process.
What I'm building toward:
- YouTube content pipeline: from idea to published video, fully automated
- Business Exchange: marketplace for buying and selling AI-powered businesses
- Cross-business learning: anonymous patterns shared across all businesses on the platform
- 400+ integrations via Composio: Gmail, Slack, HubSpot, Notion, Jira — one MCP server
- 7 business templates: Online Education, SaaS, Agency, E-commerce, Content, Marketplace, Coaching
Tech stack summary
| Layer | Technology |
|---|---|
| Server | Ubuntu 24.04, Hetzner CPX31 |
| AI Engine | Claude Code CLI via OpenRouter |
| Orchestration | GSD-2 + Ruflo |
| API Gateway | FastAPI + Redis pub/sub |
| Interfaces | Telegram (aiogram), React Web UI, REST API |
| Research | Perplexica, SurfSense, AnythingLLM, Firecrawl |
| Integrations | Composio (400+ apps), MCP servers |
| Memory | Obsidian Vault + MCPVault + git |
| Payments | Stripe, Paddle (MoR) |
Why I'm sharing this
Two reasons. First, I genuinely believe this is where software is heading — from tools to autonomous business operators. The predictions from Dario Amodei, Sam Altman, and every major AI lab point to agents handling multi-week projects autonomously by 2028. Building the platform for this now is a bet on the near future.
Second, building in public keeps me honest. If you see flaws in the architecture, I want to know. If you're building something similar, let's compare notes. If you want to be an early user — I'll be opening access soon.
Follow the build: I'll be posting weekly updates here on dev.to with technical deep dives into each component.
What would you build if you had 28 AI agents at your command?
Top comments (0)