DEV Community

Pavel Gajvoronski
Pavel Gajvoronski

Posted on

I'm Building a Platform That Deploys AI Companies From a Single Sentence


 I'm building an AI Company Builder — a platform where you describe a business idea in plain text, and a team of 28 AI agents researches the market, validates viability, designs the product, writes the code, creates content, and runs marketing. All autonomously.

This is not a chatbot. This is not another wrapper around ChatGPT. This is a full-stack agent orchestration platform with a two-layer architecture, persistent knowledge vault, multi-model routing across 300+ models, and a built-in marketplace for buying and selling AI-powered businesses.

I want to share the architecture, the tech stack, and the decisions I made — because I haven't seen anyone build exactly this combination yet.

The problem

Right now, if you want to launch a business with AI, you're stitching together 5-10 tools manually: ChatGPT for strategy, Lovable for code, Jasper for content, Perplexity for research, Notion for knowledge, Zapier for automation. Each tool does one thing. None of them talk to each other. And none of them understand your business as a whole.

What if one platform did it all? Not by being mediocre at everything — but by orchestrating specialized agents, each expert in their domain, all sharing context through a persistent knowledge vault?

The architecture: two layers, not one

Most agent platforms put all agents on the same level. Marketing agent, coding agent, sales agent — flat list, no hierarchy. This works for simple automation but breaks down when you're building a complete business.

I went with a two-layer approach:

Business layer — 7 manager agents that understand your specific business. Product Manager (Max), Marketing Lead (Ivy), Sales Strategist (Sam), Financial Analyst (Finn), Customer Success (Joy), Legal Advisor (Lex), and a Business Generator (Chief) that creates the whole structure from your description. These agents know your niche, your competitors, your audience.

Tool layer — 21 universal agents that do the actual work. Architect (Atlas), Designer (Maya), Frontend Dev (Kai), Backend Dev (Dev), Security (Shield), Researcher (Nova), Writer (Sage), and 14 more. These don't know your business — they know their craft. The business layer delegates to them with full context.

The key insight: business agents are per-business instances, tool agents are shared. If you run 3 businesses simultaneously, each has its own Max and Ivy, but they all share the same Atlas and Kai. This scales without multiplying costs.

Model routing: 90% quality at 7% cost

Running everything on Claude Opus 4.6 would cost a fortune. Running everything on a cheap model would produce garbage. The answer is intelligent routing.

I use OpenRouter as a gateway to 300+ models, organized in 4 tiers:

Tier Models Cost/1M tokens Use case
Free Llama 3.3 70B $0 Routing, classification
Budget DeepSeek V3, Gemini Flash $0.14-0.60 Content writing, planning
Performance MiniMax M2.7 $0.30-1.20 Coding, testing, debugging
Premium Claude Sonnet/Opus 4.6 $3-25 Architecture, security, design

MiniMax M2.7 is the secret weapon here. In real-world tests, it delivers 90% of Opus quality for 7% of the cost. It found all 6 bugs and all 10 security vulnerabilities that Opus found — the fixes were just slightly less thorough. For most coding tasks, that's more than enough.

The system also auto-escalates: if an agent fails 3 times on a cheaper model, it automatically upgrades to the next tier. And auto-downgrades: 10 consecutive successes on Sonnet? The system suggests trying M2.7 next time.

A full project milestone that costs $50-80 on all-Opus runs $10-12 with routing. That's 80-85% savings.

The research stack: not just chat, actual research

This is where I think most agent platforms fall short. They can write code and generate content — but they can't research. They don't know what's happening in the market right now.

My stack includes four self-hosted research tools:

Perplexica — open-source Perplexity alternative. AI-powered web search with cited sources. When Nova (researcher agent) needs to analyze a market, she searches the web through Perplexica and gets answers with real citations, not hallucinations.

SurfSense — open-source NotebookLM alternative. Upload documents, chat with them, get cited answers. Hybrid search (semantic + full text). Can even generate podcasts from documents.

AnythingLLM — RAG workspace for document analysis. Upload PDFs, DOCX, code files — agents query them with grounded answers.

Firecrawl — web scraping via MCP. Agents can scrape any URL into clean markdown, crawl entire websites, extract structured data.

The combination means agents can research a market, analyze competitors, scrape their pricing pages, summarize uploaded pitch decks, and cite every claim with a real source.

The gate system: think before you build

Here's what nobody else does. Before my system commits resources to building something, it analyzes whether it's worth building.

You write: "Build an online chess school for kids 6-14. Analyze viability first. Only proceed if rating is above 7/10."

The system runs a full analysis:

Criterion Weight Score
Market size 20% 8/10
Competition level 20% 7/10
Niche uniqueness 15% 9/10
Revenue potential 15% 8/10
Acquisition cost 15% 5/10
Channel accessibility 15% 8/10
Overall 7.4/10

If it passes the threshold — development begins. If not — the system explains why and suggests modifications. "Focus on children 6-10 instead of 6-14 — less competition, higher willingness to pay. Adjusted score: 8.1/10."

This saves thousands of dollars and weeks of development on ideas that won't work.

Persistent memory: the Obsidian Vault

Every research finding, every architectural decision, every bug fix, every content plan — saved as markdown notes in an Obsidian-compatible vault with git version control.

The vault isn't just storage. It's a living knowledge base:

  • Auto-indexing: Vault Librarian agent (Libra) maintains indexes, tags notes, creates links between related decisions
  • Git history: every change tracked, every note timestamped, full rollback capability
  • Memory consolidation: Libra periodically merges scattered notes into coherent knowledge structures
  • Cross-project learning: insights from one project automatically available in related projects

After 3 months of operation, the vault contains hundreds of notes — and the system is measurably smarter. Nova doesn't re-research topics she already investigated. Atlas references past ADRs when making new architecture decisions. The knowledge compounds.

Event-driven architecture: everything is observable

Every agent action emits an event to Redis pub/sub:

{
  "agent": "Atlas",
  "action": "created_adr",
  "model": "claude-opus-4.6",
  "tokens_in": 2400,
  "tokens_out": 5100,
  "cost_usd": 0.142,
  "duration_ms": 8300,
  "vault_note": "projects/chess/decisions/ADR-001.md"
}
Enter fullscreen mode Exit fullscreen mode

Multiple services subscribe: the audit logger saves to immutable JSONL, the cost tracker aggregates spending, the vault manager auto-saves results, and the live activity stream pushes to the Web UI via WebSocket.

This gives you:

  • Full audit trail for compliance (EU AI Act, GDPR)
  • Real-time cost tracking with ROI calculation ("Your agents saved $28,000 in equivalent human labor this month")
  • Live activity feed — watch your agents work in real-time
  • Kill switch — instantly halt all agent activity if something goes wrong

A2A protocol readiness

Google's Agent-to-Agent protocol (A2A) is becoming the standard for inter-platform agent communication. 50+ partners including Salesforce, SAP, and PayPal are building on it.

I'm building A2A compatibility from day one. Every agent has an Agent Card — a JSON file describing its capabilities. External agents can discover our agents, send tasks, and receive results through standardized endpoints.

Why this matters: in 2027-2028, your business agents will negotiate with supplier agents, customer agents will talk to support agents across platforms, and marketing agents will coordinate campaigns with influencer agents — all machine-to-machine. Building the protocol layer now means we're ready when this arrives.

What's coming next

The full platform has 14 milestones. I'm currently on the build phase, deploying infrastructure on a Hetzner VPS with Claude Code + GSD-2 running the development process.

What I'm building toward:

  • YouTube content pipeline: from idea to published video, fully automated
  • Business Exchange: marketplace for buying and selling AI-powered businesses
  • Cross-business learning: anonymous patterns shared across all businesses on the platform
  • 400+ integrations via Composio: Gmail, Slack, HubSpot, Notion, Jira — one MCP server
  • 7 business templates: Online Education, SaaS, Agency, E-commerce, Content, Marketplace, Coaching

Tech stack summary

Layer Technology
Server Ubuntu 24.04, Hetzner CPX31
AI Engine Claude Code CLI via OpenRouter
Orchestration GSD-2 + Ruflo
API Gateway FastAPI + Redis pub/sub
Interfaces Telegram (aiogram), React Web UI, REST API
Research Perplexica, SurfSense, AnythingLLM, Firecrawl
Integrations Composio (400+ apps), MCP servers
Memory Obsidian Vault + MCPVault + git
Payments Stripe, Paddle (MoR)

Why I'm sharing this

Two reasons. First, I genuinely believe this is where software is heading — from tools to autonomous business operators. The predictions from Dario Amodei, Sam Altman, and every major AI lab point to agents handling multi-week projects autonomously by 2028. Building the platform for this now is a bet on the near future.

Second, building in public keeps me honest. If you see flaws in the architecture, I want to know. If you're building something similar, let's compare notes. If you want to be an early user — I'll be opening access soon.

Follow the build: I'll be posting weekly updates here on dev.to with technical deep dives into each component.


What would you build if you had 28 AI agents at your command?

Top comments (0)