DEV Community

ModelHub Dev
ModelHub Dev

Posted on • Originally published at modelhub-api.com

How I Built 18 AI Employees in One Telegram Bot (Architecture Deep Dive)

How I Built 18 AI Employees in One Telegram Bot (Architecture Deep Dive)

A few months ago, I found myself spending way too much time switching between tools — checking shipping rates, responding to customer inquiries, monitoring inventory, posting on social media. Each task had its own app, its own login, its own notification system.

So I did what any developer would do: I built a bot army.

The result is ModelHub — a single Telegram bot that gives businesses access to 18 different AI "employees," each specialized in a different role. Here's how it works under the hood.


The Problem: Too Many SaaS Tools, Not Enough Integration

Most small businesses I work with have the same problem:

  • They use 5-10 different SaaS tools
  • Employees spend hours context-switching
  • Automation tools are either too expensive or too brittle
  • Custom development is out of budget

I wanted one interface. One place where a business owner (or a team lead, or a support manager) could type a message and get the right AI for whatever they needed.

The Multi-Bot Architecture

The system is built around a simple concept: one hub, many workers.

User → Telegram → Hub Bot → Worker Bot (specialized role)
                        ↕
                   User Selection
Enter fullscreen mode Exit fullscreen mode

Hub Bot (The Manager)

The main bot you interact with is the dispatcher. When you send it a message:

  1. It presents available AI roles as inline buttons
  2. You pick which "employee" you need
  3. It spawns a direct conversation with that worker bot
  4. The worker handles your request end-to-end
  5. When done, you're returned to the hub

This keeps everything clean. Each worker bot has its own context window, its own prompt, its own conversation history. They don't interfere with each other.

Worker Bots (The Employees)

Each worker bot is an individual Telegram bot with a different personality and skill set. Currently I've built 18, including:

Bot Specialty
Trade Clerk Shipping rates, customs docs, trade compliance
E-commerce Agent Product listings, inventory, order management
Customer Service Agent Handle returns, complaints, FAQs
Content Writer Blog posts, social media copy, product descriptions
Data Analyst Spreadsheet analysis, sales trends, reporting
HR Assistant Scheduling, employee queries, onboarding
Tech Support Debugging, setup guides, technical docs
Marketing Bot Ad copy, campaign ideas, A/B test suggestions
Translator Multi-language translation with context awareness
Researcher Web research, competitor analysis, market intel
And 8 more for specific niche workflows

Each one shares a common codebase but has a unique system prompt, behavior rules, and tool access.

Technical Stack

The whole thing runs on surprisingly modest infrastructure:

- Language: Python 3.11
- Framework: python-telegram-bot + Pyrogram
- API Server: Flask + gunicorn
- Database: SQLite (with WAL mode for concurrency)
- Deployment: $6/month Contabo VPS (Germany)
- LLM: GPT-4o-mini / Claude 3 Haiku (role-dependent)
Enter fullscreen mode Exit fullscreen mode

Why Python + Flask + python-telegram-bot?

I've been building Telegram bots for years, and this combo is the sweet spot for reliability vs complexity:

  • python-telegram-bot handles webhook registration, message routing, and inline keyboards beautifully
  • Pyrogram handles the MTProto layer for the worker bots (they use userbot-style interaction where needed)
  • Flask with gunicorn keeps the webhook server lightweight — no FastAPI overhead when you don't need async for every request
  • SQLite with WAL mode handles concurrent reads without a dedicated database server

The Core Insight: One Codebase, 18 Personalities

The biggest engineering decision was keeping a single codebase for all 18 bots.

Instead of 18 separate repos (which would be a nightmare to maintain), every bot loads from the same code. The difference is in the configuration:

# Simplified worker config
BOTS = {
    "trade_clerk": {
        "token": os.getenv("BOT_TOKEN_TRADE"),
        "system_prompt": SYSTEM_PROMPT_TRADE_CLERK,
        "tools": ["shipping_api", "customs_api", "currency_api"],
        "temperature": 0.2,
        "max_context": 16000
    },
    "ecommerce": {
        "token": os.getenv("BOT_TOKEN_ECOMMERCE"),
        "system_prompt": SYSTEM_PROMPT_ECOMMERCE,
        "tools": ["shopify_api", "inventory_api", "order_api"],
        "temperature": 0.3,
        "max_context": 16000
    },
    # ... 16 more
}
Enter fullscreen mode Exit fullscreen mode

Each worker bot process is spawned as a separate thread with its own webhook. The shared code means I can push a bug fix once and it applies to all 18 bots. New features go through the same pipeline.

The "personality" comes purely from prompt engineering. There's no fine-tuning. Just carefully crafted system prompts that define:

  • The bot's persona and tone
  • Its knowledge boundaries
  • Which APIs it can call
  • Escalation rules ("If you can't handle this, tell the user I'll forward this to a human")

Prompt Engineering at Scale

The hardest part wasn't the code — it was the prompts. Each bot needs to stay in character while being useful.

Key tricks I learned:

  1. Role-lock early — Put the persona definition in the first 200 tokens so the model anchors on it
  2. Tool definitions over examples — Instead of showing 50 examples of "how to respond," define what tools it has and let the model figure out the rest
  3. Hard constraints in the post-amble — After the main system prompt, add a "RULES" section in ALL CAPS for things it must never do
  4. Context budget per role — Trade clerk needs different token limits than content writer

How Pricing Works

The service runs on a freemium model:

  • Free trial: 300 messages, no credit card
  • Single role: $12.99/month (rent one AI employee)
  • Three roles: $29.99/month (pick any three)
  • Full access: $99/year (all 18 roles)

The pricing was a deliberate choice. At $12.99/role, it's cheaper than a single SaaS subscription for most of these tasks. And most businesses only need 2-3 roles regularly.

Infrastructure Reality Check

I'm running this on a $6/month Contabo VPS in Germany. That's it.

Handle 18 concurrent webhook listeners on this tiny box? Yes. Each bot is a Flask app behind gunicorn, and the total memory usage is about 480MB for all 18 (roughly 25MB per bot process for the webhook handler).

The real magic is that the LLM calls don't happen on this server — they go out to OpenAI/Anthropic APIs. So the VPS is just handling routing, prompt construction, and response formatting. Each request takes about 150-300ms of local processing, with the rest being LLM inference time.

What I'd Do Differently Next Time

If I were building this again:

  1. Use a message queue — Currently all bots register webhooks independently. A shared queue would simplify deployment
  2. Add persistent memory — Individual bot conversations don't share context. Sometimes I wish the trade clerk knew what the e-commerce bot just told the user
  3. Database migration — SQLite is fine for MVP, but I'd move to PostgreSQL for anything beyond personal use
  4. Containerize sooner — The VPS is manageable, but Docker would make scaling to new instances instant

Try It Yourself

If you want to check it out, the free trial is 300 messages — no signup, just start a conversation. The hub bot introduces you and you pick what roles you need.

The bot: @modelhub_bot

Or if you're a developer and want to build something similar, the architecture is straightforward: one Flask app per bot, all sharing a codebase, differentiated by prompt and tool config. The rest is just scaling the same pattern.


Built with Python, Flask, gunicorn, python-telegram-bot, running on a $6 Contabo VPS. AI models provided by OpenAI and Anthropic.

Top comments (0)