DEV Community

Cover image for Building a Free, Multi-User Telegram Bot: When Infrastructure Constraints Drive Architecture
Akan
Akan

Posted on

Building a Free, Multi-User Telegram Bot: When Infrastructure Constraints Drive Architecture

The Problem Space

At 2AM with 43% battery and no power, I needed to build a system that could:

  • Send randomized messages to multiple users throughout the day
  • Scale to handle arbitrary user counts
  • Cost exactly $0 to run
  • Deploy and forget about it

The obvious solution—Twilio's WhatsApp API—sat behind a paywall. But constraints breed creativity, and what followed was an exercise in building production-grade infrastructure with free-tier services.

Architecture Overview

The final system consists of three core components:

1. Multi-User Bot with Individual Scheduling

# Each user gets their own schedule, persisted in JSON
users = {
    "chat_id": {
        "active": True,
        "messages_per_day": 3,
        "start_hour": 8,
        "end_hour": 22
    }
}
Enter fullscreen mode Exit fullscreen mode

2. APScheduler for Randomized Delivery

def schedule_user_affirmations(chat_id, messages_per_day, start_hour, end_hour):
    for i in range(messages_per_day):
        random_hour = random.randint(start_hour, end_hour - 1)
        random_minute = random.randint(0, 59)

        scheduler.add_job(
            send_affirmation_to_user,
            'cron',
            args=[chat_id],
            hour=random_hour,
            minute=random_minute,
            id=f'user_{chat_id}_msg_{i}'
        )
Enter fullscreen mode Exit fullscreen mode

3. Webhook-Based Deployment

Polling vs. webhooks became critical for deployment. Telegram's API allows only one active connection per bot, which creates an interesting constraint when deploying.

The Polling Problem

Initial implementation used infinity_polling():

# Works locally, breaks in production
bot.infinity_polling()
Enter fullscreen mode Exit fullscreen mode

Error:

ApiTelegramException: Error code: 409. 
Description: Conflict: terminated by other getUpdates request
Enter fullscreen mode Exit fullscreen mode

This happens because:

  1. Local instance starts polling
  2. Deployed instance starts polling
  3. Telegram sees two connections and terminates the newer one
  4. Both instances keep retrying, creating a conflict loop

Solution: Webhook Architecture

if WEBHOOK_URL:
    # Production: Telegram pushes updates to us
    webhook_url = f"{WEBHOOK_URL}/{BOT_TOKEN}"
    bot.remove_webhook()
    bot.set_webhook(url=webhook_url)
else:
    # Development: We poll Telegram for updates
    bot.infinity_polling()
Enter fullscreen mode Exit fullscreen mode

Flask endpoint to receive webhooks:

@app.route(f'/{BOT_TOKEN}', methods=['POST'])
def webhook():
    if request.headers.get('content-type') == 'application/json':
        json_string = request.get_data().decode('utf-8')
        update = telebot.types.Update.de_json(json_string)
        bot.process_new_updates([update])
        return '', 200
    return '', 403
Enter fullscreen mode Exit fullscreen mode

Why This Matters

Polling (Development):

  • Bot continuously asks Telegram: "Any new messages?"
  • Simple, works for local testing
  • Cannot coexist with other instances

Webhooks (Production):

  • Telegram sends messages directly to your server
  • More efficient (no constant polling)
  • Multiple environments can coexist (different webhook URLs)
  • Production-grade approach

State Management

User preferences persist across restarts using JSON:

def load_users():
    try:
        with open(USERS_FILE, 'r') as f:
            return json.load(f)
    except FileNotFoundError:
        return {}

def save_users(users):
    with open(USERS_FILE, 'w') as f:
        json.dump(users, f, indent=2)
Enter fullscreen mode Exit fullscreen mode

Trade-offs considered:

  • Redis/PostgreSQL: Requires additional services, kills free-tier budget
  • SQLite: Better for production, but adds complexity
  • JSON file: Simple, sufficient for <1000 users, zero infrastructure cost

For a constraint-driven project, JSON files are appropriate. The system can always migrate to a database when scale demands it.

Deployment: Free Tier Engineering

Platform: Render.com

Why Render:

  • True free tier (no credit card required)
  • Auto-deploys from GitHub
  • Includes SSL/HTTPS (required for Telegram webhooks)
  • Provides a persistent URL

Configuration (render.yaml):

services:
  - type: web
    name: affirmations-bot
    runtime: python
    buildCommand: pip install -r requirements_telegram.txt
    startCommand: python telegram_app_webhook.py
    envVars:
      - key: TELEGRAM_BOT_TOKEN
        sync: false
      - key: WEBHOOK_URL
        sync: false
Enter fullscreen mode Exit fullscreen mode

The Free Tier Caveat

Render's free tier spins down after 15 minutes of inactivity. For a bot that needs to send scheduled messages, this is a problem.

Solution: UptimeRobot

  • Free monitoring service
  • Pings your app every 5 minutes
  • Keeps the dyno awake
  • Zero cost
GET https://affirmations-bot.onrender.com/health
Every 5 minutes
Enter fullscreen mode Exit fullscreen mode

Scheduling Architecture

Daily reschedule pattern prevents predictability:

def reschedule_all_users():
    """Runs at midnight, generates new random times"""
    users = load_users()
    for chat_id, user_data in users.items():
        if user_data.get('active', True):
            schedule_user_affirmations(
                int(chat_id),
                user_data.get('messages_per_day', 3),
                user_data.get('start_hour', 8),
                user_data.get('end_hour', 22)
            )

# Add to scheduler
scheduler.add_job(
    reschedule_all_users,
    'cron',
    hour=0,
    minute=1,
    id='daily_reschedule'
)
Enter fullscreen mode Exit fullscreen mode

Result:

  • User receives 3 messages daily
  • Times randomized each day (e.g., 9:23, 14:47, 19:12)
  • No predictable patterns
  • Feels organic, not automated

User Experience Design

Bot commands follow Telegram conventions:

@bot.message_handler(commands=['start'])
def send_welcome(message):
    # Auto-subscribe new users
    # Generate initial schedule
    # Send welcome message

@bot.message_handler(commands=['settings'])
def show_settings(message):
    # Display current config
    # Provide customization options

@bot.message_handler(commands=['pause', 'resume'])
def toggle_subscription(message):
    # User controls their subscription
    # Preserves preferences for resume
Enter fullscreen mode Exit fullscreen mode

Key insight: Don't over-engineer. Users want:

  1. /start → immediate value
  2. /settings → control
  3. /pause → temporary opt-out (not deletion)

Technical Challenges & Solutions

Challenge 1: Timezone Handling

Users in different timezones need messages at their local hours.

Current solution: Server time + user-specified hours

start_hour = 8  # 8 AM server time
Enter fullscreen mode Exit fullscreen mode

Future enhancement:

user_timezone = pytz.timezone(user_data.get('timezone', 'UTC'))
local_time = datetime.now(user_timezone)
Enter fullscreen mode Exit fullscreen mode

Challenge 2: Message Deduplication

With random scheduling, messages could theoretically collide.

Solution: APScheduler's job IDs prevent duplicates:

id=f'user_{chat_id}_msg_{i}'  # Unique per user, per message slot
Enter fullscreen mode Exit fullscreen mode

Challenge 3: State Corruption

What if the server crashes mid-write?

Mitigation:

def save_users(users):
    # Atomic write pattern
    temp_file = USERS_FILE + '.tmp'
    with open(temp_file, 'w') as f:
        json.dump(users, f, indent=2)
    os.replace(temp_file, USERS_FILE)  # Atomic on POSIX
Enter fullscreen mode Exit fullscreen mode

Cost Breakdown

Component Service Cost
Messaging API Telegram Bot API $0
Hosting Render.com $0
Uptime Monitoring UptimeRobot $0
Version Control GitHub $0
Total $0/month

Twilio equivalent: ~$0.005/message = $0.015/day/user = $0.45/month/user

At 100 users: $45/month vs. $0.

Performance Characteristics

Single instance handles:

  • ~100 concurrent users comfortably
  • ~300 messages/day (3 per user)
  • ~0.5 requests/second average
  • Peaks during scheduling windows

Bottlenecks:

  1. Telegram API rate limits (30 messages/second)
  2. Render free tier CPU/memory
  3. JSON file I/O (becomes issue >1000 users)

Scaling path:

  • Migrate to PostgreSQL (~1000 users)
  • Horizontal scaling with Redis queue (~10k users)
  • Switch to paid Render tier (~100k users)

Lessons from Constraint-Driven Development

1. Start with the Free Tier

Don't prematurely optimize for scale you don't have. JSON files work until they don't.

2. Understand Your Platform's Execution Model

Polling vs. webhooks isn't just a technical detail—it's the difference between working and not working in production.

3. Constraints Force Better Architecture

No database? You design for minimal state. No always-on hosting? You make your app stateless and resilient.

4. Documentation as Infrastructure

Half the battle is making it reproducible:

git clone repo
pip install -r requirements.txt
# Add bot token to .env
python telegram_app_webhook.py
Enter fullscreen mode Exit fullscreen mode

If it takes 5+ steps, you're doing it wrong.

The Meta-Problem: Environment Parity

Building from Lagos means:

  • Intermittent power → Local development gets interrupted
  • Slow/expensive internet → Downloading dependencies is costly
  • Limited payment options → Many services unavailable
  • Time zone challenges → Debugging with US-based support

These aren't excuses—they're parameters. Good engineering adapts.

Development environment:

Power: 43% battery, no outlet
Internet: 3G tethered from phone
Time: 2:47 AM
Deadline: Yesterday
Enter fullscreen mode Exit fullscreen mode

Production environment:

Power: ✓ Always on
Internet: ✓ High bandwidth
Time: ✓ 24/7 availability
Cost: $0 (hard constraint)
Enter fullscreen mode Exit fullscreen mode

The gap between these environments shapes the architecture. You build:

  • Offline-first documentation (can't count on Stack Overflow loading)
  • Minimal dependencies (pip install takes forever)
  • Aggressive caching (can't re-download on every restart)
  • Robust error handling (can't debug when offline)

What's Next

Immediate improvements:

  1. Add timezone support per user
  2. Implement message templates (user-customizable)
  3. Add analytics dashboard (messages sent, active users)

Future architecture:

  1. Migrate to PostgreSQL when users > 500
  2. Add message queue (Celery + Redis) for reliability
  3. Implement A/B testing for message timing
  4. Add web interface for non-Telegram management

System evolution pattern:

JSON file → SQLite → PostgreSQL → Distributed DB
Single instance → Load balanced → Microservices
Monolith → Modular monolith → Services
Enter fullscreen mode Exit fullscreen mode

Migrate when the pain exceeds the migration cost. Not before.

Code Repository

Full implementation: [GitHub link]

Stack:

  • Python 3.11
  • Flask (web server)
  • pyTelegramBotAPI (Telegram SDK)
  • APScheduler (job scheduling)
  • Render (hosting)

To deploy your own:

git clone [repo]
pip install -r requirements_telegram.txt
# Add TELEGRAM_BOT_TOKEN to .env
python telegram_app_webhook.py
Enter fullscreen mode Exit fullscreen mode

Closing Thoughts

The "right" solution isn't always the obvious one. When Twilio was gated, I could have:

  1. Paid for it (out of budget)
  2. Given up (not an option)
  3. Found another way (what actually happened)

Engineering isn't just about writing code—it's about navigating constraints, making trade-offs, and shipping despite the environment.

Resource-lean contexts don't produce worse engineers. They produce engineers who:

  • Understand trade-offs deeply
  • Build resilient systems by default
  • Know when "good enough" is actually good enough
  • Can build production infrastructure for $0

The feature shipped. The users are happy. The cost is zero.

That's the only metric that matters.


Technical Stack:

  • Telegram Bot API (webhooks)
  • Flask (HTTP server)
  • APScheduler (cron-like scheduling)
  • Render.com (PaaS hosting)
  • GitHub (CI/CD via git push)

Performance:

  • 100 users, 3 messages/day = 300 messages/day
  • ~0.5 req/sec average
  • <100ms p99 latency (webhook processing)
  • Zero cost at any scale under 10k users

Want to build something similar?
The repository is open source. Fork it, modify it, deploy it. All the infrastructure patterns are reusable.

Sometimes the best technology is the free technology you can ship today. Please try it here: https://web.telegram.org/k/#@my_affirmation_fr_bot


Written at 4:23 AM, 19% battery remaining, on generator power. Deployed successfully to production before the power cut out again.

Top comments (0)