Akan

Posted on Feb 4

Building a Free, Multi-User Telegram Bot: When Infrastructure Constraints Drive Architecture

#webdev #programming #ai #productivity

The Problem Space

At 2AM with 43% battery and no power, I needed to build a system that could:

Send randomized messages to multiple users throughout the day
Scale to handle arbitrary user counts
Cost exactly $0 to run
Deploy and forget about it

The obvious solution—Twilio's WhatsApp API—sat behind a paywall. But constraints breed creativity, and what followed was an exercise in building production-grade infrastructure with free-tier services.

Architecture Overview

The final system consists of three core components:

1. Multi-User Bot with Individual Scheduling

# Each user gets their own schedule, persisted in JSON
users = {
    "chat_id": {
        "active": True,
        "messages_per_day": 3,
        "start_hour": 8,
        "end_hour": 22
    }
}

2. APScheduler for Randomized Delivery

def schedule_user_affirmations(chat_id, messages_per_day, start_hour, end_hour):
    for i in range(messages_per_day):
        random_hour = random.randint(start_hour, end_hour - 1)
        random_minute = random.randint(0, 59)

        scheduler.add_job(
            send_affirmation_to_user,
            'cron',
            args=[chat_id],
            hour=random_hour,
            minute=random_minute,
            id=f'user_{chat_id}_msg_{i}'
        )

3. Webhook-Based Deployment

Polling vs. webhooks became critical for deployment. Telegram's API allows only one active connection per bot, which creates an interesting constraint when deploying.

The Polling Problem

Initial implementation used infinity_polling():

# Works locally, breaks in production
bot.infinity_polling()

Error:

ApiTelegramException: Error code: 409. 
Description: Conflict: terminated by other getUpdates request

This happens because:

Local instance starts polling
Deployed instance starts polling
Telegram sees two connections and terminates the newer one
Both instances keep retrying, creating a conflict loop

Solution: Webhook Architecture

if WEBHOOK_URL:
    # Production: Telegram pushes updates to us
    webhook_url = f"{WEBHOOK_URL}/{BOT_TOKEN}"
    bot.remove_webhook()
    bot.set_webhook(url=webhook_url)
else:
    # Development: We poll Telegram for updates
    bot.infinity_polling()

Flask endpoint to receive webhooks:

@app.route(f'/{BOT_TOKEN}', methods=['POST'])
def webhook():
    if request.headers.get('content-type') == 'application/json':
        json_string = request.get_data().decode('utf-8')
        update = telebot.types.Update.de_json(json_string)
        bot.process_new_updates([update])
        return '', 200
    return '', 403

Why This Matters

Polling (Development):

Bot continuously asks Telegram: "Any new messages?"
Simple, works for local testing
Cannot coexist with other instances

Webhooks (Production):

Telegram sends messages directly to your server
More efficient (no constant polling)
Multiple environments can coexist (different webhook URLs)
Production-grade approach

State Management

User preferences persist across restarts using JSON:

def load_users():
    try:
        with open(USERS_FILE, 'r') as f:
            return json.load(f)
    except FileNotFoundError:
        return {}

def save_users(users):
    with open(USERS_FILE, 'w') as f:
        json.dump(users, f, indent=2)

Trade-offs considered:

Redis/PostgreSQL: Requires additional services, kills free-tier budget
SQLite: Better for production, but adds complexity
JSON file: Simple, sufficient for <1000 users, zero infrastructure cost

For a constraint-driven project, JSON files are appropriate. The system can always migrate to a database when scale demands it.

Deployment: Free Tier Engineering

Platform: Render.com

Why Render:

True free tier (no credit card required)
Auto-deploys from GitHub
Includes SSL/HTTPS (required for Telegram webhooks)
Provides a persistent URL

Configuration (render.yaml):

services:
  - type: web
    name: affirmations-bot
    runtime: python
    buildCommand: pip install -r requirements_telegram.txt
    startCommand: python telegram_app_webhook.py
    envVars:
      - key: TELEGRAM_BOT_TOKEN
        sync: false
      - key: WEBHOOK_URL
        sync: false

The Free Tier Caveat

Render's free tier spins down after 15 minutes of inactivity. For a bot that needs to send scheduled messages, this is a problem.

Solution: UptimeRobot

Free monitoring service
Pings your app every 5 minutes
Keeps the dyno awake
Zero cost

GET https://affirmations-bot.onrender.com/health
Every 5 minutes

Scheduling Architecture

Daily reschedule pattern prevents predictability:

def reschedule_all_users():
    """Runs at midnight, generates new random times"""
    users = load_users()
    for chat_id, user_data in users.items():
        if user_data.get('active', True):
            schedule_user_affirmations(
                int(chat_id),
                user_data.get('messages_per_day', 3),
                user_data.get('start_hour', 8),
                user_data.get('end_hour', 22)
            )

# Add to scheduler
scheduler.add_job(
    reschedule_all_users,
    'cron',
    hour=0,
    minute=1,
    id='daily_reschedule'
)

Result:

User receives 3 messages daily
Times randomized each day (e.g., 9:23, 14:47, 19:12)
No predictable patterns
Feels organic, not automated

User Experience Design

Bot commands follow Telegram conventions:

@bot.message_handler(commands=['start'])
def send_welcome(message):
    # Auto-subscribe new users
    # Generate initial schedule
    # Send welcome message

@bot.message_handler(commands=['settings'])
def show_settings(message):
    # Display current config
    # Provide customization options

@bot.message_handler(commands=['pause', 'resume'])
def toggle_subscription(message):
    # User controls their subscription
    # Preserves preferences for resume

Key insight: Don't over-engineer. Users want:

/start → immediate value
/settings → control
/pause → temporary opt-out (not deletion)

Technical Challenges & Solutions

Challenge 1: Timezone Handling

Users in different timezones need messages at their local hours.

Current solution: Server time + user-specified hours

start_hour = 8  # 8 AM server time

Future enhancement:

user_timezone = pytz.timezone(user_data.get('timezone', 'UTC'))
local_time = datetime.now(user_timezone)

Challenge 2: Message Deduplication

With random scheduling, messages could theoretically collide.

Solution: APScheduler's job IDs prevent duplicates:

id=f'user_{chat_id}_msg_{i}'  # Unique per user, per message slot

Challenge 3: State Corruption

What if the server crashes mid-write?

Mitigation:

def save_users(users):
    # Atomic write pattern
    temp_file = USERS_FILE + '.tmp'
    with open(temp_file, 'w') as f:
        json.dump(users, f, indent=2)
    os.replace(temp_file, USERS_FILE)  # Atomic on POSIX

Cost Breakdown

Component	Service	Cost
Messaging API	Telegram Bot API	$0
Hosting	Render.com	$0
Uptime Monitoring	UptimeRobot	$0
Version Control	GitHub	$0
Total		$0/month

Twilio equivalent: ~$0.005/message = $0.015/day/user = $0.45/month/user

At 100 users: $45/month vs. $0.

Performance Characteristics

Single instance handles:

~100 concurrent users comfortably
~300 messages/day (3 per user)
~0.5 requests/second average
Peaks during scheduling windows

Bottlenecks:

Telegram API rate limits (30 messages/second)
Render free tier CPU/memory
JSON file I/O (becomes issue >1000 users)

Scaling path:

Migrate to PostgreSQL (~1000 users)
Horizontal scaling with Redis queue (~10k users)
Switch to paid Render tier (~100k users)

Lessons from Constraint-Driven Development

1. Start with the Free Tier

Don't prematurely optimize for scale you don't have. JSON files work until they don't.

2. Understand Your Platform's Execution Model

Polling vs. webhooks isn't just a technical detail—it's the difference between working and not working in production.

3. Constraints Force Better Architecture

No database? You design for minimal state. No always-on hosting? You make your app stateless and resilient.

4. Documentation as Infrastructure

Half the battle is making it reproducible:

git clone repo
pip install -r requirements.txt
# Add bot token to .env
python telegram_app_webhook.py

If it takes 5+ steps, you're doing it wrong.

The Meta-Problem: Environment Parity

Building from Lagos means:

Intermittent power → Local development gets interrupted
Slow/expensive internet → Downloading dependencies is costly
Limited payment options → Many services unavailable
Time zone challenges → Debugging with US-based support

These aren't excuses—they're parameters. Good engineering adapts.

Development environment:

Power: 43% battery, no outlet
Internet: 3G tethered from phone
Time: 2:47 AM
Deadline: Yesterday

Production environment:

Power: ✓ Always on
Internet: ✓ High bandwidth
Time: ✓ 24/7 availability
Cost: $0 (hard constraint)

The gap between these environments shapes the architecture. You build:

Offline-first documentation (can't count on Stack Overflow loading)
Minimal dependencies (pip install takes forever)
Aggressive caching (can't re-download on every restart)
Robust error handling (can't debug when offline)

What's Next

Immediate improvements:

Add timezone support per user
Implement message templates (user-customizable)
Add analytics dashboard (messages sent, active users)

Future architecture:

Migrate to PostgreSQL when users > 500
Add message queue (Celery + Redis) for reliability
Implement A/B testing for message timing
Add web interface for non-Telegram management

System evolution pattern:

JSON file → SQLite → PostgreSQL → Distributed DB
Single instance → Load balanced → Microservices
Monolith → Modular monolith → Services

Migrate when the pain exceeds the migration cost. Not before.

Code Repository

Full implementation: [GitHub link]

Stack:

Python 3.11
Flask (web server)
pyTelegramBotAPI (Telegram SDK)
APScheduler (job scheduling)
Render (hosting)

To deploy your own:

git clone [repo]
pip install -r requirements_telegram.txt
# Add TELEGRAM_BOT_TOKEN to .env
python telegram_app_webhook.py

Closing Thoughts

The "right" solution isn't always the obvious one. When Twilio was gated, I could have:

Paid for it (out of budget)
Given up (not an option)
Found another way (what actually happened)

Engineering isn't just about writing code—it's about navigating constraints, making trade-offs, and shipping despite the environment.

Resource-lean contexts don't produce worse engineers. They produce engineers who:

Understand trade-offs deeply
Build resilient systems by default
Know when "good enough" is actually good enough
Can build production infrastructure for $0

The feature shipped. The users are happy. The cost is zero.

That's the only metric that matters.

Technical Stack:

Telegram Bot API (webhooks)
Flask (HTTP server)
APScheduler (cron-like scheduling)
Render.com (PaaS hosting)
GitHub (CI/CD via git push)

Performance:

100 users, 3 messages/day = 300 messages/day
~0.5 req/sec average
<100ms p99 latency (webhook processing)
Zero cost at any scale under 10k users

Want to build something similar?
The repository is open source. Fork it, modify it, deploy it. All the infrastructure patterns are reusable.

Sometimes the best technology is the free technology you can ship today. Please try it here: https://web.telegram.org/k/#@my_affirmation_fr_bot

Written at 4:23 AM, 19% battery remaining, on generator power. Deployed successfully to production before the power cut out again.

DEV Community