DEV Community

Chalom Ellezam
Chalom Ellezam

Posted on

How I built a Discord 'ship-tracker' bot in a weekend (and the 3-process architecture that keeps it alive 24/7)

Disclosure: I'm a senior backend tech lead and I run HostingGuru. This bot runs on HostingGuru's Pro tier β€” but the architecture (web service + worker + scheduled job) works on any platform that supports those three primitives. I'll point out where each piece runs.


I co-run a small Discord community for indie founders building dev tools. About 220 members, mostly early-stage SaaS people, lots of Claude Code / Cursor enthusiasts. Every Monday I used to manually scroll through the previous week's #i-shipped channel and write a digest message: "this week we shipped X, Y, Z."

It took 30 minutes every Monday morning. After 5 weeks of it I did the math β€” 30 min Γ— 52 weeks = 26 hours a year of me doing what a bot could do better. So one Saturday I built ShipTrack, the bot that's been keeping my Mondays free for 6 months now.

This is the build log. It's mostly about an architecture decision (3 separate processes instead of 1) that turned out to be the difference between "bot keeps crashing" and "bot just works."

What the bot does

Three things, in order of complexity:

  1. Listens for the /ship slash command. When a member runs /ship "Launched my AI todo app β€” feedback welcome: link.com", the bot logs the launch into a database and reacts with πŸš€ in the channel.
  2. Tracks #i-shipped channel messages. When anyone posts in that channel (without slash command), the bot detects launch-shaped content (heuristic: contains a URL + at least one of "shipped", "launched", "live"), logs it, reacts.
  3. Posts a weekly digest every Monday at 9am UTC. The bot pulls all launches from the last 7 days, formats them into a nice list, and posts it to #announcements with @-mentions of the founders.

That's it. Three things. But they map to three completely different kinds of computation, which is where v1 went wrong.

v1: the naive setup that crashed in 15 minutes

I started simple. One Node.js file. node bot.js. Deploy to a Render free web service. Done in 30 minutes.

It worked on my laptop. It worked for the first 14 minutes after deploy. Then Render's free tier put the service to sleep due to no incoming HTTP traffic β€” and a Discord bot doesn't get HTTP traffic by default. It maintains a long-lived WebSocket connection to Discord's gateway. Render couldn't see that traffic. To Render, my bot was idle. So Render killed it.

When the bot came back from sleep 30 seconds later, it tried to reconnect to Discord's gateway. Discord saw two sessions for the same bot. The old session got disconnected with a 4008 Reconnect and the new one inherited some weird state. Members started seeing the bot react to messages twice. Slash commands timed out.

This is the kind of bug that takes a long time to diagnose if you've never seen it before, because everything looks fine in your logs. There's no error, just slightly wrong behavior. I wasted 4 hours.

Why Discord bots are weirder than they look

The thing nobody tells you when you start: a Discord bot has two completely different communication channels with Discord's servers, and they have totally different operational requirements.

Channel 1: the gateway (WebSocket, persistent).

  • The bot opens a WebSocket to wss://gateway.discord.gg
  • Stays open forever
  • Receives every event in real time (member joined, message posted, reaction added)
  • Sends heartbeats every 41.25 seconds
  • If the connection drops for >60 seconds, you have to fully re-authenticate and resync state

Channel 2: slash commands (HTTP, on-demand).

  • Discord POSTs to YOUR endpoint when a user runs a slash command
  • You have 3 seconds to respond or Discord shows "interaction failed" to the user
  • Public HTTP endpoint with signed payload verification

These two channels don't fit on the same kind of host. The gateway needs always-on. The slash command webhook needs public HTTPS that wakes up fast. Most "deploy your Node app" flows assume one or the other, not both.

v2: three processes, three responsibilities

The architecture I landed on has three pieces:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  WEB SERVICE          β”‚   β”‚  WORKER               β”‚   β”‚  SCHEDULED SCRIPT     β”‚
β”‚  HTTPS endpoint       β”‚   β”‚  Always-on process    β”‚   β”‚  Runs Monday 9am UTC  β”‚
β”‚  Slash command webhookβ”‚   β”‚  Discord gateway      β”‚   β”‚  Generates weekly     β”‚
β”‚  /api/discord/interactβ”‚   β”‚  WebSocket connection β”‚   β”‚  digest               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚                         β”‚                          β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        β”‚
                                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                β”‚  Postgres    β”‚
                                β”‚  (launches)  β”‚
                                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Enter fullscreen mode Exit fullscreen mode

Three separate deployments, one shared database. Each process does what it's good at and nothing else.

Process 1: the web service (slash commands)

This is a tiny Express app. One endpoint. Returns under 1 second.

// web-service/server.js
import express from 'express';
import { verifyKey } from 'discord-interactions';
import { db } from './db.js';

const app = express();
app.use(express.json({
  verify: (req, _res, buf) => { req.rawBody = buf; }
}));

app.post('/api/discord/interact', async (req, res) => {
  // 1. Verify Discord signed the request
  const signature = req.get('X-Signature-Ed25519');
  const timestamp = req.get('X-Signature-Timestamp');
  const valid = verifyKey(req.rawBody, signature, timestamp, process.env.DISCORD_PUBLIC_KEY);
  if (!valid) return res.status(401).send('invalid signature');

  // 2. Discord sometimes pings to check liveness
  if (req.body.type === 1) return res.json({ type: 1 });

  // 3. Slash command β€” log the launch and respond fast
  if (req.body.type === 2 && req.body.data?.name === 'ship') {
    const userId = req.body.member.user.id;
    const username = req.body.member.user.username;
    const text = req.body.data.options?.[0]?.value || '';

    await db.launches.insert({
      user_id: userId,
      username,
      text,
      channel_id: req.body.channel_id,
      created_at: new Date(),
    });

    return res.json({
      type: 4,
      data: { content: `πŸš€ Logged your ship, ${username}!` },
    });
  }

  res.json({ type: 4, data: { content: 'Unknown command' } });
});

app.listen(process.env.PORT || 3000);
Enter fullscreen mode Exit fullscreen mode

Deploy this as a normal web service. It can sleep on free tiers β€” Discord sends a request only when someone runs /ship, and 1 second of cold start before responding is fine. (For HostingGuru, I picked the Hobby tier with the always-on free guarantee anyway, but the architecture works either way.)

Process 2: the worker (gateway + reactions)

This is the long-running part. It opens the WebSocket connection to Discord and listens for messages. It can't sleep. Ever.

// worker/bot.js
import { Client, GatewayIntentBits, Events } from 'discord.js';
import { db } from './db.js';

const client = new Client({
  intents: [
    GatewayIntentBits.Guilds,
    GatewayIntentBits.GuildMessages,
    GatewayIntentBits.MessageContent,
  ],
});

const SHIPPED_CHANNEL_ID = process.env.SHIPPED_CHANNEL_ID;

client.on(Events.MessageCreate, async (msg) => {
  if (msg.author.bot) return;
  if (msg.channel.id !== SHIPPED_CHANNEL_ID) return;

  // Heuristic: contains a URL + a "shipped"-ish word
  const hasUrl = /https?:\/\/\S+/.test(msg.content);
  const hasShipWord = /\b(shipped|launched|live|released)\b/i.test(msg.content);
  if (!hasUrl || !hasShipWord) return;

  await db.launches.insert({
    user_id: msg.author.id,
    username: msg.author.username,
    text: msg.content,
    channel_id: msg.channel.id,
    message_id: msg.id,
    created_at: new Date(),
  });

  await msg.react('πŸš€');
});

client.on(Events.ClientReady, () => {
  console.log(`ShipTrack online as ${client.user.tag}`);
});

client.login(process.env.DISCORD_BOT_TOKEN);
Enter fullscreen mode Exit fullscreen mode

Deploy this as a background worker. On HostingGuru this is the Pro tier worker process type β€” same Procfile-style declaration as Heroku's old worker: line:

# hostingguru.yml (or similar config)
processes:
  bot:
    type: worker
    command: node worker/bot.js
    always_on: true
Enter fullscreen mode Exit fullscreen mode

The platform keeps it running. If it crashes, it restarts. If you push new code, it gracefully reconnects. No HTTP traffic required to keep it alive β€” that's the whole point of a worker process type vs a web service.

Process 3: the scheduled script (weekly digest)

This one runs once a week. It's an "on-demand" script β€” runs, finishes, exits. Costs almost nothing.

// scripts/weekly-digest.js
import { Client, GatewayIntentBits } from 'discord.js';
import { db } from './db.js';

const client = new Client({ intents: [GatewayIntentBits.Guilds] });
await client.login(process.env.DISCORD_BOT_TOKEN);

const since = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000);
const launches = await db.launches.find({
  created_at: { $gte: since },
});

if (launches.length === 0) {
  await client.destroy();
  process.exit(0);
}

const formatted = launches
  .map(l => `β€’ <@${l.user_id}> shipped: ${l.text.slice(0, 200)}`)
  .join('\n');

const channel = await client.channels.fetch(process.env.ANNOUNCEMENTS_CHANNEL_ID);
await channel.send({
  content: `**πŸ“¦ This week we shipped (${launches.length} launches):**\n\n${formatted}`,
  allowedMentions: { users: [] }, // notify in formatting only, don't ping
});

await client.destroy();
process.exit(0);
Enter fullscreen mode Exit fullscreen mode

On HostingGuru, this runs as an on-demand script triggered by a schedule:

processes:
  weekly-digest:
    type: script
    command: node scripts/weekly-digest.js
    schedule: "0 9 * * 1"  # every Monday at 9:00 UTC
Enter fullscreen mode Exit fullscreen mode

The platform spins up an ephemeral container at the scheduled time, runs the script, captures the output, exits. You pay for ~3 seconds of compute per week. If you've ever fought with Heroku Scheduler, you'll appreciate that the script lives in your repo, version-controlled, with the same env vars as the rest of your app.

Why this architecture matters

The naive temptation is to put all three in one Node process: HTTP server + Discord client + a setInterval for the digest. Don't. Three reasons:

  1. Crashes blast radius. If your slash-command handler throws, the gateway connection survives. If your gateway disconnects mid-deploy, the slash commands keep working. Each process is independently restartable.
  2. Scaling differs. If you have 5,000 slash commands an hour, you scale the web service. The worker stays at 1 instance (you only need one Discord gateway connection per bot). Different processes, different scaling profiles.
  3. Costs differ. The worker burns CPU cycles 24/7 just maintaining a heartbeat. The script runs 3 seconds a week. Putting them on the same dyno is paying always-on prices for a once-a-week task.

This three-process pattern is what you want for any bot or background-heavy service: not just Discord. Slack apps. Telegram bots. Webhook receivers with async fanout. The shape repeats.

What I learned

  1. WebSocket gateway = needs a worker, not a web service. Free web tiers will sleep your bot. Workers don't sleep on platforms that respect the worker primitive.
  2. Slash commands β‰  gateway. They're HTTP, you can host them anywhere, but the 3-second response cap is real. Don't do heavy work inline β€” log to DB, respond, finish processing async.
  3. Use a real database, not in-memory state. I tried "just use a JSON file" for v0. Workers restart, files vanish, members lost their launch history once. Two days later I wired Postgres.
  4. Scheduled scripts > setInterval. A setInterval in your worker tied to wall-clock time will drift, miss runs during deploys, and double-fire if you scale to 2 instances. A scheduled script run as a separate process is exactly-once, exactly-on-time.
  5. Always reply within 3 seconds. If your slash command handler does anything slow (database query > 1s, external API), respond with a deferred response (type: 5) and follow up later via Discord's webhook.

When this architecture is overkill

If you're building a bot for 5 friends to play a Discord trivia game, run it in one Node process on your laptop. You don't need three processes. You don't need a database. You probably don't need slash commands.

The three-process pattern starts paying off when at least one of these is true:

  • The bot is mission-critical (community would notice if it's down).
  • The bot has > ~50 users sending it traffic.
  • You need scheduled jobs that must run even if the bot crashed yesterday.
  • You're deploying it on a platform where free web tiers sleep.

For ShipTrack at 220 members + weekly digest, all four were true. So the three-process setup paid for itself the first time the worker crashed at 2am and the slash commands kept working through it.

A note on HostingGuru

The reason I'm building on HostingGuru (besides the obvious "I run it" disclosure at the top) is that the three primitives I needed β€” web service, worker, on-demand scheduled script β€” are first-class citizens on the platform. Same repo, same env vars, three lines of YAML config. No fighting with Heroku Scheduler vs Heroku dynos vs Heroku one-off heroku run jobs. No spinning up separate ECS task definitions on AWS.

If you're building anything with these three shapes β€” and most bots, webhook receivers, and background-heavy services have them β€” Pro tier (€35/mo) gets you 10 services with workers and on-demand scripts included. The free Starter tier supports the web service piece if you want to wire your worker elsewhere.

What's next

ShipTrack v3 is on my todo list. I want to add:

  • An LLM-powered launch summary at the bottom of each digest ("this week's theme: AI productivity tools")
  • A /profile @user command that shows someone's all-time launches
  • A leaderboard of "most active shipper this quarter"

If you've built a Discord bot recently and have hard-won lessons, I'd love to hear them in the comments. Especially the embarrassing v1 stories.


Previous posts in this series:

1. Heroku just went into "sustaining engineering mode." Here are 5 alternatives whose free tier actually doesn't sleep

2. I built my MVP with Claude Code. Now I need to deploy it. Here's what nobody tells you.

3. Your AI app is silently burning $2,000/month and you don't know it.

4. Telegram alerts for any production app β€” a 5-minute setup.

Top comments (0)