<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Chalom Ellezam</title>
    <description>The latest articles on DEV Community by Chalom Ellezam (@chalom_ellezam_5989bce65e).</description>
    <link>https://dev.to/chalom_ellezam_5989bce65e</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3849936%2F958b9107-f1c7-43d7-b3ac-90788b7e70fd.png</url>
      <title>DEV Community: Chalom Ellezam</title>
      <link>https://dev.to/chalom_ellezam_5989bce65e</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/chalom_ellezam_5989bce65e"/>
    <language>en</language>
    <item>
      <title>Your indie SaaS has zero working Postgres backups. Here's the 20-minute fix (and the drill you need to run before you sleep tonight).</title>
      <dc:creator>Chalom Ellezam</dc:creator>
      <pubDate>Thu, 14 May 2026 08:05:30 +0000</pubDate>
      <link>https://dev.to/chalom_ellezam_5989bce65e/your-indie-saas-has-zero-working-postgres-backups-heres-the-20-minute-fix-and-the-drill-you-need-2hpj</link>
      <guid>https://dev.to/chalom_ellezam_5989bce65e/your-indie-saas-has-zero-working-postgres-backups-heres-the-20-minute-fix-and-the-drill-you-need-2hpj</guid>
      <description>&lt;p&gt;&lt;em&gt;I'm a senior backend tech lead in Paris and I run HostingGuru, a managed PaaS. I'll mention HG exactly once near the end. Everything else in this article works on any platform you ship on.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A founder DMed me last month about a Render Postgres instance that had been humming along for nine months without a hiccup. Stripe charges going through, customers happy, MRR pushing past €4k. He wanted my opinion on something else, but during the call I asked how often his database backed up. He paused, opened the Render dashboard, clicked Postgres, then Backups, and saw exactly one snapshot from the day he provisioned the instance.&lt;/p&gt;

&lt;p&gt;This is not rare. I review side projects and small SaaS stacks every week, and "zero working backups" is the single most common operational bug in the deployment of someone who shipped fast and learned to operate later. It's also the cheapest bug in the world to fix. You can wire up a working strategy in twenty minutes. The reason most solo founders don't is that the canonical advice (configure AWS RDS, set up S3 lifecycle policies, write a Lambda, install Datadog) is written for a team of four. So we're skipping all of that.&lt;/p&gt;

&lt;h2&gt;
  
  
  The lie we tell ourselves about "managed" databases
&lt;/h2&gt;

&lt;p&gt;Almost every managed Postgres provider has a "backups" tab. Render does. Supabase does. Railway does. Neon does. AWS RDS does. People glance at this tab once, see the word "automatic," and assume the problem is solved.&lt;/p&gt;

&lt;p&gt;It is not solved, for three reasons.&lt;/p&gt;

&lt;p&gt;First, the default retention window on free and hobby tiers is much shorter than founders think. Render's hobby Postgres retains daily snapshots, but only for a few days, and the snapshots stop the moment you exceed plan limits. Supabase free retention covers a single day on the current tier. Neon has branching but no automatic point-in-time recovery on the free plan past 24 hours. None of this is hidden, but I have not yet met an indie founder who has read the fine print on the plan they signed up for in 2024.&lt;/p&gt;

&lt;p&gt;Second, "managed" backups are usually stored on the same vendor as your live database. If your account is suspended (billing failure, terms-of-service trigger, a misunderstanding on a 2am support ticket), your backups vanish with the rest of the instance. I have watched this happen twice in two years. Both founders had paying customers. Both lost data they would have paid me five figures to recover. There was nothing to recover.&lt;/p&gt;

&lt;p&gt;Third, the restore path is almost never tested. A snapshot you cannot restore from in under fifteen minutes is not a backup. It is hope.&lt;/p&gt;

&lt;p&gt;Working backups for a solo founder satisfy three properties: they run automatically, they live somewhere your primary vendor cannot touch, and you have personally restored from them at least once.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a real backup strategy looks like for a 1-person SaaS
&lt;/h2&gt;

&lt;p&gt;You want three things in place. None of them require a DevOps hire.&lt;/p&gt;

&lt;p&gt;A nightly logical dump of your production database, pushed to an external store. "Logical" means &lt;code&gt;pg_dump&lt;/code&gt;, not a filesystem snapshot. Logical dumps are slower than snapshots, but they are portable: you can restore them onto any Postgres of compatible major version, on any provider, from any laptop. For a SaaS with a database under 10 GB (which is most indie SaaS in their first two years), this is the right primitive.&lt;/p&gt;

&lt;p&gt;A retention policy of at least thirty days, daily. For most products, the urgent question is not "what was the data five minutes ago" (that's what your live database is for). The question is "what did the data look like before the migration I shipped on Tuesday that quietly nuked the &lt;code&gt;users.timezone&lt;/code&gt; column." Thirty daily snapshots covers nearly every realistic incident I've seen in fifteen years. If you need point-in-time recovery within seconds, you are past the threshold of this article and you should pay an ops person.&lt;/p&gt;

&lt;p&gt;A restore drill, run once, written down. The single most useful operational habit I've ever developed is restoring a backup to a scratch database, running a quick query against it, and timing how long the whole thing took. The first time I did this on a project at koodos labs back in NYC, the restore "worked" but the encoding settings on the receiving instance differed enough that a couple of emoji-heavy columns came back mangled. Better to find that out on a Sunday afternoon than during an incident.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 4-line cron job that gets you 80% there
&lt;/h2&gt;

&lt;p&gt;Here is the smallest setup that satisfies all three properties. Drop it on any host with cron and &lt;code&gt;pg_dump&lt;/code&gt; available, plus credentials to write to one external object store (Backblaze B2, Cloudflare R2, Wasabi, or AWS S3 if you must).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;TS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; +%Y%m%d-%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/tmp/backup-&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;

pg_dump &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DATABASE_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;plain &lt;span class="nt"&gt;--no-owner&lt;/span&gt; &lt;span class="nt"&gt;--no-privileges&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="nt"&gt;-9&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

rclone copyto &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"b2:my-bucket/postgres/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;

&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four substantive lines. Put it in &lt;code&gt;/etc/cron.daily&lt;/code&gt; (or your platform's scheduled-jobs feature), set &lt;code&gt;DATABASE_URL&lt;/code&gt; and the &lt;code&gt;rclone&lt;/code&gt; config in env vars, and you have a daily off-vendor backup. &lt;code&gt;rclone&lt;/code&gt; is one binary, no dependencies, and it talks to virtually every cloud storage provider with the same syntax.&lt;/p&gt;

&lt;p&gt;For retention, give the bucket a lifecycle rule: keep objects for 30 days, then delete. Cloudflare R2 and Backblaze B2 both have these in their UI under "Bucket settings." You don't need to write code for the rotation, just configure it once.&lt;/p&gt;

&lt;p&gt;What this setup does not do: it does not protect you against a backup that is silently broken (a dump that says "succeeded" but is missing tables because of a permission issue, or a &lt;code&gt;gzip&lt;/code&gt; that truncated because the disk filled up). The simplest defense is a size sanity check. If today's compressed dump is dramatically smaller than yesterday's, something is wrong, and you want a Telegram message before you find out the hard way.&lt;/p&gt;

&lt;p&gt;Here is the version I actually use, which adds that check.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

&lt;span class="nv"&gt;TS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt; +%Y%m%d-%H%M%S&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/tmp/backup-&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;
&lt;span class="nv"&gt;LAST_SIZE_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/var/lib/backups/last-size"&lt;/span&gt;

pg_dump &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DATABASE_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;plain &lt;span class="nt"&gt;--no-owner&lt;/span&gt; &lt;span class="nt"&gt;--no-privileges&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;gzip&lt;/span&gt; &lt;span class="nt"&gt;-9&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="nv"&gt;SIZE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;stat&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt;%s &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;LAST&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LAST_SIZE_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SIZE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;RATIO&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SIZE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="nv"&gt;b&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LAST&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s1"&gt;'BEGIN{print a/b}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="s2"&gt;"BEGIN {exit !(&lt;/span&gt;&lt;span class="nv"&gt;$RATIO&lt;/span&gt;&lt;span class="s2"&gt; &amp;lt; 0.8)}"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"https://api.telegram.org/bot&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TG_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/sendMessage"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;chat_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TG_CHAT&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nv"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"Backup shrank to &lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;RATIO&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;x of yesterday. Investigate."&lt;/span&gt;
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SIZE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LAST_SIZE_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
rclone copyto &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"b2:my-bucket/postgres/&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TS&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.sql.gz"&lt;/span&gt;
&lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you don't have Telegram alerting wired up yet, see article #4 in this series. The Telegram piece takes five minutes and is the thing I'd put on every project before I'd install Sentry.&lt;/p&gt;

&lt;p&gt;A note for the Postgres pedants: &lt;code&gt;--format=plain&lt;/code&gt; is intentional. Custom format (&lt;code&gt;-Fc&lt;/code&gt;) is faster and smaller, and &lt;code&gt;pg_restore&lt;/code&gt; is more flexible against it, but plain SQL is human-readable. I have personally done a partial restore by opening a backup in &lt;code&gt;vim&lt;/code&gt; and copying out the rows I needed. You will not regret choosing plain text the first time you need it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Restoring is the test you actually need to run
&lt;/h2&gt;

&lt;p&gt;This is the step almost everyone skips. It is the step that turns "I have backups" into "I have working backups." They are not the same thing.&lt;/p&gt;

&lt;p&gt;Once a quarter, do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Pull yesterday's backup down&lt;/span&gt;
rclone copyto b2:my-bucket/postgres/&amp;lt;yesterday&amp;gt;.sql.gz /tmp/restore.sql.gz

&lt;span class="c"&gt;# 2. Spin up a scratch Postgres locally&lt;/span&gt;
docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; scratch-pg &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-p&lt;/span&gt; 5433:5432 &lt;span class="se"&gt;\&lt;/span&gt;
  postgres:16

&lt;span class="c"&gt;# 3. Restore&lt;/span&gt;
&lt;span class="nb"&gt;gunzip&lt;/span&gt; &lt;span class="nt"&gt;-c&lt;/span&gt; /tmp/restore.sql.gz | psql &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-p&lt;/span&gt; 5433 &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-d&lt;/span&gt; postgres

&lt;span class="c"&gt;# 4. Sanity-check a row count&lt;/span&gt;
psql &lt;span class="nt"&gt;-h&lt;/span&gt; localhost &lt;span class="nt"&gt;-p&lt;/span&gt; 5433 &lt;span class="nt"&gt;-U&lt;/span&gt; postgres &lt;span class="nt"&gt;-d&lt;/span&gt; postgres &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"select count(*) from users;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Time it. Write the elapsed minutes down. The number you want in your head is your recovery time: how long from "database is gone" to "product is back up." For most indie SaaS the answer should be under an hour, and most of that should be data transfer. If it takes you four hours to figure out how to restore, your backups are doing less for you than you think.&lt;/p&gt;

&lt;p&gt;The first time you run this drill, you will hit one of these problems. The dump is missing a schema you didn't know about (Postgres has &lt;code&gt;public&lt;/code&gt; plus often a &lt;code&gt;pg_catalog&lt;/code&gt;, plus extension schemas like &lt;code&gt;pgvector&lt;/code&gt; or &lt;code&gt;pg_trgm&lt;/code&gt;). The Postgres major versions are incompatible because you upgraded the live instance and forgot the dump tooling. The &lt;code&gt;gunzip&lt;/code&gt; produces a corrupted file because last night's S3 upload timed out and you only stored the truncated piece. The role definitions clash because you used &lt;code&gt;--no-owner&lt;/code&gt; but a function depends on a specific role.&lt;/p&gt;

&lt;p&gt;Every one of those is easier to debug on a Sunday afternoon than at 3am during an incident.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three mistakes I see every single week
&lt;/h2&gt;

&lt;p&gt;The first is using only the vendor's built-in backups. We covered this above. If your provider's account suspends, your billing card expires, or the region has a bad day, the backups go with the database. Off-vendor storage is not optional.&lt;/p&gt;

&lt;p&gt;The second is backing up only the database. Indie SaaS often has uploaded user files (avatars, generated PDFs, CSV exports, AI-generated images) sitting on the same disk as the app, or in the vendor's local volume. If you are already using S3-compatible object storage for uploads, you are fine: those buckets have their own durability and you can mirror them with &lt;code&gt;rclone sync&lt;/code&gt; on the same schedule as your DB. If you are storing uploads on the dyno's local disk, you have unbacked-up state, and the day the dyno is recycled you discover this. Move that to object storage first.&lt;/p&gt;

&lt;p&gt;The third is the one I want to spend a paragraph on, because it bites people who think they did everything right. If your app stores PII encrypted at the column level (which it should, especially under GDPR), the database dump is useless without the encryption key. The key lives in an env var or a secrets manager. Back that up too. Store it separately, in a password manager or a dedicated secrets vault, and write down the recovery procedure. I lost an afternoon to this exact configuration drift on a client project a few years ago. The database came back fine and we still couldn't read half the columns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;I run HostingGuru because I spent fifteen years (Oney, BeReal, Ringover, koodos labs, agency contracts) writing variations of the cron above for every new project. At some point I got tired of rebuilding the same scaffolding and shipped a managed PaaS where daily off-vendor Postgres backups, encrypted env vars, and AI-driven Telegram alerts (including the "your backup shrank" pattern from above) are part of the default experience. EU and US data centers, GDPR, ISO 27001, the routine. The free tier doesn't sleep. That's the only mention you'll get. Everything in this article works on Render, Railway, Fly.io, Supabase, a raw VPS, or your own Kubernetes cluster.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do tonight regardless of which platform you use
&lt;/h2&gt;

&lt;p&gt;Five steps, in order. Block off an hour.&lt;/p&gt;

&lt;p&gt;Step one. Pull your live database with &lt;code&gt;pg_dump&lt;/code&gt; from your laptop right now. Time it. If you can do it at all, you have a baseline. If you cannot (credentials are wrong, network rules block you, you've forgotten the password to the live DB role), fix that first. You need this skill to exist before you automate anything.&lt;/p&gt;

&lt;p&gt;Step two. Create a bucket on Backblaze B2 or Cloudflare R2. Both have free tiers that cover a 10 GB SaaS for years. Generate an access key, store it in your password manager, and verify you can &lt;code&gt;rclone copyto&lt;/code&gt; a test file into it.&lt;/p&gt;

&lt;p&gt;Step three. Wire up the cron script from this article. Put it on whatever scheduler your platform exposes (Render scheduled jobs, Fly cron, GitHub Actions on a &lt;code&gt;schedule:&lt;/code&gt;, your PaaS's on-demand script feature, or a &lt;code&gt;/etc/cron.daily&lt;/code&gt; entry on a VPS). Run it once manually. Confirm the file lands in the bucket.&lt;/p&gt;

&lt;p&gt;Step four. Set a lifecycle rule on the bucket: retain for 30 days, delete after. This step takes 90 seconds in the B2 or R2 UI. Without it, you'll pay for storage forever and the bucket will become a haystack.&lt;/p&gt;

&lt;p&gt;Step five. Block off ninety minutes next weekend for the restore drill. Restore yesterday's dump to a scratch Postgres, run a row-count query, write down the elapsed time. That number is your recovery time objective. Now you can answer the customer who asks "what happens if you lose my data" without lying.&lt;/p&gt;

&lt;p&gt;If you only do step one tonight, you've already moved the needle. Most founders haven't.&lt;/p&gt;

&lt;h2&gt;
  
  
  One question
&lt;/h2&gt;

&lt;p&gt;I'd be curious from anyone reading: have you ever actually restored from a backup in production, not as a drill but because something went wrong? What broke in the restore that you didn't expect? The interesting failure modes are not in the docs, and I'd love to read them in the comments.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Previous posts in this series:&lt;/em&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Heroku just went into "sustaining engineering mode." Here are 5 alternatives whose free tier actually doesn't sleep.&lt;/li&gt;
&lt;li&gt;I built my MVP with Claude Code. Now I need to deploy it. Here's what nobody tells you.&lt;/li&gt;
&lt;li&gt;Your AI app is silently burning $2,000/month and you don't know it. Here are the 5 patterns that bite founders.&lt;/li&gt;
&lt;li&gt;Telegram alerts for any production app: a 5-minute setup (no SaaS, no signup, just curl)&lt;/li&gt;
&lt;li&gt;How I built a Discord 'ship-tracker' bot in a weekend (and the 3-process architecture that keeps it alive 24/7)&lt;/li&gt;
&lt;li&gt;I migrated 12 client projects off Heroku. Here's the playbook (and the 7 things that bit me every single time).&lt;/li&gt;
&lt;li&gt;The Claude Code → production checklist: 15 things that aren't obvious until they bite you&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>postgres</category>
      <category>devops</category>
      <category>indiehackers</category>
      <category>beginners</category>
    </item>
    <item>
      <title>The Claude Code production checklist: 15 things that aren't obvious until they bite you</title>
      <dc:creator>Chalom Ellezam</dc:creator>
      <pubDate>Tue, 12 May 2026 12:14:16 +0000</pubDate>
      <link>https://dev.to/chalom_ellezam_5989bce65e/the-claude-code-production-checklist-15-things-that-arent-obvious-until-they-bite-you-3p7n</link>
      <guid>https://dev.to/chalom_ellezam_5989bce65e/the-claude-code-production-checklist-15-things-that-arent-obvious-until-they-bite-you-3p7n</guid>
      <description>&lt;p&gt;&lt;em&gt;Disclosure: I'm a senior backend tech lead and I run HostingGuru. This list applies to any platform; HostingGuru happens to handle a few of these for you automatically, which I'll flag honestly when relevant.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;I've helped about a dozen non-technical founders take their first Claude Code MVP from &lt;code&gt;localhost:3000&lt;/code&gt; to a real production URL in the last six months. They are &lt;em&gt;much&lt;/em&gt; better at shipping than the same founders would have been two years ago.&lt;/p&gt;

&lt;p&gt;But the same 15 things keep biting them in the first two weeks after launch. Almost none of these are about the code being wrong. They are about production being a different environment with its own rules, rules nobody warned anyone about because everyone assumed you already knew them.&lt;/p&gt;

&lt;p&gt;This is the checklist I now send to every founder before they go live. If you went live in the last 30 days and didn't go through it, you almost certainly have at least four of these issues right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Your &lt;code&gt;.env&lt;/code&gt; file is in your git history
&lt;/h2&gt;

&lt;p&gt;Even if your current code has &lt;code&gt;.env&lt;/code&gt; in &lt;code&gt;.gitignore&lt;/code&gt;, check git history:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git log &lt;span class="nt"&gt;--all&lt;/span&gt; &lt;span class="nt"&gt;--full-history&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; .env
git log &lt;span class="nt"&gt;--all&lt;/span&gt; &lt;span class="nt"&gt;--full-history&lt;/span&gt; &lt;span class="nt"&gt;--&lt;/span&gt; .env.local
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If anything shows up, your API keys were exposed at some point. They are still exposed, because git history is forever. &lt;strong&gt;Rotate every key in that file.&lt;/strong&gt; Today. Yes, even if the repo is private. Future you who makes the repo public for an open-source moment will thank present you.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. You have one set of API keys for dev and prod
&lt;/h2&gt;

&lt;p&gt;Same &lt;code&gt;OPENAI_API_KEY&lt;/code&gt;, same &lt;code&gt;STRIPE_SECRET_KEY&lt;/code&gt;, same &lt;code&gt;SENDGRID_API_KEY&lt;/code&gt; in your laptop and on the production server. The first time you accidentally run a test script that fires 500 emails or charges 200 cards, you'll wish you had a &lt;code&gt;*_DEV&lt;/code&gt; and a &lt;code&gt;*_PROD&lt;/code&gt;. Make separate keys per environment. Today.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Your Stripe webhook is unsigned
&lt;/h2&gt;

&lt;p&gt;When Stripe POSTs to &lt;code&gt;/api/webhooks/stripe&lt;/code&gt;, you should verify the signature header before trusting the payload. If your code just reads &lt;code&gt;req.body.amount&lt;/code&gt; and credits the user's account, anyone on the internet can hit that URL with fake events and give themselves credits.&lt;/p&gt;

&lt;p&gt;The fix is three lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;stripe-signature&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;stripe&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;webhooks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;constructEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rawBody&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;STRIPE_WEBHOOK_SECRET&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Now use event.type, event.data — verified.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Required reading: &lt;a href="https://stripe.com/docs/webhooks/signatures" rel="noopener noreferrer"&gt;Stripe's webhook signature docs&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. You're using Stripe test keys in production
&lt;/h2&gt;

&lt;p&gt;Your &lt;code&gt;STRIPE_SECRET_KEY&lt;/code&gt; starts with &lt;code&gt;sk_test_...&lt;/code&gt; instead of &lt;code&gt;sk_live_...&lt;/code&gt;. Real payments hit Stripe's test environment, which... doesn't charge anybody. You launch, you celebrate the first sale, three days later you realize Stripe has $0 from you.&lt;/p&gt;

&lt;p&gt;Same with &lt;code&gt;STRIPE_PUBLISHABLE_KEY&lt;/code&gt; and the frontend &lt;code&gt;pk_test_*&lt;/code&gt; vs &lt;code&gt;pk_live_*&lt;/code&gt;. Match them. Double-check after every deploy in the first week.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Your database has no backup strategy
&lt;/h2&gt;

&lt;p&gt;"I'll set up backups later" is a sentence I have heard about 100 times. Approximately zero of those people set up backups later. Then someone (often Claude) runs a migration that drops a table, and the conversation is over.&lt;/p&gt;

&lt;p&gt;Most managed databases (Supabase, Neon, managed Postgres on any PaaS) have automatic daily backups built in but &lt;strong&gt;only if you turn it on&lt;/strong&gt;. Click around your database dashboard now. If you don't see "Backups enabled," fix it before reading item 6.&lt;/p&gt;

&lt;p&gt;For self-hosted: &lt;code&gt;pg_dump&lt;/code&gt; to S3 / Cloudflare R2 nightly via a cron. Test the restore. Once. The 5 minutes you spend testing is the difference between "we recovered" and "we lost everything."&lt;/p&gt;

&lt;h2&gt;
  
  
  6. You have no rate limiting on AI endpoints
&lt;/h2&gt;

&lt;p&gt;You have a &lt;code&gt;/api/chat&lt;/code&gt; route that calls OpenAI. Someone (a scraper, a bored teen, your competitor) discovers it and hits it in a &lt;code&gt;for&lt;/code&gt; loop. By the time you notice, your OpenAI bill is up by $400 and the abuser has stopped.&lt;/p&gt;

&lt;p&gt;Even a stupid rate limit is much better than no rate limit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Crude but works&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ipHits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ip&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ipHits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;recentHits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;hits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;recentHits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;slow down&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;recentHits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;now&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;ipHits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ip&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;recentHits&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// ... your real handler&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;10 calls/minute per IP. Most legitimate users won't hit it. Most abusers will. For real production, use a library (express-rate-limit, slowapi, etc.) with Redis-backed counters.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. CORS is wide open
&lt;/h2&gt;

&lt;p&gt;You have &lt;code&gt;Access-Control-Allow-Origin: *&lt;/code&gt; in your headers or &lt;code&gt;cors({ origin: '*' })&lt;/code&gt; in your Express setup. For a public read-only API, fine. For anything with auth, this means any random website can make authenticated requests as your logged-in users.&lt;/p&gt;

&lt;p&gt;Set &lt;code&gt;origin&lt;/code&gt; to your specific frontend domain. If you need both &lt;code&gt;https://yourapp.com&lt;/code&gt; and &lt;code&gt;https://www.yourapp.com&lt;/code&gt;, use an allowlist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;allowed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://yourapp.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://www.yourapp.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;cors&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;cb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;origin&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  8. You haven't set up error tracking
&lt;/h2&gt;

&lt;p&gt;Sentry takes 25 minutes to install. Until you have it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your users find bugs before you do&lt;/li&gt;
&lt;li&gt;You spend hours guessing what broke from screenshots&lt;/li&gt;
&lt;li&gt;You miss the bugs that don't generate user complaints (and there are a lot of those)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Install it tonight. Free tier covers 5K errors/month, plenty for any startup under 10K users.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--save&lt;/span&gt; @sentry/nextjs   &lt;span class="c"&gt;# or @sentry/node, @sentry/python, etc.&lt;/span&gt;
npx @sentry/wizard &lt;span class="nt"&gt;-i&lt;/span&gt; nextjs        &lt;span class="c"&gt;# follows a guided setup&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  9. Your app ships source maps to production
&lt;/h2&gt;

&lt;p&gt;Source maps make stack traces readable, but if they're served publicly, anyone can open Chrome DevTools and read your original (TypeScript / unminified) code. This includes your API logic, your prompts to OpenAI, your business rules.&lt;/p&gt;

&lt;p&gt;For Next.js:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// next.config.js&lt;/span&gt;
&lt;span class="nx"&gt;module&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exports&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;productionBrowserSourceMaps&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Upload source maps to Sentry instead (so YOU can debug stack traces) and exclude them from the public bundle.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. You have no &lt;code&gt;/healthz&lt;/code&gt; endpoint
&lt;/h2&gt;

&lt;p&gt;Most hosting platforms periodically ping a health endpoint to know if your app is alive. If you don't have one, the platform pings your homepage, which loads your full app stack including AI calls, which is slow and expensive.&lt;/p&gt;

&lt;p&gt;Add one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/healthz&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configure your hosting platform's health check to point at &lt;code&gt;/healthz&lt;/code&gt;. Cheap, fast, useful.&lt;/p&gt;

&lt;h2&gt;
  
  
  11. Your DNS TTL is too high
&lt;/h2&gt;

&lt;p&gt;Most domain registrars default to a 24-hour or 4-hour TTL (Time To Live) on DNS records. This means when you change your domain to point at a new host, browsers and ISPs cache the old DNS for up to 24 hours.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before any DNS migration&lt;/strong&gt;, log into your registrar (Namecheap, OVH, Cloudflare) and set TTL to &lt;strong&gt;300 seconds&lt;/strong&gt; (5 minutes). Wait one TTL period. Then make your changes. Propagation will be minutes, not hours.&lt;/p&gt;

&lt;p&gt;Do this &lt;em&gt;before&lt;/em&gt; you need to migrate, not the day of.&lt;/p&gt;

&lt;h2&gt;
  
  
  12. Your runtime version isn't pinned
&lt;/h2&gt;

&lt;p&gt;Claude Code generates code targeting whatever Node / Python / Ruby version it's currently aware of (often the latest). Your hosting platform might run an older default. Result: subtle bugs that work on your laptop and break in prod.&lt;/p&gt;

&lt;p&gt;For Node, in your &lt;code&gt;package.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"engines"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"node"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"20.x"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Python, create &lt;code&gt;runtime.txt&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python-3.11.7
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Ruby, &lt;code&gt;.ruby-version&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;3.2.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pin once. Forget about it. Your future self never debugs "works on my laptop" again.&lt;/p&gt;

&lt;h2&gt;
  
  
  13. Server secrets are in your frontend bundle
&lt;/h2&gt;

&lt;p&gt;In Next.js, environment variables prefixed with &lt;code&gt;NEXT_PUBLIC_&lt;/code&gt; are sent to the browser. If you accidentally name your server secret &lt;code&gt;NEXT_PUBLIC_STRIPE_SECRET_KEY&lt;/code&gt;, you have just published it to every user's Chrome.&lt;/p&gt;

&lt;p&gt;Rule:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;NEXT_PUBLIC_*&lt;/code&gt; → safe for the browser (analytics IDs, Stripe &lt;strong&gt;publishable&lt;/strong&gt; keys, feature flags)&lt;/li&gt;
&lt;li&gt;Any actual &lt;em&gt;secret&lt;/em&gt; → no prefix. Server-only.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same idea in Vite (&lt;code&gt;VITE_*&lt;/code&gt;), Create React App (&lt;code&gt;REACT_APP_*&lt;/code&gt;), Astro, etc. &lt;strong&gt;Audit your &lt;code&gt;.env&lt;/code&gt;&lt;/strong&gt; for any "PUBLIC" variable that shouldn't be.&lt;/p&gt;

&lt;h2&gt;
  
  
  14. You have no monitoring on cron jobs
&lt;/h2&gt;

&lt;p&gt;You set up a nightly job at &lt;code&gt;0 3 * * *&lt;/code&gt;. It worked for the first three days. It hasn't run in two weeks because of a &lt;code&gt;node_modules&lt;/code&gt; issue you didn't notice. You only realize when you check the database and see no new data.&lt;/p&gt;

&lt;p&gt;Two ways to know your crons are running:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Easy way&lt;/strong&gt;: every cron pings a service like &lt;a href="https://healthchecks.io" rel="noopener noreferrer"&gt;healthchecks.io&lt;/a&gt; (free for solo use). If the ping doesn't arrive within the expected window, it emails you.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;0 3 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /opt/myapp/nightly.sh &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; curl &lt;span class="nt"&gt;-fsS&lt;/span&gt; &lt;span class="nt"&gt;-m&lt;/span&gt; 10 https://hc-ping.com/&amp;lt;uuid&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Harder way&lt;/strong&gt;: hosted platforms with built-in cron observability. HostingGuru does this via its AI monitoring layer (more on that at the end). Render and Railway expose cron logs but you have to remember to look.&lt;/p&gt;

&lt;h2&gt;
  
  
  15. Your app crashes silently on startup, restarts forever
&lt;/h2&gt;

&lt;p&gt;A crash on boot looks like this: your platform starts your container, your app throws an uncaught exception within 2 seconds, platform restarts, repeat. Externally, your domain just returns 503s.&lt;/p&gt;

&lt;p&gt;If you don't have a &lt;code&gt;process.on('uncaughtException')&lt;/code&gt; handler that logs to your error tracker AND alerts you on Telegram/Slack/email, this can go on for hours before you notice.&lt;/p&gt;

&lt;p&gt;Minimum viable setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;uncaughtException&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;FATAL:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// Optionally: send to Sentry, post to Telegram, etc.&lt;/span&gt;
  &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// let the platform restart cleanly&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then make sure your hosting platform's health check is configured (item 10), so when crashes happen it actually stops trying to restart endlessly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The faster path through this list
&lt;/h2&gt;

&lt;p&gt;Some of these are handled for you on managed PaaS platforms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HostingGuru&lt;/strong&gt; (full disclosure: I build it): encrypted env vars (so item 13's blast radius is smaller if you misname), AI-monitored crash loops (so item 15 pings you on Telegram automatically), built-in cron job monitoring with the same Telegram alert (item 14). The list of &lt;em&gt;what you still need to do yourself&lt;/em&gt; remains long though: items 1–4 (key hygiene), 5 (backups), 6–7 (rate limiting + CORS), 9 (source maps), 11 (DNS), 12 (runtime pinning). The platform can't fix code-level decisions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Render / Railway / Fly.io&lt;/strong&gt;: similar pattern. Some items (env vars in dashboard, basic process restart logic) are handled. The code-level items remain yours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VPS + Coolify or Dokku&lt;/strong&gt;: you handle all 15 yourself. That's fine if you have the time and discipline. Most solo founders don't.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The right move is: pick a platform that handles a few of these so you can focus on the rest, then &lt;em&gt;actually go through the rest&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do tonight, in order
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Run &lt;code&gt;git log -- .env&lt;/code&gt; (item 1) — 30 seconds. If anything appears, rotate keys immediately.&lt;/li&gt;
&lt;li&gt;Check your Stripe key prefix (item 4) — 10 seconds. &lt;code&gt;echo $STRIPE_SECRET_KEY | head -c 10&lt;/code&gt;. Should be &lt;code&gt;sk_live_...&lt;/code&gt; in prod.&lt;/li&gt;
&lt;li&gt;Add &lt;code&gt;/healthz&lt;/code&gt; endpoint (item 10) — 2 minutes.&lt;/li&gt;
&lt;li&gt;Add an &lt;code&gt;uncaughtException&lt;/code&gt; handler (item 15) — 5 minutes.&lt;/li&gt;
&lt;li&gt;Verify your database has backups enabled (item 5) — 2 minutes in your DB dashboard.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's 10 minutes for the five most critical items. The other 10 you can do over the week.&lt;/p&gt;

&lt;p&gt;If you build with Claude Code, this list is also a useful prompt: &lt;em&gt;"Audit my repo against the following 15 items..."&lt;/em&gt; Claude will go through each one and tell you which apply, which don't, and give you a fix for the ones that do. It catches most of them. The audit takes maybe 15 minutes total.&lt;/p&gt;

&lt;p&gt;What's the most embarrassing thing you've shipped to production that this list would have caught? I'm collecting horror stories for v2.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Previous posts in this series:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;1. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/heroku-just-went-into-sustaining-engineering-mode-here-are-5-alternatives-whose-free-tier-58id"&gt;Heroku just went into "sustaining engineering mode." Here are 5 alternatives whose free tier actually doesn't sleep&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;2. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;I built my MVP with Claude Code. Now I need to deploy it. Here's what nobody tells you.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;3. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;Your AI app is silently burning $2,000/month and you don't know it.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;4. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;Telegram alerts for any production app — a 5-minute setup.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;5. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;How I built a Discord 'ship-tracker' bot in a weekend.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;6. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/i-migrated-12-client-projects-off-heroku-heres-the-playbook-and-the-7-things-that-bit-me-every-1j4j"&gt;I migrated 12 client projects off Heroku. Here's the playbook.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>beginners</category>
      <category>webdev</category>
    </item>
    <item>
      <title>I migrated 12 client projects off Heroku. Here's the playbook (and the 7 things that bit me every single time).</title>
      <dc:creator>Chalom Ellezam</dc:creator>
      <pubDate>Mon, 11 May 2026 13:18:26 +0000</pubDate>
      <link>https://dev.to/chalom_ellezam_5989bce65e/i-migrated-12-client-projects-off-heroku-heres-the-playbook-and-the-7-things-that-bit-me-every-1j4j</link>
      <guid>https://dev.to/chalom_ellezam_5989bce65e/i-migrated-12-client-projects-off-heroku-heres-the-playbook-and-the-7-things-that-bit-me-every-1j4j</guid>
      <description>&lt;p&gt;&lt;em&gt;Disclosure: I'm a senior backend tech lead and I run HostingGuru. Six of these 12 migrations landed on HostingGuru. The other six went to Render, Railway, Fly.io, or back to a VPS. The playbook below works regardless of destination — it's the Heroku-side problems I want you to skip.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Since Heroku announced "sustaining engineering mode" in February 2026, I've been the person clients call when they need to get off the platform without breaking production. As of this week I've done 12 migrations — Rails apps, Django apps, Node services, Python workers — for clients ranging from 200-MAU side projects to 80K-MAU consumer apps.&lt;/p&gt;

&lt;p&gt;If you're staring at your Heroku dashboard wondering when to leave, this is the post I wish someone had written before I started doing these.&lt;/p&gt;

&lt;p&gt;It's two parts: &lt;strong&gt;the playbook&lt;/strong&gt; (the order of operations that works), and &lt;strong&gt;the 7 things that bit me&lt;/strong&gt; (the stuff nobody warns you about until you hit it at 11pm on launch night).&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1 — The playbook
&lt;/h2&gt;

&lt;p&gt;Every migration I've done follows the same eight-step order. I've tried shuffling it. The shuffling always costs me. Just do it in this order.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 0: Decide the destination first
&lt;/h3&gt;

&lt;p&gt;Don't start migrating until you know where you're going. Each destination changes step 4 and step 7 significantly.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Render&lt;/strong&gt;: closest to old Heroku ergonomics, web service sleeps on free tier, Postgres is solid&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Railway&lt;/strong&gt;: best DX for small projects, usage-based pricing surprises at scale, Postgres reliability has been variable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fly.io&lt;/strong&gt;: best if you need multi-region, requires a &lt;code&gt;fly.toml&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HostingGuru&lt;/strong&gt;: managed PaaS, EU + US, AI monitoring built in, predictable pricing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS / GCP&lt;/strong&gt;: only if you have a real DevOps person on the team&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VPS + Coolify/Dokku&lt;/strong&gt;: cheapest, you maintain the server&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the rest of this playbook I'll be platform-agnostic except where it matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Inventory what's actually running
&lt;/h3&gt;

&lt;p&gt;Before touching anything, list every dyno, every add-on, every config var, every scheduled job.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;heroku ps &lt;span class="nt"&gt;--app&lt;/span&gt; yourapp        &lt;span class="c"&gt;# web + worker dynos&lt;/span&gt;
heroku addons &lt;span class="nt"&gt;--app&lt;/span&gt; yourapp     &lt;span class="c"&gt;# databases, redis, monitoring, etc.&lt;/span&gt;
heroku config &lt;span class="nt"&gt;--app&lt;/span&gt; yourapp     &lt;span class="c"&gt;# env vars (don't paste this in Slack)&lt;/span&gt;
heroku features &lt;span class="nt"&gt;--app&lt;/span&gt; yourapp   &lt;span class="c"&gt;# any legacy/labs features still on&lt;/span&gt;
heroku scheduler:jobs &lt;span class="nt"&gt;--app&lt;/span&gt; yourapp  &lt;span class="c"&gt;# if using Heroku Scheduler&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pipe these into a markdown file in your repo (&lt;code&gt;MIGRATION.md&lt;/code&gt;). You will need to refer to it 6 times.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Provision the destination
&lt;/h3&gt;

&lt;p&gt;On the new platform, create everything that needs to exist &lt;em&gt;before&lt;/em&gt; the cutover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The web service (don't deploy real code yet — a placeholder is fine)&lt;/li&gt;
&lt;li&gt;Workers if you have them&lt;/li&gt;
&lt;li&gt;Database (always Postgres for me; we'll cover the dump-restore in step 5)&lt;/li&gt;
&lt;li&gt;Redis if you have Sidekiq/BullMQ&lt;/li&gt;
&lt;li&gt;Object storage if you use S3-equivalent&lt;/li&gt;
&lt;li&gt;Any external API webhooks (you'll need to update their target URLs in step 7)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Goal at end of step 2: every "thing" exists on the new side, empty.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Copy env vars carefully
&lt;/h3&gt;

&lt;p&gt;This is where every migration gets a paper cut. Heroku's &lt;code&gt;heroku config&lt;/code&gt; output looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;DATABASE_URL&lt;/span&gt;: &lt;span class="n"&gt;postgres&lt;/span&gt;://&lt;span class="n"&gt;user&lt;/span&gt;:&lt;span class="n"&gt;pass&lt;/span&gt;@&lt;span class="n"&gt;host&lt;/span&gt;:&lt;span class="m"&gt;5432&lt;/span&gt;/&lt;span class="n"&gt;db&lt;/span&gt;?&lt;span class="n"&gt;sslmode&lt;/span&gt;=&lt;span class="n"&gt;require&lt;/span&gt;
&lt;span class="n"&gt;REDIS_URL&lt;/span&gt;:    &lt;span class="n"&gt;redis&lt;/span&gt;://...
&lt;span class="n"&gt;RAILS_MASTER_KEY&lt;/span&gt;: ...
&lt;span class="n"&gt;STRIPE_SECRET_KEY&lt;/span&gt;: &lt;span class="n"&gt;sk_live_&lt;/span&gt;...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things to watch:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;DATABASE_URL&lt;/strong&gt; will need to change to the new DB's URL. Don't paste the Heroku one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Watch the trailing &lt;code&gt;?sslmode=require&lt;/code&gt;&lt;/strong&gt; — this is the #1 silent migration killer (more on this in Part 2).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Heroku auto-generates &lt;code&gt;PORT&lt;/code&gt; for you&lt;/strong&gt;; on most platforms you do too, so don't manually copy it.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 4: Deploy code to the new platform — get "hello world" working
&lt;/h3&gt;

&lt;p&gt;Push your code to the new platform. Don't migrate the database yet. Just confirm the platform can build your code and your &lt;code&gt;/healthz&lt;/code&gt; endpoint returns 200.&lt;/p&gt;

&lt;p&gt;If you have a deploy config file (&lt;code&gt;render.yaml&lt;/code&gt;, &lt;code&gt;fly.toml&lt;/code&gt;, &lt;code&gt;hostingguru.yml&lt;/code&gt;, whatever), put it in your repo BEFORE the migration and merge it. Don't be discovering it works at 11pm.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Database migration (the scary part)
&lt;/h3&gt;

&lt;p&gt;Backup → restore. The commands are basically the same on every platform:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# From Heroku&lt;/span&gt;
heroku pg:backups:capture &lt;span class="nt"&gt;--app&lt;/span&gt; yourapp
heroku pg:backups:download &lt;span class="nt"&gt;--app&lt;/span&gt; yourapp &lt;span class="nt"&gt;--output&lt;/span&gt; /tmp/dump.tar

&lt;span class="c"&gt;# To new platform (psql connection details from the new DB)&lt;/span&gt;
psql &lt;span class="nt"&gt;-h&lt;/span&gt; NEW_HOST &lt;span class="nt"&gt;-p&lt;/span&gt; NEW_PORT &lt;span class="nt"&gt;-U&lt;/span&gt; NEW_USER &lt;span class="nt"&gt;-d&lt;/span&gt; NEW_DB &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s2"&gt;"DROP SCHEMA public CASCADE; CREATE SCHEMA public;"&lt;/span&gt;

pg_restore &lt;span class="nt"&gt;--verbose&lt;/span&gt; &lt;span class="nt"&gt;--no-owner&lt;/span&gt; &lt;span class="nt"&gt;--no-acl&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-h&lt;/span&gt; NEW_HOST &lt;span class="nt"&gt;-p&lt;/span&gt; NEW_PORT &lt;span class="nt"&gt;-U&lt;/span&gt; NEW_USER &lt;span class="nt"&gt;-d&lt;/span&gt; NEW_DB /tmp/dump.tar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Then verify with row counts&lt;/strong&gt;, every single table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'users'&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="k"&gt;table_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
&lt;span class="k"&gt;UNION&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'posts'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;
&lt;span class="k"&gt;UNION&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt; &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="s1"&gt;'orders'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compare to Heroku. If anything mismatches, stop, investigate, don't proceed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Put Heroku in maintenance mode + flip DNS
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;heroku maintenance:on &lt;span class="nt"&gt;--app&lt;/span&gt; yourapp
&lt;span class="c"&gt;# users see a "we'll be back" page&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then update your DNS provider (Namecheap, OVH, whatever) to point at the new host. Set TTL low (300 seconds) before this if you remember — DNS propagation will go faster.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 7: Final dump + restore
&lt;/h3&gt;

&lt;p&gt;Yes, dump the database again. Heroku has been writing data for the last few hours while you set up. Do a fresh capture, restore on the new side. Then row-count verify again.&lt;/p&gt;

&lt;p&gt;Yes, this means downtime equal to the dump-restore time. For most apps under 5GB, this is 5–15 minutes. Schedule the migration window.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 8: Reroute external webhooks, then drop maintenance mode
&lt;/h3&gt;

&lt;p&gt;Anything that calls &lt;em&gt;into&lt;/em&gt; your app from outside needs its target URL updated to the new host:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stripe webhooks&lt;/li&gt;
&lt;li&gt;SendGrid event webhooks&lt;/li&gt;
&lt;li&gt;OAuth callback URLs (Google, GitHub, Slack)&lt;/li&gt;
&lt;li&gt;Custom integrations&lt;/li&gt;
&lt;li&gt;Cron jobs hitting your &lt;code&gt;/api/...&lt;/code&gt; endpoints from external services&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then &lt;code&gt;heroku maintenance:off&lt;/code&gt;. Then watch your logs for 10 minutes straight. If the logs are quiet and your &lt;code&gt;/healthz&lt;/code&gt; is green, you're done.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 2 — The 7 things that bit me
&lt;/h2&gt;

&lt;p&gt;This is the part nobody warns you about. Every one of these has cost me at least 45 minutes of debugging at least once.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bite 1: The &lt;code&gt;heroku_ext&lt;/code&gt; schema doesn't exist anywhere else
&lt;/h3&gt;

&lt;p&gt;Heroku installs Postgres extensions in a special &lt;code&gt;heroku_ext&lt;/code&gt; schema (not &lt;code&gt;public&lt;/code&gt;). When you &lt;code&gt;pg_restore&lt;/code&gt; to a non-Heroku Postgres, the restore tries to install extensions in &lt;code&gt;heroku_ext&lt;/code&gt; and fails:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;pg_restore: error: could not execute query: ERROR:  schema "heroku_ext" does not exist
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The restore often &lt;em&gt;appears&lt;/em&gt; to complete despite the error, but the extensions (uuid-ossp, unaccent, pg_trgm, etc.) aren't installed. Then your app boots and gets a &lt;code&gt;function uuid_generate_v4() does not exist&lt;/code&gt; 500 error in production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: after restore, recreate the extensions manually:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="nv"&gt;"uuid-ossp"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="nv"&gt;"unaccent"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="nv"&gt;"pg_trgm"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;List of likely-needed extensions: &lt;code&gt;heroku pg:psql --app yourapp -c "SELECT extname FROM pg_extension;"&lt;/code&gt; before you migrate.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bite 2: &lt;code&gt;pg_stat_statements&lt;/code&gt; is enabled by default on Heroku, often not elsewhere
&lt;/h3&gt;

&lt;p&gt;If you have any internal slow-query monitoring (e.g. PgHero, a custom admin dashboard), it queries &lt;code&gt;pg_stat_statements&lt;/code&gt;. On the new platform this extension is often not enabled by default. Your slow-query dashboard silently shows zero queries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: enable it on the new platform. On Render/Railway/HostingGuru it's a one-line config flag. On a VPS with your own Postgres, edit &lt;code&gt;postgresql.conf&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;shared_preload_libraries&lt;/span&gt; = &lt;span class="s1"&gt;'pg_stat_statements'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then &lt;code&gt;CREATE EXTENSION pg_stat_statements;&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bite 3: Heroku's &lt;code&gt;DATABASE_URL&lt;/code&gt; bakes in &lt;code&gt;sslmode=require&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Heroku's connection string format always includes &lt;code&gt;?sslmode=require&lt;/code&gt;. Most apps rely on this being implicit. When you move, some platforms don't add it by default, and your Rails / Django / Node app starts logging cryptic SSL errors at higher load (when the connection pool churns):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PG::ConnectionBad: SSL connection has been closed unexpectedly
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: explicitly add &lt;code&gt;?sslmode=require&lt;/code&gt; to your new &lt;code&gt;DATABASE_URL&lt;/code&gt;. Or set &lt;code&gt;PGSSLMODE=require&lt;/code&gt; as an env var. Diff the connection strings character-by-character before flipping traffic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bite 4: Scheduled jobs run on different time zones
&lt;/h3&gt;

&lt;p&gt;Heroku Scheduler runs jobs in &lt;strong&gt;UTC&lt;/strong&gt;. Some platforms default to UTC, some default to the platform's region timezone, some default to whatever timezone the cron entry doesn't specify.&lt;/p&gt;

&lt;p&gt;If you have a &lt;code&gt;0 9 * * *&lt;/code&gt; job that ran at 9am UTC on Heroku, and you migrate to a platform that interprets it as 9am local-time-of-some-California-datacenter, your "daily report at 9am Paris time" now runs at 6pm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: always explicitly check the new platform's cron timezone before migrating. Most modern platforms (Render, Railway, HostingGuru) are UTC. AWS EventBridge defaults to UTC. Older or self-hosted setups vary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bite 5: Heroku's &lt;code&gt;DYNO&lt;/code&gt; env var doesn't exist anywhere else
&lt;/h3&gt;

&lt;p&gt;Heroku sets &lt;code&gt;DYNO=web.1&lt;/code&gt; or &lt;code&gt;DYNO=worker.1&lt;/code&gt; automatically. Some apps use this to determine "am I a web process or a worker?" so they can skip certain initialization. After migration, &lt;code&gt;DYNO&lt;/code&gt; is unset, and your worker process tries to bind to a port (because the "am I web?" check defaults to true).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: grep your codebase for &lt;code&gt;process.env.DYNO&lt;/code&gt;, &lt;code&gt;ENV["DYNO"]&lt;/code&gt;, &lt;code&gt;os.environ.get('DYNO')&lt;/code&gt;. Replace with explicit env vars you set per process (&lt;code&gt;PROCESS_TYPE=web&lt;/code&gt; vs &lt;code&gt;worker&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  Bite 6: Heroku Postgres "follower" replicas don't survive
&lt;/h3&gt;

&lt;p&gt;If you used Heroku Postgres followers (read replicas) for analytics queries, those don't migrate. The new platform may or may not have a managed replica option, and the connection string format is always different.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: if you don't use the replica heavily, just point all queries at primary for the first week post-migration. Then add a replica on the new platform if needed. Don't try to bring the replica online during the migration itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bite 7: Buildpacks vs Docker — your build will behave subtly differently
&lt;/h3&gt;

&lt;p&gt;Heroku auto-detects your stack via buildpacks (e.g. heroku/python, heroku/ruby). Some new platforms also use buildpacks (Render uses Nixpacks, Railway uses Nixpacks). Some require a &lt;code&gt;Dockerfile&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Even when both use buildpacks, the specific buildpack version differs. I've seen the same &lt;code&gt;requirements.txt&lt;/code&gt; install Python 3.11.7 on Heroku and Python 3.12.1 on Render — and then a library breaks because Python 3.12 deprecated something.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix&lt;/strong&gt;: pin your runtime version explicitly. For Python, that's a &lt;code&gt;.python-version&lt;/code&gt; or &lt;code&gt;runtime.txt&lt;/code&gt;. For Ruby, that's &lt;code&gt;.ruby-version&lt;/code&gt;. For Node, that's &lt;code&gt;"engines": { "node": "20.11.x" }&lt;/code&gt; in &lt;code&gt;package.json&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The full migration timeline I quote clients
&lt;/h2&gt;

&lt;p&gt;For a "small to medium" Rails or Django app (1 web service, 1 worker, 1 Postgres &amp;lt; 5GB, &amp;lt; 20 env vars):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Step 0 (decide destination): 1–3 days (mostly waiting for client approval)&lt;/li&gt;
&lt;li&gt;Step 1–4 (prep): half a day of focused work&lt;/li&gt;
&lt;li&gt;Step 5 (DB migration dry-run): 1 hour&lt;/li&gt;
&lt;li&gt;Step 6–8 (cutover window): 30–60 minutes with the team on standby&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total: roughly &lt;strong&gt;2–3 hours of execution time&lt;/strong&gt; in a 1-week calendar window. The week is for sanity, not because the work takes that long.&lt;/p&gt;

&lt;p&gt;If anyone quotes you "we'll migrate Heroku in 30 minutes," they haven't done it. There's always something you didn't expect — most often one of the 7 bites above.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd do if I were migrating today
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Run the inventory&lt;/strong&gt; (&lt;code&gt;heroku ps&lt;/code&gt;, &lt;code&gt;addons&lt;/code&gt;, &lt;code&gt;config&lt;/code&gt;, &lt;code&gt;scheduler:jobs&lt;/code&gt;) and put it in a markdown file &lt;em&gt;today&lt;/em&gt;. You'll need it whether you migrate this month or in 6 months.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick a destination&lt;/strong&gt; that fits your team's operational maturity. If you don't have a DevOps person, pick a managed PaaS. If you have someone who already runs Kubernetes, you have more options.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pin your runtime versions&lt;/strong&gt; in your repo &lt;em&gt;before&lt;/em&gt; the migration. This is the cheapest insurance policy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Do one dry run&lt;/strong&gt; of the database migration to a throwaway DB on the new platform. Verify row counts. Then do the real one later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schedule the cutover&lt;/strong&gt; for a Tuesday or Wednesday morning, not a Friday. If something breaks, you want a full work week to fix it.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  A note on the platform I run
&lt;/h2&gt;

&lt;p&gt;If you're picking a destination and you want one that ships with the operational stuff (Telegram alerts, log-pattern detection, EU data center, predictable pricing), HostingGuru is built around exactly that. Pro tier is €35/mo for 10 services with workers and on-demand scripts included (the same primitives Heroku had with &lt;code&gt;worker:&lt;/code&gt; dynos and &lt;code&gt;Heroku Scheduler&lt;/code&gt;). Free tier never sleeps.&lt;/p&gt;

&lt;p&gt;If you pick another destination, the playbook above still applies — just translate "step 4: deploy code" to whatever that platform's deploy flow is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;Heroku is in maintenance mode, not gone. Your app will keep working there. But every month you delay migrating, you're betting that nothing breaks on a platform that's stopped investing in fixing things. That's not crazy in 2026. It just gets less crazy with time, not more.&lt;/p&gt;

&lt;p&gt;Whenever you do migrate, the order of operations matters more than which destination you pick. Pick the order. Execute Tuesday morning. Watch the logs for 10 minutes. Then take Wednesday off — you've earned it.&lt;/p&gt;

&lt;p&gt;What's the dumbest thing that broke during your migration? I'm collecting these for v2 of this post.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Previous posts in this series:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;1. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/heroku-just-went-into-sustaining-engineering-mode-here-are-5-alternatives-whose-free-tier-58id"&gt;Heroku just went into "sustaining engineering mode." Here are 5 alternatives whose free tier actually doesn't sleep&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;2. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;I built my MVP with Claude Code. Now I need to deploy it. Here's what nobody tells you.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;3. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;Your AI app is silently burning $2,000/month and you don't know it.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;4. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;Telegram alerts for any production app — a 5-minute setup.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;5. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;How I built a Discord 'ship-tracker' bot in a weekend.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>heroku</category>
      <category>devops</category>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How I built a Discord 'ship-tracker' bot in a weekend (and the 3-process architecture that keeps it alive 24/7)</title>
      <dc:creator>Chalom Ellezam</dc:creator>
      <pubDate>Fri, 08 May 2026 17:54:27 +0000</pubDate>
      <link>https://dev.to/chalom_ellezam_5989bce65e/how-i-built-a-discord-ship-tracker-bot-in-a-weekend-and-the-3-process-architecture-that-keeps-it-a71</link>
      <guid>https://dev.to/chalom_ellezam_5989bce65e/how-i-built-a-discord-ship-tracker-bot-in-a-weekend-and-the-3-process-architecture-that-keeps-it-a71</guid>
      <description>&lt;p&gt;&lt;em&gt;Disclosure: I'm a senior backend tech lead and I run HostingGuru. This bot runs on HostingGuru's Pro tier — but the architecture (web service + worker + scheduled job) works on any platform that supports those three primitives. I'll point out where each piece runs.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;I co-run a small Discord community for indie founders building dev tools. About 220 members, mostly early-stage SaaS people, lots of Claude Code / Cursor enthusiasts. Every Monday I used to manually scroll through the previous week's &lt;code&gt;#i-shipped&lt;/code&gt; channel and write a digest message: "this week we shipped X, Y, Z."&lt;/p&gt;

&lt;p&gt;It took 30 minutes every Monday morning. After 5 weeks of it I did the math — 30 min × 52 weeks = 26 hours a year of me doing what a bot could do better. So one Saturday I built &lt;strong&gt;ShipTrack&lt;/strong&gt;, the bot that's been keeping my Mondays free for 6 months now.&lt;/p&gt;

&lt;p&gt;This is the build log. It's mostly about an architecture decision (3 separate processes instead of 1) that turned out to be the difference between "bot keeps crashing" and "bot just works."&lt;/p&gt;

&lt;h2&gt;
  
  
  What the bot does
&lt;/h2&gt;

&lt;p&gt;Three things, in order of complexity:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Listens for the &lt;code&gt;/ship&lt;/code&gt; slash command.&lt;/strong&gt; When a member runs &lt;code&gt;/ship "Launched my AI todo app — feedback welcome: link.com"&lt;/code&gt;, the bot logs the launch into a database and reacts with 🚀 in the channel.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tracks &lt;code&gt;#i-shipped&lt;/code&gt; channel messages.&lt;/strong&gt; When anyone posts in that channel (without slash command), the bot detects launch-shaped content (heuristic: contains a URL + at least one of "shipped", "launched", "live"), logs it, reacts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Posts a weekly digest&lt;/strong&gt; every Monday at 9am UTC. The bot pulls all launches from the last 7 days, formats them into a nice list, and posts it to &lt;code&gt;#announcements&lt;/code&gt; with @-mentions of the founders.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. Three things. But they map to three completely different &lt;em&gt;kinds&lt;/em&gt; of computation, which is where v1 went wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  v1: the naive setup that crashed in 15 minutes
&lt;/h2&gt;

&lt;p&gt;I started simple. One Node.js file. &lt;code&gt;node bot.js&lt;/code&gt;. Deploy to a Render free web service. Done in 30 minutes.&lt;/p&gt;

&lt;p&gt;It worked on my laptop. It worked for the first 14 minutes after deploy. Then Render's free tier put the service to sleep due to no incoming HTTP traffic — and a Discord bot &lt;strong&gt;doesn't get HTTP traffic&lt;/strong&gt; by default. It maintains a long-lived WebSocket connection to Discord's gateway. Render couldn't see that traffic. To Render, my bot was idle. So Render killed it.&lt;/p&gt;

&lt;p&gt;When the bot came back from sleep 30 seconds later, it tried to reconnect to Discord's gateway. Discord saw two sessions for the same bot. The old session got disconnected with a &lt;code&gt;4008 Reconnect&lt;/code&gt; and the new one inherited some weird state. Members started seeing the bot react to messages twice. Slash commands timed out.&lt;/p&gt;

&lt;p&gt;This is the kind of bug that takes a long time to diagnose if you've never seen it before, because &lt;strong&gt;everything looks fine in your logs&lt;/strong&gt;. There's no error, just slightly wrong behavior. I wasted 4 hours.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Discord bots are weirder than they look
&lt;/h2&gt;

&lt;p&gt;The thing nobody tells you when you start: a Discord bot has &lt;em&gt;two&lt;/em&gt; completely different communication channels with Discord's servers, and they have totally different operational requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Channel 1: the gateway (WebSocket, persistent).&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The bot opens a WebSocket to &lt;code&gt;wss://gateway.discord.gg&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Stays open forever&lt;/li&gt;
&lt;li&gt;Receives every event in real time (member joined, message posted, reaction added)&lt;/li&gt;
&lt;li&gt;Sends heartbeats every 41.25 seconds&lt;/li&gt;
&lt;li&gt;If the connection drops for &amp;gt;60 seconds, you have to fully re-authenticate and resync state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Channel 2: slash commands (HTTP, on-demand).&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Discord POSTs to YOUR endpoint when a user runs a slash command&lt;/li&gt;
&lt;li&gt;You have &lt;strong&gt;3 seconds&lt;/strong&gt; to respond or Discord shows "interaction failed" to the user&lt;/li&gt;
&lt;li&gt;Public HTTP endpoint with signed payload verification&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These two channels don't fit on the same kind of host. The gateway needs &lt;strong&gt;always-on&lt;/strong&gt;. The slash command webhook needs &lt;strong&gt;public HTTPS that wakes up fast&lt;/strong&gt;. Most "deploy your Node app" flows assume one or the other, not both.&lt;/p&gt;

&lt;h2&gt;
  
  
  v2: three processes, three responsibilities
&lt;/h2&gt;

&lt;p&gt;The architecture I landed on has three pieces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────┐    ┌──────────────────────┐    ┌──────────────────────┐
│  WEB SERVICE          │   │  WORKER               │   │  SCHEDULED SCRIPT     │
│  HTTPS endpoint       │   │  Always-on process    │   │  Runs Monday 9am UTC  │
│  Slash command webhook│   │  Discord gateway      │   │  Generates weekly     │
│  /api/discord/interact│   │  WebSocket connection │   │  digest               │
└──────────────────────┘    └──────────────────────┘    └──────────────────────┘
              │                         │                          │
              └─────────────────────────┴──────────────────────────┘
                                        │
                                ┌──────────────┐
                                │  Postgres    │
                                │  (launches)  │
                                └──────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three separate deployments, one shared database. Each process does what it's good at and nothing else.&lt;/p&gt;

&lt;h3&gt;
  
  
  Process 1: the web service (slash commands)
&lt;/h3&gt;

&lt;p&gt;This is a tiny Express app. One endpoint. Returns under 1 second.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// web-service/server.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;verifyKey&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;discord-interactions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./db.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;_res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rawBody&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/discord/interact&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// 1. Verify Discord signed the request&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;signature&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-Signature-Ed25519&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;X-Signature-Timestamp&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;valid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;verifyKey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rawBody&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;signature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DISCORD_PUBLIC_KEY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;valid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;invalid signature&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Discord sometimes pings to check liveness&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Slash command — log the launch and respond fast&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ship&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;member&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;username&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;member&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;?.[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;launches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;channel_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;channel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`🚀 Logged your ship, &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;!`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Unknown command&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PORT&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deploy this as a normal &lt;strong&gt;web service&lt;/strong&gt;. It can sleep on free tiers — Discord sends a request only when someone runs &lt;code&gt;/ship&lt;/code&gt;, and 1 second of cold start before responding is fine. (For HostingGuru, I picked the Hobby tier with the always-on free guarantee anyway, but the architecture works either way.)&lt;/p&gt;

&lt;h3&gt;
  
  
  Process 2: the worker (gateway + reactions)
&lt;/h3&gt;

&lt;p&gt;This is the long-running part. It opens the WebSocket connection to Discord and listens for messages. It can't sleep. Ever.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// worker/bot.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;GatewayIntentBits&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Events&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;discord.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./db.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;intents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nx"&gt;GatewayIntentBits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Guilds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;GatewayIntentBits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GuildMessages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;GatewayIntentBits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MessageContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;SHIPPED_CHANNEL_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SHIPPED_CHANNEL_ID&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MessageCreate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;bot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;SHIPPED_CHANNEL_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Heuristic: contains a URL + a "shipped"-ish word&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hasUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sr"&gt;/https&lt;/span&gt;&lt;span class="se"&gt;?&lt;/span&gt;&lt;span class="sr"&gt;:&lt;/span&gt;&lt;span class="se"&gt;\/\/\S&lt;/span&gt;&lt;span class="sr"&gt;+/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hasShipWord&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\b(&lt;/span&gt;&lt;span class="sr"&gt;shipped|launched|live|released&lt;/span&gt;&lt;span class="se"&gt;)\b&lt;/span&gt;&lt;span class="sr"&gt;/i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;hasUrl&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;hasShipWord&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;launches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;author&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;channel_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;message_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;react&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;🚀&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;Events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ClientReady&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`ShipTrack online as &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DISCORD_BOT_TOKEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deploy this as a &lt;strong&gt;background worker&lt;/strong&gt;. On HostingGuru this is the Pro tier &lt;code&gt;worker&lt;/code&gt; process type — same &lt;code&gt;Procfile&lt;/code&gt;-style declaration as Heroku's old &lt;code&gt;worker:&lt;/code&gt; line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# hostingguru.yml (or similar config)&lt;/span&gt;
&lt;span class="na"&gt;processes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;bot&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;worker&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node worker/bot.js&lt;/span&gt;
    &lt;span class="na"&gt;always_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The platform keeps it running. If it crashes, it restarts. If you push new code, it gracefully reconnects. &lt;strong&gt;No HTTP traffic required to keep it alive&lt;/strong&gt; — that's the whole point of a worker process type vs a web service.&lt;/p&gt;

&lt;h3&gt;
  
  
  Process 3: the scheduled script (weekly digest)
&lt;/h3&gt;

&lt;p&gt;This one runs once a week. It's an "on-demand" script — runs, finishes, exits. Costs almost nothing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// scripts/weekly-digest.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;GatewayIntentBits&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;discord.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./db.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;intents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;GatewayIntentBits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Guilds&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;login&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DISCORD_BOT_TOKEN&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;since&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;launches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;launches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;$gte&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;since&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;launches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;destroy&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;formatted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;launches&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;l&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;`• &amp;lt;@&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;l&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;gt; shipped: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;l&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;channel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;channels&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ANNOUNCEMENTS_CHANNEL_ID&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`**📦 This week we shipped (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;launches&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; launches):**\n\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;formatted&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;allowedMentions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;users&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// notify in formatting only, don't ping&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;destroy&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On HostingGuru, this runs as an &lt;strong&gt;on-demand script&lt;/strong&gt; triggered by a schedule:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;processes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;weekly-digest&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;script&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;node scripts/weekly-digest.js&lt;/span&gt;
    &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;9&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;  &lt;span class="c1"&gt;# every Monday at 9:00 UTC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The platform spins up an ephemeral container at the scheduled time, runs the script, captures the output, exits. You pay for ~3 seconds of compute per week. If you've ever fought with Heroku Scheduler, you'll appreciate that the script lives in your repo, version-controlled, with the same env vars as the rest of your app.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this architecture matters
&lt;/h2&gt;

&lt;p&gt;The naive temptation is to put all three in one Node process: HTTP server + Discord client + a &lt;code&gt;setInterval&lt;/code&gt; for the digest. &lt;strong&gt;Don't.&lt;/strong&gt; Three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Crashes blast radius.&lt;/strong&gt; If your slash-command handler throws, the gateway connection survives. If your gateway disconnects mid-deploy, the slash commands keep working. Each process is independently restartable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaling differs.&lt;/strong&gt; If you have 5,000 slash commands an hour, you scale the web service. The worker stays at 1 instance (you only need one Discord gateway connection per bot). Different processes, different scaling profiles.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Costs differ.&lt;/strong&gt; The worker burns CPU cycles 24/7 just maintaining a heartbeat. The script runs 3 seconds a week. Putting them on the same dyno is paying always-on prices for a once-a-week task.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This three-process pattern is what you want for &lt;strong&gt;any bot or background-heavy service&lt;/strong&gt;: not just Discord. Slack apps. Telegram bots. Webhook receivers with async fanout. The shape repeats.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;WebSocket gateway = needs a worker, not a web service.&lt;/strong&gt; Free web tiers will sleep your bot. Workers don't sleep on platforms that respect the worker primitive.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slash commands ≠ gateway.&lt;/strong&gt; They're HTTP, you can host them anywhere, but the 3-second response cap is real. Don't do heavy work inline — log to DB, respond, finish processing async.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use a real database, not in-memory state.&lt;/strong&gt; I tried "just use a JSON file" for v0. Workers restart, files vanish, members lost their launch history once. Two days later I wired Postgres.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled scripts &amp;gt; setInterval.&lt;/strong&gt; A &lt;code&gt;setInterval&lt;/code&gt; in your worker tied to wall-clock time will drift, miss runs during deploys, and double-fire if you scale to 2 instances. A scheduled script run as a separate process is exactly-once, exactly-on-time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Always reply within 3 seconds.&lt;/strong&gt; If your slash command handler does anything slow (database query &amp;gt; 1s, external API), respond with a deferred response (&lt;code&gt;type: 5&lt;/code&gt;) and follow up later via Discord's webhook.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  When this architecture is overkill
&lt;/h2&gt;

&lt;p&gt;If you're building a bot for 5 friends to play a Discord trivia game, run it in one Node process on your laptop. You don't need three processes. You don't need a database. You probably don't need slash commands.&lt;/p&gt;

&lt;p&gt;The three-process pattern starts paying off when &lt;strong&gt;at least one of these is true&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The bot is mission-critical (community would notice if it's down).&lt;/li&gt;
&lt;li&gt;The bot has &amp;gt; ~50 users sending it traffic.&lt;/li&gt;
&lt;li&gt;You need scheduled jobs that must run even if the bot crashed yesterday.&lt;/li&gt;
&lt;li&gt;You're deploying it on a platform where free web tiers sleep.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For ShipTrack at 220 members + weekly digest, all four were true. So the three-process setup paid for itself the first time the worker crashed at 2am and the slash commands kept working through it.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on HostingGuru
&lt;/h2&gt;

&lt;p&gt;The reason I'm building on HostingGuru (besides the obvious "I run it" disclosure at the top) is that the three primitives I needed — &lt;strong&gt;web service&lt;/strong&gt;, &lt;strong&gt;worker&lt;/strong&gt;, &lt;strong&gt;on-demand scheduled script&lt;/strong&gt; — are first-class citizens on the platform. Same repo, same env vars, three lines of YAML config. No fighting with Heroku Scheduler vs Heroku dynos vs Heroku one-off &lt;code&gt;heroku run&lt;/code&gt; jobs. No spinning up separate ECS task definitions on AWS.&lt;/p&gt;

&lt;p&gt;If you're building anything with these three shapes — and most bots, webhook receivers, and background-heavy services have them — Pro tier (€35/mo) gets you 10 services with workers and on-demand scripts included. The free Starter tier supports the web service piece if you want to wire your worker elsewhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;ShipTrack v3 is on my todo list. I want to add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An LLM-powered launch summary at the bottom of each digest ("this week's theme: AI productivity tools")&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;/profile @user&lt;/code&gt; command that shows someone's all-time launches&lt;/li&gt;
&lt;li&gt;A leaderboard of "most active shipper this quarter"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you've built a Discord bot recently and have hard-won lessons, I'd love to hear them in the comments. Especially the embarrassing v1 stories.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Previous posts in this series:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;1. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/heroku-just-went-into-sustaining-engineering-mode-here-are-5-alternatives-whose-free-tier-58id"&gt;Heroku just went into "sustaining engineering mode." Here are 5 alternatives whose free tier actually doesn't sleep&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;2. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;I built my MVP with Claude Code. Now I need to deploy it. Here's what nobody tells you.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;3. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;Your AI app is silently burning $2,000/month and you don't know it.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;4. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;Telegram alerts for any production app — a 5-minute setup.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>discord</category>
      <category>javascript</category>
      <category>devops</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Telegram alerts for any production app — a 5-minute setup (no SaaS, no signup, just curl)</title>
      <dc:creator>Chalom Ellezam</dc:creator>
      <pubDate>Mon, 04 May 2026 12:36:33 +0000</pubDate>
      <link>https://dev.to/chalom_ellezam_5989bce65e/telegram-alerts-for-any-production-app-a-5-minute-setup-no-saas-no-signup-just-curl-3pgf</link>
      <guid>https://dev.to/chalom_ellezam_5989bce65e/telegram-alerts-for-any-production-app-a-5-minute-setup-no-saas-no-signup-just-curl-3pgf</guid>
      <description>&lt;p&gt;&lt;em&gt;Disclosure: I'm a senior backend tech lead and I run HostingGuru, where Telegram alerts ship as a built-in feature. This tutorial works on any platform — it's the manual version of what HostingGuru does for you. Useful even if you never become a customer.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;There's a hierarchy of where production alerts go, ranked by how likely you are to actually see them.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Email → 14% open rate within an hour, less at 3am.&lt;/li&gt;
&lt;li&gt;Slack → muted in 6 of 10 teams I've seen, especially "alerts" channels.&lt;/li&gt;
&lt;li&gt;Phone-call paging (PagerDuty, Opsgenie) → works, but $20+/user/month and overkill for solo founders.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Telegram&lt;/strong&gt; → notification on lock screen, no setup cost, works on every phone, you'll see it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For a solo founder or a small team, Telegram alerts hit a sweet spot: the notification is &lt;strong&gt;annoying enough that you'll see it&lt;/strong&gt;, &lt;strong&gt;easy enough that you'll set it up&lt;/strong&gt;, and &lt;strong&gt;free&lt;/strong&gt;. After 8 years of trying every paging tool, this is what I default to for early-stage projects.&lt;/p&gt;

&lt;p&gt;Here's how to wire it up in 5 minutes, plus what I learned about &lt;em&gt;what&lt;/em&gt; to alert on.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Create a Telegram bot (60 seconds)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Open Telegram, search for &lt;code&gt;@BotFather&lt;/code&gt;, start a chat.&lt;/li&gt;
&lt;li&gt;Send &lt;code&gt;/newbot&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Pick a name. Then a username (must end in &lt;code&gt;bot&lt;/code&gt;, e.g. &lt;code&gt;myapp_alerts_bot&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;BotFather replies with a token like &lt;code&gt;7234567890:AAFq...&lt;/code&gt;. &lt;strong&gt;Save it.&lt;/strong&gt; This is your &lt;code&gt;TELEGRAM_BOT_TOKEN&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. The bot exists. Now we need somewhere to send messages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Get your chat ID (60 seconds)
&lt;/h2&gt;

&lt;p&gt;You need the ID of the chat where alerts will land. Two options:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option A — Personal alerts (just for you):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open a chat with your new bot. Send it any message ("hello").&lt;/li&gt;
&lt;li&gt;Visit &lt;code&gt;https://api.telegram.org/bot&amp;lt;YOUR_TOKEN&amp;gt;/getUpdates&lt;/code&gt; in your browser.&lt;/li&gt;
&lt;li&gt;Find the &lt;code&gt;chat.id&lt;/code&gt; field in the JSON response. It's a number like &lt;code&gt;123456789&lt;/code&gt;. &lt;strong&gt;That's your chat ID.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Option B — Group alerts (for the team):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a Telegram group (or use an existing one).&lt;/li&gt;
&lt;li&gt;Add your bot to the group.&lt;/li&gt;
&lt;li&gt;Send any message in the group.&lt;/li&gt;
&lt;li&gt;Visit the same &lt;code&gt;getUpdates&lt;/code&gt; URL. The chat ID for groups is negative (e.g. &lt;code&gt;-987654321&lt;/code&gt;).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Save the chat ID as &lt;code&gt;TELEGRAM_CHAT_ID&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Send your first alert (30 seconds)
&lt;/h2&gt;

&lt;p&gt;The send API is one HTTP call. From a terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://api.telegram.org/bot&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TELEGRAM_BOT_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/sendMessage"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"chat_id=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TELEGRAM_CHAT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"text=Hello from production"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your phone should buzz immediately. If it does, the wiring is done. Now we just need to call this from your app when something interesting happens.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Wire it into your app (3 minutes)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Node.js / TypeScript
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// alerts.ts&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TG_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TELEGRAM_BOT_TOKEN&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TG_CHAT&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TELEGRAM_CHAT_ID&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;TG_TOKEN&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;TG_CHAT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// disabled in dev&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`https://api.telegram.org/bot&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;TG_TOKEN&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/sendMessage`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;TG_CHAT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;// Telegram cap&lt;/span&gt;
        &lt;span class="na"&gt;parse_mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Markdown&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// never let alerting crash your app&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;alert failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use it anywhere:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;alert&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./alerts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;uncaughtException&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`🚨 *Uncaught exception*\n&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\`&lt;/span&gt;&lt;span class="s2"&gt;\n\nstack:\n&lt;/span&gt;&lt;span class="se"&gt;\`\`\`&lt;/span&gt;&lt;span class="s2"&gt;\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\n&lt;/span&gt;&lt;span class="se"&gt;\`\`\`&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// or in business logic&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;failed&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`⚠️ Stripe webhook retry rate at &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/min — investigate`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# alerts.py
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;TG_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TELEGRAM_BOT_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;TG_CHAT&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;TELEGRAM_CHAT_ID&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;TG_TOKEN&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;TG_CHAT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;https://api.telegram.org/bot&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;TG_TOKEN&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/sendMessage&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;chat_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TG_CHAT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;parse_mode&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Markdown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# never let alerting crash your app
&lt;/span&gt;        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;alert failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wire it to the unhandled exception hook:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;alerts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;alert&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;excepthook&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exc_value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exc_traceback&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;🚨 *Uncaught exception*&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;`&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;exc_type&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;exc_value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;`&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;__excepthook__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exc_value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exc_traceback&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;excepthook&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;excepthook&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Plain bash (great for cron jobs)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="c"&gt;# Save as /usr/local/bin/tg-alert&lt;/span&gt;
curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"https://api.telegram.org/bot&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TELEGRAM_BOT_TOKEN&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/sendMessage"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data-urlencode&lt;/span&gt; &lt;span class="s2"&gt;"chat_id=&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;TELEGRAM_CHAT_ID&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data-urlencode&lt;/span&gt; &lt;span class="s2"&gt;"text=&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /dev/null
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then any cron job can do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;0 3 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /opt/myapp/nightly-backup.sh &lt;span class="o"&gt;||&lt;/span&gt; tg-alert &lt;span class="s2"&gt;"❌ Nightly backup failed at &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;hostname&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole infrastructure. You're done.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: The rate-limiting trick (the part most tutorials skip)
&lt;/h2&gt;

&lt;p&gt;Here's the failure mode of every "I just wired alerts" project: a bug fires at 200/sec, your phone buzzes 200 times in 10 seconds, you mute the bot in frustration, you miss the next &lt;em&gt;real&lt;/em&gt; alert two days later.&lt;/p&gt;

&lt;p&gt;You need a &lt;strong&gt;rate limiter&lt;/strong&gt; between your code and Telegram. The simplest one: deduplicate identical messages within a window.&lt;/p&gt;

&lt;h3&gt;
  
  
  Node.js — in-memory dedupe
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DEDUPE_WINDOW_MS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// 5 minutes&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// use first 200 chars as dedupe key&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;last&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;seen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;last&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;last&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;DEDUPE_WINDOW_MS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// skip&lt;/span&gt;
  &lt;span class="nx"&gt;seen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

  &lt;span class="c1"&gt;// ... rest of the send logic&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means &lt;em&gt;the same alert&lt;/em&gt; won't fire more than once every 5 minutes, but &lt;em&gt;different alerts&lt;/em&gt; still go through. A retry loop firing the same exception 800 times in 10 minutes will produce &lt;strong&gt;3 Telegram messages&lt;/strong&gt;, not 800. You'll still know it's happening.&lt;/p&gt;

&lt;h3&gt;
  
  
  Redis-backed (for multi-instance apps)
&lt;/h3&gt;

&lt;p&gt;If your app runs on multiple servers, in-memory dedupe doesn't work. Use Redis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Redis&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ioredis&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;REDIS_URL&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`tg-dedupe:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nf"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;))}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="kd"&gt;set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;NX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// already alerted in last 5 min&lt;/span&gt;

  &lt;span class="c1"&gt;// ... send to Telegram&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;EX 300 NX&lt;/code&gt; does the magic: set the key with a 5-minute TTL, but only if it doesn't already exist. If two servers try to send the same alert simultaneously, only one wins.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to alert on (the harder question)
&lt;/h2&gt;

&lt;p&gt;The tutorial above is the easy part. The hard part is &lt;em&gt;what&lt;/em&gt; to alert on. Bad alerts → alert fatigue → muted bot → you miss the real ones.&lt;/p&gt;

&lt;p&gt;Five things that are usually worth a Telegram ping:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Uncaught exceptions in the main process&lt;/strong&gt; — these usually mean a process is about to die or has died.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Job queue depth above N&lt;/strong&gt; — if Sidekiq/BullMQ/Celery has more than e.g. 1000 jobs queued, something is producing faster than consuming. Investigate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;5xx error rate above 1%&lt;/strong&gt; — not every 500 needs a ping, but the &lt;em&gt;rate&lt;/em&gt; exceeding a threshold does.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specific business events that mean money is leaking&lt;/strong&gt; — failed payment retries, stale webhook signatures, expired refresh tokens.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cron jobs that didn't run&lt;/strong&gt; — if your nightly backup didn't fire by 3:30am, you want to know at 3:30am, not at 9am when you check email.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Five things that are usually &lt;strong&gt;NOT&lt;/strong&gt; worth a ping:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Individual 4xx errors (these are mostly user error or scrapers).&lt;/li&gt;
&lt;li&gt;Slow queries (log them, don't page on them).&lt;/li&gt;
&lt;li&gt;Anything you can't act on in the next 30 minutes.&lt;/li&gt;
&lt;li&gt;Anything where the action is "do nothing, it'll resolve itself" (most CDN hiccups, most rate-limit blips).&lt;/li&gt;
&lt;li&gt;Anything informational ("user X signed up" — use a different channel for celebration).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The test I use: &lt;strong&gt;if I'm at dinner and my phone buzzes, will I be glad I knew, or annoyed?&lt;/strong&gt; If the answer is "annoyed," it doesn't belong on Telegram.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on alert &lt;em&gt;resolution&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;A subtle thing most tutorials skip: alerts should also tell you when something is &lt;em&gt;fixed&lt;/em&gt;. If your job queue depth crossed 1000 and you got pinged, you also want a "now back to 0" ping when it normalizes. Otherwise you spend 20 minutes manually checking the dashboard.&lt;/p&gt;

&lt;p&gt;The simplest pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;inAlertState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkQueueDepth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;inAlertState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`🔴 Queue depth: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;inAlertState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;depth&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;inAlertState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;alert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`🟢 Queue back to &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;depth&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;inAlertState&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two messages per incident, not 200. Your bot stays usable.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to stop building this and use a platform
&lt;/h2&gt;

&lt;p&gt;Everything above takes about an evening to wire up properly. For a small team, that's a great trade.&lt;/p&gt;

&lt;p&gt;There are three signs it's time to stop building this and use a platform that does it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You're spending more than 1 day a quarter maintaining your alerting setup.&lt;/li&gt;
&lt;li&gt;You realize you need &lt;em&gt;pattern&lt;/em&gt; detection (retry loops, token spikes, anomalous response times) — those are much harder to write yourself than threshold alerts.&lt;/li&gt;
&lt;li&gt;You've muted your own bot more than once.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the gap &lt;strong&gt;HostingGuru&lt;/strong&gt; fills. The platform tails your production logs, runs pattern detection automatically (retry loops, token spikes, hot fingerprints, anomalous latency, silent cron failures), and pings you on Telegram with a link to the relevant logs. No code in your app, no Redis dedupe to maintain — it's part of the hosting layer. €19/mo Hobby, €35/mo Pro.&lt;/p&gt;

&lt;p&gt;If you're already deployed somewhere else, the homemade Telegram setup above is a fine start — and probably better than no alerts at all.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do after you finish reading this
&lt;/h2&gt;

&lt;p&gt;Concretely, in the next 30 minutes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the bot via BotFather.&lt;/li&gt;
&lt;li&gt;Get your chat ID with &lt;code&gt;getUpdates&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Add &lt;code&gt;TELEGRAM_BOT_TOKEN&lt;/code&gt; and &lt;code&gt;TELEGRAM_CHAT_ID&lt;/code&gt; to your env vars on whatever platform you use.&lt;/li&gt;
&lt;li&gt;Drop the &lt;code&gt;alert(message)&lt;/code&gt; helper into your codebase.&lt;/li&gt;
&lt;li&gt;Wire it to your top-level uncaught exception handler.&lt;/li&gt;
&lt;li&gt;Add the dedupe logic so a runaway loop doesn't spam you.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the minimum viable production alerting setup for any app. It costs $0, works in any country, lives on your phone, and survives every channel rotation in your team.&lt;/p&gt;

&lt;p&gt;If you wire something interesting on top of this — anomaly detection, business event alerts, anything cool — drop it in the comments. I'm always looking for new patterns to steal.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Previous posts in this series:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;1. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/heroku-just-went-into-sustaining-engineering-mode-here-are-5-alternatives-whose-free-tier-58id"&gt;Heroku just went into "sustaining engineering mode." Here are 5 alternatives whose free tier actually doesn't sleep&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;2. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;I built my MVP with Claude Code. Now I need to deploy it. Here's what nobody tells you.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;3. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;Your AI app is silently burning $2,000/month and you don't know it. Here are the 5 patterns that bite founders.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>monitoring</category>
      <category>devops</category>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Your AI app is silently burning $2,000/month and you don't know it. Here are the 5 patterns that bite founders.</title>
      <dc:creator>Chalom Ellezam</dc:creator>
      <pubDate>Wed, 29 Apr 2026 13:47:32 +0000</pubDate>
      <link>https://dev.to/chalom_ellezam_5989bce65e/your-ai-app-is-silently-burning-2000month-and-you-dont-know-it-here-are-the-5-patterns-that-51pn</link>
      <guid>https://dev.to/chalom_ellezam_5989bce65e/your-ai-app-is-silently-burning-2000month-and-you-dont-know-it-here-are-the-5-patterns-that-51pn</guid>
      <description>&lt;p&gt;&lt;em&gt;Disclosure: I'm a senior backend tech lead and I run HostingGuru, where built-in AI monitoring is the feature I'm proudest of. This article will mention HostingGuru once at the end, but the patterns and detection methods below work on any platform — I want this useful even if you never become a customer.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The cleanest version of this story is one I keep hearing from founders, with small variations each time:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"We woke up to a $2,400 OpenAI bill. The product still works. Sentry is green. Our error rate is normal. We have no idea what happened."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then they dig in. They find a webhook handler that's been retrying a Stripe event for 11 days because a key was rotated and the retry logic capped out at "30 minutes between attempts" instead of "stop after 24 hours." Each retry calls an LLM to summarize the event for an internal log. 11 days × 48 retries × 8K tokens × $0.04. The math is unforgiving.&lt;/p&gt;

&lt;p&gt;Or they find an agent that's been self-triggering. Or a context window that quietly grew from 4K to 80K tokens because nobody noticed a bug stuffing the entire conversation history into every prompt. Or a cron job that runs at 3am and produces output nobody reads, but produces it via Claude Sonnet at $3 per million input tokens.&lt;/p&gt;

&lt;p&gt;This is the 2026 version of a problem that used to be small. AI made it expensive.&lt;/p&gt;

&lt;p&gt;I want to walk you through the five patterns I see most often, why they're invisible to traditional monitoring, and what you can actually do about them tonight.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is harder than it used to be
&lt;/h2&gt;

&lt;p&gt;Pre-AI, a runaway loop in your app was annoying. It maxed out a CPU, your alerting noticed the CPU pegged, you got paged, you fixed it. Total damage: a few hours of degraded service, maybe a small AWS bill bump.&lt;/p&gt;

&lt;p&gt;Post-AI, a runaway loop is &lt;em&gt;expensive&lt;/em&gt;. Each iteration calls an LLM. Each LLM call costs real money. Worst of all, &lt;strong&gt;the loop doesn't show up as a problem in any of your existing tools&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sentry&lt;/strong&gt;: aggregates errors by fingerprint. Same retry loop = "1 issue, +850 events." It looks like one bug, not eight hundred.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CloudWatch / Datadog&lt;/strong&gt;: traffic and CPU look fine — a retry loop is just a steady stream of requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stripe / your billing dashboard&lt;/strong&gt;: shows charges &lt;em&gt;after&lt;/em&gt; they happen, on a 24-48h delay.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Your inbox&lt;/strong&gt;: silent. The OpenAI / Anthropic / Stripe APIs don't email you when one customer is making 50,000 calls an hour.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first signal you usually get is the credit card alert from your bank. By then, you're $1,000+ in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 1: The infinite retry loop
&lt;/h2&gt;

&lt;p&gt;The classic. A background job hits a transient error, your retry logic backs off exponentially, eventually retries every 30 minutes, but never gives up. The underlying issue is permanent: a webhook secret was rotated, an API key was deactivated, a file path was renamed. The job will retry forever.&lt;/p&gt;

&lt;p&gt;If the job involves an LLM call (summarizing the error, deciding next action, generating a fallback response), every retry costs tokens. Multiply by however long until someone notices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real example I saw last month&lt;/strong&gt;: a B2B SaaS doing email parsing. Their email parser used GPT-4 to extract structured data. One specific email format consistently failed validation downstream. The retry queue kept retrying. 11,000 emails × 6 retries × $0.10 per call = $6,600 wasted before the founder noticed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to detect it tonight&lt;/strong&gt;: query your job queue (BullMQ, Sidekiq, Celery, whatever) for jobs that have been "active" or "failed" for more than 24 hours. Set a hard cap: any job that retries more than 10 times gets paged or dropped, no exceptions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 2: The self-triggering agent
&lt;/h2&gt;

&lt;p&gt;Multi-agent systems are particularly good at this one. Agent A produces output. Agent B reads agent A's output and decides "I should ping agent A for clarification." Agent A produces a clarification. Agent B reads it and decides "I should clarify the clarification." The conversation continues until you run out of context or money — whichever comes first.&lt;/p&gt;

&lt;p&gt;I saw this kill a YC startup's monthly budget in 14 hours. They'd shipped a "research assistant" that orchestrated three agents. A user typed an ambiguous query. The agents started clarifying each other. By the time the user's session timed out, the system had made 4,200 LLM calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to detect it tonight&lt;/strong&gt;: hard-cap your multi-agent loops at 10 turns. After 10 back-and-forth iterations, the system returns whatever it has and exits. If you're using LangChain or similar, this is one config flag. If you've written your own orchestration, it's three lines of code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 3: The "fingerprint aggregation" blind spot
&lt;/h2&gt;

&lt;p&gt;This is the most insidious one because it specifically defeats Sentry / Bugsnag / Honeybadger.&lt;/p&gt;

&lt;p&gt;Error monitoring tools group errors by &lt;em&gt;fingerprint&lt;/em&gt; (basically a hash of the stack trace + error message). Same fingerprint = "this is the same bug." The dashboard shows "+842 events on this issue" with a slowly incrementing counter.&lt;/p&gt;

&lt;p&gt;The problem: a &lt;em&gt;retry loop firing the same error 800 times&lt;/em&gt; looks identical to &lt;em&gt;800 different users hitting the same bug once&lt;/em&gt;. Your error tool can't tell them apart. Both show up as "+800 events on the same issue." If you're not specifically watching event-rate per fingerprint, you'll miss the loop entirely.&lt;/p&gt;

&lt;p&gt;The default Sentry alerts trigger on &lt;em&gt;new&lt;/em&gt; issues, not on suddenly-very-noisy &lt;em&gt;existing&lt;/em&gt; issues. So a bug that's been silently looping at 50/sec for 6 hours doesn't trip any alerts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to detect it tonight&lt;/strong&gt;: add a custom Sentry alert on "any single issue with &amp;gt; 100 events per hour." Most teams forget this exists. It's the alert that catches the silent loops.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 4: The context window that quietly grew
&lt;/h2&gt;

&lt;p&gt;Here's how this happens: you ship an AI feature with a 4K-token context window. Works fine in dev. In prod, a customer accumulates a long conversation history. Your code (or worse, Claude's code from when it built the feature) appends the &lt;em&gt;entire&lt;/em&gt; conversation history to every new prompt without truncation.&lt;/p&gt;

&lt;p&gt;Six months later, that customer has a 60K-token conversation. Every interaction now costs 15× what it did at launch. Multiplied across all your power users, you've quietly 5x'd your per-user AI cost without noticing — because the increase is gradual and the dashboard just shows "monthly OpenAI bill went up."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to detect it tonight&lt;/strong&gt;: log the input token count of every LLM call (most SDKs return this). Plot the p95 input token count over time. If it's trending up, you have context bloat. The fix is usually a sliding window or a summarization step.&lt;/p&gt;

&lt;p&gt;This is also where I see the most "Claude Code did this and now I owe $400" stories. Claude is generous with context — it'll happily concatenate everything if you don't tell it not to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pattern 5: The cron job that never reads its output
&lt;/h2&gt;

&lt;p&gt;Less dramatic, more common: a &lt;code&gt;0 3 * * *&lt;/code&gt; cron job kicks off every night at 3am. It runs an analysis. It generates a report. It writes the report to a database table or an S3 bucket. &lt;em&gt;Nobody reads the report.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This was useful when you had it built last year. Then the team member who used it left. Then the report became stale. Then it became wrong. But the cron keeps running every night, calling the LLM, eating tokens. Quietly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to detect it tonight&lt;/strong&gt;: list every cron job in your system. For each one, ask: "if this stopped running tomorrow, would anyone notice within 7 days?" If the answer is no, kill it. (You can always add it back if someone complains.)&lt;/p&gt;

&lt;h2&gt;
  
  
  What "good monitoring" looks like for AI apps
&lt;/h2&gt;

&lt;p&gt;Traditional monitoring (Sentry, Datadog, CloudWatch) is great at finding &lt;em&gt;errors&lt;/em&gt;. They're bad at finding &lt;em&gt;patterns&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The patterns above all share two properties:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;They're not errors.&lt;/strong&gt; They're successful behavior at high volume.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;They don't trigger alerts.&lt;/strong&gt; Each individual call looks fine. Only the aggregate rate is wrong.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What you actually need is a layer that watches &lt;em&gt;behavior&lt;/em&gt;, not errors. Some signals worth tracking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Token usage per user per day (spike = investigation trigger)&lt;/li&gt;
&lt;li&gt;LLM call rate per service (steady ≠ healthy if it's been steady for 18 hours unattended)&lt;/li&gt;
&lt;li&gt;Job queue length over time (growing slowly = retry loop accumulating)&lt;/li&gt;
&lt;li&gt;Per-fingerprint event rate (the Sentry blind spot above)&lt;/li&gt;
&lt;li&gt;Cost per active user (rising = something's bloating somewhere)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can build this yourself. It takes about 2 weeks of work for a backend engineer. You can also use a platform that has it built in, which is what I want to be honest about now.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built (and why I built it)
&lt;/h2&gt;

&lt;p&gt;I'm a senior backend tech lead. I've shipped production systems for BeReal, Oney, Ringover. I built &lt;strong&gt;HostingGuru&lt;/strong&gt; because the gap between "Sentry tells me when something errors" and "I get a Telegram ping at 3am that says 'this Stripe webhook handler has retried 200 times in the last hour, here's the link to the logs'" was the gap I kept finding myself filling manually for clients.&lt;/p&gt;

&lt;p&gt;HostingGuru's AI monitoring tails your production logs and alerts on patterns, not errors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Retry loops&lt;/strong&gt; detected when the same operation fires faster than expected, regardless of error rate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token spikes&lt;/strong&gt; detected when a user's per-day LLM cost jumps significantly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hot fingerprints&lt;/strong&gt; detected when one Sentry-style issue suddenly explodes in event rate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anomalous response times&lt;/strong&gt; detected when p95 latency jumps without an obvious traffic cause&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Silent cron failures&lt;/strong&gt; detected when a job that ran consistently for 30 days suddenly stops&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Alerts go to &lt;strong&gt;Telegram&lt;/strong&gt; by default — because that's where founders actually look at 3am. (Email and Slack also supported.)&lt;/p&gt;

&lt;p&gt;It works on any app deployed to HostingGuru, on any of the 14+ frameworks we support. The alerts and pattern detection are part of the platform — no extra config, no extra subscription.&lt;/p&gt;

&lt;p&gt;If you've ever woken up to a surprise bill, this is the layer that would have caught it before it happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do tonight, regardless of which platform you use
&lt;/h2&gt;

&lt;p&gt;You don't need to switch hosts to catch most of these. Five concrete moves:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Run a query on your job queue&lt;/strong&gt; for any job retrying more than 10 times. Cancel them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cap your multi-agent loops at 10 turns&lt;/strong&gt; in code. One commit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add a Sentry alert&lt;/strong&gt; on "any single issue with &amp;gt; 100 events per hour."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log token counts&lt;/strong&gt; on every LLM call and check p95 input tokens trend. If trending up, fix context truncation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;List every cron job&lt;/strong&gt; and kill any whose output nobody reads.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These five moves take an evening. They prevent the vast majority of the surprise-bill stories I hear. Whether you do them on HostingGuru, Render, Railway, AWS, or your own VPS, you should do them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The harder truth
&lt;/h2&gt;

&lt;p&gt;The hardest part of running an AI-powered product in 2026 isn't building it. AI tools made building it 10x cheaper. The hard part is &lt;em&gt;operating&lt;/em&gt; it — knowing what's running, what it's costing, what's broken in a way that doesn't show up as broken.&lt;/p&gt;

&lt;p&gt;The cost of a runaway loop went from "annoying" to "expensive" the moment AI became a per-call API charge. The tools we use to monitor production didn't get the memo. Sentry was designed in a world where errors were the primary problem; it's still the best at that, but it's not the right tool for "your tokens are leaking somewhere."&lt;/p&gt;

&lt;p&gt;Until that gap closes across the whole industry, you have to build it yourself or use a platform that has it built in. Either path is fine. The one path that doesn't end well is "we'll find out at the end of the month."&lt;/p&gt;

&lt;p&gt;If you've had a $2,000 surprise bill, what was the cause? I'm collecting these stories — drop them in the comments. The patterns repeat surprisingly often.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Previous posts in this series:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;1. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/heroku-just-went-into-sustaining-engineering-mode-here-are-5-alternatives-whose-free-tier-58id"&gt;Heroku just went into "sustaining engineering mode." Here are 5 alternatives whose free tier actually doesn't sleep&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;2. &lt;a href="https://dev.to/chalom_ellezam_5989bce65e/"&gt;I built my MVP with Claude Code. Now I need to deploy it. Here's what nobody tells you.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>webdev</category>
      <category>observability</category>
    </item>
    <item>
      <title>I built my MVP with Claude Code. Now I need to deploy it. Here's what nobody tells you.</title>
      <dc:creator>Chalom Ellezam</dc:creator>
      <pubDate>Mon, 27 Apr 2026 18:54:41 +0000</pubDate>
      <link>https://dev.to/chalom_ellezam_5989bce65e/i-built-my-mvp-with-claude-code-now-i-need-to-deploy-it-heres-what-nobody-tells-you-2c8c</link>
      <guid>https://dev.to/chalom_ellezam_5989bce65e/i-built-my-mvp-with-claude-code-now-i-need-to-deploy-it-heres-what-nobody-tells-you-2c8c</guid>
      <description>&lt;p&gt;&lt;em&gt;Disclosure up front: I'm a senior backend tech lead by trade and I run HostingGuru, one of the platforms mentioned at the end. I tried to make this useful even if you pick something else.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;In 2026, a strange thing has happened. Hundreds of thousands of people who couldn't write a &lt;code&gt;for&lt;/code&gt; loop two years ago are now shipping working software. They open Claude Code, Cursor, Lovable, or Bolt, describe what they want, and a few hours later they have a working app on their laptop.&lt;/p&gt;

&lt;p&gt;I've watched this from up close. Friends who run agencies, founders who used to pay €30k for an MVP, business school grads with no engineering background — they all show me the same screen: a Mac terminal, an app running on &lt;code&gt;localhost:3000&lt;/code&gt;, and a face that's half pride, half panic.&lt;/p&gt;

&lt;p&gt;The pride: &lt;em&gt;"I built this myself."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The panic: &lt;em&gt;"How do I put it on the internet?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That panic is what this post is about. Because the gap between "it works on my laptop" and "people can use it from a phone in another country" is wider than you think — and almost no one is writing for the person facing it for the first time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The deployment cliff nobody warns you about
&lt;/h2&gt;

&lt;p&gt;Building software with AI got 10x easier in 2 years. Deploying it didn't.&lt;/p&gt;

&lt;p&gt;When you ask Claude Code to build a feature, it works. Maybe not perfectly, but it works. When you ask it &lt;em&gt;"how do I deploy this?"&lt;/em&gt;, you get one of three answers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A list of commands that assume you already know what Docker is&lt;/li&gt;
&lt;li&gt;A reference to "your hosting provider" — implying you have one&lt;/li&gt;
&lt;li&gt;A polite suggestion to "consult your DevOps team"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You don't have a DevOps team. You don't have a hosting provider. You have a laptop and a working app and a launch date.&lt;/p&gt;

&lt;p&gt;This is the deployment cliff. It's the moment where the AI assistant that taught you to code stops being useful, because deployment is half technical knowledge and half operational vigilance — and operational vigilance is exactly what AI is &lt;em&gt;worst&lt;/em&gt; at.&lt;/p&gt;

&lt;h2&gt;
  
  
  The four things you actually need
&lt;/h2&gt;

&lt;p&gt;Stripped of jargon, putting an app on the internet is four things:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. A computer that's always on
&lt;/h3&gt;

&lt;p&gt;When your app runs on &lt;code&gt;localhost:3000&lt;/code&gt;, your laptop is the computer. The moment you close the lid, the app is gone. To stay online 24/7, your code needs to live on a computer that someone else keeps running. That's hosting.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. An address people can type
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;localhost:3000&lt;/code&gt; only works for you. To let other people in, the computer needs a public address — usually a domain like &lt;code&gt;myapp.com&lt;/code&gt;. You buy this from a registrar (Namecheap, OVH, Gandi). Then you point it at the computer.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. A place to store your secrets
&lt;/h3&gt;

&lt;p&gt;Your app probably has API keys — to OpenAI, to Stripe, to whatever payments service you're using. These can't live in your code (anyone could see them). They live in &lt;em&gt;environment variables&lt;/em&gt; — a secure storage that your hosting platform provides.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. A way to know when something breaks
&lt;/h3&gt;

&lt;p&gt;This is the one nobody mentions. Once your app is live, things break. APIs go down, retries fail, payments don't go through. The question isn't whether something will break — it's whether you'll know about it before your users do.&lt;/p&gt;

&lt;p&gt;That's what monitoring is. It's the layer that watches your app and tells you when something's wrong.&lt;/p&gt;

&lt;p&gt;If you're a non-technical founder, item #4 is the part that will quietly kill your launch. Not because the bug is hard to fix, but because you won't know it happened. Three days later a user emails &lt;em&gt;"hey, your signup form has been broken since Tuesday"&lt;/em&gt; and you realize you've lost 70% of the people who tried to sign up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why most platforms aren't built for you
&lt;/h2&gt;

&lt;p&gt;Now you Google &lt;em&gt;"deploy app"&lt;/em&gt; and you hit a wall of options. Heroku, Vercel, Render, Railway, Fly.io, AWS, Google Cloud, DigitalOcean. Let me save you the research:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS, Google Cloud, Azure&lt;/strong&gt;: Skip. These are infrastructure-as-toolboxes. They're powerful and cheap &lt;em&gt;if&lt;/em&gt; you're willing to spend three weeks learning them. You're not. Don't even click.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Heroku&lt;/strong&gt;: Was the easy answer for a decade. Entered "sustaining engineering mode" in February 2026 — meaning the company stopped shipping new features. Still works, but the trajectory is downward and the free tier is gone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vercel, Netlify&lt;/strong&gt;: Great for &lt;em&gt;frontends&lt;/em&gt;. If your app is purely a Next.js / React / static site, these are excellent. The moment you have a backend (a Node.js API, a Python script, a database connection), you're going to fight against the Serverless Functions abstraction. If you're not technical, this fight will exhaust you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Render&lt;/strong&gt;: The closest to old Heroku. Solid product. The catch: the free tier sleeps after 15 minutes of inactivity. Cold starts are 30 seconds. For a demo or a small product, that's a problem — your users see a blank screen for 30 seconds and many leave.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Railway&lt;/strong&gt;: Slick UX, usage-based pricing. Good for small projects. The catch: usage-based billing is fair when traffic is small and surprising when traffic spikes. A runaway script can produce a 4-figure bill quietly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fly.io&lt;/strong&gt;: Powerful, multi-region. Requires a &lt;code&gt;fly.toml&lt;/code&gt; config file you'll have to learn to write or paste from documentation. Fine if you're a developer, more friction if you're not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Self-hosted (a VPS on OVH/DigitalOcean + something like Coolify)&lt;/strong&gt;: Cheapest. You manage the server: OS updates, security patches, backups, SSL renewals, monitoring setup. Everyone tells you it's "not that hard." It actually is — when you're trying to focus on your product, not on Linux.&lt;/p&gt;

&lt;p&gt;You see the pattern. Every option assumes you know one or two things you don't yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "deploying for non-developers" actually needs to look like
&lt;/h2&gt;

&lt;p&gt;Here's the honest list of what someone shipping their first AI-built MVP needs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One simple action&lt;/strong&gt;: connect your GitHub repo, click deploy. No &lt;code&gt;fly.toml&lt;/code&gt;, no &lt;code&gt;Dockerfile&lt;/code&gt;, no YAML, no terminal commands.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A free tier that doesn't sleep&lt;/strong&gt;: because if you're showing a demo to your first 10 users and the site takes 30 seconds to load, you've lost the demo.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain + SSL handled for you&lt;/strong&gt;: you should never have to type &lt;code&gt;certbot&lt;/code&gt;. Type your domain, the platform deals with HTTPS.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment variables in a clean UI&lt;/strong&gt;: paste your API keys, save, done.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring built in, not bolted on&lt;/strong&gt;: when something breaks at 3am, the platform should tell you. Ideally on a channel you actually check (Telegram, email, Slack).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable pricing&lt;/strong&gt;: a flat monthly tier. €19, €29, whatever. Not "usage-based" that surprises you. You know your bill before the month starts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A human you can ask&lt;/strong&gt;: when you're stuck, you don't want a 200-page documentation site. You want someone who replies in a Discord or by email and says &lt;em&gt;"oh, paste this in your env vars."&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The platforms above each tick some boxes. None tick all of them, with one exception I want to be honest about.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built (and why I built it)
&lt;/h2&gt;

&lt;p&gt;I'm a senior backend tech lead. I've shipped production systems for BeReal, Oney, Ringover. I've watched hundreds of friends try to deploy side projects and either give up or burn money on infrastructure they don't need.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;HostingGuru&lt;/strong&gt; — a managed PaaS positioned exactly at this gap. It's not the cheapest. It's not the most powerful. It's the one I'd want my non-technical friends to use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Free tier that never sleeps.&lt;/strong&gt; Your demo stays online, period.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connect GitHub, hit deploy.&lt;/strong&gt; No Dockerfile. No YAML. No CLI required.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;14+ frameworks supported&lt;/strong&gt; out of the box: Next.js, Django, Rails, FastAPI, Express, Go, Rust, plus Docker if you want it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Built-in AI monitoring with Telegram alerts.&lt;/strong&gt; It tails your production logs and pings you when something looks off — retry loops draining your tokens at 3am, anomalous error spikes, response time degradation. This is the feature I'm proudest of, because it's the one most often missing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EU + US data centers&lt;/strong&gt; (Germany + Oregon). ISO 27001, GDPR-compliant. Important if your users are in Europe.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Predictable pricing.&lt;/strong&gt; €19/mo Hobby (3 services). €35/mo Pro (10 services, custom domains, encrypted env vars, background workers).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human support.&lt;/strong&gt; I read every Discord message myself.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For non-technical founders specifically, we also offer &lt;strong&gt;white-glove setup&lt;/strong&gt; for the Hobby tier: you tell us your repo, we set up the deployment and the domain, and you start with a working app. €19/mo + a one-time setup. You touch nothing.&lt;/p&gt;

&lt;p&gt;If you're shipping your first Claude Code or Lovable app and you've been staring at AWS pricing tables for an hour, this is built for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do next, regardless of which platform you pick
&lt;/h2&gt;

&lt;p&gt;Whether you go with HostingGuru, Render, Railway, or anything else, this is the order of operations that won't get you stuck:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Push your code to GitHub.&lt;/strong&gt; If you don't have a GitHub account yet, make one. Every modern platform deploys from there.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pick a platform that matches your tech.&lt;/strong&gt; Next.js → most platforms work. Django/Rails → Render or HostingGuru. Static site → Netlify or Vercel.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Buy a domain.&lt;/strong&gt; Cheapest from Namecheap or OVH (€10–15/year). Don't buy through your hosting platform — domains and hosting should be separate, so if you ever switch hosts you keep the domain.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set up environment variables BEFORE deploying.&lt;/strong&gt; Look at your &lt;code&gt;.env&lt;/code&gt; file (the one you don't push to GitHub). Every line in there needs to be set on the platform.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deploy a "hello world" version first.&lt;/strong&gt; Not your full app — just confirm the platform can build something. Then add complexity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set up monitoring on day one.&lt;/strong&gt; Even just an UptimeRobot ping every 5 minutes to your homepage. You want to know when the site is down before users tell you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set up alerts on a channel you check.&lt;/strong&gt; Email is fine. Telegram is better. Slack if your team uses it. &lt;em&gt;Not&lt;/em&gt; an internal dashboard you'll forget exists.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. That's the whole thing. The first time you do it, it'll take an afternoon. The second time, twenty minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on Claude Code specifically
&lt;/h2&gt;

&lt;p&gt;If you built your app with Claude Code, you have one big advantage: the codebase tends to be cleaner and more conventional than what a beginner would write by hand. Most modern PaaS platforms can deploy it without modification, because Claude tends to use standard project structures (a clear &lt;code&gt;package.json&lt;/code&gt;, a sensible &lt;code&gt;requirements.txt&lt;/code&gt;, environment variables already factored out).&lt;/p&gt;

&lt;p&gt;What you should &lt;em&gt;not&lt;/em&gt; do: ask Claude Code to "deploy this for me." It can't. Deployment requires real-time interaction with the platform's API or dashboard, which an AI agent can't reliably do unattended yet. Claude Code is for &lt;em&gt;building&lt;/em&gt;. Deployment is for you (or for a managed service that handles it on your behalf).&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;If you're reading this because you finished your first MVP last weekend and your laptop is uncomfortably warm, congratulations. The fact that the deployment step feels overwhelming is a sign you're about to learn something most engineers spent years learning. You'll do it once, it'll feel impossible, and then it'll be done.&lt;/p&gt;

&lt;p&gt;If you want to skip that learning curve and have someone do it for you while you focus on the product, we offer that — that's &lt;a href="https://hostingguru.io" rel="noopener noreferrer"&gt;HostingGuru&lt;/a&gt;. If you want to do it yourself with maximum DX, Render is also a great call. Either way: don't let the deployment step block your launch. Pick a platform tonight, ship something visible tomorrow.&lt;/p&gt;

&lt;p&gt;What did you build? I'm curious what people are shipping with Claude Code these days. Drop a link in the comments.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you found this useful, the previous post in this series was "&lt;a href="https://dev.to/chalom_ellezam_5989bce65e/heroku-just-went-into-sustaining-engineering-mode-here-are-5-alternatives-whose-free-tier-58id"&gt;Heroku just went into 'sustaining engineering mode'. Here are 5 alternatives whose free tier actually doesn't sleep&lt;/a&gt;" — a comparison of 5 PaaS options for the developer side of the same question.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>beginners</category>
      <category>webdev</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>Heroku just went into "sustaining engineering mode." Here are 5 alternatives whose free tier actually doesn't sleep.</title>
      <dc:creator>Chalom Ellezam</dc:creator>
      <pubDate>Fri, 24 Apr 2026 08:00:27 +0000</pubDate>
      <link>https://dev.to/chalom_ellezam_5989bce65e/heroku-just-went-into-sustaining-engineering-mode-here-are-5-alternatives-whose-free-tier-58id</link>
      <guid>https://dev.to/chalom_ellezam_5989bce65e/heroku-just-went-into-sustaining-engineering-mode-here-are-5-alternatives-whose-free-tier-58id</guid>
      <description>&lt;p&gt;In February 2026, Heroku quietly announced it was entering &lt;strong&gt;"sustaining engineering mode"&lt;/strong&gt; — shifting focus away from new features toward "stability and security." They also stopped offering Enterprise contracts to new customers.&lt;/p&gt;

&lt;p&gt;Translation, for the rest of us: Heroku is in maintenance mode. It still works. It's not getting better.&lt;/p&gt;

&lt;p&gt;If you've been on Heroku (especially if you came back after the free-dyno era ended), this is a good moment to revisit the landscape. The PaaS market has actually gotten &lt;em&gt;better&lt;/em&gt; since Heroku stopped innovating. The alternatives below all do parts of what Heroku did well — often for less money.&lt;/p&gt;

&lt;p&gt;But here's the question I rarely see asked in "Heroku alternatives" listicles, and it matters a lot for side projects and MVPs:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Does the free tier actually stay up, or does it sleep after 15 minutes of inactivity?&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the #1 gotcha in the alternatives market. A lot of "free" tiers aren't really free if your first user of the day gets a 30-second cold start. Below, each platform is evaluated specifically on that question, plus pricing, free-tier limits, and who it's actually best for.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. HostingGuru
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Free tier: never sleeps.&lt;/strong&gt; This is the specific promise — your app stays online, period. One always-on service, free SSL, GitHub auto-deploy, a free custom subdomain. No credit card.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Paid tiers:&lt;/strong&gt; $19/mo Hobby (3 services, custom domains, encrypted env vars) → $35/mo Pro (10 services, guaranteed resources, background workers, on-demand scripts).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frameworks:&lt;/strong&gt; 14+ including Next.js, Django, Rails, Laravel, FastAPI, Express, Rust, Go, Docker, static sites.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data centers:&lt;/strong&gt; Germany (EU) and Oregon (US). ISO 27001 / GDPR compliant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's different:&lt;/strong&gt; built-in AI monitoring that tails production logs and pings the team on Telegram when something looks off — hangs, retry loops quietly burning tokens at 3am, unusual error spikes. Most PaaS products leave observability entirely to you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; solo devs, freelancers, and small teams who want a never-sleeps free tier to start on, and a predictable paid path from there. Also: anyone who needs EU hosting for GDPR reasons.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Caveats:&lt;/strong&gt; smaller team than the big players. Discord + email is where help happens.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Render
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Free tier: sleeps.&lt;/strong&gt; Free web services spin down after 15 minutes of inactivity. Cold start is ~30 seconds. Free background workers don't exist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Paid tiers:&lt;/strong&gt; $7/mo Starter web service (no sleep, 512MB RAM, 0.5 CPU). Managed Postgres from $7/mo. Cron jobs, preview environments, background workers all available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frameworks:&lt;/strong&gt; most things, with Dockerfile fallback for anything else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's different:&lt;/strong&gt; probably the most "Heroku-like" option out there. Blueprints (infra-as-code), preview environments on PRs, cron jobs, background workers, managed Postgres. Polished product, long track record.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; teams who liked the Heroku model and will pay $7+/mo right away. If you specifically want a free tier that stays up, Render isn't it.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Railway
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Free tier: time-limited trial credit.&lt;/strong&gt; You get ~$5 of trial credit that expires. After that, usage-based billing (roughly $0.000231 per GB-hour compute). No perpetual free tier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Paid:&lt;/strong&gt; pay-as-you-go; a typical small web service runs $5–20/mo depending on traffic and memory footprint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frameworks:&lt;/strong&gt; anything that speaks HTTP, really. Nixpacks handles most stacks automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's different:&lt;/strong&gt; slickest developer UX in this list. Deploy-from-GitHub is one-click. Usage-based billing is fair-feeling if your project is genuinely small, and scales smoothly if it grows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; builders who hate pricing tiers and prefer usage-based, predictable scaling. Not a fit if your MVP needs to run for free while you figure out product-market fit.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Fly.io
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Free tier: 3 shared-cpu-1x VMs, ~256MB RAM each, ~3GB persistent volume.&lt;/strong&gt; They've tightened the free allowance over time (it used to be more generous), but it's still one of the only real "free always-on" options outside HostingGuru.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Paid:&lt;/strong&gt; pay-as-you-go beyond the free resources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frameworks:&lt;/strong&gt; anything Dockerizable. There's a &lt;code&gt;fly.toml&lt;/code&gt; file you'll touch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's different:&lt;/strong&gt; deploy to multiple regions by default. Your app runs close to your users globally, not in a single region. More "infra-native" feel than the other options here — it leans toward the ops side of the spectrum.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; devs who want edge/multi-region behavior, are comfortable with slightly lower-level abstractions, and don't mind writing a config file.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. DigitalOcean App Platform
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Free tier: static sites only.&lt;/strong&gt; Dynamic web services start at $5/mo. Managed databases from $15/mo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Paid:&lt;/strong&gt; from $5/mo for a basic web service. Higher tiers unlock more CPU/RAM and autoscaling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frameworks:&lt;/strong&gt; buildpack-based, supports the main stacks and Docker fallback.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's different:&lt;/strong&gt; lives inside the broader DigitalOcean ecosystem — Droplets, managed databases, Spaces object storage, Kubernetes. The upgrade path is clear: if you outgrow App Platform, you can move the same app to Droplets or DOKS without switching providers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; teams who plan to scale beyond PaaS later and want a single provider to grow into.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Free tier stays up?&lt;/th&gt;
&lt;th&gt;Min paid&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HostingGuru&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes, never sleeps&lt;/td&gt;
&lt;td&gt;$19/mo&lt;/td&gt;
&lt;td&gt;Always-on free start + AI monitoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Render&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No (sleeps 15 min)&lt;/td&gt;
&lt;td&gt;$7/mo&lt;/td&gt;
&lt;td&gt;Heroku-like, paid-from-day-one&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Railway&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Trial credit only&lt;/td&gt;
&lt;td&gt;~$5–20/mo usage&lt;/td&gt;
&lt;td&gt;Usage-based, predictable scaling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fly.io&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes (3 shared VMs)&lt;/td&gt;
&lt;td&gt;Pay-as-you-go&lt;/td&gt;
&lt;td&gt;Multi-region / edge&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DO App Platform&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Static only&lt;/td&gt;
&lt;td&gt;$5/mo&lt;/td&gt;
&lt;td&gt;Scale-out path inside DigitalOcean&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  How to actually pick
&lt;/h2&gt;

&lt;p&gt;Three questions cut through the marketing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Does your free tier need to stay up?&lt;/strong&gt;&lt;br&gt;
If yes → HostingGuru or Fly.io. If no → Render is probably the closest to Heroku ergonomics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Do you want a tier ladder or usage-based pricing?&lt;/strong&gt;&lt;br&gt;
Tiers (HostingGuru, Render, DigitalOcean) are predictable and simple. Usage-based (Railway, Fly.io) is fair if your project is small — and occasionally produces surprises if traffic spikes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Where does your data need to live?&lt;/strong&gt;&lt;br&gt;
For European customers or GDPR-sensitive workloads → HostingGuru or Fly.io both have EU regions. Render is US-primary (an EU region exists but less mature). DigitalOcean has global data centers but App Platform defaults to the US.&lt;/p&gt;




&lt;h2&gt;
  
  
  Migrating from Heroku
&lt;/h2&gt;

&lt;p&gt;Most of these have a "deploy from GitHub" flow that makes migration fairly painless for standard stacks (Node, Python, Ruby, Go). The tricky parts are usually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled jobs.&lt;/strong&gt; Heroku Scheduler → each platform has its own cron story. Most support cron syntax natively now.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Background workers.&lt;/strong&gt; Test carefully; resource limits differ.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Managed Postgres data migration.&lt;/strong&gt; &lt;code&gt;pg_dump&lt;/code&gt; + restore into the new managed DB is the universal answer. Expect 5 minutes of downtime for a small DB.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment variables.&lt;/strong&gt; Most platforms let you bulk import from a &lt;code&gt;.env&lt;/code&gt; file or paste from Heroku config vars.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom domains and SSL.&lt;/strong&gt; Trivial everywhere — swap DNS, SSL auto-provisions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For a Rails or Django app with one web dyno, one worker, and a Postgres DB, a full migration is usually a 1–2 hour job end to end.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Heroku's transition to "sustaining engineering mode" is, in a way, a gift. It forces a market that had gone quiet to start innovating again, and the alternatives above reflect that.&lt;/p&gt;

&lt;p&gt;If your top priority is &lt;em&gt;"free tier that stays up"&lt;/em&gt; — look at &lt;strong&gt;HostingGuru&lt;/strong&gt; (what I build) or &lt;strong&gt;Fly.io&lt;/strong&gt;. If your priority is &lt;em&gt;"Heroku ergonomics, willing to pay $7/mo from day one"&lt;/em&gt; — &lt;strong&gt;Render&lt;/strong&gt;. If your priority is &lt;em&gt;"usage-based, no tiers"&lt;/em&gt; — &lt;strong&gt;Railway&lt;/strong&gt;. If your priority is &lt;em&gt;"future-proof path to bigger infrastructure"&lt;/em&gt; — &lt;strong&gt;DigitalOcean App Platform&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What are you moving to? I'm curious what specifically pushes people one direction or the other — the free-tier question, the pricing model, the frameworks supported, the region. Leave a comment, happy to answer specific migration questions.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you want to try HostingGuru's always-on free tier, it's at &lt;a href="https://hostingguru.io" rel="noopener noreferrer"&gt;hostingguru.io&lt;/a&gt;. No credit card, no sleep, GitHub-to-production in about 90 seconds. I'd love feedback if you give it a spin.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cloud</category>
      <category>news</category>
      <category>sideprojects</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
