<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sumedhvats</title>
    <description>The latest articles on DEV Community by Sumedhvats (@sumedhvats).</description>
    <link>https://dev.to/sumedhvats</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3017143%2Fb59d4173-ce6c-4f15-9642-50d8ff4c27aa.png</url>
      <title>DEV Community: Sumedhvats</title>
      <link>https://dev.to/sumedhvats</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sumedhvats"/>
    <language>en</language>
    <item>
      <title>How I Shrunk My Docker Images by 98% (Go + Next.js)</title>
      <dc:creator>Sumedhvats</dc:creator>
      <pubDate>Tue, 16 Jun 2026 13:21:51 +0000</pubDate>
      <link>https://dev.to/sumedhvats/how-i-shrunk-my-docker-images-by-98-go-nextjs-196l</link>
      <guid>https://dev.to/sumedhvats/how-i-shrunk-my-docker-images-by-98-go-nextjs-196l</guid>
      <description>&lt;p&gt;I was building &lt;a href="https://paste.sumedh.app" rel="noopener noreferrer"&gt;pasteCTL&lt;/a&gt; — a real-time collaborative paste/code sharing app (&lt;a href="https://github.com/Sumedhvats/pasteCTL_web" rel="noopener noreferrer"&gt;source on GitHub&lt;/a&gt;) — and at some point I opened lazydocker to check on things and saw this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3492pdu4smp5kwqn38h1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3492pdu4smp5kwqn38h1.png" alt="Initial docker image size on lazydocker"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Combined, my two services were sitting at nearly &lt;strong&gt;3.6GB&lt;/strong&gt;. For a Go API and a Next.js frontend.&lt;/p&gt;

&lt;p&gt;The app worked fine. But those numbers were going to slow down every deploy, eat registry storage, and make cold starts painful. So I fixed both of them. Here's the full process, step by step.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Docker Images Get So Big
&lt;/h2&gt;

&lt;p&gt;When you write &lt;code&gt;FROM golang:1.26&lt;/code&gt; or &lt;code&gt;FROM node&lt;/code&gt;, you're pulling an image designed for &lt;em&gt;development&lt;/em&gt; — it includes the compiler, build tools, package managers, debug utilities, and a full OS userland. All of that gets baked into your final image even though none of it runs in production.&lt;/p&gt;

&lt;p&gt;The Go compiler alone is ~600MB. The default &lt;code&gt;node&lt;/code&gt; image (Debian-based) is over 1GB before you install a single dependency. By the time you run &lt;code&gt;npm install&lt;/code&gt; or &lt;code&gt;go mod download&lt;/code&gt;, you're already deep in the hole.&lt;/p&gt;

&lt;p&gt;The fix is &lt;strong&gt;multi-stage builds&lt;/strong&gt;: use one container to compile and build, then throw it away and copy only the output into a minimal runtime container.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1 — Fix the Go Backend
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Before
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; golang:1.26&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; FRONTEND_URL=http://paste.sumedh.app&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; DATABASE_URL=https://notdburl.com&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; &lt;span class="nt"&gt;--no-install-recommends&lt;/span&gt; git ca-certificates tzdata
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /backend&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;go mod download
&lt;span class="k"&gt;RUN &lt;/span&gt;go build &lt;span class="nt"&gt;-o&lt;/span&gt; main cmd/main.go
&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x main
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8080&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["./main"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This uses &lt;code&gt;golang:1.26&lt;/code&gt; — the full Debian-based image. It compiles the binary and then leaves the entire Go toolchain, Debian base, and all source code sitting inside the final image. Size: &lt;strong&gt;1.60GB&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  After
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;golang:1.26-alpine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;builder&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apk add &lt;span class="nt"&gt;--no-cache&lt;/span&gt; git ca-certificates tzdata
&lt;span class="k"&gt;RUN &lt;/span&gt;adduser &lt;span class="nt"&gt;-D&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; &lt;span class="s1"&gt;''&lt;/span&gt; appuser

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /backend&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; go.mod go.sum ./&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;go mod download
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;&lt;span class="nv"&gt;CGO_ENABLED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0 &lt;span class="nv"&gt;GOOS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;linux &lt;span class="nv"&gt;GOARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;amd64 &lt;span class="se"&gt;\
&lt;/span&gt;    go build &lt;span class="nt"&gt;-trimpath&lt;/span&gt; &lt;span class="nt"&gt;-ldflags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"-w -s -extldflags '-static'"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; /app/main ./cmd

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;scratch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;runner&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /usr/share/zoneinfo /usr/share/zoneinfo&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /etc/passwd /etc/passwd&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /app/main /main&lt;/span&gt;

&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; appuser&lt;/span&gt;
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8080&lt;/span&gt;
&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; ["/main"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Size: &lt;strong&gt;~15MB&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here's what changed and why each change matters:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Switch to &lt;code&gt;golang:1.26-alpine&lt;/code&gt; for the build stage.&lt;/strong&gt; Alpine is a minimal Linux distro. The Alpine-based Go image is a fraction of the Debian one — same compiler, none of the Debian bloat.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;FROM scratch&lt;/code&gt; for the runner.&lt;/strong&gt; This is the biggest backend win. &lt;code&gt;scratch&lt;/code&gt; is a completely empty image — zero bytes, no OS, no shell, nothing. Since Go can compile to a fully static binary with no OS dependencies at all, you don't need an OS in the runner. The final image size is essentially just your binary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copy &lt;code&gt;go.mod&lt;/code&gt; and &lt;code&gt;go.sum&lt;/code&gt; before copying source.&lt;/strong&gt; Docker caches each layer. If you &lt;code&gt;COPY . .&lt;/code&gt; everything at once, any change to any source file invalidates the dependency download layer and re-downloads all your modules. Copying only the mod files first means that layer only busts when your dependencies actually change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add &lt;code&gt;-trimpath&lt;/code&gt; and &lt;code&gt;-extldflags '-static'&lt;/code&gt; to the build.&lt;/strong&gt; &lt;code&gt;-trimpath&lt;/code&gt; removes file system paths from the compiled binary, making it smaller and reproducible across machines. &lt;code&gt;-extldflags '-static'&lt;/code&gt; guarantees no dynamic C libraries are linked — required for the binary to run in a &lt;code&gt;scratch&lt;/code&gt; container.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Create the non-root user in the builder stage.&lt;/strong&gt; Because &lt;code&gt;scratch&lt;/code&gt; is empty, there's no &lt;code&gt;adduser&lt;/code&gt; binary in the runner. You create the user in the builder and copy &lt;code&gt;/etc/passwd&lt;/code&gt; across so the &lt;code&gt;USER appuser&lt;/code&gt; directive has something to reference.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copy certificates and timezone data from the builder.&lt;/strong&gt; &lt;code&gt;scratch&lt;/code&gt; has no filesystem at all, so anything your app needs at runtime must be explicitly copied. CA certificates are needed for outbound HTTPS calls, and &lt;code&gt;zoneinfo&lt;/code&gt; is needed if your app does anything timezone-aware.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2 — Fix the Next.js Frontend
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Before
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; node&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; NEXT_PUBLIC_BACKEND_URL = http://paste.sumedh.app&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; NEXT_PUBLIC_WS_URL = ws://paste.sumedh.app&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /frontend&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--legacy-peer-deps&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;npm run build
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 3000&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["npm","run","start"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This uses &lt;code&gt;FROM node&lt;/code&gt; — the full default Node image, over 1GB by itself. Then &lt;code&gt;npm install&lt;/code&gt; pulls in all of &lt;code&gt;node_modules&lt;/code&gt; including every dev dependency: TypeScript, ESLint, webpack, the works. Size: &lt;strong&gt;1.98GB&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  After
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;node:22-alpine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;builder&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apk add &lt;span class="nt"&gt;--no-cache&lt;/span&gt; libc6-compat
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /frontend&lt;/span&gt;

&lt;span class="k"&gt;ARG&lt;/span&gt;&lt;span class="s"&gt; NEXT_PUBLIC_BACKEND_URL&lt;/span&gt;
&lt;span class="k"&gt;ARG&lt;/span&gt;&lt;span class="s"&gt; NEXT_PUBLIC_WS_URL&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; NEXT_PUBLIC_BACKEND_URL=$NEXT_PUBLIC_BACKEND_URL \&lt;/span&gt;
    NEXT_PUBLIC_WS_URL=$NEXT_PUBLIC_WS_URL \
    NEXT_TELEMETRY_DISABLED=1

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; package.json package-lock.json ./&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;npm ci &lt;span class="nt"&gt;--legacy-peer-deps&lt;/span&gt;

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;npm run build

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;node:22-alpine&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;runner&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apk add &lt;span class="nt"&gt;--no-cache&lt;/span&gt; libc6-compat

&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; NODE_ENV=production \&lt;/span&gt;
    NEXT_TELEMETRY_DISABLED=1 \
    PORT=3000 \
    HOSTNAME=0.0.0.0

&lt;span class="k"&gt;RUN &lt;/span&gt;addgroup &lt;span class="nt"&gt;--system&lt;/span&gt; &lt;span class="nt"&gt;--gid&lt;/span&gt; 1001 nodejs &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    adduser &lt;span class="nt"&gt;--system&lt;/span&gt; &lt;span class="nt"&gt;--uid&lt;/span&gt; 1001 nextjs

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder --chown=nextjs:nodejs /frontend/.next/standalone ./&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder --chown=nextjs:nodejs /frontend/.next/static ./.next/static&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder --chown=nextjs:nodejs /frontend/public ./public&lt;/span&gt;

&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; nextjs&lt;/span&gt;
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 3000&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["node", "server.js"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Size: &lt;strong&gt;190.53MB&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Here's what changed:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Switch from &lt;code&gt;node&lt;/code&gt; to &lt;code&gt;node:22-alpine&lt;/code&gt;.&lt;/strong&gt; Same move as the backend — Alpine drops the Debian userland. &lt;code&gt;node:22-alpine&lt;/code&gt; is around 130MB vs 1GB+ for the default.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Add &lt;code&gt;libc6-compat&lt;/code&gt; to both stages.&lt;/strong&gt; Next.js uses SWC (a Rust-based compiler) and image optimization libraries like Sharp that depend on glibc. Alpine uses musl libc instead, and without this compatibility shim, the build or runtime will crash. It needs to be in both the builder and runner.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copy &lt;code&gt;package.json&lt;/code&gt; and &lt;code&gt;package-lock.json&lt;/code&gt; first.&lt;/strong&gt; Same layer caching logic as the Go backend. Your &lt;code&gt;node_modules&lt;/code&gt; only rebuilds when your dependency files change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;npm ci&lt;/code&gt; instead of &lt;code&gt;npm install&lt;/code&gt;.&lt;/strong&gt; &lt;code&gt;npm ci&lt;/code&gt; installs exactly what's in &lt;code&gt;package-lock.json&lt;/code&gt;, skips the resolution step, and is faster and fully reproducible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use &lt;code&gt;ARG&lt;/code&gt; for public env variables.&lt;/strong&gt; The original Dockerfile hardcoded the URLs. Using &lt;code&gt;ARG&lt;/code&gt; lets you pass them in at build time so the same Dockerfile works across environments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Copy only the standalone output in the runner.&lt;/strong&gt; Next.js has an &lt;code&gt;output: 'standalone'&lt;/code&gt; mode that produces a minimal self-contained server bundle under &lt;code&gt;.next/standalone&lt;/code&gt; with only the production Node dependencies your app actually needs — not all of &lt;code&gt;node_modules&lt;/code&gt;. You also copy &lt;code&gt;.next/static&lt;/code&gt; and &lt;code&gt;public&lt;/code&gt;. That's it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; You need &lt;code&gt;output: 'standalone'&lt;/code&gt; in your &lt;code&gt;next.config.js&lt;/code&gt; for this to work. Without it, the &lt;code&gt;.next/standalone&lt;/code&gt; directory won't be generated and the build will fail.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Step 3 — The .dockerignore Files
&lt;/h2&gt;

&lt;p&gt;A frequently missed cause of slow builds and inflated images is sending unnecessary files to the Docker daemon when &lt;code&gt;COPY . .&lt;/code&gt; runs. Every file in the build context gets sent over, even if it never ends up in the image. Add these to both directories.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;./backend/.dockerignore&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;.&lt;span class="n"&gt;git&lt;/span&gt;
.&lt;span class="n"&gt;idea&lt;/span&gt;
.&lt;span class="n"&gt;vscode&lt;/span&gt;
*.&lt;span class="n"&gt;md&lt;/span&gt;
&lt;span class="n"&gt;bin&lt;/span&gt;/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;code&gt;./frontend/.dockerignore&lt;/code&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;.&lt;span class="n"&gt;git&lt;/span&gt;
.&lt;span class="n"&gt;next&lt;/span&gt;
&lt;span class="n"&gt;node_modules&lt;/span&gt;
.&lt;span class="n"&gt;env&lt;/span&gt;*.&lt;span class="n"&gt;local&lt;/span&gt;
*.&lt;span class="n"&gt;md&lt;/span&gt;
.&lt;span class="n"&gt;vscode&lt;/span&gt;
.&lt;span class="n"&gt;idea&lt;/span&gt;
&lt;span class="n"&gt;npm&lt;/span&gt;-&lt;span class="n"&gt;debug&lt;/span&gt;.&lt;span class="n"&gt;log&lt;/span&gt;*
&lt;span class="n"&gt;yarn&lt;/span&gt;-&lt;span class="n"&gt;debug&lt;/span&gt;.&lt;span class="n"&gt;log&lt;/span&gt;*
&lt;span class="n"&gt;yarn&lt;/span&gt;-&lt;span class="n"&gt;error&lt;/span&gt;.&lt;span class="n"&gt;log&lt;/span&gt;*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;node_modules&lt;/code&gt; entry is especially important for the frontend. Without it, your entire local &lt;code&gt;node_modules&lt;/code&gt; folder gets sent to the daemon and partially shadows what's installed inside the container, which causes subtle and confusing bugs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 4 — Tighten the Docker Compose
&lt;/h2&gt;

&lt;p&gt;With the images sorted, there were a couple of things worth fixing in the compose file too.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Container logs grow indefinitely by default.&lt;/strong&gt; On a long-running server, uncapped JSON logs will quietly eat your disk. Adding a log driver config caps each service at 30MB total (3 files × 10MB).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Postgres volume mount had a bug.&lt;/strong&gt; The original used &lt;code&gt;/var/lib/postgresql&lt;/code&gt; as the mount target. Postgres actually stores data in &lt;code&gt;/var/lib/postgresql/data&lt;/code&gt; — mounting the parent can cause initialization to fail if that directory already has files in it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pasteCTL_web&lt;/span&gt;

&lt;span class="na"&gt;x-logging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nl"&gt;&amp;amp;default-logging&lt;/span&gt;
  &lt;span class="na"&gt;driver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json-file"&lt;/span&gt;
  &lt;span class="na"&gt;options&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;max-size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;10m"&lt;/span&gt;
    &lt;span class="na"&gt;max-file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3"&lt;/span&gt;

&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./backend&lt;/span&gt;
      &lt;span class="na"&gt;dockerfile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfile&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pastectl_backend&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;FRONTEND_URL=http://localhost:3000&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;DATABASE_URL=postgres://user:password@db:5432/pastectl&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;condition&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;service_healthy&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wget&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-qO-&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;http://localhost:8080/api/health&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;||&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;exit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;
    &lt;span class="na"&gt;logging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;*default-logging&lt;/span&gt;

  &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres:18-alpine&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pastectl_db&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;POSTGRES_USER&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user&lt;/span&gt;
      &lt;span class="na"&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;password&lt;/span&gt;
      &lt;span class="na"&gt;POSTGRES_DB&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pastectl&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5432:5432"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;pgdata:/var/lib/postgresql/data&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD-SHELL"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pg_isready&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-U&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-d&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pastectl"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;5s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;
    &lt;span class="na"&gt;logging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;*default-logging&lt;/span&gt;

  &lt;span class="na"&gt;frontend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./frontend&lt;/span&gt;
      &lt;span class="na"&gt;dockerfile&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Dockerfile&lt;/span&gt;
      &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;NEXT_PUBLIC_BACKEND_URL=http://localhost:8080&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;NEXT_PUBLIC_WS_URL=ws://localhost:8080&lt;/span&gt;
    &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pastectl_frontend&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3000:3000"&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;condition&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;service_healthy&lt;/span&gt;
    &lt;span class="na"&gt;restart&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;unless-stopped&lt;/span&gt;
    &lt;span class="na"&gt;logging&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;*default-logging&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;pgdata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Final Numbers
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwee7aoq11hki0mf338za.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwee7aoq11hki0mf338za.png" alt="Final docker image size on lazydocker"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;3.58GB down to ~205MB. Both services, same functionality, no changes to application code.&lt;/p&gt;




&lt;h2&gt;
  
  
  What You Could Still Improve
&lt;/h2&gt;

&lt;p&gt;These Dockerfiles are solid for production, but there are a few things worth exploring if you want to go further:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Distroless runner for the frontend.&lt;/strong&gt; Google's &lt;code&gt;gcr.io/distroless/nodejs22-debian12&lt;/code&gt; image is more locked-down than Alpine — no shell, no package manager, no utilities. Harder to debug but a smaller attack surface.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;docker scout&lt;/code&gt; or Trivy for CVE scanning.&lt;/strong&gt; Smaller images have fewer vulnerabilities, but Alpine and even scratch aren't immune. Running a scanner in CI catches issues before they reach production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BuildKit cache mounts for Go modules.&lt;/strong&gt; Instead of the &lt;code&gt;COPY go.mod&lt;/code&gt; trick, BuildKit's &lt;code&gt;--mount=type=cache&lt;/code&gt; keeps the module cache between builds on the same machine, making repeated local builds significantly faster.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pin base image digests.&lt;/strong&gt; &lt;code&gt;golang:1.26-alpine&lt;/code&gt; is a mutable tag — it can change under you. For reproducible builds, pin to the SHA digest: &lt;code&gt;FROM golang:1.26-alpine@sha256:...&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build the images in CI and push to a registry.&lt;/strong&gt; Right now the images are built on the host. Moving the build to GitHub Actions and pushing to GHCR or Docker Hub means your production server only pulls, never builds.&lt;/p&gt;




&lt;p&gt;The full source for pasteCTL is on &lt;a href="https://github.com/Sumedhvats/pasteCTL_web" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; if you want to look at the actual Dockerfiles in context.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>go</category>
      <category>nextjs</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Production-Ready Rate Limiter in Go: From Side Project to Distributed System</title>
      <dc:creator>Sumedhvats</dc:creator>
      <pubDate>Mon, 03 Nov 2025 09:59:11 +0000</pubDate>
      <link>https://dev.to/sumedhvats/production-ready-rate-limiter-in-go-from-side-project-to-distributed-system-1h3c</link>
      <guid>https://dev.to/sumedhvats/production-ready-rate-limiter-in-go-from-side-project-to-distributed-system-1h3c</guid>
      <description>&lt;h2&gt;
  
  
  A deep dive into three algorithms, atomic Redis operations, and building a high-performance, flexible library from scratch.
&lt;/h2&gt;

&lt;p&gt;When you're building a new service, rate limiting is one of those things you &lt;em&gt;know&lt;/em&gt; you need, but you often start with something simple. Maybe it's a basic in-memory counter. But what happens when your service grows? When you move from a single server to a distributed system, that simple counter breaks down. You're stuck rewriting your rate limiting logic.&lt;/p&gt;

&lt;p&gt;Most Go rate limiters I found forced me into a single algorithm (usually token bucket) or locked me into a specific storage backend. This was the problem I set out to solve.&lt;/p&gt;

&lt;p&gt;I decided to build &lt;strong&gt;&lt;a href="https://github.com/sumedhvats/rate-limiter-go" rel="noopener noreferrer"&gt;rate-limiter-go&lt;/a&gt;&lt;/strong&gt;, a library that scales with you. It provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Multiple battle-tested algorithms&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable storage&lt;/strong&gt; (in-memory or Redis)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomic Redis operations&lt;/strong&gt; for concurrency-safe, production-ready limiting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this post, I'm going to walk you through the journey of building it: the algorithms I explored, the edge cases I found, and the final high-performance library.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 1: The Quest for the "Perfect" Algorithm
&lt;/h2&gt;

&lt;p&gt;Rate limiting seems simple, but there are many ways to do it, each with critical trade-offs.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Naive Start: Fixed Window
&lt;/h3&gt;

&lt;p&gt;This is the most intuitive approach. You divide time into fixed "windows" (e.g., one minute) and allow a certain number of requests in that window.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mental Model:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set a limit (e.g., 100 requests per minute).&lt;/li&gt;
&lt;li&gt;If the time is &lt;code&gt;12:24:02&lt;/code&gt;, the window is &lt;code&gt;12:24:00&lt;/code&gt; to &lt;code&gt;12:24:59&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;All requests in this period increment a single counter.&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;counter &amp;gt; 100&lt;/code&gt;, reject.&lt;/li&gt;
&lt;li&gt;At &lt;code&gt;12:25:00&lt;/code&gt;, the counter resets to 0.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;The Problem: Burst Errors&lt;/strong&gt;&lt;br&gt;
This algorithm has a major flaw. Imagine your limit is 100 requests/minute. A user could send 100 requests at &lt;code&gt;12:24:59&lt;/code&gt; (which are allowed) and then &lt;em&gt;another&lt;/em&gt; 100 requests at &lt;code&gt;12:25:00&lt;/code&gt; (which are also allowed, as it's a new window).&lt;/p&gt;

&lt;p&gt;This user just sent &lt;strong&gt;200 requests in two seconds&lt;/strong&gt;, effectively doubling your intended rate limit and bypassing your protection.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;When to Use It:&lt;/strong&gt; Simple, low-traffic, or single-node setups where absolute precision isn't critical.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  2. The "Smooth" Approach: Token Bucket
&lt;/h3&gt;

&lt;p&gt;This algorithm is a classic for a reason. It's designed to handle bursts gracefully while maintaining a steady average rate.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mental Model:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Each user gets a "bucket" with a maximum capacity (e.g., 100 tokens).&lt;/li&gt;
&lt;li&gt;The bucket is refilled at a constant rate (e.g., 10 tokens per second).&lt;/li&gt;
&lt;li&gt;Every request tries to consume one token.&lt;/li&gt;
&lt;li&gt;If a token is available, the request is allowed.&lt;/li&gt;
&lt;li&gt;If the bucket is empty, the request is rejected.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is much better. It allows a user to "save up" tokens to send a short burst (up to the bucket capacity), but they can't exceed the steady-state refill rate over the long term.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Implementation Edge Cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Clock Skew:&lt;/strong&gt; In a distributed system, different servers will have different clocks, leading to inconsistent refill calculations. &lt;strong&gt;Solution:&lt;/strong&gt; Use Redis server time (&lt;code&gt;TIME&lt;/code&gt; command) as the single source of truth.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Float Precision:&lt;/strong&gt; Refill rates are often fractional (e.g., 1.66 tokens/sec). This can lead to floating-point precision issues. &lt;strong&gt;Solution:&lt;/strong&gt; Be careful to round values before comparison.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;When to Use It:&lt;/strong&gt; This is ideal for most public APIs. It provides smooth flow control and allows for legitimate, short-term bursts of traffic.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  3. The Balanced Approach: Sliding Window Counter
&lt;/h3&gt;

&lt;p&gt;This was the algorithm that struck the best balance for me. It solves the "burst error" of the Fixed Window but is simpler to implement and often more performant than a Token Bucket.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Mental Model:&lt;/strong&gt;&lt;br&gt;
This algorithm smooths out the rate by considering a &lt;em&gt;weighted average&lt;/em&gt; of the &lt;strong&gt;previous&lt;/strong&gt; window and the &lt;strong&gt;current&lt;/strong&gt; window.&lt;/p&gt;

&lt;p&gt;Imagine a 1-minute window (limit 100).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It's &lt;code&gt;12:25:15&lt;/code&gt; (so, we are 25% of the way through the current window).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Previous Window (&lt;code&gt;12:24&lt;/code&gt;):&lt;/strong&gt; Had 80 requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Current Window (&lt;code&gt;12:25&lt;/code&gt;):&lt;/strong&gt; Has 10 requests so far.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We don't just look at the &lt;code&gt;10&lt;/code&gt; requests. We calculate a weighted count:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Weight of Previous Window: 75% (since 75% of the sliding window is still in the past)&lt;/li&gt;
&lt;li&gt;Weight of Current Window: 25%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;Weighted Count = (80 requests * 75%) + (10 requests * 25%)&lt;/code&gt;&lt;br&gt;
&lt;code&gt;Weighted Count = 60 + 2.5 = 62.5&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The user's current effective count is 62.5. They can continue making requests. This approach gracefully "slides" the count from one window to the next, completely eliminating the boundary burst problem.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;When to Use It:&lt;/strong&gt; My recommendation for most general-purpose, distributed rate limiting. It provides excellent accuracy and performance without the complexity of token management.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Part 2: From Theory to Production Library
&lt;/h2&gt;

&lt;p&gt;Knowing the algorithms is one thing; implementing them in a production-ready way is another. Here were my core design goals for &lt;strong&gt;&lt;a href="https://github.com/sumedhvats/rate-limiter-go" rel="noopener noreferrer"&gt;rate-limiter-go&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Storage Backend Abstraction
&lt;/h3&gt;

&lt;p&gt;I wanted to start with in-memory storage for development and scale to Redis in production &lt;em&gt;without changing my application code&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I defined a simple &lt;code&gt;Storage&lt;/code&gt; interface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Storage&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;ttl&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="n"&gt;Delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="n"&gt;Increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ttl&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, I can initialize my limiter with either backend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// Development: in-memory&lt;/span&gt;
&lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;storage&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewMemoryStorage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c"&gt;// Production: Redis (same interface)&lt;/span&gt;
&lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;storage&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRedisStorage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"redis-cluster:6379"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Same limiter code works with both&lt;/span&gt;
&lt;span class="n"&gt;rateLimiter&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewSlidingWindowLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Atomic Redis Operations
&lt;/h3&gt;

&lt;p&gt;In a concurrent system, you can't just &lt;code&gt;GET&lt;/code&gt; a value, check it, and then &lt;code&gt;SET&lt;/code&gt; it. This is a classic race condition.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# !! RACE CONDITION !!&lt;/span&gt;
&lt;span class="c"&gt;# Client 1 GETS count (99)&lt;/span&gt;
&lt;span class="c"&gt;# Client 2 GETS count (99)&lt;/span&gt;
&lt;span class="c"&gt;# Client 1 increments to 100, SETS 100. (Allowed)&lt;/span&gt;
&lt;span class="c"&gt;# Client 2 increments to 100, SETS 100. (Also Allowed)&lt;/span&gt;
&lt;span class="c"&gt;# !! We just allowed 101 requests !!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The solution is to perform all operations &lt;strong&gt;atomically&lt;/strong&gt;. I used &lt;strong&gt;Lua scripts&lt;/strong&gt;, which Redis guarantees will run without interruption.&lt;/p&gt;

&lt;p&gt;Here is the (simplified) Lua script for the Fixed Window algorithm. It gets, checks, increments, and sets the expiry all in one atomic step.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Fixed Window example (simplified)&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="s1"&gt;'0'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;increment&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;  &lt;span class="c1"&gt;-- Denied&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'INCRBY'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'EXPIRE'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ttl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;  &lt;span class="c1"&gt;-- Allowed&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;No race conditions. No approximate counting. Just correctness.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. High-Performance In-Memory Storage
&lt;/h3&gt;

&lt;p&gt;For the in-memory backend, the obvious choice is a &lt;code&gt;sync.Mutex&lt;/code&gt; wrapping a &lt;code&gt;map[string]int&lt;/code&gt;. However, Go's documentation mentions &lt;code&gt;sync.Map&lt;/code&gt; is optimized for a specific case: "when a given key is written once but read many times."&lt;/p&gt;

&lt;p&gt;A rate limiter cache is the &lt;em&gt;opposite&lt;/em&gt;: keys are read and written to on almost &lt;em&gt;every request&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;My implementation for in-memory storage uses &lt;code&gt;sync.Map&lt;/code&gt; but leverages its &lt;code&gt;CompareAndSwap&lt;/code&gt; (CAS) atomic operations to safely increment counters under high concurrency, which performs better than a single, global mutex blocking all goroutines.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 3: Putting It All Together
&lt;/h2&gt;

&lt;p&gt;Here's what the final library looks like in practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Quick Start: 5 Lines to Rate Limiting
&lt;/h3&gt;

&lt;p&gt;This is all it takes to add rate limiting to any function.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"time"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/sumedhvats/rate-limiter-go/pkg/limiter"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/sumedhvats/rate-limiter-go/pkg/storage"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// 1. Create in-memory storage&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;storage&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewMemoryStorage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// 2. Create limiter: 10 requests per minute&lt;/span&gt;
    &lt;span class="n"&gt;rateLimiter&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewSlidingWindowLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Rate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Window&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Minute&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c"&gt;// 3. Check if request is allowed&lt;/span&gt;
    &lt;span class="n"&gt;allowed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;rateLimiter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"user:alice"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// 4. Deny or allow&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;allowed&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Rate limit exceeded!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c"&gt;// 5. Allow&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Request allowed!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Most Common Use Case: HTTP Middleware
&lt;/h3&gt;

&lt;p&gt;Of course, the most common need is for an HTTP API. I built a middleware that handles everything automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"net/http"&lt;/span&gt;
    &lt;span class="s"&gt;"time"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/sumedhvats/rate-limiter-go/middleware"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/sumedhvats/rate-limiter-go/pkg/limiter"&lt;/span&gt;
    &lt;span class="s"&gt;"github.com/sumedhvats/rate-limiter-go/pkg/storage"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Use Redis for a distributed system&lt;/span&gt;
    &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;storage&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewRedisStorage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"localhost:6379"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// 100 requests per minute per IP&lt;/span&gt;
    &lt;span class="n"&gt;rateLimiter&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewSlidingWindowLimiter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limiter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Rate&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="m"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Window&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Minute&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="c"&gt;// Apply middleware&lt;/span&gt;
    &lt;span class="n"&gt;mux&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServeMux&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandleFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/api/data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dataHandler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c"&gt;// The middleware automatically uses IP address as the key&lt;/span&gt;
    &lt;span class="n"&gt;handler&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;middleware&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RateLimitMiddleware&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;middleware&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Limiter&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;rateLimiter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})(&lt;/span&gt;&lt;span class="n"&gt;mux&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ListenAndServe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;":8080"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;dataHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Data served successfully"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This middleware automatically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extracts the client IP (handling &lt;code&gt;X-Forwarded-For&lt;/code&gt; proxies).&lt;/li&gt;
&lt;li&gt;Returns a &lt;code&gt;429 Too Many Requests&lt;/code&gt; JSON error.&lt;/li&gt;
&lt;li&gt;Adds standard rate limit headers (&lt;code&gt;X-RateLimit-Limit&lt;/code&gt;, &lt;code&gt;X-RateLimit-Remaining&lt;/code&gt;, &lt;code&gt;X-RateLimit-Reset&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Part 4: The Proof: Does it Scale?
&lt;/h2&gt;

&lt;p&gt;I built this for performance, so I benchmarked it heavily. Here are the results on my 12th Gen Intel i5.&lt;/p&gt;

&lt;p&gt;This first test shows a realistic, concurrent load with many different keys (e.g., many different users hitting the API).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multiple Keys (Realistic Load) - Concurrent&lt;/strong&gt;&lt;br&gt;
|     Algorithm      |    Time/op   | Memory/op|&lt;br&gt;
|--------------------|--------------|----------|&lt;br&gt;
| &lt;strong&gt;Sliding Window&lt;/strong&gt; | &lt;strong&gt;68 ns/op&lt;/strong&gt; | 100 B/op |&lt;br&gt;
| &lt;strong&gt;Token Bucket&lt;/strong&gt;   | 76 ns/op     | 160 B/op |&lt;br&gt;
| &lt;strong&gt;Fixed Window&lt;/strong&gt;   | 130 ns/op    | 261 B/op |&lt;/p&gt;

&lt;p&gt;This test shows how the system scales when hammering the cache with 10,000 unique keys.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scalability (10K Keys) - Concurrent&lt;/strong&gt;&lt;br&gt;
|    Algorithm       |  Time/op |   Throughput    |&lt;br&gt;
|--------------------|----------|-----------------|&lt;br&gt;
| &lt;strong&gt;Token Bucket&lt;/strong&gt;   | 56 ns/op | ~17M ops/sec   |&lt;br&gt;
| &lt;strong&gt;Sliding Window&lt;/strong&gt; | 74 ns/op | ~13M ops/sec   |&lt;br&gt;
| &lt;strong&gt;Fixed Window&lt;/strong&gt;   | 95 ns/op | ~10M ops/sec   |&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Insights:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sliding Window&lt;/strong&gt; and &lt;strong&gt;Token Bucket&lt;/strong&gt; are the clear winners, both able to handle &lt;strong&gt;13-17 million operations per second&lt;/strong&gt; on a single core.&lt;/li&gt;
&lt;li&gt;They are incredibly lightweight, using 100-160 bytes per operation.&lt;/li&gt;
&lt;li&gt;The performance scales linearly with the number of keys.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building this library was a fantastic journey through algorithm design, concurrency patterns in Go, and atomic database operations with Redis.&lt;/p&gt;

&lt;p&gt;I started with a simple goal: create a rate limiter that wouldn't need to be rewritten when a project scaled. The result is a library that lets you choose the right algorithm for the job, scales from a single in-memory instance to a distributed Redis cluster, and operates with atomic, concurrency-safe guarantees.&lt;/p&gt;

&lt;p&gt;If you want to check out the code, contribute, or use the library in your own project, you can find it on GitHub.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://pkg.go.dev/github.com/sumedhvats/rate-limiter-go" rel="noopener noreferrer"&gt;Go reference&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://goreportcard.com/report/github.com/sumedhvats/rate-limiter-go" rel="noopener noreferrer"&gt;Go Report&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/sumedhvats/rate-limiter-go" rel="noopener noreferrer"&gt;Github&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thanks for reading!&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>algorithms</category>
      <category>opensource</category>
      <category>go</category>
    </item>
  </channel>
</rss>
