<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: suresh peddinti</title>
    <description>The latest articles on DEV Community by suresh peddinti (@suresh_peddinti_5272d04b0).</description>
    <link>https://dev.to/suresh_peddinti_5272d04b0</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3826088%2Fcc6c88ed-44de-41d6-9465-7dfcd909312f.png</url>
      <title>DEV Community: suresh peddinti</title>
      <link>https://dev.to/suresh_peddinti_5272d04b0</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/suresh_peddinti_5272d04b0"/>
    <language>en</language>
    <item>
      <title>Building a Rate Limiting As A Service: Across Modern Systems Rate Limiter</title>
      <dc:creator>suresh peddinti</dc:creator>
      <pubDate>Mon, 16 Mar 2026 01:16:43 +0000</pubDate>
      <link>https://dev.to/suresh_peddinti_5272d04b0/rlaas-rate-limiting-as-a-service-rate-limiting-across-modern-systems-32kd</link>
      <guid>https://dev.to/suresh_peddinti_5272d04b0/rlaas-rate-limiting-as-a-service-rate-limiting-across-modern-systems-32kd</guid>
      <description>&lt;p&gt;RLAAS (Rate Limiting As A Service) is an open-source, policy-driven platform for controlling HTTP traffic, logs, spans, metrics, and events with one reusable engine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Every engineering team knows they need rate limiting. But most solutions only protect one layer — the API gateway.&lt;/p&gt;

&lt;p&gt;What happens to everything else?&lt;/p&gt;

&lt;p&gt;Here are the real pain points I kept running into:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Log floods&lt;/strong&gt; — A bug sends millions of error logs to your observability stack. Costs spike. Dashboards break. On-call engineers drown in noise.&lt;br&gt;
&lt;strong&gt;Metric storms&lt;/strong&gt; — A chatty service emits 50x normal Datadog metrics during a deployment. Your bill triples overnight.&lt;br&gt;
&lt;strong&gt;Kafka cascades&lt;/strong&gt; — A slow consumer falls behind. Retries pile up. One service takes down the entire event pipeline.&lt;br&gt;
&lt;strong&gt;Sidecar blindspots&lt;/strong&gt; — Traffic between services inside a mesh never hits your gateway. There’s nothing enforcing limits there.&lt;br&gt;
&lt;strong&gt;Copy-paste rate limiting&lt;/strong&gt; — Every team reimplements throttling logic in their own service, with their own bugs, their own edge cases, and no shared policy.&lt;br&gt;
The root cause is the same in every case: &lt;strong&gt;rate limiting is treated as a gateway feature, not a platform capability&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That’s what I set out to fix.&lt;/p&gt;
&lt;h2&gt;
  
  
  Introducing RLAAS
&lt;/h2&gt;

&lt;p&gt;RLAAS (Rate Limiting As A Service) is an open-source, policy-driven platform written in Go. It applies consistent rate limiting decisions across multiple domains — HTTP, gRPC, telemetry, events, and sidecars — using one unified engine.&lt;/p&gt;

&lt;p&gt;Instead of scattered, per-service throttling code, you define policies once and enforce them everywhere.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr9yqdljmo2zgj71ne1c3.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr9yqdljmo2zgj71ne1c3.gif" alt=" " width="720" height="1152"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The core idea is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;One policy engine. Multiple providers. Multiple deployment models.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Algorithms It Supports Today
&lt;/h2&gt;

&lt;p&gt;RLAAS doesn’t lock you into a single algorithm. Each policy independently chooses the algorithm that fits its traffic pattern:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Token Bucket&lt;/strong&gt; — A bucket refills tokens at a fixed rate. Requests consume tokens. When empty, requests are throttled. Great for bursty traffic you want to smooth out without hard-blocking. Example: allow up to 100 API calls per minute with short bursts permitted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sliding Window&lt;/strong&gt; — Tracks requests across a continuously rolling time window. Eliminates the “boundary spike” problem where clients fire double the limit by straddling two fixed-window edges. Best for accurate per-user and per-tenant quota enforcement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fixed Window&lt;/strong&gt; — Counts requests in a hard time slot (e.g., 0–60 s). Simple, cheap, and predictable. Best when coarse-grained limits matter more than precision.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Leaky Bucket&lt;/strong&gt; — Enforces a strict, steady output rate regardless of how bursty the input is. Useful for protecting downstream services that can’t handle spikes even if the total volume is within limits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Concurrent Request Limiter&lt;/strong&gt; — Caps the number of in-flight requests at any moment. Essential for protecting slow upstream dependencies from being overwhelmed by parallel callers.&lt;/p&gt;

&lt;p&gt;A single RLAAS deployment can run all algorithms simultaneously across different policies and resources.&lt;/p&gt;
&lt;h2&gt;
  
  
  What RLAAS Integrates With
&lt;/h2&gt;

&lt;p&gt;One policy engine. Many integration points:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18m63q6e6kyzpbl60cp5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F18m63q6e6kyzpbl60cp5.png" alt=" " width="800" height="486"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  Decisions — More Than Just Allow or Deny
&lt;/h2&gt;

&lt;p&gt;Most rate limiters return two answers: pass or reject. RLAAS returns five:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhw8ftef9cu3y83h1pegw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhw8ftef9cu3y83h1pegw.png" alt=" " width="800" height="356"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Shadow mode is especially powerful during rollouts. You can observe exactly what would have been throttled before flipping enforcement on — no surprises, no incidents.&lt;/p&gt;

&lt;p&gt;Each policy declares its own action, so one policy can DENY abusive callers while another DROPs noisy telemetry and a third runs in SHADOW mode while the team validates the thresholds.&lt;/p&gt;
&lt;h2&gt;
  
  
  Three Ways to Deploy It
&lt;/h2&gt;

&lt;p&gt;Embedded SDK — Import the library directly into your service. Zero network hop. Full control. Works in Go, Python, Java, and TypeScript.&lt;/p&gt;

&lt;p&gt;Centralized Service — Deploy rlaas-server as a shared microservice. All your services call it over gRPC or HTTP to get allow/deny decisions. One place to manage all policies.&lt;/p&gt;

&lt;p&gt;Sidecar / Agent — Run rlaas-agent as a sidecar next to your workload. No code changes needed. Intercepts traffic at the infrastructure level. Works with Kubernetes, service meshes, and bare-metal alike.&lt;/p&gt;
&lt;h2&gt;
  
  
  How Policies Work
&lt;/h2&gt;

&lt;p&gt;Policies are declarative and version-controlled. You define: who the policy applies to, which algorithm to use, the limit, the window, and what to do when the limit is hit.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"nw-payments-logs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"org_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"northwind"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"resource"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"payments.logs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"algorithm"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sliding_window"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"limit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"window_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"drop"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No code changes. No redeploys. Policy updates take effect immediately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Open Source?
&lt;/h2&gt;

&lt;p&gt;Rate limiting logic is not your competitive advantage. It’s infrastructure — the same way a load balancer or a message queue is infrastructure. It should be shared, composable, and policy-driven rather than handcrafted inside each microservice.&lt;/p&gt;

&lt;p&gt;RLAAS is MIT-licensed. Every algorithm, every adapter, every SDK is open source and built to be extended.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Docs&lt;/strong&gt;: &lt;a href="https://rlaas-io.github.io/rlaas/" rel="noopener noreferrer"&gt;https://rlaas-io.github.io/rlaas/&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/rlaas-io/rlaas" rel="noopener noreferrer"&gt;https://github.com/rlaas-io/rlaas&lt;/a&gt;&lt;br&gt;
If this solves a problem you’re dealing with, open an issue, contribute an adapter, or share it with your team.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>monitoring</category>
      <category>opensource</category>
      <category>sre</category>
    </item>
  </channel>
</rss>
