<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Neural Download</title>
    <description>The latest articles on DEV Community by Neural Download (@neuraldownload).</description>
    <link>https://dev.to/neuraldownload</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3813456%2F871bb0b9-3efa-4457-9255-80ec5f421887.png</url>
      <title>DEV Community: Neural Download</title>
      <link>https://dev.to/neuraldownload</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/neuraldownload"/>
    <language>en</language>
    <item>
      <title>Rust</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Wed, 06 May 2026 20:07:48 +0000</pubDate>
      <link>https://dev.to/neuraldownload/rust-2i32</link>
      <guid>https://dev.to/neuraldownload/rust-2i32</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=Sy5YlVEW3N4" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=Sy5YlVEW3N4&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Rust is in places you probably don't think about. AWS Firecracker, the microVM your Lambda runs on, has been written in it since 2017. Cloudflare's edge proxy Pingora handles a trillion requests a day in Rust. Discord rewrote their hottest read path. The new Python tooling — uv and ruff from Charlie Marsh's Astral — is Rust all the way down. The Linux kernel started accepting Rust code in 6.1.&lt;/p&gt;

&lt;p&gt;Five extremely serious infrastructure teams reached for the same language at roughly the same time. That is not a meme. So what is its actual deal?&lt;/p&gt;

&lt;h2&gt;
  
  
  The bug class it kills
&lt;/h2&gt;

&lt;p&gt;When people say Rust is safe, they don't mean it kills all bugs. They mean a specific class. Five of them, mostly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use after free&lt;/strong&gt; — return a pointer to memory that's been freed; next caller reads garbage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Double free&lt;/strong&gt; — release the same allocation twice; heap bookkeeping corrupts under load.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data race&lt;/strong&gt; — two threads touch shared memory without coordinating; you read values nobody wrote.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Null pointer dereference&lt;/strong&gt; — the thing that crashes your service at 3 AM.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Buffer overflow&lt;/strong&gt; — the bug class that owned the security industry from the nineties to roughly now.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Microsoft's Security Response Center analyzed the CVEs they assigned from 2006 through 2018. Around 70% were memory-safety bugs. Chromium ran the same analysis on a completely different codebase. Same number. Two of the largest C/C++ codebases on Earth, two independent counts, one answer.&lt;/p&gt;

&lt;p&gt;Rust eliminates that whole class by construction. Not by tooling, not by fuzzers, not by linters — at compile time, before the binary exists.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three rules nobody draws
&lt;/h2&gt;

&lt;p&gt;The borrow checker isn't strict. It enforces three concrete rules:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Each value has exactly one owner.&lt;/strong&gt; When the owner goes out of scope, the value is dropped. The compiler inserts the deallocation statically — there's no garbage collector running at runtime to figure it out.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You can have many readers, or one writer. Never both.&lt;/strong&gt; Niko Matsakis, who designed the borrow checker, calls this &lt;em&gt;mutation XOR sharing&lt;/em&gt;. Three words, the entire model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;A reference can't outlive what it points at.&lt;/strong&gt; The compiler tracks lifetimes and refuses to compile a reference that would survive its target.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Most languages let ownership stay implicit. Rust forces you to encode it in the type system. The friction in your first month is that encoding work. The payoff is bugs that never make it to runtime.&lt;/p&gt;

&lt;p&gt;There's an opt-out. Write &lt;code&gt;unsafe&lt;/code&gt; and the compiler stops checking. But you have to ask for it, and inside that block you own the safety contract.&lt;/p&gt;

&lt;h2&gt;
  
  
  Speed is a side effect of safety
&lt;/h2&gt;

&lt;p&gt;Most explainers tell you Rust is fast &lt;em&gt;and&lt;/em&gt; safe, like those are two parallel features the language happened to ship together. Wrong frame.&lt;/p&gt;

&lt;p&gt;Watch the actual chain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rule one says exactly one owner, so the compiler always knows when each value dies.&lt;/li&gt;
&lt;li&gt;So the compiler can statically insert the deallocation, like a destructor.&lt;/li&gt;
&lt;li&gt;So there's no traced collector at runtime, no pause-the-world step, no GC tax.&lt;/li&gt;
&lt;li&gt;So the program a Rust compiler emits has the same memory layout a C compiler would emit. Same speed, with proven safety.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Speed isn't a parallel feature. &lt;strong&gt;Speed is a side effect of safety.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The receipt: Discord's 2020 post on rewriting their Read States service from Go to Rust. Their Go version was getting latency spikes "roughly every 2 minutes" — Go's garbage collector forcing collection cycles even when nothing needed collecting. Rust version: zero spikes. Average response time dropped to microseconds. They didn't pick Rust because it was cool; they picked it because the GC tax was a structural cost they couldn't optimize away in Go.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the hype is real, where it's overblown
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Real:&lt;/strong&gt; kernels, browsers, embedded firmware, edge proxies, hypervisors that run other people's code, anywhere a 2 ms p99 spike costs real money. Anywhere a memory bug is a security incident, not just a crash. The math always checks out for blast-radius-critical hot paths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Overblown:&lt;/strong&gt; most web app backends. If your bottleneck is the database, the borrow checker tax doesn't pay you back. Go and Java and TypeScript will keep working fine for that shape of work. Discord rewrote one hot-path microservice — not their entire web app. The post got read as "rewrite everything in Rust." That isn't what they did.&lt;/p&gt;

&lt;p&gt;Rust isn't perfect either. Async Rust still has rough edges, the GUI story isn't settled, and learning the borrow checker takes weeks. How long Rust stays in this position is a real question.&lt;/p&gt;

&lt;h2&gt;
  
  
  The heuristic
&lt;/h2&gt;

&lt;p&gt;Next time it comes up in your team, ask: &lt;em&gt;is this a blast-radius-critical hot path, or are we chasing a meme?&lt;/em&gt; If it's a hot path where memory bugs are security incidents and a 2 ms pause costs money — the math probably checks out. If it's a CRUD endpoint where the latency budget is 200 ms — almost certainly not.&lt;/p&gt;

&lt;p&gt;Adoption is real, you can name five places. The bug class it kills is the 70% class. The mechanism is three rules. And the speed comes free, because compile-time enforcement is what removes the GC entirely.&lt;/p&gt;

&lt;p&gt;That's its deal.&lt;/p&gt;

</description>
      <category>rustrustprogramminglanguagebor</category>
    </item>
    <item>
      <title>Agent Skills Explained (Simply &amp; Visually)</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Mon, 04 May 2026 18:13:20 +0000</pubDate>
      <link>https://dev.to/neuraldownload/agent-skills-explained-simply-visually-1bo7</link>
      <guid>https://dev.to/neuraldownload/agent-skills-explained-simply-visually-1bo7</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=3yL8WbcwEXI" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=3yL8WbcwEXI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You wrote a skill. Then it didn't fire.&lt;/p&gt;

&lt;p&gt;The prompt was right there. The folder was sitting in the right place. The agent just ignored it. And tweaking the wording didn't help.&lt;/p&gt;

&lt;p&gt;This is one of the most common bugs with Anthropic's Agent Skills standard, and it almost always traces back to one wrong assumption: most devs treat the description field like a label. It's not. It's the router.&lt;/p&gt;

&lt;h2&gt;
  
  
  What an Agent Skill actually is
&lt;/h2&gt;

&lt;p&gt;A skill is a folder. Inside that folder is one required file called &lt;code&gt;SKILL.md&lt;/code&gt;. Everything else — extra docs, scripts, templates — is optional and lives in subfolders.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SKILL.md&lt;/code&gt; has YAML frontmatter at the top with two required fields: &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt;. Below the frontmatter is plain markdown — the instructions the agent follows once the skill is active.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;code-review&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Review&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;pull&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;requests&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;—&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;check&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;naming,&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;missing&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;tests,"&lt;/span&gt;
  &lt;span class="s"&gt;and security issues. Use this when the user pastes a diff or&lt;/span&gt;
  &lt;span class="s"&gt;says "review this PR."&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Code Review Playbook&lt;/span&gt;
&lt;span class="p"&gt;
1.&lt;/span&gt; Read the diff.
&lt;span class="p"&gt;2.&lt;/span&gt; Check naming rules.
&lt;span class="p"&gt;3.&lt;/span&gt; Flag missing tests.
&lt;span class="p"&gt;4.&lt;/span&gt; If anything touches auth, see references/security.md.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. A name, a description, a body. Anthropic's own example — the PDF skill — has the same shape.&lt;/p&gt;

&lt;h2&gt;
  
  
  The description is a router, not a tagline
&lt;/h2&gt;

&lt;p&gt;When the agent starts up, it reads the &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt; of every skill in your library. Just those two fields. About a hundred tokens per skill. That's the only thing the agent knows about a skill until something triggers it.&lt;/p&gt;

&lt;p&gt;So when you type a prompt, the agent scans every description and decides: does this match? If your description for &lt;code&gt;code-review&lt;/code&gt; says "helps with code," the agent has no idea when to fire it. Code is everywhere. It might fire on a typo question. It might miss an actual review request.&lt;/p&gt;

&lt;p&gt;The fix is to write the description like a function signature for the agent's decision logic. Be specific about &lt;em&gt;when&lt;/em&gt;, not just &lt;em&gt;what&lt;/em&gt;. List the user phrasings that should trigger it. Anthropic ships an entire skill called &lt;code&gt;skill-creator&lt;/code&gt; with a trigger evaluation loop just for tuning these.&lt;/p&gt;

&lt;h2&gt;
  
  
  Progressive disclosure: three loading layers
&lt;/h2&gt;

&lt;p&gt;Once the agent decides to fire a skill, the second mechanism takes over. It's how the spec stays cheap as your library grows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1 — Metadata.&lt;/strong&gt; Name and description for every skill, always loaded, ~100 tokens each. Twenty skills cost you ~2,000 tokens of overhead. That's the price of having them available.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2 — Body.&lt;/strong&gt; When the agent decides a skill matches, it reads the full &lt;code&gt;SKILL.md&lt;/code&gt; into context. The spec recommends keeping this under 5,000 tokens, and Anthropic's best-practice guidance says aim for under 500 lines. That's the activation cost for one skill, paid every time it fires.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3 — References.&lt;/strong&gt; Inside the body, you point to other files in &lt;code&gt;references/&lt;/code&gt;, &lt;code&gt;scripts/&lt;/code&gt;, or &lt;code&gt;assets/&lt;/code&gt;. Those don't load until the body tells the agent to read them. Our &lt;code&gt;code-review&lt;/code&gt; skill keeps a security checklist in &lt;code&gt;references/security.md&lt;/code&gt; — that file only enters the conversation when the diff actually touches auth code.&lt;/p&gt;

&lt;p&gt;The 500-line guideline isn't arbitrary. It nudges you to push detail outward. The body becomes a map; the references become the territory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Skills are not MCP
&lt;/h2&gt;

&lt;p&gt;If you've heard of MCP, you might be wondering where skills fit. They get conflated constantly. They shouldn't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MCP exposes capabilities. Skills teach when and how to use them.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;MCP is a protocol — it standardizes how an agent calls a tool. Your GitHub MCP server exposes "create issue," "list pulls," "post comment." The agent invokes those across a wire.&lt;/p&gt;

&lt;p&gt;A skill doesn't expose a callable function. It teaches procedure — the order of steps, the team's conventions, what to do first.&lt;/p&gt;

&lt;p&gt;The two compose. A &lt;code&gt;code-review&lt;/code&gt; skill body might say &lt;em&gt;"read the diff, check naming, then call the GitHub MCP tool to post the comment."&lt;/em&gt; The skill is the playbook; MCP is the hands. Anthropic describes a skill as &lt;em&gt;"an onboarding guide for a new hire."&lt;/em&gt; MCP is what gets the new hire badged into the building.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do next
&lt;/h2&gt;

&lt;p&gt;If your skill isn't firing, four things in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Description is the router.&lt;/strong&gt; Vague description, wrong routing. Add WHEN-clauses with the user phrasings that should trigger it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Body is the playbook.&lt;/strong&gt; Loaded only when the description matches. Keep it under ~500 lines.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;References are the territory.&lt;/strong&gt; Loaded only when the body says to load them. Push deep detail outward.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills compose with MCP.&lt;/strong&gt; They don't replace it. Skill bodies call MCP tools.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When you write your next skill, draft the description first. Before the body. Before a single instruction. Then test it — paste a prompt that should fire it, paste one that shouldn't, and watch which one the agent picks up. If it fires the wrong way, the description needs more WHEN, not more WHAT.&lt;/p&gt;

&lt;p&gt;That's the spec. Description, body, references — composed with MCP. Go ship one that actually fires.&lt;/p&gt;

</description>
      <category>agentskillsanthropicagentskill</category>
    </item>
    <item>
      <title>Mcp</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Mon, 04 May 2026 17:53:36 +0000</pubDate>
      <link>https://dev.to/neuraldownload/mcp-31cf</link>
      <guid>https://dev.to/neuraldownload/mcp-31cf</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=OoPtezIMQ9Q" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=OoPtezIMQ9Q&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You've heard the term. MCP server. Maybe in a Cursor changelog. Maybe in a Slack release. Maybe your team set one up and you nodded along.&lt;/p&gt;

&lt;p&gt;Most developers hear those three letters and think the same thing. &lt;em&gt;Another API spec.&lt;/em&gt; Another acronym. Another integration to wire up later.&lt;/p&gt;

&lt;p&gt;That's the misread. MCP isn't an API. It's the thing that stops you from writing forty different APIs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The N×M problem
&lt;/h2&gt;

&lt;p&gt;Picture the world before MCP.&lt;/p&gt;

&lt;p&gt;You have an LLM client — Claude Desktop, Cursor, your own agent. You want it to talk to GitHub, your filesystem, Postgres, Slack, Google Drive, Linear, Stripe. Eight tools.&lt;/p&gt;

&lt;p&gt;That's eight integrations. Annoying, but fine.&lt;/p&gt;

&lt;p&gt;Now another LLM client appears. ChatGPT adds tool calling. Cursor wants the same connections. Continue. Cline. Five clients.&lt;/p&gt;

&lt;p&gt;Five clients × eight tools = &lt;strong&gt;forty wires&lt;/strong&gt;. Every new tool means five new integrations. Every new client means eight new integrations. Nobody writes them. Most of them never exist.&lt;/p&gt;

&lt;p&gt;This isn't theoretical. The same explosion happened to IDEs in the early 2010s. Every editor wrote its own Python support, its own TypeScript support, its own Rust support. Microsoft fixed it with Language Server Protocol. LSP collapsed N×M for IDE/language pairs.&lt;/p&gt;

&lt;p&gt;MCP is doing the same trick for AI tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  What MCP actually is
&lt;/h2&gt;

&lt;p&gt;A client process and a server process. They talk &lt;strong&gt;JSON-RPC 2.0&lt;/strong&gt; — over standard input/output if the server runs locally, over HTTP if it's remote. Boring on purpose.&lt;/p&gt;

&lt;p&gt;What's interesting is what you can ask the server to do. MCP defines exactly three things a server can offer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tools&lt;/strong&gt; — callable functions. &lt;code&gt;read_file&lt;/code&gt;, &lt;code&gt;query_database&lt;/code&gt;, &lt;code&gt;send_message&lt;/code&gt;. Each named, typed, with a description the model reads.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resources&lt;/strong&gt; — readable data. A file, a database row, a config. Each addressable by URI, like a tiny URL pointing inside the server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prompts&lt;/strong&gt; — reusable templates the user can invoke. A "summarize this PR" template that fills in the diff. Parameterized starting points.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Three primitives. Tools you call, resources you read, prompts you reuse. That's the whole API surface.&lt;/p&gt;

&lt;p&gt;Why three? Because Anthropic looked at LSP and copied the shape. LSP defines a small set of things a language server can offer — hover, go-to-definition, format. Small enough that anyone can implement. Big enough that an editor that speaks LSP gets every language for free.&lt;/p&gt;

&lt;p&gt;MCP is the same bet. Three primitives. Small enough to implement in an afternoon. Big enough that a client that speaks them gets every tool for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  The collapse
&lt;/h2&gt;

&lt;p&gt;Add MCP to the chaos.&lt;/p&gt;

&lt;p&gt;Each tool implements the protocol once. Eight implementations. The GitHub team writes the GitHub server. The Postgres team writes the Postgres server.&lt;/p&gt;

&lt;p&gt;Each client speaks the protocol once. Five implementations.&lt;/p&gt;

&lt;p&gt;Total wires? &lt;strong&gt;Eight + five = thirteen.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Forty becomes thirteen. N×M becomes N+M. The same word in the math, a different operation between them. Multiplication collapsed to addition.&lt;/p&gt;

&lt;p&gt;Add a ninth tool? One new implementation. Every client gets it for free. Add a sixth client? One new implementation. Every tool already works.&lt;/p&gt;

&lt;p&gt;The framing the docs landed on: &lt;strong&gt;USB-C for AI&lt;/strong&gt;. One plug shape. Many devices.&lt;/p&gt;

&lt;h2&gt;
  
  
  The compose moment
&lt;/h2&gt;

&lt;p&gt;Where it stops feeling like an integration story and starts feeling like an operating system: one client with four MCP servers running. Filesystem. GitHub. Slack. Postgres.&lt;/p&gt;

&lt;p&gt;You type one prompt: &lt;em&gt;"Find the bug from yesterday's incident, write a fix, push the PR, post it to the channel."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Watch the agent work. It opens the filesystem server, reads the incident logs. Opens Postgres, runs an EXPLAIN, confirms the missing index. Opens GitHub, branches off main, opens a PR. Opens Slack, posts the link.&lt;/p&gt;

&lt;p&gt;Four servers. One sentence. The agent composed them. None of them know about each other. The protocol is what made them composable.&lt;/p&gt;

&lt;p&gt;Anthropic has been pushing this further. There's a recent piece from their engineering team on treating MCP servers as code APIs — letting the agent write code that imports them, filters in execution, returns just the answer. In one example, they took a Drive-to-Salesforce workflow and dropped the model's token usage from 150,000 down to 2,000. &lt;strong&gt;98.7% less context&lt;/strong&gt; for the same result.&lt;/p&gt;

&lt;p&gt;That's not an incremental improvement. That's a different kind of program.&lt;/p&gt;

&lt;h2&gt;
  
  
  The mental model
&lt;/h2&gt;

&lt;p&gt;Lock it in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCP is a protocol&lt;/strong&gt;. JSON-RPC over stdio or HTTP. Three primitives a server offers — tools, resources, prompts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It exists because of N×M&lt;/strong&gt;. Five clients × eight tools = forty integrations. With MCP, eight + five = thirteen. The same trick LSP played for IDEs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It's spreading because every major LLM client picked it up&lt;/strong&gt;. Anthropic shipped it in November 2024. OpenAI added support. Microsoft. Google. The Linux Foundation took it over in December 2025. Ninety-seven million SDK downloads a month by March 2026.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next time someone in your team stands up an MCP server, you know what they actually built. Not another API. A plug. Speak the protocol once, and every client in the ecosystem can reach you. That's the deal.&lt;/p&gt;

</description>
      <category>mcpmodelcontextprotocolanthrop</category>
    </item>
    <item>
      <title>Key-Value Stores Aren't Simple</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Wed, 29 Apr 2026 23:29:31 +0000</pubDate>
      <link>https://dev.to/neuraldownload/key-value-stores-arent-simple-36a9</link>
      <guid>https://dev.to/neuraldownload/key-value-stores-arent-simple-36a9</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=cSzh7YwDvYI" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=cSzh7YwDvYI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Two lines of bash, and you have a key-value store:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;db_set&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;,&lt;/span&gt;&lt;span class="nv"&gt;$2&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /tmp/db&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
db_get&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="s2"&gt;"^&lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;,"&lt;/span&gt; /tmp/db | &lt;span class="nb"&gt;tail&lt;/span&gt; &lt;span class="nt"&gt;-1&lt;/span&gt; | &lt;span class="nb"&gt;cut&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;, &lt;span class="nt"&gt;-f2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Append to a file. Grep what you wrote. Run it. It works. That's not a joke — it's the opening of Martin Kleppmann's &lt;em&gt;Designing Data-Intensive Applications&lt;/em&gt;. Three methods on the API surface — get, put, delete — and a working KV store in two shell functions. So why is Redis 80,000 lines of C? Why has Meta's ZippyDB been in production since 2013? Why does Discord shape its trillion-message data layer like the same &lt;code&gt;get/put/delete&lt;/code&gt; interface and yet rebuild it from scratch?&lt;/p&gt;

&lt;p&gt;Because the API is a trap. The three-method interface hides five enormous design decisions, and every production KV store made a different choice on every axis. The signature looks substitutable. The systems behind it are not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Five Dimensions Hiding Behind get / put / delete
&lt;/h2&gt;

&lt;p&gt;Here are the five — durability, consistency, latency tails, sharding, and hot keys — each grounded in a real production system.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Durability — what does "OK" mean?
&lt;/h3&gt;

&lt;p&gt;You call &lt;code&gt;put&lt;/code&gt;. The server returns &lt;code&gt;OK&lt;/code&gt;. What just happened?&lt;/p&gt;

&lt;p&gt;That depends on the system. &lt;code&gt;OK&lt;/code&gt; in one KV means the bytes are queued in memory on this one machine. &lt;code&gt;OK&lt;/code&gt; in another means a majority of replicas have it in their write-ahead log and the primary fsync'd to disk. The function signature is identical. The promise is wildly different.&lt;/p&gt;

&lt;p&gt;Meta's &lt;strong&gt;ZippyDB&lt;/strong&gt; exposes the choice as an option flag on the same &lt;code&gt;put&lt;/code&gt; call. Default mode: ack only after a majority of replicas have logged the write to their Paxos logs &lt;em&gt;and&lt;/em&gt; the primary has flushed to RocksDB. Strong, slow. Fast-acknowledge mode: ack the moment the primary has the write queued for replication. Fast, fragile. Same client code. Two completely different durability stories.&lt;/p&gt;

&lt;p&gt;At the floor of the spectrum sits default Redis — in-memory only. Crash the box and you lose every write since the last snapshot. Turn on AOF (append-only file) with &lt;code&gt;everysec&lt;/code&gt; flushing and you still have a one-second loss window on power failure.&lt;/p&gt;

&lt;p&gt;In 2013, Kyle Kingsbury (Aphyr) ran Jepsen on a Redis cluster with async replication and a network partition. He sent 2,000 writes. Redis claimed 1,998 of them succeeded. Only 872 were actually present after the failover settled. &lt;strong&gt;Redis dropped 56% of the writes it had explicitly told the client succeeded&lt;/strong&gt; — because async replication plus failover plus partition is a recipe for silent data loss, and the API never said which mode you were in.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Consistency — what can the next read see?
&lt;/h3&gt;

&lt;p&gt;Two clients. One writes a value. The other reads — immediately, on a different node. What does the reader see?&lt;/p&gt;

&lt;p&gt;If the system is eventually consistent, the reader might see the new value, the old one, or briefly both. Stale reads aren't a bug; they're a feature. Werner Vogels framed the trade-off best while building Dynamo at Amazon: strong consistency is non-negotiable for a bank balance, and surplus for a shopping cart. The cart can lose an item and add it back. The KV store has no idea which one you're storing — that's your problem.&lt;/p&gt;

&lt;p&gt;Dropbox built &lt;strong&gt;Panda&lt;/strong&gt; to be the cart-and-balance-and-everything-else metadata layer for their filesystem. Two petabytes of data. Tens of millions of QPS. Single-digit-millisecond latency. &lt;strong&gt;Linearizable reads. ACID transactions across multiple keys with two-phase commit.&lt;/strong&gt; Hybrid logical clocks (HLCs) tag every write with a monotonic version that's tied to wall-clock time, so the system can answer reads at a consistent snapshot and know exactly what was visible.&lt;/p&gt;

&lt;p&gt;That's not a normal KV store. It's a transactional KV. Same &lt;code&gt;put&lt;/code&gt; call you'd write against Redis — but commits or rolls back atomically across multiple keys. Dropbox explicitly rejected CockroachDB (quorum replication added 80ms write latency, incompatible with their target), FoundationDB (centralized timestamp oracle hit a single-process scaling cap), and Vitess (no production cross-shard ACID). The transactional-KV space is real, and the ergonomics matter as much as the API.&lt;/p&gt;

&lt;p&gt;When consistency goes wrong, Jepsen calls it "multiple timelines of a single key." That's split-brain in three words: two halves of the cluster see different histories of the same key, and when the partition heals, one of them gets quietly overwritten. Panda makes that impossible by design. Default-configured Redis can produce it under failure. The function signature, again, doesn't tell you.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Latency — p99, not p50
&lt;/h3&gt;

&lt;p&gt;"My database is fast." What does that mean?&lt;/p&gt;

&lt;p&gt;p50 (the median) is the marketing number — half your requests are this fast or faster. The number that pages you at 3 AM is &lt;strong&gt;p99&lt;/strong&gt; — the slowest 1% of requests. Marc Brooker, who writes more clearly about tail latency than anyone, calls it "those times when your system is weirdly slow."&lt;/p&gt;

&lt;p&gt;Why does it matter at scale? Because tails compound. Imagine one server with a 1% chance of being slow. That's fine. Now your request fans out to 100 servers and you wait for all of them. The probability that at least one is slow is &lt;strong&gt;63%&lt;/strong&gt;. Jeff Dean and Luiz Barroso laid this out in &lt;em&gt;The Tail at Scale&lt;/em&gt; (2013) — what was rare on a single server becomes normal on a fleet.&lt;/p&gt;

&lt;p&gt;Discord saw the math live in production. On Cassandra, their insert p99 swung between 5 and 70 ms — Java GC pauses, compaction backlogs, hot partitions cascading into quorum reads. Five symptoms, one disease: the tail. In 2022 they migrated trillions of messages to ScyllaDB — same data model, written in C++, no JVM, shard-per-core architecture. &lt;strong&gt;p99 dropped to a steady 5 ms.&lt;/strong&gt; The migration ran at 3.2M messages/sec for nine days, and the 2022 World Cup Final hit during the cutover window. The system didn't notice.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Sharding — who owns this key?
&lt;/h3&gt;

&lt;p&gt;Above some scale, the cluster has to decide which node owns which key. That decision — the partition map — is where 90% of operational pain lives.&lt;/p&gt;

&lt;p&gt;The naive answer: &lt;code&gt;hash(key) % num_servers&lt;/code&gt;. Add a server, and ~90% of keys reshuffle. Catastrophe.&lt;/p&gt;

&lt;p&gt;The real answers: &lt;strong&gt;consistent hashing&lt;/strong&gt; (only ~1/N keys move when a node joins or leaves), &lt;strong&gt;range partitioning&lt;/strong&gt; (keys are stored in sorted ranges, great for ordered scans, vulnerable to hot ranges), or &lt;strong&gt;composite keys&lt;/strong&gt; (Discord's choice). Discord shards on &lt;code&gt;(channel_id, time_bucket)&lt;/code&gt; with Snowflake IDs that sort chronologically — old conversations live on cold nodes, active channels stay on hot nodes, and any single channel's messages cluster together for cache locality.&lt;/p&gt;

&lt;p&gt;Pick the partition key wrong and one shard burns while the others sleep. Pick it right and your cluster scales linearly until the next problem hits.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Hot keys — when one key is everyone's key
&lt;/h3&gt;

&lt;p&gt;Even a perfect partition map can get you killed by one key.&lt;/p&gt;

&lt;p&gt;Discord said it best in their migration writeup: "A server with hundreds of thousands of people sends orders of magnitude more messages than a small group of friends." One channel ID. One partition. Ten times the traffic of any other partition.&lt;/p&gt;

&lt;p&gt;Why does that cascade across the cluster? In a quorum-replicated system, every read pulls in two or three nodes that hold copies of that partition. The hot partition's nodes get hammered, fall behind, and now every other query that happens to touch one of those nodes — for completely unrelated keys — slows down too. The hot partition pollutes the rest of the cluster.&lt;/p&gt;

&lt;p&gt;Discord's fix lives &lt;strong&gt;above&lt;/strong&gt; the database, not inside it. They built a Rust data services layer between the gateway and the storage cluster. Its job: when a thousand users open the same channel at the same moment, the layer collapses that into a single database query and fans the same answer back to all thousand readers. One row, one query, a thousand happy clients. Request coalescing.&lt;/p&gt;

&lt;p&gt;The API never told you which keys are hot. It can't — hot keys come from your users, and your users change every hour. The mitigation has to live in the layer that knows about your users: caches, coalescers, replication strategies, sometimes app-level pre-sharding. Never in get, put, or delete.&lt;/p&gt;

&lt;h2&gt;
  
  
  Five Questions to Ask Before You Pick a KV Store
&lt;/h2&gt;

&lt;p&gt;Three methods on top. Five enormous decisions underneath. Next time you reach for a key-value store, you're not picking an interface. You're picking five answers to five hard questions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Durability&lt;/strong&gt; — what does &lt;code&gt;OK&lt;/code&gt; actually mean? Bytes in RAM? Replicated? Fsynced? Quorum-acked across regions?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt; — what can the next read see? Eventual? Read-your-writes? Linearizable? Tunable per query?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt; — what's the p99, not the p50? What happens when one request fans out to many nodes?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sharding&lt;/strong&gt; — how does the cluster pick the node that owns a key? What happens when you add or remove one?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hot keys&lt;/strong&gt; — when 80% of reads hit one key, who absorbs the heat? The database, or the layer above it?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Pick on purpose. The API is the same. The systems are not.&lt;/p&gt;

</description>
      <category>keyvaluestore</category>
      <category>keyvaluestores</category>
      <category>kvstore</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Stack vs Heap: Be Sure You Know Where Your Variables Live</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Tue, 28 Apr 2026 21:31:11 +0000</pubDate>
      <link>https://dev.to/neuraldownload/stack-vs-heap-be-sure-you-know-where-your-variables-live-5aec</link>
      <guid>https://dev.to/neuraldownload/stack-vs-heap-be-sure-you-know-where-your-variables-live-5aec</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=TP3_ZWncjqI" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=TP3_ZWncjqI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Look at this function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;make_int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// returning a pointer to a local&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now look at the caller:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;make_int&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;                  &lt;span class="c1"&gt;// some other function&lt;/span&gt;
&lt;span class="n"&gt;printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%d&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;     &lt;span class="c1"&gt;// ?&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It builds. The compiler may print a warning, but it builds. And then it lies — sometimes printing garbage, sometimes printing 42, sometimes crashing. Welcome to &lt;strong&gt;undefined behavior&lt;/strong&gt;: the C standard makes no promises about what runs.&lt;/p&gt;

&lt;p&gt;The variable looks fine in the source. Five lines, perfectly clear. There's nothing wrong with the syntax. But the variable's lifetime ended when &lt;code&gt;make_int&lt;/code&gt; returned, and the pointer didn't get the memo.&lt;/p&gt;

&lt;p&gt;That bug has a name. By the end of this post, you'll know it, why the language allows it, and the rule that prevents it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "stack allocation" actually is
&lt;/h2&gt;

&lt;p&gt;Here's how &lt;code&gt;make_int&lt;/code&gt; actually compiled (gcc, x86-64, no optimization):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;make_int:
    push    rbp
    mov     rbp, rsp
    sub     rsp, 16          ; &amp;lt;-- this line
    mov     DWORD PTR [rbp-4], 42
    lea     rax, [rbp-4]
    leave
    ret
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look at the marked line. &lt;code&gt;sub rsp, 16&lt;/code&gt;. &lt;strong&gt;That's the entire stack allocation.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;rsp&lt;/code&gt; is a register. The stack pointer. The compiler subtracts 16 from it. Now there are 16 bytes of memory between where the pointer was and where it is now. That's where the function's locals live.&lt;/p&gt;

&lt;p&gt;That's all stack allocation means. Move a register. One instruction.&lt;/p&gt;

&lt;p&gt;When the function returns, the compiler emits the opposite — &lt;code&gt;add rsp, 16&lt;/code&gt; (or &lt;code&gt;leave&lt;/code&gt;, which does the same thing). The pointer slides back up. The bits down there are still there. The bytes are still there. But the program no longer thinks they belong to anything.&lt;/p&gt;

&lt;p&gt;The lifetime ended. The pointer didn't.&lt;/p&gt;

&lt;p&gt;Then the next function runs. It needs space too. &lt;code&gt;sub rsp, 16&lt;/code&gt;. Same instruction. Same memory. The pointer you took home from &lt;code&gt;make_int&lt;/code&gt; still points at that address — but that address is now somebody else's variable. Read it later, and you'll most likely see whatever the next function wrote there.&lt;/p&gt;

&lt;p&gt;The local didn't get freed. Its lifetime simply ended. The slot moved on without it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The heap doesn't work like that
&lt;/h2&gt;

&lt;p&gt;Heap memory is not a register you can move. It's a region the allocator manages on your behalf — and the allocator has to actually do work.&lt;/p&gt;

&lt;p&gt;Think of going to a restaurant. You tell the host how many people. The host walks the room, finds an empty table that fits, and leads you to it. That walking — that searching — that's heap allocation.&lt;/p&gt;

&lt;p&gt;The allocator tracks which blocks of memory are in use and which are free. When you ask for memory, it searches its bookkeeping data, finds a block big enough, marks it taken, and hands you back a pointer.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Stack allocation is one instruction.&lt;br&gt;
Heap allocation is bookkeeping.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's most of the performance gap, in one sentence: stack allocation &lt;strong&gt;is&lt;/strong&gt; the bookkeeping. The CPU just moves a register. Heap allocation &lt;strong&gt;needs&lt;/strong&gt; bookkeeping. The allocator has to think about it.&lt;/p&gt;

&lt;p&gt;So why bother with the heap at all?&lt;/p&gt;

&lt;h2&gt;
  
  
  The rule
&lt;/h2&gt;

&lt;p&gt;Because of one rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If a value's lifetime is bounded by a single function call, it can live on the stack. If anything has to outlive the function, the stack can't hold it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The frame is going away. The slot coming back becomes somebody else's storage the moment you return. So if the data has to live longer than that, it has to live somewhere durable — somewhere the function leaving doesn't kill it. For most allocations, that's the heap. Bookkeeping is the price of outliving your scope.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lifetime determines location. Not size. Not speed. Lifetime.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Different languages enforce this differently. C trusts you to get it right. Rust refuses to compile when lifetimes don't add up. Same rule. Different enforcement.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bug, named
&lt;/h2&gt;

&lt;p&gt;So look back at the bug from the start.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;make_int&lt;/code&gt; allocated an &lt;code&gt;int&lt;/code&gt; on its stack frame. It returned a pointer to it. The frame popped. The next function reused those bytes. The pointer was now pointing at lies.&lt;/p&gt;

&lt;p&gt;That bug has a name. &lt;strong&gt;Dangling pointer.&lt;/strong&gt; When the storage was on the stack and the function returned, it's called &lt;strong&gt;use-after-return&lt;/strong&gt;. (When the storage was on the heap and you called &lt;code&gt;free&lt;/code&gt;, it's called use-after-free. Same shape, different storage.)&lt;/p&gt;

&lt;p&gt;Why did the language allow it? Because C decided to trust you. The compiler may warn, but the standard doesn't refuse. The lifetime rule is real — it just isn't enforced.&lt;/p&gt;

&lt;h2&gt;
  
  
  The recap, in order
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stack allocation&lt;/strong&gt; is moving the stack pointer. One instruction.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Heap allocation&lt;/strong&gt; is bookkeeping. The allocator has to find space.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lifetime&lt;/strong&gt; decides which one you need.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A dangling pointer&lt;/strong&gt; is a pointer that outlived its storage.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So next time you write &lt;code&gt;return &amp;amp;local&lt;/code&gt;, or any pointer that might outlive its source — ask one question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does this data outlive the frame?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If yes, it can't live on the stack.&lt;/p&gt;

</description>
      <category>stackvsheap</category>
      <category>stackmemory</category>
      <category>heapmemory</category>
      <category>memoryallocation</category>
    </item>
    <item>
      <title>15 HTTP Status Codes You'll Actually Hit | Coffee Time</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Mon, 27 Apr 2026 19:57:42 +0000</pubDate>
      <link>https://dev.to/neuraldownload/15-http-status-codes-youll-actually-hit-coffee-time-3l1i</link>
      <guid>https://dev.to/neuraldownload/15-http-status-codes-youll-actually-hit-coffee-time-3l1i</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=ZAMNET5jEzI" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=ZAMNET5jEzI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Most HTTP status code references walk the codes numerically — 1xx, 2xx, 3xx, 4xx, 5xx — like a flashcard deck. That ordering teaches the &lt;em&gt;number system&lt;/em&gt; but not what actually happens to a request.&lt;/p&gt;

&lt;p&gt;Here's a different lens: walk them in the order a request hits them. Connection. Success. Redirect. Client error. Server error. Each transition is a story beat, and once you see them as conversational moves between client and server, the table stops being a memorized list.&lt;/p&gt;

&lt;p&gt;Fifteen codes you'll actually hit in production, in journey order.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Good Path: 100, 200, 204
&lt;/h2&gt;

&lt;p&gt;Three numbers come back when everything works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;100 Continue&lt;/strong&gt; is the polite knock. Your client wants to upload 50 megabytes. Instead of pushing the bytes blindly, it sends the headers first with &lt;code&gt;Expect: 100-continue&lt;/code&gt; and waits. The server checks auth, content length, all of it — and replies &lt;code&gt;100&lt;/code&gt; only if it intends to accept the body. If the answer was going to be a &lt;code&gt;401&lt;/code&gt; or a &lt;code&gt;413&lt;/code&gt;, the 50 megabytes never leave your machine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;200 OK&lt;/strong&gt; is the workhorse. Request landed, server did the work, here's the body.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;204 No Content&lt;/strong&gt; is "I did it, nothing to say." Common after a &lt;code&gt;DELETE&lt;/code&gt; — there's no representation of nothing. The spec actually forbids a body on a 204; sending one is a protocol violation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Redirect Trap: 301, 304, 308
&lt;/h2&gt;

&lt;p&gt;Sometimes the server doesn't answer — it points somewhere else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;301 Moved Permanently&lt;/strong&gt; sounds clean. It is not. POST a form to a URL that returns 301, and your browser will silently change the POST to a GET, drop the body, and follow the new URL with no payload. The spec literally codifies this: &lt;em&gt;"For historical reasons, a user agent MAY change the request method from POST to GET for the subsequent request."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That bug broke form submissions on the web for twenty years. Browsers did it so consistently that codifying the wrong behavior was easier than fixing it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;308 Permanent Redirect&lt;/strong&gt; is the fix. Same meaning as 301, except the method is preserved. Your POST stays a POST. Your body comes with you. 308 exists for one reason: to undo 301's mistake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;304 Not Modified&lt;/strong&gt; is the cache validator. Your browser sends a hash (&lt;code&gt;If-None-Match: "abc123"&lt;/code&gt;), the server compares it against the current &lt;code&gt;ETag&lt;/code&gt;, and if nothing's changed it replies &lt;code&gt;304&lt;/code&gt; with header section only. No body. The smallest useful response in HTTP — every revisit to a cached asset costs ~200 bytes of headers instead of the full payload.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your Fault: 400, 401, 403, 404
&lt;/h2&gt;

&lt;p&gt;Four ways the server tells you the problem is on your side.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;400 Bad Request&lt;/strong&gt; — your JSON has a trailing comma, your headers are malformed, the server couldn't parse what you sent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;401 Unauthorized&lt;/strong&gt; is misnamed. It actually means &lt;em&gt;unauthenticated&lt;/em&gt; — the server doesn't know who you are. And here's the part most APIs get wrong: a &lt;code&gt;401&lt;/code&gt; MUST send back a &lt;code&gt;WWW-Authenticate&lt;/code&gt; header. That header tells the client how to retry — which auth scheme, which realm. Without it, the client has no path forward. It's spec-mandated, not optional. If your API returns &lt;code&gt;401&lt;/code&gt; with no challenge header, you're violating HTTP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;403 Forbidden&lt;/strong&gt; means the server knows exactly who you are. You just can't have this. Sending the same credentials again won't help.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;404 Not Found&lt;/strong&gt; — and here's where it gets interesting. &lt;strong&gt;404 is allowed to lie.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Try to hit a private GitHub repo while logged in. By the rules, GitHub should return &lt;code&gt;403&lt;/code&gt; — you exist, you're authenticated, you don't have access. Instead, GitHub returns &lt;code&gt;404&lt;/code&gt;. Same response as if the repo didn't exist at all.&lt;/p&gt;

&lt;p&gt;The spec explicitly blesses this. RFC 9110: &lt;em&gt;"An origin server that wishes to 'hide' the current existence of a forbidden target resource MAY instead respond with a status code of 404 (Not Found)."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The lie is the feature. If GitHub returned &lt;code&gt;403&lt;/code&gt;, an attacker could probe URLs and learn which private repos exist by which ones say "you can't have this" versus "no such thing." &lt;code&gt;404&lt;/code&gt; collapses both signals into one.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Personality Codes: 418, 429, 451
&lt;/h2&gt;

&lt;p&gt;Three more 4xx codes, written by humans.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;418 I'm a teapot.&lt;/strong&gt; April 1, 1998 — Larry Masinter at Xerox wrote a joke RFC for a coffee pot control protocol. &lt;code&gt;418&lt;/code&gt; meant &lt;em&gt;you tried to brew coffee in a teapot&lt;/em&gt;. Pure gag.&lt;/p&gt;

&lt;p&gt;Twenty years later, the IETF tried to clean it up and remove &lt;code&gt;418&lt;/code&gt; from the registry. The internet revolted. &lt;em&gt;Save 418&lt;/em&gt;. Forks of major web frameworks added &lt;code&gt;418&lt;/code&gt; handlers. The IETF reversed course. RFC 9110 §15.5.19 codifies the truce: &lt;em&gt;"the definition of an application-specific 418 status code... has been deployed as a joke often enough for the code to be unusable for any future use. Therefore, the 418 status code is reserved in the IANA HTTP Status Code Registry."&lt;/em&gt; The joke became permanent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;429 Too Many Requests&lt;/strong&gt; is the only 4xx that tells you when to come back. Headers can include &lt;code&gt;Retry-After&lt;/code&gt; — five seconds, an exact timestamp, whatever. Every other 4xx is a flat rejection; &lt;code&gt;429&lt;/code&gt; is a contract.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;451 Unavailable For Legal Reasons&lt;/strong&gt; — the number is a Fahrenheit 451 reference. Bradbury's novel about burning books. Tim Bray proposed it in 2013, ratified as RFC 7725 in 2016. When a court or government has demanded the server block this content, &lt;code&gt;451&lt;/code&gt; is the response. It can include a &lt;code&gt;Link&lt;/code&gt; header with &lt;code&gt;rel="blocked-by"&lt;/code&gt; pointing to the entity demanding the block. Censorship made machine-readable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Their Fault: 500, 502, 503, 504
&lt;/h2&gt;

&lt;p&gt;Now it's the server's fault — but not always the &lt;em&gt;same&lt;/em&gt; server.&lt;/p&gt;

&lt;p&gt;In production, your traffic doesn't hit one box. It hits a load balancer. The load balancer talks to a backend behind it. The 5xx family points fingers at four different places along that chain.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;500 Internal Server Error&lt;/strong&gt; — something broke inside the backend. Uncaught exception, null pointer. The catch-all.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;502 Bad Gateway&lt;/strong&gt; — the load balancer talked to the backend, the backend replied, and the reply was &lt;em&gt;garbage&lt;/em&gt;. Malformed bytes. Connection reset mid-response. Upstream gave junk.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;503 Service Unavailable&lt;/strong&gt; — &lt;em&gt;this&lt;/em&gt; server, the one you're talking to, is overloaded or in maintenance. Often comes with &lt;code&gt;Retry-After&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;504 Gateway Timeout&lt;/strong&gt; — the load balancer asked the backend a question and got &lt;em&gt;silence&lt;/em&gt;. No reply within the timeout window.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mnemonic: 500 = backend crashed, 502 = backend replied with garbage, 503 = this box is overloaded, 504 = backend went silent. Four codes, four different broken links.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Map
&lt;/h2&gt;

&lt;p&gt;The whole point of journey ordering: when you see a code in production, you can place it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1xx — handshake&lt;/li&gt;
&lt;li&gt;2xx — request landed&lt;/li&gt;
&lt;li&gt;3xx — request rerouted&lt;/li&gt;
&lt;li&gt;4xx — request rejected (you)&lt;/li&gt;
&lt;li&gt;5xx — request died inside (them)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next time you hit a code, don't ask &lt;em&gt;what does this number mean&lt;/em&gt;. Ask &lt;em&gt;where on the journey did this request die&lt;/em&gt;. The answer is already in the number.&lt;/p&gt;

</description>
      <category>httpstatuscodes</category>
      <category>httpcodes</category>
      <category>404notfound</category>
      <category>500internalservererror</category>
    </item>
    <item>
      <title>Microservices Aren't About Services</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Sun, 26 Apr 2026 04:20:54 +0000</pubDate>
      <link>https://dev.to/neuraldownload/microservices-arent-about-services-2c2j</link>
      <guid>https://dev.to/neuraldownload/microservices-arent-about-services-2c2j</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=4F0dlOMWGHE" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=4F0dlOMWGHE&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Monolith or microservices? You've been in this meeting before. One engineer says "we need to move to microservices or we won't scale." Another one says "we tried microservices at my last company and it was a disaster." Both of them think they're arguing about architecture.&lt;/p&gt;

&lt;p&gt;They're not. They're arguing about Conway's Law, and neither of them has noticed yet.&lt;/p&gt;

&lt;p&gt;Here's the one-sentence version of what every balanced treatment of this topic is actually trying to tell you: &lt;strong&gt;microservices pay off when multiple autonomous teams need to deploy independently across cleanly separable domains. Otherwise, a monolith is cheaper.&lt;/strong&gt; Everything below is the detail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Microservices aren't about services
&lt;/h2&gt;

&lt;p&gt;The most common definition you'll hear — "a microservice is a small service" — is wrong. Size is a symptom, not the rule.&lt;/p&gt;

&lt;p&gt;A microservice is defined by one property: &lt;strong&gt;independent deployability&lt;/strong&gt;. You can push one service without coordinating with the others. No shared database. No lockstep release. One team ships; nobody else has to wait.&lt;/p&gt;

&lt;p&gt;If your services share a database, or they have to be deployed together, or a change to one requires a change to another — you don't have microservices. You have a distributed monolith. That's strictly worse than a regular monolith, because you've paid the distributed-systems tax and gotten nothing back.&lt;/p&gt;

&lt;p&gt;So the question isn't "do we want lots of small services?" It's "do our teams actually need to ship independently?"&lt;/p&gt;

&lt;h2&gt;
  
  
  The case FOR, steelmanned
&lt;/h2&gt;

&lt;p&gt;There are two things a monolith physically cannot do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One: independent deploy cadence.&lt;/strong&gt; If you have a pricing team that ships experiments five times a day, and a payments team that ships quarterly under compliance review, they cannot coexist on the same release train. One team's test suite is the other team's blocker. In 2002, Amazon figured this out and made a company-wide rule: no team communicates with another team except via service APIs. Every API had to be externalizable. The result, years later, was AWS as a business, and deploys happening somewhere around every eleven seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two: Conway's Law.&lt;/strong&gt; Named after Melvin Conway, 1968: &lt;em&gt;"Any organization that designs a system will produce a design whose structure is a copy of the organization's communication structure."&lt;/em&gt; Your architecture is a photograph of your org chart. You can't escape it. If you want autonomous services, you first need autonomous teams. Every successful microservices migration you've heard of is an org restructure in disguise.&lt;/p&gt;

&lt;p&gt;That's the pro argument. It isn't "small services are better." It's: when your org has outgrown a single team's coordination ceiling, microservices are the tool that lets teams decouple from each other.&lt;/p&gt;

&lt;p&gt;The catch is right there in the sentence: &lt;em&gt;"when your org has outgrown a single team's coordination ceiling."&lt;/em&gt; Most teams haven't.&lt;/p&gt;

&lt;h2&gt;
  
  
  The case AGAINST, steelmanned
&lt;/h2&gt;

&lt;p&gt;Two pathologies that microservices have and monoliths don't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One: the math.&lt;/strong&gt; Suppose each service call has a 99% chance of being fast. Now a single user request fans out to 100 services. What's the probability that at least &lt;em&gt;one&lt;/em&gt; of those calls is slow?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;P(at least one slow) = 1 - (0.99)^N

  N=10   →  10%
  N=100  →  63%
  N=1000 →  ~100%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;63%. Two-thirds of your user requests hit at least one slow backend, not because anything is broken, but because probability scales with fanout. Jeff Dean and Luiz Barroso named this &lt;em&gt;"the tail at scale"&lt;/em&gt; in a 2013 paper. In-process calls have latency variance too — but the network multiplies it across every hop. Fanout turns rare slowness into the common case. And you only see it in production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two: the distributed monolith.&lt;/strong&gt; The most-named, least-visualized failure mode in the whole debate. You set out to build microservices. You end up with twelve services that call each other synchronously five-deep, share a database, must be deployed together, and can't be run locally. You paid every distributed-systems tax, and got none of the distributed-systems benefits.&lt;/p&gt;

&lt;p&gt;That's the worst tradeoff in software architecture. And it's what you get when you adopt the &lt;em&gt;architecture&lt;/em&gt; without the &lt;em&gt;org structure&lt;/em&gt; that justifies it.&lt;/p&gt;

&lt;h2&gt;
  
  
  When each one wins
&lt;/h2&gt;

&lt;p&gt;Three questions. Answer in order. Stop at the first "no."&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Do you have 15+ engineers across multiple autonomous teams contributing to the same codebase?&lt;/li&gt;
&lt;li&gt;Do those teams actually need to deploy at different cadences, and is the shared pipeline the real bottleneck?&lt;/li&gt;
&lt;li&gt;Can your product be genuinely split into separate domains that each feel like they could be bought from a different company?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Three "yes" answers → microservices earn their premium.&lt;/p&gt;

&lt;p&gt;Any "no" → monolith, or modular monolith. Shopify runs a Rails monolith with over a thousand engineers and handles thirty terabytes per minute at flash-sale peak. Stack Overflow serves 200M+ requests a day off eleven web servers. These aren't companies that failed to graduate to microservices. They're companies that looked at the three questions and decided they don't have the organizational problem.&lt;/p&gt;

&lt;p&gt;Here's the honest footnote. Martin Fowler popularized the word "microservices" in 2014. One year later, he wrote an essay called &lt;em&gt;MonolithFirst&lt;/em&gt;. His line: &lt;em&gt;"Almost all the successful microservice stories have started with a monolith that got too big."&lt;/em&gt; The same person who named the architecture also named the mistake of starting with it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The verdict
&lt;/h2&gt;

&lt;p&gt;Microservices solve an organizational scaling problem by introducing distributed-systems problems. If you don't have the first, you don't want the second.&lt;/p&gt;

&lt;p&gt;Next time this debate starts at your company, don't argue about Netflix or Amazon. Write the three questions on a whiteboard. If the answers are "one team, no, and no" — you already have your answer.&lt;/p&gt;

&lt;p&gt;Build the architecture that actually matches the team you have. Not the one you wish you had.&lt;/p&gt;

</description>
      <category>microservices</category>
      <category>monolith</category>
      <category>microservicesvsmonolith</category>
      <category>distributedsystems</category>
    </item>
    <item>
      <title>Auth: It's Easier Than You Think</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Fri, 24 Apr 2026 18:56:04 +0000</pubDate>
      <link>https://dev.to/neuraldownload/auth-its-easier-than-you-think-1phm</link>
      <guid>https://dev.to/neuraldownload/auth-its-easier-than-you-think-1phm</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=XeFqLDL4lVA" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=XeFqLDL4lVA&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bearer token. Access token. ID token. Session cookie. OAuth. OIDC. JWT. API key. You open the auth docs and they throw ten words at you.&lt;/p&gt;

&lt;p&gt;It feels like ten things. It's four. Here they are.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Identity — who you CLAIM to be
&lt;/h2&gt;

&lt;p&gt;Every auth system starts with the same question: who are you? Not yet — who do you &lt;em&gt;claim&lt;/em&gt; to be?&lt;/p&gt;

&lt;p&gt;That's identity. It's a claim. &lt;code&gt;jane@example.com&lt;/code&gt;. User ID &lt;code&gt;42&lt;/code&gt;. &lt;code&gt;service: payments&lt;/code&gt;. A string, a number, a subject line. The system has no reason yet to believe you, but you made the claim.&lt;/p&gt;

&lt;p&gt;Here's the part most tutorials skip. Identity and proof are two separate things. The email address alone is not authentication. It's just what you told the server.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Credential — proof of the claim
&lt;/h2&gt;

&lt;p&gt;A credential is proof the claim is real. Something only you could produce.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A password that only you know.&lt;/li&gt;
&lt;li&gt;A private key that only you hold.&lt;/li&gt;
&lt;li&gt;A signature from a device only you own.&lt;/li&gt;
&lt;li&gt;Your fingerprint, your face, a code from your authenticator app.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of those play the same role: proof attached to a claim. When you see "username and password" in one system and "private key and signature" in another, you're looking at the same primitive, wearing different clothes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API keys are the weird one.&lt;/strong&gt; An API key is a credential that skips the claim — the key itself IS the identity. Possession equals authentication. The server sees the key, looks it up, and now knows who you are AND that you're allowed to be here. Credential and session, fused into one long-lived string. That's why leaking an API key is catastrophic — there's no second factor. The string is the whole auth system.&lt;/p&gt;

&lt;p&gt;Together, identity plus credential answers: &lt;em&gt;"you are who you said you were."&lt;/em&gt; Authentication is done. But on the very next request, the server forgets. Unless you give it something to remember.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Session — the permit you carry across requests
&lt;/h2&gt;

&lt;p&gt;HTTP has no memory. Every request arrives alone. If the server made you log in on every click, the internet would be unusable.&lt;/p&gt;

&lt;p&gt;So after you prove who you are, the server hands you a permit. Something small you can carry. You show it on every subsequent request; the server says &lt;em&gt;"I already know this one"&lt;/em&gt; and lets you in.&lt;/p&gt;

&lt;p&gt;That permit is a session.&lt;/p&gt;

&lt;p&gt;A session isn't a data structure — it's a role. It's the thing that carries your proven identity across requests so you don't re-prove it every single time.&lt;/p&gt;

&lt;p&gt;There are two common shapes for that permit. Same role, different geometry:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shape one — the session cookie.&lt;/strong&gt; The server stores everything in its database: who you are, what you can do, when this expires. It hands you an opaque ID like &lt;code&gt;sess_k2n9xq&lt;/code&gt;. That ID means nothing to you or to anyone else — it's just a ticket stub. The server looks it up in its own memory on every request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shape two — the JWT.&lt;/strong&gt; The server doesn't want to store anything, so it writes everything you need on a piece of paper and &lt;em&gt;signs&lt;/em&gt; it with its own secret key. Your identity, your permissions, when this expires — all signed. You carry the paper; the server just verifies the signature. That's a JSON Web Token.&lt;/p&gt;

&lt;p&gt;Same primitive, same role. Opaque ID vs. signed claim bag. One requires a database lookup, one does not. And that trade-off is the entire debate behind every "sessions vs. tokens" blog post you've ever skimmed.&lt;/p&gt;

&lt;p&gt;Both are sessions. Don't let the vocabulary trick you.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Permission — what the session is allowed to DO
&lt;/h2&gt;

&lt;p&gt;The server knows who you are. It knows you've been here recently. But there's one more question before it actually does anything: are you &lt;em&gt;allowed&lt;/em&gt; to do this?&lt;/p&gt;

&lt;p&gt;That's permission. And it's completely separate from the first three.&lt;/p&gt;

&lt;p&gt;Read a user's profile? Maybe. Delete their account? Probably not. Charge their card? Only if you're billing. Permission is what the authenticated you is allowed to do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authentication answers "who are you." Authorization answers "what can you do."&lt;/strong&gt; They're different primitives, use different data, and fail in different ways. If your authentication is broken, random strangers get in. If your authorization is broken, logged-in users access things they shouldn't. Both are breaches. But the fix lives in a completely different part of your code.&lt;/p&gt;

&lt;p&gt;So the four primitives split across a gate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Identity + Credential&lt;/strong&gt; on the left — they prove who you are&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Permission&lt;/strong&gt; on the right — it decides what you can do&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session&lt;/strong&gt; straddles the gate — it carries the "who" forward so the "what" can get asked on every request without redoing the proof&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The payoff: every scary acronym is these four
&lt;/h2&gt;

&lt;p&gt;Watch what was hiding behind the big words.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Sign in with Google."&lt;/strong&gt; That's OAuth — a delegation flow. Instead of your app asking for a password, it sends you to Google. Google authenticates you. Google hands your app a session with a specific permission: &lt;em&gt;"this app can read this person's email address."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Look closely. What did your app receive? A &lt;strong&gt;session&lt;/strong&gt;. A &lt;strong&gt;permission&lt;/strong&gt;. And nothing else.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OAuth gives you three of the four primitives.&lt;/strong&gt; Session and permission directly; credential was handled by Google. But identity? OAuth alone does NOT tell your app who the user is. You got a permission to access their stuff — you did not get a proof of who they are.&lt;/p&gt;

&lt;p&gt;This is the part the internet keeps getting wrong. &lt;strong&gt;OAuth by itself is authorization, not authentication.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In 2014, a spec added the missing piece: &lt;strong&gt;OIDC&lt;/strong&gt; (OpenID Connect). OIDC is literally OAuth plus one extra thing — an ID token, a signed JWT saying &lt;em&gt;"this user is &lt;a href="mailto:jane@example.com"&gt;jane@example.com&lt;/a&gt;."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That's the whole difference. OAuth hands you session + permission. OIDC hands you session + permission + identity. All four primitives accounted for. That's how "Sign in with Google" actually becomes a real login.&lt;/p&gt;

&lt;p&gt;Every provider's login button, every enterprise SSO, every federated identity system, is some flavor of this — the four primitives, composed across two parties.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your homework
&lt;/h2&gt;

&lt;p&gt;Next time you open an auth doc, try this: before you touch a single line of code, read every header, every field, every token name, and &lt;em&gt;label&lt;/em&gt; it.&lt;/p&gt;

&lt;p&gt;Is this proving identity?&lt;br&gt;
Is this carrying a session?&lt;br&gt;
Is this granting permission?&lt;/p&gt;

&lt;p&gt;Once you know which of the four you're looking at, the docs stop being scary. They're just four things in slightly different shapes.&lt;/p&gt;

</description>
      <category>authentication</category>
      <category>authorization</category>
      <category>auth</category>
      <category>authexplained</category>
    </item>
    <item>
      <title>6 Minutes to Finally Understand Why Postgres Keeps Winning</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Fri, 24 Apr 2026 02:01:36 +0000</pubDate>
      <link>https://dev.to/neuraldownload/6-minutes-to-finally-understand-why-postgres-keeps-winning-31i7</link>
      <guid>https://dev.to/neuraldownload/6-minutes-to-finally-understand-why-postgres-keeps-winning-31i7</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=fY-pGkrLXg4" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=fY-pGkrLXg4&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You're building a RAG app. Your team says: Postgres for the data, Pinecone for the vector search. You nod — because that's what you always do, one database per job.&lt;/p&gt;

&lt;p&gt;Here's the thing nobody tells you up front: Postgres is the database that became a platform. The vector search, the geospatial queries, the time-series rollups, the fuzzy text search — all of it might already be Postgres. And three specific design decisions are what made that possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  MVCC: Readers Don't Wait for Writers
&lt;/h2&gt;

&lt;p&gt;Two transactions on the same row. One reads. One updates. Same instant.&lt;/p&gt;

&lt;p&gt;In a traditional row-lock database, one of them has to wait. In Postgres, for ordinary reads and writes, neither one waits — and the reason is one fact that almost nobody teaches in an intro.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In Postgres, an UPDATE does not change the row. It writes a new row.&lt;/strong&gt; Every row has two hidden system columns: &lt;code&gt;xmin&lt;/code&gt; (the transaction that created it) and &lt;code&gt;xmax&lt;/code&gt; (the transaction that ended its visibility by updating or deleting it). When one transaction reads and another updates the same row, each one is looking at a different version.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- txid 100: reads row. xmin=100 is visible to it.&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- txid 101 at the same instant: updates. Creates NEW row with xmin=101.&lt;/span&gt;
&lt;span class="c1"&gt;-- Old row gets stamped xmax=101.&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'new@example.com'&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both queries return without waiting. This is &lt;strong&gt;MVCC&lt;/strong&gt; — multi-version concurrency control. Postgres still uses locks for schema changes and explicit &lt;code&gt;SELECT FOR UPDATE&lt;/code&gt;, but for ordinary read-write conflicts on the same row, versioning replaces blocking.&lt;/p&gt;

&lt;p&gt;And critically: this behavior isn't reserved for the &lt;code&gt;users&lt;/code&gt; table. Every Postgres extension inherits it. Your vector search is lock-free. Your geospatial queries are lock-free. All of it, for free, from the core.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Planner: EXPLAIN ANALYZE Shows You Everything
&lt;/h2&gt;

&lt;p&gt;Paste &lt;code&gt;EXPLAIN&lt;/code&gt; in front of any query and Postgres shows you the plan it's about to execute — not the SQL you wrote, but a tree of operators (scans, joins, sorts, indexes) with estimated costs. Add &lt;code&gt;ANALYZE&lt;/code&gt;, and Postgres runs the query and fills in real timing at every node.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;EXPLAIN&lt;/span&gt; &lt;span class="k"&gt;ANALYZE&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'CA'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output tells you: did it use an index on &lt;code&gt;country&lt;/code&gt;? Did it hash-join or nested-loop? How many rows did each node actually produce vs. expect? You don't have to guess.&lt;/p&gt;

&lt;p&gt;The mental model: &lt;strong&gt;your SQL is the input; the plan is the output.&lt;/strong&gt; The planner builds the plan, the executor runs it. You wrote what answer you want — the planner picks how.&lt;/p&gt;

&lt;p&gt;Here's the part that matters for the platform argument. The same planner that handles a &lt;code&gt;JOIN&lt;/code&gt; on &lt;code&gt;users&lt;/code&gt; also handles a &lt;code&gt;pgvector&lt;/code&gt; nearest-neighbor query. Same tree, same operator framework. Extensions can even teach the planner how to estimate costs for their own operators. That's why, when you bolt on vector search, geospatial, or time-series, the queries feel native — to the planner, they &lt;strong&gt;are&lt;/strong&gt; native.&lt;/p&gt;

&lt;h2&gt;
  
  
  Extensions: One Engine, Many Databases
&lt;/h2&gt;

&lt;p&gt;Here's the decision that actually made Postgres what it is: &lt;code&gt;CREATE EXTENSION&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Run this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;pgvector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That one command installed new functionality into your Postgres instance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A new type: &lt;code&gt;vector&lt;/code&gt;, for storing arrays of floats&lt;/li&gt;
&lt;li&gt;A new distance operator: &lt;code&gt;&amp;lt;-&amp;gt;&lt;/code&gt;, computing the L2 distance between two vectors&lt;/li&gt;
&lt;li&gt;New index access methods (&lt;code&gt;ivfflat&lt;/code&gt;, &lt;code&gt;hnsw&lt;/code&gt;) for nearest-neighbor search&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And it all plugs into the same engine. Same transactions. Same MVCC. Same planner.&lt;/p&gt;

&lt;p&gt;Now you can do this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;ivfflat&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector_l2_ops&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&amp;gt;&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's vector similarity search, running inside the same transaction as your &lt;code&gt;users&lt;/code&gt; table. No separate service. No separate API. No separate backup story.&lt;/p&gt;

&lt;p&gt;And this is the &lt;strong&gt;pattern&lt;/strong&gt; — not the feature. Once you see it once, you see it everywhere:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PostGIS&lt;/strong&gt; adds geometry types, spatial operators (&lt;code&gt;ST_DWithin&lt;/code&gt;, &lt;code&gt;ST_Intersects&lt;/code&gt;), GiST-based spatial indexes. Same mechanism.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pg_trgm&lt;/strong&gt; adds trigram-based fuzzy text matching, so &lt;code&gt;WHERE name % 'databse'&lt;/code&gt; matches "database". Same mechanism.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TimescaleDB&lt;/strong&gt; goes further — it layers its own chunking machinery for time-series — but it plugs into the same transactions, the same planner, the same write-ahead log.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One engine. Many databases.&lt;/p&gt;

&lt;p&gt;And because every change to core Postgres storage — whether it's an &lt;code&gt;INSERT&lt;/code&gt; into &lt;code&gt;users&lt;/code&gt; or an update to your &lt;code&gt;pgvector&lt;/code&gt; index — goes through the same write-ahead log before touching the data file, extensions built on that storage inherit the same crash-recovery story as your primary tables. Kill the process mid-write, restart, the log replays.&lt;/p&gt;

&lt;h2&gt;
  
  
  So What Do You Do With This
&lt;/h2&gt;

&lt;p&gt;Next time your team is about to add another datastore, run through four checks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Does Postgres have the data type?&lt;/li&gt;
&lt;li&gt;Does it have an index for your query pattern?&lt;/li&gt;
&lt;li&gt;Is there an extension that covers the use case?&lt;/li&gt;
&lt;li&gt;Is the latency acceptable for your workload?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If it passes all four — don't add the service yet. If it fails one, or you need extreme scale, distribution, or operational isolation — then add the separate service, intentionally.&lt;/p&gt;

&lt;p&gt;You don't have to like every database. You just have to know what Postgres already does.&lt;/p&gt;

</description>
      <category>postgres</category>
      <category>database</category>
      <category>mvcc</category>
    </item>
    <item>
      <title>How Unicode Actually Works</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Wed, 22 Apr 2026 20:25:27 +0000</pubDate>
      <link>https://dev.to/neuraldownload/how-unicode-actually-works-jdf</link>
      <guid>https://dev.to/neuraldownload/how-unicode-actually-works-jdf</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=Z_LQa_NeA8w" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=Z_LQa_NeA8w&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One emoji can have three different lengths at the same time.&lt;/p&gt;

&lt;p&gt;In UTF-8 bytes, the family emoji &lt;code&gt;👨‍👩‍👦&lt;/code&gt; is &lt;strong&gt;25&lt;/strong&gt;. In Unicode code points, it's &lt;strong&gt;7&lt;/strong&gt;. In grapheme clusters, it's &lt;strong&gt;1&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;All three answers are correct.&lt;/p&gt;

&lt;p&gt;That's the bug waiting underneath almost every piece of text handling code: we keep asking "how long is this string?" as if the question only has one meaning.&lt;/p&gt;

&lt;p&gt;Unicode exists because text actually lives in three different layers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 1: Bytes
&lt;/h2&gt;

&lt;p&gt;At the bottom, computers only store &lt;strong&gt;bytes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;ASCII was the first successful shared mapping: a byte value for &lt;code&gt;A&lt;/code&gt;, a byte value for &lt;code&gt;z&lt;/code&gt;, a byte value for space. It was clean, simple, and completely insufficient. ASCII only gave you 128 slots. Enough for English. Not enough for the world.&lt;/p&gt;

&lt;p&gt;So every region built its own encoding table. Shift-JIS in Japan. KOI8 in Russia. Latin-1 in Western Europe. Each worked locally. None agreed globally. Move text between systems and you got &lt;strong&gt;mojibake&lt;/strong&gt; — garbage symbols where words should be.&lt;/p&gt;

&lt;p&gt;Unicode fixed the &lt;em&gt;agreement&lt;/em&gt; problem by separating character identity from byte storage.&lt;/p&gt;

&lt;p&gt;UTF-8 fixed the &lt;em&gt;storage&lt;/em&gt; problem by keeping ASCII as one byte and expanding only when needed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1 byte for ASCII&lt;/li&gt;
&lt;li&gt;2 bytes for many European and Middle Eastern scripts&lt;/li&gt;
&lt;li&gt;3 bytes for most modern writing systems&lt;/li&gt;
&lt;li&gt;4 bytes for everything else, including emoji&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's why UTF-8 won. English stays compact. Old ASCII files still work. And the encoding can represent the full Unicode space.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 2: Code Points
&lt;/h2&gt;

&lt;p&gt;Unicode's core idea is almost boring:&lt;/p&gt;

&lt;p&gt;Give every character an abstract number.&lt;/p&gt;

&lt;p&gt;That's a &lt;strong&gt;code point&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;A&lt;/code&gt; is &lt;code&gt;U+0041&lt;/code&gt;. The Arabic letter alef is &lt;code&gt;U+0627&lt;/code&gt;. The Chinese character for water is &lt;code&gt;U+6C34&lt;/code&gt;. A snowflake is &lt;code&gt;U+2744&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is the level most developers &lt;em&gt;think&lt;/em&gt; they're working at when they say "character." But a code point is not the same thing as bytes, and it's not the same thing as what a human sees on screen.&lt;/p&gt;

&lt;p&gt;A code point answers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What symbol is this?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It does &lt;strong&gt;not&lt;/strong&gt; answer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How many bytes does it take in memory?&lt;/li&gt;
&lt;li&gt;How many visible characters will a user perceive?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That split is where most Unicode confusion starts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The &lt;code&gt;é&lt;/code&gt; Problem
&lt;/h2&gt;

&lt;p&gt;Take the letter &lt;code&gt;é&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It can be represented in Unicode two different ways:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;U+00E9              -&amp;gt; é
U+0065 U+0301       -&amp;gt; e + combining acute accent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Visually, they're the same.&lt;/p&gt;

&lt;p&gt;Under the hood, they are different sequences.&lt;/p&gt;

&lt;p&gt;So now all the "simple" operations stop being simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Equality checks can fail&lt;/li&gt;
&lt;li&gt;String lengths can differ&lt;/li&gt;
&lt;li&gt;Search can miss identical-looking text&lt;/li&gt;
&lt;li&gt;Cursor movement can behave strangely&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a rendering bug. It's a modeling bug. Your code assumed one visible character always equals one code point. Unicode does not make that promise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 3: Grapheme Clusters
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;grapheme cluster&lt;/strong&gt; is what a human reader experiences as one character.&lt;/p&gt;

&lt;p&gt;Sometimes that's one code point. Sometimes it's several code points working together.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;é&lt;/code&gt; example already proves it. One visible unit can be either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one precomposed code point, or&lt;/li&gt;
&lt;li&gt;two code points: base letter + combining mark&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Emoji make the same idea impossible to ignore.&lt;/p&gt;

&lt;p&gt;The family emoji &lt;code&gt;👨‍👩‍👦&lt;/code&gt; is not one atomic symbol in storage. It's a sequence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;man + ZWJ + woman + ZWJ + boy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;zero-width joiner&lt;/strong&gt; (&lt;code&gt;ZWJ&lt;/code&gt;) is invisible glue. It tells the renderer to combine neighboring code points into one displayed unit.&lt;/p&gt;

&lt;p&gt;So the same string now has three perfectly valid measurements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;25 bytes&lt;/strong&gt; in UTF-8&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;7 code points&lt;/strong&gt; in Unicode&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1 grapheme cluster&lt;/strong&gt; on screen&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your app limits usernames by bytes, that's one answer.&lt;br&gt;
If your parser iterates code points, that's another answer.&lt;br&gt;
If your text editor moves by user-visible characters, that's a third answer.&lt;/p&gt;

&lt;p&gt;The number isn't wrong. The level is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why String APIs Feel Inconsistent
&lt;/h2&gt;

&lt;p&gt;Developers often think text APIs are inconsistent because Unicode is complicated. The real issue is that different APIs are answering different questions.&lt;/p&gt;

&lt;p&gt;One API is counting bytes because it cares about storage.&lt;br&gt;
Another is counting code points because it cares about encoded symbols.&lt;br&gt;
Another is moving over grapheme clusters because it cares about what a user sees.&lt;/p&gt;

&lt;p&gt;They're not disagreeing. They're working at different layers.&lt;/p&gt;

&lt;p&gt;Once you see the stack clearly, a lot of "Unicode weirdness" stops being weird:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;UTF-8 length bugs are byte-level bugs&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;é != é&lt;/code&gt; bugs are normalization bugs&lt;/li&gt;
&lt;li&gt;Broken cursor movement is a grapheme-cluster bug&lt;/li&gt;
&lt;li&gt;Emoji limits exploding in databases are "you counted the wrong layer" bugs&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Normalization Is Not Optional
&lt;/h2&gt;

&lt;p&gt;Because Unicode allows multiple valid representations of the same visible text, serious text processing usually needs &lt;strong&gt;normalization&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The two common forms are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;NFC&lt;/strong&gt;: prefer single precomposed code points where possible&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NFD&lt;/strong&gt;: decompose into base characters plus combining marks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If two strings need to compare equal, normalize them to the same form first.&lt;/p&gt;

&lt;p&gt;Without that step, you're trusting visually identical text to also be byte-identical. That's not safe.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Mental Model
&lt;/h2&gt;

&lt;p&gt;Unicode is not "a bigger ASCII."&lt;/p&gt;

&lt;p&gt;It's a layered model:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Bytes&lt;/strong&gt; — how text is stored&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code points&lt;/strong&gt; — the abstract symbols Unicode defines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Grapheme clusters&lt;/strong&gt; — what a human actually perceives as one character&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Most production bugs happen when code silently swaps one layer for another.&lt;/p&gt;

&lt;p&gt;You ask for "character count."&lt;br&gt;
The runtime gives you code points.&lt;br&gt;
The product manager means user-visible characters.&lt;br&gt;
The database limit is actually bytes.&lt;br&gt;
Now everyone is technically correct, and the software is still broken.&lt;/p&gt;

&lt;p&gt;That's Unicode in one sentence:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Text has multiple valid lengths because text has multiple layers.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And once you internalize that, string handling stops feeling arbitrary. It starts feeling precise.&lt;/p&gt;

</description>
      <category>unicode</category>
      <category>utf8</category>
      <category>utf16</category>
      <category>codepoints</category>
    </item>
    <item>
      <title>Rate Limiting: The 4 Algorithms Behind Every 429</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Tue, 21 Apr 2026 23:03:46 +0000</pubDate>
      <link>https://dev.to/neuraldownload/rate-limiting-the-4-algorithms-behind-every-429-1bkb</link>
      <guid>https://dev.to/neuraldownload/rate-limiting-the-4-algorithms-behind-every-429-1bkb</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=H0SWt7MB0lI" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=H0SWt7MB0lI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Two terminals. Same &lt;code&gt;curl&lt;/code&gt;. Same second. One of them returns a hundred green &lt;code&gt;200 OK&lt;/code&gt; responses. The other slams red at request six. Both are valid APIs. Both send the same status code when they refuse. Behind the refusal — four completely different machines.&lt;/p&gt;

&lt;p&gt;This is what every engineer runs into and almost nobody looks at straight on. &lt;code&gt;429 Too Many Requests&lt;/code&gt; isn't a protocol. It's a signal. The machinery that &lt;em&gt;decides&lt;/em&gt; when to fire it is a design choice — and the choice is why your one-liner integration breaks at Cloudflare but sails through at Stripe.&lt;/p&gt;

&lt;p&gt;A rate limiter is really just a question: &lt;em&gt;how many requests has this client sent in the last N seconds?&lt;/em&gt; Four algorithms, four different answers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fixed window — cheap and broken
&lt;/h2&gt;

&lt;p&gt;The simplest thing that could possibly work: keep a counter per client, keyed by the current minute. Every request increments it. Past the limit, return 429. At the next minute boundary, reset to zero.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INCR  rate:alice:2026-04-21T14:05
EXPIRE rate:alice:2026-04-21T14:05 60
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One number per client. One Redis &lt;code&gt;INCR&lt;/code&gt;. Ships in ten lines. It has exactly one bug.&lt;/p&gt;

&lt;p&gt;Imagine the counter is at 100 at &lt;code&gt;11:59:59.9&lt;/code&gt;. A hundred more requests fire in the final tenth of a second — all rejected. Clock ticks to &lt;code&gt;12:00:00.1&lt;/code&gt;. Counter slams to zero. A hundred more requests fire immediately — all allowed. Two hundred requests in two-tenths of a second under a limit of "100 per minute."&lt;/p&gt;

&lt;p&gt;Fixed window is still the cheapest thing you can run. It just leaves a door open at every minute boundary. Close that door, and you get the next algorithm.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sliding window — exact, or cheap, pick one
&lt;/h2&gt;

&lt;p&gt;Stop thinking in calendar windows. Keep a &lt;em&gt;list&lt;/em&gt;. Every request drops a timestamp on a timeline. Draw a 60-second window. Count only the timestamps inside. As time moves forward, the window slides. Old timestamps fall off the left edge.&lt;/p&gt;

&lt;p&gt;No boundary seam. Exactly right.&lt;/p&gt;

&lt;p&gt;Exactly expensive. A client at 10,000 requests per hour carries 10,000 timestamps in memory under a one-hour window. Now multiply by every client.&lt;/p&gt;

&lt;p&gt;Cloudflare faced this at scale and picked an approximation instead. Two counters per client — last minute's count and this minute's — weighted by how far you've slid into the new minute.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prev_count&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;window_remaining&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;window_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;curr_count&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Forty-two requests last minute. Eighteen so far this minute, a quarter of the way through. &lt;code&gt;42 × 0.75 + 18 = 49.5&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It isn't exact. Cloudflare measured it across 400 million of their requests anyway — wrong answer on three of every hundred thousand. Two numbers per client, close enough to right, runs on &lt;code&gt;GET&lt;/code&gt;/&lt;code&gt;SET&lt;/code&gt;/&lt;code&gt;INCR&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Token bucket — stop counting requests, count capacity
&lt;/h2&gt;

&lt;p&gt;Flip the whole mental model. Don't count what came in. Count what's left.&lt;/p&gt;

&lt;p&gt;A bucket holds tokens, capped at some capacity C. Tokens drip in at rate R per second. Every request reaches in and grabs one. Empty bucket, rejected. That's the algorithm.&lt;/p&gt;

&lt;p&gt;The interesting behavior shows up when a client sits idle. At 10 tokens per second, if they wait 10 seconds, the bucket fills to 100. Now they can fire 100 requests in a single second — every one gets a token. Then the bucket drains, and they're back to steady 10/sec.&lt;/p&gt;

&lt;p&gt;Sprint, then jog. That's the feature, not a bug.&lt;/p&gt;

&lt;p&gt;Stripe wants a user to be able to load a dashboard in a burst. Then idle. Then another burst. Humans and dashboards and mobile apps do not send at constant rates. Token bucket doesn't make them pretend to.&lt;/p&gt;

&lt;p&gt;The algorithm was described in 1986 by Jonathan Turner for ATM networks — 53-byte cells moving over phone lines. Now Stripe, AWS API Gateway, and countless modern APIs all use variants of it. The problem didn't change. Just the packets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Leaky bucket — the inverse twin
&lt;/h2&gt;

&lt;p&gt;Flip the bucket upside down and you get the other classical algorithm. Requests now &lt;em&gt;fill&lt;/em&gt; the bucket from the top. The bucket leaks out the bottom at a fixed rate. Overflow the capacity, the next request overflows.&lt;/p&gt;

&lt;p&gt;Token bucket polices. Leaky bucket &lt;em&gt;shapes&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Nginx's &lt;code&gt;limit_req&lt;/code&gt; directive is a leaky bucket. Configure it &lt;code&gt;rate=1r/s burst=5&lt;/code&gt; and five requests arriving in the same instant don't get rejected — they line up. Nginx drains them one per second to the upstream. Requests six and seven, arriving while the queue is still full, get dropped.&lt;/p&gt;

&lt;p&gt;Same mathematical family as token bucket. Different posture. Leaky bucket is what you want between your edge and a downstream that breaks under bursts.&lt;/p&gt;

&lt;p&gt;Four algorithms. One question underneath each — when does the server forget what it's counted?&lt;/p&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fixed window&lt;/strong&gt; forgets at the minute mark. Cheap, simple, broken at the seam.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sliding window&lt;/strong&gt; forgets as timestamps age out. Exact, or Cloudflare's 99.997%-accurate approximation with two counters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token bucket&lt;/strong&gt; doesn't count requests at all — it counts unused capacity. Sit idle, bank tokens, sprint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leaky bucket&lt;/strong&gt; is the inverse — requests fill, time drains the tally, overflow drops.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same &lt;code&gt;429&lt;/code&gt;. Four completely different forgetting strategies.&lt;/p&gt;

&lt;p&gt;When you hit 429, the right question isn't &lt;em&gt;am I sending too much?&lt;/em&gt; It's &lt;em&gt;which bucket just rejected me?&lt;/em&gt; The answer tells you what to do next:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If it was Stripe (token bucket), you probably bursted past capacity — wait a second and the bucket refills.&lt;/li&gt;
&lt;li&gt;If it was Cloudflare (sliding window), your last 60 seconds of traffic is the counted metric — actually slow down.&lt;/li&gt;
&lt;li&gt;If it was a legacy fixed-window limiter, you might just be bad-luck-timing the reset boundary — wait for the next minute.&lt;/li&gt;
&lt;li&gt;If nginx is shaping your traffic (leaky bucket queue), the requests aren't lost — they're queued. Expect latency, not failure.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same three digits. Four different machines. The choice of machine is the design.&lt;/p&gt;

</description>
      <category>ratelimiting</category>
      <category>ratelimiter</category>
      <category>429</category>
      <category>tokenbucket</category>
    </item>
    <item>
      <title>Protobuf: Why Google's Servers Don't Speak JSON</title>
      <dc:creator>Neural Download</dc:creator>
      <pubDate>Mon, 20 Apr 2026 21:10:56 +0000</pubDate>
      <link>https://dev.to/neuraldownload/protobuf-why-googles-servers-dont-speak-json-1k67</link>
      <guid>https://dev.to/neuraldownload/protobuf-why-googles-servers-dont-speak-json-1k67</guid>
      <description>&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=OsyKxWxGtiI" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=OsyKxWxGtiI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Your API returns a user. Three fields — an ID, a name, a bit for active. That exact payload, as compact JSON, is forty-one bytes on the wire. As protobuf, it's twelve. But compactness is the least interesting thing about protobuf. The real reason Google built it is something JSON fundamentally cannot do.&lt;/p&gt;

&lt;h2&gt;
  
  
  The JSON bill you pay on every request
&lt;/h2&gt;

&lt;p&gt;Take this payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;12345&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"Alice"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"active"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Twelve bytes of that are actual data. The other twenty-nine are JSON describing itself — quotes, colons, commas, and the key names &lt;code&gt;id&lt;/code&gt;, &lt;code&gt;name&lt;/code&gt;, and &lt;code&gt;active&lt;/code&gt; spelled out in every single message. Every request. Every response. Forever.&lt;/p&gt;

&lt;p&gt;You can compress it. You can minify it. It still has to spell itself out on the wire because every reader has to parse a self-describing document.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real problem isn't bytes — it's fragility
&lt;/h2&gt;

&lt;p&gt;A week later you add a field: &lt;code&gt;email&lt;/code&gt;. You ship the new server. But an old client out on someone's phone still only knows three fields. It gets the new payload and hits &lt;code&gt;"email": "a@b.c"&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;What happens next depends entirely on the library. It might crash. Silently drop the field. Corrupt a nearby field. Reject the whole message. JSON itself has no opinion — there's no contract telling the client how to walk past something unknown.&lt;/p&gt;

&lt;p&gt;This is the wall Google hit at scale. And the answer wasn't a smaller JSON. It was a format where the wire itself tells the reader how to skip something it has never seen.&lt;/p&gt;

&lt;h2&gt;
  
  
  Put the schema on both sides, not in the bytes
&lt;/h2&gt;

&lt;p&gt;A protobuf message is defined by a &lt;code&gt;.proto&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;int32&lt;/span&gt;  &lt;span class="na"&gt;id&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;bool&lt;/span&gt;   &lt;span class="na"&gt;active&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That schema gets compiled into code on both sides. It is never sent over the wire. So what ends up on the wire?&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;id = 12345&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One byte tag: "field one, type varint"&lt;/li&gt;
&lt;li&gt;Two bytes for 12345 as a varint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For &lt;code&gt;name = "Alice"&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One byte tag: "field two, type length-delimited"&lt;/li&gt;
&lt;li&gt;One length byte: 5&lt;/li&gt;
&lt;li&gt;Five bytes: &lt;code&gt;A l i c e&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For &lt;code&gt;active = true&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One byte tag: "field three, type varint"&lt;/li&gt;
&lt;li&gt;One byte: 1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Twelve bytes. Same data. JSON wrote forty-one bytes to describe its own structure. Protobuf wrote zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  The varint — small numbers, small bytes
&lt;/h2&gt;

&lt;p&gt;That "two bytes for 12345" is a varint. Protobuf slices an integer into groups of seven bits. Each group goes into a byte. The top bit of the byte is a continuation flag — one means "more bytes coming," zero means "this is the last one." A reader walks one byte at a time and stops when the flag clears.&lt;/p&gt;

&lt;p&gt;Small numbers use one byte. Huge numbers use more. You never pay for bits you don't need.&lt;/p&gt;

&lt;h2&gt;
  
  
  The tag byte is doing two jobs
&lt;/h2&gt;

&lt;p&gt;That "one byte tag" is where the magic lives. The bottom three bits encode the &lt;em&gt;wire type&lt;/em&gt;. The remaining bits encode the &lt;em&gt;field number&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cpp"&gt;&lt;code&gt;&lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field_number&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;wire_type&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wire types are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;0&lt;/code&gt; — varint (int32, int64, bool, enum)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;1&lt;/code&gt; — fixed 64-bit (double, fixed64)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;2&lt;/code&gt; — length-delimited (string, bytes, sub-message)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;5&lt;/code&gt; — fixed 32-bit (float, fixed32)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This split is what makes protobuf evolvable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Unknown field? The wire type tells you how to skip
&lt;/h2&gt;

&lt;p&gt;Back to the schema evolution scenario. Old client, new message with an extra &lt;code&gt;email&lt;/code&gt; field. Watch the parser:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tag → field 1, varint. Known. Parse ID.&lt;/li&gt;
&lt;li&gt;Tag → field 2, length-delimited. Known. Parse name.&lt;/li&gt;
&lt;li&gt;Tag → field 3, varint. Known. Parse active.&lt;/li&gt;
&lt;li&gt;Tag → field 4. Never heard of it. But wire type says length-delimited — so read the length, skip that many bytes, continue.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No crash. No guess. No corruption. The unknown field just passes through. Some runtimes will even preserve the raw bytes of unknown fields so the client can re-serialize the message and send it back untouched.&lt;/p&gt;

&lt;p&gt;You can add fields. You can rename fields — the name was never on the wire. You can remove a field and the reader just keeps going. The one rule: don't reuse a field number for a different meaning, which is why every protobuf schema pins a number next to every field.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's actually on the line under your gRPC calls
&lt;/h2&gt;

&lt;p&gt;Every gRPC call on the planet uses this format. It's what moves messages inside Kubernetes service meshes. It's on the wire between Google's own data centers. It's how Android push notifications get to your phone.&lt;/p&gt;

&lt;p&gt;Not because twelve bytes is smaller than forty-one. Because the bytes were designed so tomorrow's schema can't break yesterday's code.&lt;/p&gt;

</description>
      <category>protobuf</category>
      <category>protocolbuffers</category>
      <category>grpc</category>
      <category>wireformat</category>
    </item>
  </channel>
</rss>
