<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jonathan Murray</title>
    <description>The latest articles on DEV Community by Jonathan Murray (@jon_at_backboardio).</description>
    <link>https://dev.to/jon_at_backboardio</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3824580%2Fcbf3ef23-2d0b-4576-90ff-0d46b2119ea8.png</url>
      <title>DEV Community: Jonathan Murray</title>
      <link>https://dev.to/jon_at_backboardio</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jon_at_backboardio"/>
    <language>en</language>
    <item>
      <title>OpenAI and Anthropic are Friendster and MySpace, if Subquadratic proves to be true.</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Wed, 06 May 2026 15:24:34 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/openai-and-anthropic-are-friendster-and-myspace-if-subquadratic-proves-to-be-true-nb6</link>
      <guid>https://dev.to/jon_at_backboardio/openai-and-anthropic-are-friendster-and-myspace-if-subquadratic-proves-to-be-true-nb6</guid>
      <description>&lt;p&gt;If you've ever shipped an LLM-powered feature that needed to reason over a real codebase, a real contract, or a real research corpus, you already know the shape of the problem. The model technically accepts a million tokens of context. In practice, the answers get worse as the context gets longer, and your infra bill gets worse faster than that.&lt;/p&gt;

&lt;p&gt;SubQ is built around &lt;strong&gt;SSA — Subquadratic Sparse Attention&lt;/strong&gt; — a linearly scaling attention mechanism designed for long-context retrieval, reasoning, and software engineering workloads. The technical results are strong on their own merits: 52.2× prefill speedup at 1M tokens, RULER 95.0%, MRCR v2 65.9%, SWE-Bench Verified 81.8%.&lt;/p&gt;

&lt;p&gt;But the more interesting question is what happens to the industry if results like these stop being a one-off. The valuations, pricing, and competitive narrative around the major labs have been priced as if compute is the moat — as if maximizing token use and burning more dollars per call is the cost of doing business at the frontier. SSA is one of the first credible signals that this might not be true for much longer. And if it isn't, the OpenAIs and Anthropics of today look less like permanent fixtures and more like the Friendsters and MySpaces of the next platform shift.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem isn't "missing context." It's fragmented context.
&lt;/h2&gt;

&lt;p&gt;The hard problems enterprise AI needs to solve are long-context problems. Codebases, contracts, enterprise corpora, databases, spreadsheets, research collections, and long-running agent sessions rarely fail because the answer is &lt;em&gt;absent&lt;/em&gt;. They fail because the relevant evidence is distributed across a large body of context, referenced indirectly, and only meaningful when multiple pieces are held in view at once.&lt;/p&gt;

&lt;p&gt;If you build with these systems, this list will look familiar:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a codebase where a function is defined in one module, called in dozens of others, and constrained by tests elsewhere&lt;/li&gt;
&lt;li&gt;a contract where an obligation depends on a definition, an exception, and a referenced clause several pages apart&lt;/li&gt;
&lt;li&gt;a research workflow where a conclusion depends on reconciling evidence across many papers&lt;/li&gt;
&lt;li&gt;a long-running coding task where prior planning decisions, intermediate edits, review notes, and regressions all matter&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These aren't lookup problems. They're multi-hop reasoning problems over fragmented corpora. And the workarounds we've been using — chunking, RAG, agentic decomposition, recursive summarization — all have the same shape. They preserve some signal and lose some signal. RAG keeps semantic similarity but loses position, hierarchy, neighboring context, and reference structure. Agentic workflows decompose tasks into smaller calls but compound errors across steps and bake hand-authored orchestration policy into the system. The bitter lesson keeps showing up: scaffolding that works today doesn't generalize tomorrow.&lt;/p&gt;

&lt;p&gt;SSA is an attempt to remove more of the reason that scaffolding is necessary in the first place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why dense attention is the bottleneck
&lt;/h2&gt;

&lt;p&gt;Attention is a retrieval operation built into the model. Each token acts as a query, compares itself against every other token, scores their relevance, and aggregates their information into its next representation. Powerful, because every token gets access to the full context. Expensive, for the exact same reason — every query compares against every key, and the cost grows quadratically with sequence length.&lt;/p&gt;

&lt;p&gt;At small contexts this is fine. At hundreds of thousands to millions of tokens, it becomes the dominant constraint. Doubling context doesn't double cost; it quadruples it.&lt;/p&gt;

&lt;p&gt;And here's the part that should bother any engineer: most of that work is wasted. In trained models, the vast majority of attention weights are near zero. The model performs the full all-pairs comparison, but only a small fraction of those interactions meaningfully influence the output. Dense attention isn't just quadratic — it's &lt;em&gt;wastefully&lt;/em&gt; quadratic.&lt;/p&gt;

&lt;p&gt;FlashAttention made this much more practical at today's context lengths by avoiding materialization of the full attention matrix and optimizing memory movement. That's a real win. But it doesn't change the underlying scaling. The number of comparisons is still the same. The model still does quadratic work; it just does that work more efficiently.&lt;/p&gt;

&lt;p&gt;System-level workarounds — retrieval pipelines, context compaction, recursive decomposition, agentic orchestration — make dense-attention systems usable. None of them change the scaling law. They route around the limitation. The quadratic cost is the boundary they're routing around.&lt;/p&gt;

&lt;h2&gt;
  
  
  What prior efficient architectures gave up
&lt;/h2&gt;

&lt;p&gt;The field has spent years trying to make attention cheaper. The hard part isn't reducing cost. It's reducing cost without breaking retrieval. Every prior approach traded something away.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fixed-pattern sparse attention&lt;/strong&gt; — sliding windows, strided patterns, dilated masks — gets subquadratic scaling by deciding &lt;em&gt;in advance&lt;/em&gt; which positions a token can attend to. The routing decision is positional, not content-aware. The model decides where to look before it knows what it's looking for. When the relevant information falls outside the pattern, it's invisible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;State space models and recurrent alternatives&lt;/strong&gt; drop the all-pairs comparison entirely, replacing it with a compressed state that evolves across the sequence. Linear scaling by construction — but the state has fixed capacity. Information gets summarized, blurred, or discarded as the sequence grows. Great at gist and structure, weaker at retrieving a specific fact introduced arbitrarily far back.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hybrid architectures&lt;/strong&gt; combine both ideas: efficient layers do most of the compute, dense attention layers preserve retrieval. Works in practice, but the dense layers stay load-bearing. As context grows, their quadratic cost dominates again. The benefit is scalar, not asymptotic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DeepSeek Sparse Attention&lt;/strong&gt; offsets attention's quadratic cost to a lightning indexer that selects, per query, which keys to attend to. The indexer is itself quadratic — it scores every query against every key with small constants but the same O(n²) scaling. The complexity has been moved, not removed.&lt;/p&gt;

&lt;p&gt;The pattern is consistent. Fixed sparsity gives up content-dependent routing. Recurrent models give up exact retrieval. Hybrids reintroduce the original cost. DeepSeek-style indexers stay quadratic and become cost-prohibitive at scale.&lt;/p&gt;

&lt;p&gt;The open problem isn't "make attention faster." It's: build a mechanism that's efficient, content-dependent, &lt;strong&gt;and&lt;/strong&gt; capable of retrieving from arbitrary positions across long context.&lt;/p&gt;

&lt;h2&gt;
  
  
  How SSA works
&lt;/h2&gt;

&lt;p&gt;SSA changes how attention work is allocated. The core idea is &lt;strong&gt;content-dependent selection&lt;/strong&gt;: for each query, the model selects which parts of the sequence are worth attending to, and computes attention exactly over those positions.&lt;/p&gt;

&lt;p&gt;Dense attention assumes every pair might matter, so it evaluates all of them. In practice, almost none do. SSA drops that assumption. It doesn't approximate attention — it restricts attention to the positions that actually carry signal, and skips the rest.&lt;/p&gt;

&lt;p&gt;That gives SSA three properties that matter together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Linear scaling in compute and memory.&lt;/strong&gt; Attention cost grows with the number of selected positions, not the full sequence. Long context becomes economically usable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content-dependent routing.&lt;/strong&gt; The model decides where to look based on meaning, not position. Relevant information can be retrieved regardless of where it appears.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sparse retrieval from arbitrary positions.&lt;/strong&gt; Unlike recurrent or compressed approaches, SSA preserves the ability to recover specific information introduced far earlier in the sequence.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The practical distinction matters: SSA is not just a faster implementation of dense attention. It reduces the &lt;em&gt;amount&lt;/em&gt; of attention work the model performs. That reduction is what shows up as speed.&lt;/p&gt;

&lt;p&gt;Measured in wall-clock input processing time on B200s, SSA achieves the following speedups over standard attention with FlashAttention-2 (FlashAttention-3 did not produce a speedup over FA-2 on B200s):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Context length&lt;/th&gt;
&lt;th&gt;SSA speed increase vs. FlashAttention&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;7.2×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;13.2×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;512K&lt;/td&gt;
&lt;td&gt;23.0×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1M&lt;/td&gt;
&lt;td&gt;52.2×&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This is the throughput inversion that matters in production. Dense attention becomes &lt;em&gt;slower&lt;/em&gt; relative to SSA as context grows. SSA gets more advantageous exactly where long-context workloads become most valuable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Training SSA for long-context behavior
&lt;/h2&gt;

&lt;p&gt;Architecture is necessary but not sufficient. A model can have a long context window and still fail to use it well. SSA was trained to make long-context use reliable, not just possible.&lt;/p&gt;

&lt;p&gt;The training pipeline is three stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Pre-training&lt;/strong&gt; establishes base language modeling capability and the long-context representations the selection mechanism uses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supervised fine-tuning&lt;/strong&gt; shapes behavior toward instruction following, structured reasoning, and the code generation patterns enterprise workloads need.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reinforcement learning&lt;/strong&gt; targets the behaviors that are hardest to induce through supervised examples: reliable long-context retrieval, and coding behavior that uses the available context aggressively instead of defaulting to local reasoning.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That last stage is the one developers should care about. Long-context failures often look &lt;em&gt;plausible&lt;/em&gt;. A model answers from nearby context because nearby evidence is easier to use, even when the decisive evidence is much earlier. It produces a locally correct patch that violates an interface defined elsewhere. It summarizes a prior decision instead of preserving the exact constraint that should govern a later step. SSA's RL stage is designed around exactly those failure modes.&lt;/p&gt;

&lt;p&gt;Training data emphasizes long-form sources with high information density and cross-reference structure — the kind of data that forces the selection mechanism to learn routing over large positional distances. The goal isn't benchmark memorization. It's teaching the model to attend to what matters regardless of where it sits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the training infrastructure matters too
&lt;/h2&gt;

&lt;p&gt;Long-context training isn't only a modeling problem. It's a systems problem that only shows up at scale. At million-token sequence lengths, failure modes that are invisible at shorter contexts become binding — memory pressure, sequence partitioning across devices, gradient instability, numerical precision, kernel efficiency. These determine whether training runs at all.&lt;/p&gt;

&lt;p&gt;The SSA training stack runs stably at 1M tokens and beyond, maintains linear memory scaling across the training pipeline, and uses distributed sequence parallelism to shard sequences across devices when they exceed single-device limits.&lt;/p&gt;

&lt;p&gt;The consequence isn't just that long-context training becomes possible. It becomes &lt;strong&gt;iterable&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Under dense attention, long-context experiments are expensive enough that they get treated as reserved runs. With SSA's linear scaling, they become routine. More ablations, more evaluations, faster feedback, targeted fixes on the behaviors that actually matter at long context.&lt;/p&gt;

&lt;p&gt;That's the deeper implication. SSA doesn't only reduce the cost of inference. It reduces the cost of &lt;em&gt;learning&lt;/em&gt; long-context behavior in the first place — and that's the thing that compounds for developers downstream.&lt;/p&gt;

&lt;h2&gt;
  
  
  Evaluating functional context, not nominal context
&lt;/h2&gt;

&lt;p&gt;An advertised context window doesn't tell you how much context a model can use. The real question is whether the model can retrieve, connect, and reason over evidence distributed across that window.&lt;/p&gt;

&lt;p&gt;SubQ is evaluated across two axes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deployment viability&lt;/strong&gt; — compute reduction and wall-clock speed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval capability&lt;/strong&gt; — RULER, MRCR v2, and SWE-Bench Verified&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;More general benchmarks will be published in the upcoming model card. Needle-in-a-Haystack tests exact retrieval of a single target. RULER extends that to multi-hop retrieval, aggregation, variable tracking, and selective filtering. MRCR v2 goes further: the model must locate and integrate multiple pieces of evidence distributed across the context, where the relevant set isn't given in advance. That's closer to the shape of real work — finding one fact isn't enough; the model has to determine which pieces matter and combine them into a coherent answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Compute and speed
&lt;/h3&gt;

&lt;p&gt;SSA's linear scaling means doubling context length doubles attention compute, rather than quadrupling it. At 1M tokens, that's a &lt;strong&gt;62.5× attention FLOP reduction&lt;/strong&gt; relative to standard quadratic attention.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Context length&lt;/th&gt;
&lt;th&gt;Attention FLOP reduction vs. standard attention&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;8×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1M&lt;/td&gt;
&lt;td&gt;62.5×&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Wall-clock speed is the more product-relevant result: a &lt;strong&gt;52.2× prefill speedup&lt;/strong&gt; over dense attention at 1M tokens. That's the difference between a long-context system that behaves like an interactive tool and one that feels like an offline batch job.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Context length&lt;/th&gt;
&lt;th&gt;Input processing speed increase&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;128K&lt;/td&gt;
&lt;td&gt;7.2×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;256K&lt;/td&gt;
&lt;td&gt;13.2×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;512K&lt;/td&gt;
&lt;td&gt;23.0×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1M&lt;/td&gt;
&lt;td&gt;52.2×&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  RULER
&lt;/h3&gt;

&lt;p&gt;RULER tests retrieval and reasoning beyond simple needle lookup — multi-hop retrieval, aggregation, variable tracking, selective filtering.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;RULER @ 128K&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SSA / SubQ&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;95.0%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.6&lt;/td&gt;
&lt;td&gt;94.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For real workflows this matters because multi-hop tasks compound. A missed reference early in the chain can corrupt every conclusion downstream.&lt;/p&gt;

&lt;h3&gt;
  
  
  MRCR v2
&lt;/h3&gt;

&lt;p&gt;MRCR v2 is the most demanding retrieval benchmark in this set. It evaluates the ability to locate and integrate multiple non-adjacent pieces of evidence across long context.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;MRCR v2 score&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SSA / SubQ&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;65.9%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.6&lt;/td&gt;
&lt;td&gt;78.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT 5.5&lt;/td&gt;
&lt;td&gt;74.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT 5.4&lt;/td&gt;
&lt;td&gt;36.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.7&lt;/td&gt;
&lt;td&gt;32.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 3.1 Pro&lt;/td&gt;
&lt;td&gt;26.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;SubQ lands at 65.9% — solidly in the range of frontier dense models, well ahead of GPT 5.4, Opus 4.7, and Gemini 3.1 Pro. That's the clearest evidence for the gap between &lt;em&gt;nominal&lt;/em&gt; and &lt;em&gt;functional&lt;/em&gt; context. A model can accept a long input and still fail to reason reliably over that input. MRCR v2 surfaces the gap because it requires retrieval &lt;em&gt;and&lt;/em&gt; combination, not just token processing.&lt;/p&gt;

&lt;h3&gt;
  
  
  SWE-Bench Verified
&lt;/h3&gt;

&lt;p&gt;SWE-Bench Verified is an end-to-end software engineering benchmark on real GitHub issues. Not a pure retrieval test — it asks whether the model can use codebase understanding to localize bugs, reason about implementation constraints, and produce patches.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;SWE-Bench Verified&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SSA / SubQ&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;81.8%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.7&lt;/td&gt;
&lt;td&gt;87.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.6&lt;/td&gt;
&lt;td&gt;80.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 3.1 Pro&lt;/td&gt;
&lt;td&gt;80.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT 5.4&lt;/td&gt;
&lt;td&gt;not reported&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPT 5.5&lt;/td&gt;
&lt;td&gt;not reported&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Sitting at 81.8% — ahead of Opus 4.6 and Gemini 3.1 Pro on a real-world coding benchmark while running on a subquadratic architecture — is the result that should land hardest for developers. This is the workload most of us actually care about.&lt;/p&gt;

&lt;h2&gt;
  
  
  The part nobody priced in
&lt;/h2&gt;

&lt;p&gt;Step back from the architecture for a second and look at what the current AI industry is actually selling.&lt;/p&gt;

&lt;p&gt;The valuations, the capex, the data center buildouts, the multi-year compute contracts — all of it is underwritten by an assumption that frontier intelligence requires frontier-scale spend. Long context costs a lot. Reasoning costs a lot. Agents cost a lot. The premise running through every pitch deck and earnings call is that the labs with the most GPUs win, and the rest of the market pays for tokens at whatever margin those labs choose.&lt;/p&gt;

&lt;p&gt;SSA is one architecture, on one model, with one set of benchmarks. But the result it points at is uncomfortable for that premise: &lt;strong&gt;the dominant cost of long-context inference may not be a law of physics — it may be an artifact of dense attention.&lt;/strong&gt; A 52.2× prefill speedup at 1M tokens isn't a 10% efficiency gain. It is the kind of step-change that, if it generalizes, rewrites the unit economics of the entire industry.&lt;/p&gt;

&lt;p&gt;If you don't have to maximize tokens consumed and dollars burned to get frontier-quality long-context behavior, a lot of the moat narrative collapses with it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the incumbents look more fragile than they're priced
&lt;/h2&gt;

&lt;p&gt;The Friendster and MySpace comparison isn't snark — it's a specific lesson. Both had network effects. Both had brand. Both had scale advantages that looked durable right up until a better-architected product showed up and the users moved over a weekend. The moat people &lt;em&gt;talked&lt;/em&gt; about (network effects, switching costs) turned out to be much weaker than the moat that actually mattered (a better product on a better stack).&lt;/p&gt;

&lt;p&gt;The current frontier labs have a similar mismatch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API-level switching cost is near zero.&lt;/strong&gt; Most production code paths abstract the model behind a thin client. Swapping providers is a config change, not a migration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compute scarcity is the moat people brag about.&lt;/strong&gt; It is also the moat that subquadratic architectures attack first. If a challenger can match frontier quality at a fraction of the FLOPs, the capex advantage flips into a capex &lt;em&gt;liability&lt;/em&gt; — billions of dollars of GPU contracts depreciating against a more efficient successor.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pricing power assumes scarcity.&lt;/strong&gt; Today's per-token prices for long context look reasonable because the underlying compute is genuinely expensive. Drop the cost of a 1M-token prefill by 50× and the same prices start looking like rent extraction, not value capture.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brand isn't a defense once parity exists.&lt;/strong&gt; "Nobody got fired for buying OpenAI" works until a model with comparable benchmarks costs an order of magnitude less to serve. Then it works against them, the same way "nobody got fired for choosing IBM" did.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't a prediction that any specific lab disappears. Anthropic, OpenAI, and Google have real assets — distribution, talent, training data, alignment research, regulatory relationships. Those don't evaporate. But the &lt;em&gt;valuations&lt;/em&gt; and the &lt;em&gt;pricing power&lt;/em&gt; are built on the assumption that frontier compute is a stable moat, and that assumption depends on dense attention staying expensive.&lt;/p&gt;

&lt;p&gt;SSA is one of the first credible signals that it might not.&lt;/p&gt;

&lt;h2&gt;
  
  
  What developers should actually take away
&lt;/h2&gt;

&lt;p&gt;Strip out the industry analysis and the practical takeaways for anyone building on top of these systems are pretty clean:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Long context as a product surface is about to get a lot cheaper and a lot better.&lt;/strong&gt; If you've been deferring long-context features because the economics didn't pencil, the economics are about to pencil.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A nominal context window has never told you what a model can actually use.&lt;/strong&gt; RULER 95.0% and MRCR v2 65.9% on a subquadratic architecture is the gap between marketing tokens and functional tokens, and that gap is closing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Less hand-authored scaffolding.&lt;/strong&gt; Chunking, recursive summarization, and bespoke orchestration are workarounds for an attention bottleneck. As that bottleneck loosens, the scaffolding becomes a maintenance burden rather than an asset.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Watch where the open and challenger labs go next.&lt;/strong&gt; Efficient architectures disproportionately benefit teams that don't already own a hyperscaler-sized GPU fleet. The next frontier-quality model that runs cheaply on commodity infra is the one to track.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't lock into long-term commitments priced on dense-attention economics.&lt;/strong&gt; Multi-year contracts written against today's per-token costs are the riskiest thing on the table if a successor architecture cuts those costs by an order of magnitude.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SSA on its own is one paper, one architecture, one set of numbers. The reason it's worth paying attention to is what it implies if the result is real and replicable: the AI bubble's tightest correlation — bigger spend, better model — gets a lot weaker. That's good for developers, good for customers, and meaningfully bad for any incumbent whose story to investors depends on the old curve holding.&lt;/p&gt;

&lt;p&gt;The Friendsters and MySpaces of this cycle won't lose because their products got worse. They'll lose because someone shows up with a better-architected stack at a fraction of the cost, and the switching cost turns out to have been a config flag the whole time.&lt;/p&gt;

&lt;p&gt;Worth watching.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>discuss</category>
      <category>llm</category>
    </item>
    <item>
      <title>Very cool use of Backboard!</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Sat, 02 May 2026 02:29:45 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/very-cool-use-of-backboard-2e38</link>
      <guid>https://dev.to/jon_at_backboardio/very-cool-use-of-backboard-2e38</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/arqamwd/terra-triage-i-built-a-3-agent-wildlife-dispatcher-that-learns-from-every-referral-efk" class="crayons-story__hidden-navigation-link"&gt;Terra Triage: I Built a 3-Agent Wildlife Dispatcher That Learns From Every Referral&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
      &lt;a href="https://dev.to/arqamwd/terra-triage-i-built-a-3-agent-wildlife-dispatcher-that-learns-from-every-referral-efk" class="crayons-article__context-note crayons-article__context-note__feed"&gt;&lt;p&gt;DEV Weekend Challenge: Earth Day&lt;/p&gt;

&lt;/a&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/arqamwd" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3760002%2Feb94d8d9-e8ef-4932-ab99-d07a12fe197b.jpeg" alt="arqamwd profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/arqamwd" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Arqam Waheed
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Arqam Waheed
                &lt;a href="/++"&gt;&lt;img alt="Subscriber" class="subscription-icon" src="https://assets.dev.to/assets/subscription-icon-805dfa7ac7dd660f07ed8d654877270825b07a92a03841aa99a1093bd00431b2.png"&gt;&lt;/a&gt;
              
              &lt;div id="story-author-preview-content-3523816" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/arqamwd" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3760002%2Feb94d8d9-e8ef-4932-ab99-d07a12fe197b.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Arqam Waheed&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/arqamwd/terra-triage-i-built-a-3-agent-wildlife-dispatcher-that-learns-from-every-referral-efk" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Apr 20&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/arqamwd/terra-triage-i-built-a-3-agent-wildlife-dispatcher-that-learns-from-every-referral-efk" id="article-link-3523816"&gt;
          Terra Triage: I Built a 3-Agent Wildlife Dispatcher That Learns From Every Referral
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/devchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;devchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/weekendchallenge"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;weekendchallenge&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/backboard"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;backboard&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/arqamwd/terra-triage-i-built-a-3-agent-wildlife-dispatcher-that-learns-from-every-referral-efk" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/raised-hands-74b2099fd66a39f2d7eed9305ee0f4553df0eb7b4f11b01b6b1b499973048fe5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;24&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/arqamwd/terra-triage-i-built-a-3-agent-wildlife-dispatcher-that-learns-from-every-referral-efk#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              3&lt;span class="hidden s:inline"&gt; comments&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            9 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
      <category>agents</category>
      <category>ai</category>
      <category>automation</category>
      <category>showdev</category>
    </item>
    <item>
      <title>"Of Course" Erodes Trust Faster Than Bad Code ... Two Words That Are Killing Your Career</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Thu, 30 Apr 2026 18:38:04 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/of-course-erodes-trust-faster-than-bad-code-two-words-that-are-killing-your-career-2h59</link>
      <guid>https://dev.to/jon_at_backboardio/of-course-erodes-trust-faster-than-bad-code-two-words-that-are-killing-your-career-2h59</guid>
      <description>&lt;p&gt;You already have the job or the internship. You're on the team. You're in the meetings. You're in the Slack channels.&lt;/p&gt;

&lt;p&gt;And the thing that's going to hold you back has nothing to do with your code.&lt;/p&gt;

&lt;p&gt;Someone you work with says "hey, could we do X?" and you say "yeah, of course." Feels confident. Feels like you just proved you've got it.&lt;/p&gt;

&lt;p&gt;But you gave an answer that contains zero information. No cost. No timeline. No tradeoff. No indication of whether you even understood the question. And now you're either about to disappear for two weeks and come back with something nobody asked for, or pull an all-nighter for something that was just a question, not a request.&lt;/p&gt;

&lt;p&gt;Both started with "of course."&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I'm Writing This
&lt;/h2&gt;

&lt;p&gt;I'm a non-technical founder. I don't write the code. But I build alongside my team every day. I set direction, I think through problems, I get my hands dirty in the product.&lt;/p&gt;

&lt;p&gt;The devs who accelerated fastest on my team were never the ones who said yes the fastest. They were the ones who slowed down long enough to make sure we were talking about the same thing. The ones who said "of course" to everything burned out, shipped the wrong thing, and lost trust. Not because they weren't talented, but because they never gave anyone a chance to actually collaborate with them.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Yell That Sounds Like a Whisper
&lt;/h2&gt;

&lt;p&gt;Not everything a founder or lead says carries the same weight. But it doesn't always feel that way from your side. When the person steering the ship says "hey what if we tried this," it can land like a mandate even when it's just a thought.&lt;/p&gt;

&lt;p&gt;So before you go heads-down for 48 hours on something mentioned in a 5-minute conversation, ask:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Is this urgent or is this something we should plan for?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Are you asking me to build this or are you asking if it's possible?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Where does this sit relative to what I'm working on right now?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;And yeah, sometimes the answer is going to be "yes, it's urgent, do it right now, and please don't ask me any more questions." That happens. But even that is better than the silence you were working inside of before. That five-second question just saved you from building the wrong thing at the wrong pace.&lt;/p&gt;




&lt;h2&gt;
  
  
  "Of Course" Erodes Trust Faster Than Bad Code
&lt;/h2&gt;

&lt;p&gt;You say "of course." You go dark. A week passes. Someone checks in. The thing isn't done, or it's half-done, or it's not what was asked for. The people around you start second-guessing every "of course" that comes after it.&lt;/p&gt;

&lt;p&gt;That didn't happen because you're a bad developer. It happened because you skipped making sure everyone was on the same page before you started building.&lt;/p&gt;

&lt;p&gt;If you're stuck, say so. If it's more complex than expected, flag it. If the original ask doesn't make sense technically, speak up. "That won't work because of X, but here's what would" is one of the most valuable sentences in engineering.&lt;/p&gt;

&lt;p&gt;If you can build but you can't communicate what you're building, why you built it that way, and what could go wrong, you are operating at half your potential.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Best Devs Actually Sound Like
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;"I want to make sure I understand what you're looking for before I start."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"That's a cool idea. Here's what it would take and here's what we'd need to deprioritize."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"I can do a rough version by Friday to see if it's even the right direction. Want that instead of the full build?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Honestly, I'm not sure yet. Let me look into it and come back to you tomorrow."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;None of those sound weak. They sound like someone you'd hand the keys to. Someone who respects the problem enough to not pretend it's already solved.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Line
&lt;/h2&gt;

&lt;p&gt;The most dangerous dev says "of course."&lt;/p&gt;

&lt;p&gt;The most valuable dev says "let me make sure I understand the problem first."&lt;/p&gt;

&lt;p&gt;You already got the job. Now show them why they were right.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mentorship</category>
      <category>discuss</category>
      <category>founder</category>
    </item>
    <item>
      <title>The Hidden Challenge of Multi-LLM Context Management</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Fri, 24 Apr 2026 20:19:51 +0000</pubDate>
      <link>https://dev.to/backboardio/the-hidden-challenge-of-multi-llm-context-management-1pbh</link>
      <guid>https://dev.to/backboardio/the-hidden-challenge-of-multi-llm-context-management-1pbh</guid>
      <description>&lt;h1&gt;
  
  
  Why token counting isn't a solved problem when building across providers
&lt;/h1&gt;

&lt;p&gt;Building AI products that span multiple LLM providers involves a challenge most developers don't anticipate until they hit it: context windows are not interoperable.&lt;/p&gt;

&lt;p&gt;On the surface, managing context in a multi-LLM system seems straightforward. You track how long conversations get, trim when needed, and move on. In practice, it's considerably more complex — and if you're routing requests across providers like OpenAI, Anthropic, Google, Cohere, or xAI, there's a fundamental mismatch that can break your product in subtle ways.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tokenization Problem
&lt;/h2&gt;

&lt;p&gt;Every major LLM provider uses its own tokenizer. These tokenizers don't agree. The same block of text produces different token counts depending on which model processes it. The difference is often 10–20%, sometimes more.&lt;/p&gt;

&lt;p&gt;What this means in practice: a conversation that fits comfortably in one model's context window may silently overflow another's. A prompt routed to OpenAI might count as 1,200 tokens; the same prompt routed to Claude might count as 1,450. That gap matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where It Breaks
&lt;/h2&gt;

&lt;p&gt;The failure modes tend to show up at the boundaries. When you switch providers mid-conversation, the new model has to ingest the full prior context. If your context management layer was calibrated to the previous model's tokenizer, the new model may see a context that's already at or over the limit — before it's even responded to anything new.&lt;/p&gt;

&lt;p&gt;This produces three common failure patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unexpected context-window overflow:&lt;/strong&gt; the conversation that worked before now breaches the limit&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inconsistent truncation:&lt;/strong&gt; different models truncate at different points, changing what prior context the model actually sees&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Routing failures&lt;/strong&gt; that are unpredictable because the numbers your system used don't match the numbers the model actually used&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Simple Estimates Fail
&lt;/h2&gt;

&lt;p&gt;The instinct is to maintain a single "token estimate" with a generous safety margin. The problem is that the margin you'd need varies by provider, model version, and content type (code tokenizes differently than prose). A margin calibrated for one use case will either be too tight for another, causing failures, or too generous, causing unnecessary truncation that degrades conversation quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution: Provider-Aware Token Counting
&lt;/h2&gt;

&lt;p&gt;A robust multi-LLM context management layer makes token counting provider-specific. Rather than maintaining a single estimate, it measures each prompt the way the actual target model will measure it. The routing layer uses these per-provider measurements to make decisions before requests are sent.&lt;/p&gt;

&lt;p&gt;This lets the system stay ahead of context limits: it knows when a conversation is approaching an edge, trims or compresses history calibrated to the specific model receiving the request, and avoids the pricing and failure surprises that come from miscounted tokens.&lt;/p&gt;

&lt;p&gt;The end result is what users should see: a smooth conversation experience, regardless of which model is serving it. The complexity of "every model speaks a slightly different token language" stays inside the infrastructure layer, invisible to the people using the product.&lt;/p&gt;

&lt;p&gt;This is the approach we've taken in our adaptive context window management component, and it's become a foundational part of how we think about multi-LLM routing more broadly.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Rob Imbeault&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Apr 17, 2026&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>llm</category>
    </item>
    <item>
      <title>Why LLM Reasoning Is Breaking AI Infrastructure (And How to Fix It)</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Fri, 24 Apr 2026 20:18:05 +0000</pubDate>
      <link>https://dev.to/backboardio/why-llm-reasoning-is-breaking-ai-infrastructure-and-how-to-fix-it-2aik</link>
      <guid>https://dev.to/backboardio/why-llm-reasoning-is-breaking-ai-infrastructure-and-how-to-fix-it-2aik</guid>
      <description>&lt;p&gt;If you've tried building anything serious on top of large language models (LLMs) recently, you've probably run into this:&lt;/p&gt;

&lt;p&gt;"Thinking" is supposed to make models better. In practice, it makes your infrastructure worse.&lt;/p&gt;

&lt;p&gt;This isn't a model problem—it's an infrastructure and abstraction problem. And it's getting worse as teams scale across multiple AI providers.&lt;/p&gt;

&lt;p&gt;Let's break down exactly where things go wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Illusion of "Just Turn On Reasoning"
&lt;/h2&gt;

&lt;p&gt;At a high level, LLM reasoning sounds straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Turn reasoning on → better answers&lt;/li&gt;
&lt;li&gt;Turn reasoning off → cheaper, faster&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But in production systems, reality looks very different.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What actually happens:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Models don't reason when explicitly prompted&lt;/li&gt;
&lt;li&gt;Models over-reason on trivial queries, wasting tokens&lt;/li&gt;
&lt;li&gt;Behavior is inconsistent across providers and model versions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of predictable performance, you get variability.&lt;/p&gt;

&lt;p&gt;You're no longer just building an AI product—you're debugging model behavior at runtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fragmentation Problem in LLM Reasoning
&lt;/h2&gt;

&lt;p&gt;One of the biggest hidden challenges in AI infrastructure today is fragmentation.&lt;/p&gt;

&lt;p&gt;Every major provider has implemented reasoning differently:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI&lt;/strong&gt; → reasoning effort levels (low, medium, high)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic (Claude)&lt;/strong&gt; → explicit reasoning token budgets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google AI (Gemini)&lt;/strong&gt; → hybrid approaches depending on model version&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's just input configuration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Output fragmentation is even worse:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Some models return separate reasoning blocks&lt;/li&gt;
&lt;li&gt;Others provide summarized reasoning&lt;/li&gt;
&lt;li&gt;Some mix reasoning directly into standard responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No shared schema&lt;/li&gt;
&lt;li&gt;No standardized interface&lt;/li&gt;
&lt;li&gt;No predictable structure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What this means for developers:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you're building a multi-model AI system, you now need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Input normalization layers&lt;/li&gt;
&lt;li&gt;Output parsing logic per provider&lt;/li&gt;
&lt;li&gt;Custom handling for reasoning formats&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At this point, "simple API routing" becomes complex middleware engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI Cost Optimization Becomes a Moving Target
&lt;/h2&gt;

&lt;p&gt;Reasoning doesn't just impact performance—it breaks cost predictability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Billing inconsistencies across providers:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Some expose reasoning tokens explicitly&lt;/li&gt;
&lt;li&gt;Others bundle them into total usage&lt;/li&gt;
&lt;li&gt;Some introduce custom billing fields&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now you're not just optimizing latency or quality.&lt;/p&gt;

&lt;p&gt;You're building a cost translation layer across providers.&lt;/p&gt;

&lt;p&gt;This adds complexity to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Forecasting&lt;/li&gt;
&lt;li&gt;Budget control&lt;/li&gt;
&lt;li&gt;Scaling decisions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Multi-Model Switching Breaks Systems
&lt;/h2&gt;

&lt;p&gt;In theory, switching between LLM providers should improve reliability and cost efficiency.&lt;/p&gt;

&lt;p&gt;In practice, it introduces system instability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Even within a single provider:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Different endpoints behave differently&lt;/li&gt;
&lt;li&gt;Input formats change&lt;/li&gt;
&lt;li&gt;Output schemas change&lt;/li&gt;
&lt;li&gt;Reasoning structures vary&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Now add state management:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What context should persist?&lt;/li&gt;
&lt;li&gt;How do you maintain reasoning continuity?&lt;/li&gt;
&lt;li&gt;How do you prevent token explosion?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The result:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Most teams either:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Abandon portability, or&lt;/li&gt;
&lt;li&gt;Build fragile adapter layers that constantly break&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Real Problem: Lack of Abstraction
&lt;/h2&gt;

&lt;p&gt;After working through these challenges, one thing becomes clear:&lt;/p&gt;

&lt;p&gt;The core issue isn't reasoning—it's the absence of a unified abstraction layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Developers today are forced to:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Learn multiple reasoning systems&lt;/li&gt;
&lt;li&gt;Normalize different response formats&lt;/li&gt;
&lt;li&gt;Track multiple billing models&lt;/li&gt;
&lt;li&gt;Rebuild state handling for each provider&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not scalable.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "Unified LLM Reasoning" Should Look Like
&lt;/h2&gt;

&lt;p&gt;To make AI infrastructure truly production-ready, reasoning needs to be abstracted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A unified system should provide:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A single reasoning parameter&lt;/li&gt;
&lt;li&gt;Direct control over reasoning budgets&lt;/li&gt;
&lt;li&gt;Consistent behavior across models&lt;/li&gt;
&lt;li&gt;Standardized input/output formats&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The impact:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Developers can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tune reasoning without provider lock-in&lt;/li&gt;
&lt;li&gt;Switch models without rewriting logic&lt;/li&gt;
&lt;li&gt;Maintain consistent state across systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And most importantly:&lt;/p&gt;

&lt;p&gt;Stop thinking about thinking.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Uncomfortable Truth About Scaling AI Systems
&lt;/h2&gt;

&lt;p&gt;If you're working with LLMs and haven't encountered these issues yet—you will.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complexity compounds rapidly when you:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add a second provider&lt;/li&gt;
&lt;li&gt;Enable reasoning features&lt;/li&gt;
&lt;li&gt;Optimize for cost&lt;/li&gt;
&lt;li&gt;Maintain persistent context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At that point:&lt;/p&gt;

&lt;p&gt;You're no longer building your product. You're building AI infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Future of AI Platforms
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Short-term impact:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reduced engineering time (weeks to months saved)&lt;/li&gt;
&lt;li&gt;Lower debugging overhead&lt;/li&gt;
&lt;li&gt;More predictable cost structures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Long-term shift:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The winning AI platforms won't be defined by model quality alone.&lt;/p&gt;

&lt;p&gt;They will be defined by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Interoperability&lt;/strong&gt; (model interchangeability)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Statefulness&lt;/strong&gt; (persistent, portable context)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the real unlock in the next phase of AI development.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Audit for Your AI Stack
&lt;/h2&gt;

&lt;p&gt;If you're currently integrating multiple LLM providers, ask yourself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How many reasoning formats are you handling?&lt;/li&gt;
&lt;li&gt;How portable is your state management layer?&lt;/li&gt;
&lt;li&gt;How predictable are your AI costs?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If those answers aren't clean and consistent:&lt;/p&gt;

&lt;p&gt;You're already paying the infrastructure tax.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Rob Imbeault&lt;/em&gt;&lt;br&gt;
&lt;em&gt;Apr 20, 2026&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>I Broke SSO Trying to Center a Div. Let's Talk About Tokenmaxxing</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Fri, 24 Apr 2026 15:28:12 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/i-broke-sso-trying-to-center-a-div-lets-talk-about-tokenmaxxing-1h10</link>
      <guid>https://dev.to/jon_at_backboardio/i-broke-sso-trying-to-center-a-div-lets-talk-about-tokenmaxxing-1h10</guid>
      <description>&lt;h2&gt;
  
  
  &lt;a href="https://backboard.io/cli-form" rel="noopener noreferrer"&gt;Backboard CODEGEN CLI Waitlist&lt;/a&gt;
&lt;/h2&gt;

&lt;p&gt;A couple weeks ago, I tried to recenter some text on one of my side project SSO pages.&lt;/p&gt;

&lt;p&gt;That's the whole task. Move the text. Left a bit. Right a bit. Until it's in the middle. Center. Middle.&lt;/p&gt;

&lt;p&gt;I opened Claude Code. I said, roughly, "hey, center this."&lt;/p&gt;

&lt;p&gt;Fifteen minutes later I was two bugs deep, SSO was broken — not the text, the &lt;em&gt;whole login flow&lt;/em&gt; — and I'd hit my usage limit trying to unbreak the thing I broke while trying to do the thing that should've taken eight seconds in the inspector.&lt;/p&gt;

&lt;p&gt;Palm. Face.&lt;/p&gt;

&lt;p&gt;"Why did I just do that?"&lt;/p&gt;




&lt;p&gt;That was tokenmaxxing. You know what tokenmaxxing is. Your timeline knows. There is an entire subgenre of VC on X right now posting, with their whole chest, variations of &lt;em&gt;"if your engineers aren't maxing out their token budgets every day, they aren't working hard enough."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Three thousand likes. Quote tweets from other VCs agreeing. "This," they type. "100%," they type.&lt;/p&gt;

&lt;p&gt;I want to say this clearly, one time, so we can move on: that is insane.&lt;/p&gt;

&lt;p&gt;Measuring engineering effort by token spend is like measuring a chef by how much gas they burn. Congratulations. Your kitchen is on fire and the soup is fine.&lt;/p&gt;

&lt;p&gt;Tokenmaxxing is when the answer to every problem — a typo, a bug, a bad schema, a bad decision, a bad Tuesday — is to shove more context, more tokens, more model at it until the problem stops complaining.&lt;/p&gt;

&lt;p&gt;It is &lt;code&gt;console.log("hello world")&lt;/code&gt; wearing a $400 watch.&lt;/p&gt;




&lt;p&gt;A lot of people are going to read this and get defensive. I get it. I've done it.&lt;/p&gt;

&lt;p&gt;I once built a "documentation agent" that loaded the entire repo into context and then asked, very politely, whether we had a login page.&lt;/p&gt;

&lt;p&gt;We did. It was in &lt;code&gt;routes/login.tsx&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That query cost $2.17.&lt;/p&gt;

&lt;p&gt;I tell myself it was research.&lt;/p&gt;




&lt;p&gt;Here's the part nobody says out loud: brute-force compute is the new jQuery.&lt;/p&gt;

&lt;p&gt;Not in the "it works, ship it" way. In the "we're going to look at this in three years and wince" way.&lt;/p&gt;

&lt;p&gt;We're living in a window where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A 100k-token prompt to find one number is considered normal.&lt;/li&gt;
&lt;li&gt;"Just pass the whole codebase" is a real architectural decision that real adults say out loud in real meetings.&lt;/li&gt;
&lt;li&gt;The solution to hallucinations is more tokens. The solution to latency is more tokens. The solution to your cat being sad is, apparently, more tokens.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And the people selling the tokens? Thrilled. Obviously. You would be too.&lt;/p&gt;




&lt;p&gt;I want to be clear: I love LLMs. I use them constantly. I have emotions about them I will not discuss here.&lt;/p&gt;

&lt;p&gt;But the current game is rigged in a very specific direction. The model companies make more money when you're lazy. Your sloppy prompt is their margin. Your 90,000-token scaffolding is someone's yacht.&lt;/p&gt;

&lt;p&gt;Meanwhile, the indie devs — the people who built the internet worth having — are getting priced out of the exact kind of tinkering that used to be free. You can't "just try something" when "just trying something" is $40.&lt;/p&gt;

&lt;p&gt;The next big app should not require a $10k/month API budget to prototype. It used to require a laptop and an unreasonable amount of Red Bull. I'd like to go back to that, if possible.&lt;/p&gt;




&lt;p&gt;So here is my proposal, which I will now name dramatically so it fits in a tweet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Token Minimizing Revolution.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It has two rules, and they are embarrassingly obvious.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Precision over volume.&lt;/strong&gt; A small, clever retrieval beats a giant dumb context every time. RAG, fine-tunes, routers, caches. Boring stuff. Works.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token-golfing is the new code-golfing.&lt;/strong&gt; The flex is not "look what I made the big model do." The flex is "look what I made the small model do."&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;We're building something right now at backboard.io that is the opposite of tokenmaxxing. &lt;/p&gt;

&lt;p&gt;But if you've been feeling that itch — the one where you look at your API bill and think &lt;em&gt;this is not a technology problem, this is a vibes problem&lt;/em&gt; — you are not alone.&lt;/p&gt;

&lt;p&gt;The revolution will be small. Efficient. Under budget.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Stop building what your customers ask for</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Wed, 22 Apr 2026 13:44:39 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/stop-building-what-your-customers-ask-for-3d16</link>
      <guid>https://dev.to/jon_at_backboardio/stop-building-what-your-customers-ask-for-3d16</guid>
      <description>&lt;p&gt;I was at a conference this week.&lt;/p&gt;

&lt;p&gt;Bunch of stakeholders on stage. Hospital admins, big-name buyers, a couple of policy folks. The message to founders was loud and clear:&lt;/p&gt;

&lt;p&gt;"You need to be consulting us. You need to be adapting your products to our suggestions."&lt;/p&gt;

&lt;p&gt;And honestly? I hated it.&lt;/p&gt;

&lt;p&gt;Not because they were completely wrong. They were half right. They were just shouting the half that was wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Here's the part that's true
&lt;/h2&gt;

&lt;p&gt;Building in a vacuum is how you ship things nobody uses. Founders, especially technical ones, have a real habit of deciding what the world needs from inside a Notion doc.&lt;/p&gt;

&lt;p&gt;So yes. Talk to users. Ride along. Watch people struggle with your product. All of that.&lt;/p&gt;

&lt;p&gt;The stakeholders aren't crazy for wanting a seat at the table.&lt;/p&gt;

&lt;h2&gt;
  
  
  Here's the part that breaks things
&lt;/h2&gt;

&lt;p&gt;"Listen to us" slowly turns into "do what we say."&lt;/p&gt;

&lt;p&gt;And that's where it gets weird.&lt;/p&gt;

&lt;p&gt;Because every dev on earth has learned this lesson already. It's called a bug report.&lt;/p&gt;

&lt;p&gt;A user says "the login is slow." You dig in. The login isn't slow. They're on hotel wifi and there's no loading spinner, so it &lt;em&gt;feels&lt;/em&gt; frozen. The complaint was real. The proposed fix, "make the login faster," was useless.&lt;/p&gt;

&lt;p&gt;Stakeholder feedback works exactly the same way.&lt;/p&gt;

&lt;p&gt;The pain is the signal. The proposed fix is a guess. Usually a bad one.&lt;/p&gt;

&lt;p&gt;A senior eng who shipped whatever the ticket said would get laughed out of the room. Why do we call a founder who ships whatever the customer asks for "responsive"?&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the fix is almost always wrong
&lt;/h2&gt;

&lt;p&gt;Three reasons, no mystery to any of this:&lt;/p&gt;

&lt;p&gt;1) Stakeholders see their slice. Not the whole system. Of course their fix is local.&lt;br&gt;
2) They imagine solutions inside the workflow they already have. Which is often the exact workflow you're trying to change.&lt;br&gt;
3) The thing that would actually solve the problem doesn't exist in their vocabulary yet. That's kind of your job.&lt;/p&gt;

&lt;p&gt;When a cardiologist says "add a button that auto-generates the referral letter," the real signal is &lt;em&gt;referrals are friction&lt;/em&gt;. The button might be the worst possible version of the fix. Maybe the letter shouldn't exist. Maybe the referral shouldn't need a letter. That's a conversation. Not a ticket.&lt;/p&gt;

&lt;h2&gt;
  
  
  The receipt: healthcare AI just ran this experiment for us
&lt;/h2&gt;

&lt;p&gt;For years, stakeholders told the industry they wanted "AI that can pass the medical boards."&lt;/p&gt;

&lt;p&gt;The industry listened. Every model got tuned on USMLE-style questions. Board-exam scores became the benchmark everyone pointed at.&lt;/p&gt;

&lt;p&gt;This month, JAMA Network Open dropped a study across 21 top LLMs (ChatGPT, Claude, Gemini, DeepSeek, Grok). Final-diagnosis accuracy on complete cases? Over 90%.&lt;/p&gt;

&lt;p&gt;Differential diagnosis, the thing an actual doctor does all day? Failed more than 80% of the time.&lt;/p&gt;

&lt;p&gt;The stakeholders asked for the wrong benchmark. Founders shipped it. We now have a generation of models that ace trivia and fold on reasoning.&lt;/p&gt;

&lt;p&gt;The founders who had pushed back, the ones who said &lt;em&gt;we hear you want trustworthy AI, we're not going to chase board scores to prove it&lt;/em&gt;, would look prescient right now. The ones who obeyed built an industry of exam-passers.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to actually do it
&lt;/h2&gt;

&lt;p&gt;When a stakeholder hands me a feature request, I try to never put it in the backlog as written. Three questions first:&lt;/p&gt;

&lt;p&gt;1) What were they trying to do when they felt the pain?&lt;br&gt;
2) What's the actual friction, stripped of their proposed fix?&lt;br&gt;
3) What would "solved" feel like, regardless of how it gets built?&lt;/p&gt;

&lt;p&gt;Rule of thumb: if a stakeholder ask fits neatly into a Jira ticket, I haven't translated it yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Back to the conference
&lt;/h2&gt;

&lt;p&gt;I get why the stakeholders were on that stage. They've been burned by founders who ignored them. They want to be heard.&lt;/p&gt;

&lt;p&gt;But "heard" is not the same as "obeyed." And founders who treat customer feedback as a spec instead of a bug report end up building slightly nicer versions of the thing that already isn't working.&lt;/p&gt;

&lt;p&gt;Listen obsessively.&lt;br&gt;
Obey selectively.&lt;br&gt;
And be willing to tell the room that the button they're asking for isn't the thing they actually need.&lt;/p&gt;

&lt;p&gt;That's not arrogance. That's the job.&lt;/p&gt;

&lt;p&gt;What's a piece of stakeholder feedback you took literally and regretted? Or one you translated into something better and it worked?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>discuss</category>
      <category>customersuccess</category>
      <category>programming</category>
    </item>
    <item>
      <title>🙏🏻🙏🏻🙏🏻🙏🏻💪🏻💪🏻💪🏻💪🏻</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Tue, 21 Apr 2026 18:47:22 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/-1n79</link>
      <guid>https://dev.to/jon_at_backboardio/-1n79</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/ranjancse/building-conversational-intelligence-with-backboard-turning-conversations-into-a-living-1mip" class="crayons-story__hidden-navigation-link"&gt;Building Conversational Intelligence with Backboard: Turning Conversations into a Living Intelligence System&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/ranjancse" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1211275%2F88edf8cd-cc3a-4aac-91b7-934631126085.png" alt="ranjancse profile" class="crayons-avatar__image" width="764" height="750"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/ranjancse" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Ranjan Dailata
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Ranjan Dailata
                
              
              &lt;div id="story-author-preview-content-3528675" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/ranjancse" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1211275%2F88edf8cd-cc3a-4aac-91b7-934631126085.png" class="crayons-avatar__image" alt="" width="764" height="750"&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Ranjan Dailata&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/ranjancse/building-conversational-intelligence-with-backboard-turning-conversations-into-a-living-1mip" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Apr 21&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/ranjancse/building-conversational-intelligence-with-backboard-turning-conversations-into-a-living-1mip" id="article-link-3528675"&gt;
          Building Conversational Intelligence with Backboard: Turning Conversations into a Living Intelligence System
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/machinelearning"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;machinelearning&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/nlp"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;nlp&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/productivity"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;productivity&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/ranjancse/building-conversational-intelligence-with-backboard-turning-conversations-into-a-living-1mip" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;10&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/ranjancse/building-conversational-intelligence-with-backboard-turning-conversations-into-a-living-1mip#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              &lt;span class="hidden s:inline"&gt;Add Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            4 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>Two Days, Two Hacks: The Lovable Disclosure and the Pattern Nobody Wants to Talk About</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Mon, 20 Apr 2026 18:14:55 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/two-days-two-hacks-the-lovable-disclosure-and-the-pattern-nobody-wants-to-talk-about-47eh</link>
      <guid>https://dev.to/jon_at_backboardio/two-days-two-hacks-the-lovable-disclosure-and-the-pattern-nobody-wants-to-talk-about-47eh</guid>
      <description>&lt;p&gt;Yesterday I wrote about the Vercel incident and walked through &lt;a href="https://dev.to/jon_at_backboardio/vercel-hack-why-you-need-to-rotate-your-non-sensitive-environment-variables-today-25mh"&gt;why you need to rotate your "non-sensitive" environment variables today&lt;/a&gt;. I thought that would be the week's security post.&lt;/p&gt;

&lt;p&gt;Then I woke up to @weezerOSINT's disclosure about Lovable, and now I am starting to wonder if someone out there is just running an end-to-end test on the mythos of the modern AI-dev stack.&lt;/p&gt;

&lt;p&gt;Two days. Two incidents. Totally different root causes. Same uncomfortable conclusion.&lt;/p&gt;

&lt;h2&gt;
  
  
  What dropped
&lt;/h2&gt;

&lt;p&gt;The short version: security researcher @weezerOSINT made a free Lovable account and was able to read other users' source code, database credentials, AI chat histories, and customer data. Any free account. Every project created before November 2025.&lt;/p&gt;

&lt;p&gt;The screenshot making the rounds shows a response from &lt;code&gt;api.lovable.dev/GetProjectMessagesOutputBody.json&lt;/code&gt; with another user's prompts, AI reasoning traces, task lists, and project IDs sitting there in plain JSON. The bug is Broken Object Level Authorization on Lovable's own platform API, not the more familiar "the generated app shipped without Supabase RLS" story we got in February.&lt;/p&gt;

&lt;p&gt;The part that actually made me set my coffee down: the report was filed through Lovable's bug bounty program 48 days ago, marked as a duplicate of an earlier informative report, and left open. At the time of the disclosure it reportedly still worked.&lt;/p&gt;

&lt;p&gt;Forty. Eight. Days.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this one hits different
&lt;/h2&gt;

&lt;p&gt;The February Lovable wave was a story about generated apps. The takeaway was "audit the output" — a thing developers already know how to do, at least in principle. You could imagine a fix: better defaults, RLS on by default in the scaffolds, a linter that yells at you when a table is public.&lt;/p&gt;

&lt;p&gt;This one is a story about the platform itself. The thing you trusted to hold your code, your keys, your customer data — the control plane, not the output — had a missing auth check on a production API endpoint for at least seven weeks after someone told them about it.&lt;/p&gt;

&lt;p&gt;Stack this next to the Vercel situation and a pattern starts to emerge. In the Vercel case, the breach came through a third-party AI tool that had been granted a Workspace OAuth scope that went further than anyone audited. In the Lovable case, it is the platform's own API failing to check "is this caller allowed to see this object." Different failure modes, same underlying theme: the trust boundaries in the AI-assisted-dev stack are drawn with marker, and the marker is washing off in the rain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The vibe-coding angle
&lt;/h2&gt;

&lt;p&gt;Here is the thing that will keep me up tonight. When you vibe-code an app, you do not type &lt;code&gt;process.env.STRIPE_KEY&lt;/code&gt; into a &lt;code&gt;.env&lt;/code&gt; file and move on. You paste the key into the chat so the AI can wire it up. You paste the database URL into the chat to fix a schema bug. You paste a sample customer record into the chat to get the types right.&lt;/p&gt;

&lt;p&gt;Every one of those messages lives in the project's chat history. The disclosed endpoint returned chat histories. So it is not just "your generated app is exposed" — it is "every secret you ever mentioned in a conversation with Lovable is sitting in a JSON response that any free account could fetch."&lt;/p&gt;

&lt;p&gt;If you have built on Lovable, go read your own chat history right now, with the eyes of an attacker. Search for &lt;code&gt;sk-&lt;/code&gt;, &lt;code&gt;postgres://&lt;/code&gt;, &lt;code&gt;Bearer&lt;/code&gt;, anything that looks like a secret. Every match is a key to rotate at the source. Not rename. Rotate. Revoke at the provider and reissue.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually think is going on
&lt;/h2&gt;

&lt;p&gt;I do not think someone is literally targeting the AI-dev ecosystem on a two-day schedule for dramatic effect. What I think is happening is that this category of tools grew very fast, shipped a lot of features, pointed their best engineers at the next feature rather than the last one, and is now discovering that "trust boundaries" is a feature that does not show up in a demo.&lt;/p&gt;

&lt;p&gt;The vibe-coding productivity is real. I still use these tools. I will still use them next week. But I am going to stop pretending that a platform saying "secure by default" counts for anything until I see a disclosure track record that backs it up. Forty-eight days on a report with the title "Broken Object Level Authorization on Lovable API leads to unauthorized access to user data and project source code" is, to use a technical term, a lot.&lt;/p&gt;

&lt;h2&gt;
  
  
  If you are shipping on Lovable right now
&lt;/h2&gt;

&lt;p&gt;Short version, because I already wrote the long version yesterday for Vercel and the shape is the same:&lt;/p&gt;

&lt;p&gt;Rotate anything a Lovable project ever touched. Revoke at the upstream provider, not just in the Lovable dashboard. Audit your chat histories for pasted secrets. Turn on RLS on every Supabase table while you are in there. If personal data was exposed, talk to a lawyer today about your disclosure obligations, because "we used an AI app builder" is not going to hold up in front of a regulator.&lt;/p&gt;

&lt;p&gt;Two days. Two hacks. Maybe it is the start of a trend, maybe it is the week from hell, maybe someone really is testing the mythos. Either way, rotate your keys and get back to building.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Source for the disclosure: &lt;a href="https://x.com/weezerosint/status/2046170666131669027" rel="noopener noreferrer"&gt;@weezerOSINT on X&lt;/a&gt;. If you have audited a Lovable project in the last day and found something worth sharing, the comments are open.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>hacked</category>
      <category>discuss</category>
    </item>
    <item>
      <title>My co-founder is just being honest in this post. ;)</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Mon, 20 Apr 2026 18:09:14 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/my-co-founder-is-just-being-honest-in-this-post--1c2h</link>
      <guid>https://dev.to/jon_at_backboardio/my-co-founder-is-just-being-honest-in-this-post--1c2h</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/robimbeault/i-think-therefore-i-am-a-big-pain-in-the-a-3a9m" class="crayons-story__hidden-navigation-link"&gt;I Think Therefore I Am… A Big Pain in the A$$&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/robimbeault" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818726%2F3d165aef-8612-4c2c-aba9-6dd7754f4f84.jpeg" alt="robimbeault profile" class="crayons-avatar__image"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/robimbeault" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Robert Imbeault
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Robert Imbeault
                
              
              &lt;div id="story-author-preview-content-3512354" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/robimbeault" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818726%2F3d165aef-8612-4c2c-aba9-6dd7754f4f84.jpeg" class="crayons-avatar__image" alt=""&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Robert Imbeault&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/robimbeault/i-think-therefore-i-am-a-big-pain-in-the-a-3a9m" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Apr 20&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/robimbeault/i-think-therefore-i-am-a-big-pain-in-the-a-3a9m" id="article-link-3512354"&gt;
          I Think Therefore I Am… A Big Pain in the A$$
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/llm"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;llm&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/developers"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;developers&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/reasoning"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;reasoning&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/robimbeault/i-think-therefore-i-am-a-big-pain-in-the-a-3a9m" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="18" height="18"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;5&lt;span class="hidden s:inline"&gt; reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/robimbeault/i-think-therefore-i-am-a-big-pain-in-the-a-3a9m#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              Comments


              1&lt;span class="hidden s:inline"&gt; comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            3 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>Anyone working on applications for Recursive Language Models? #discussion #rlm</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Mon, 20 Apr 2026 16:34:38 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/anyone-working-on-applications-for-recursive-language-models-discussion-rlm-2fp0</link>
      <guid>https://dev.to/jon_at_backboardio/anyone-working-on-applications-for-recursive-language-models-discussion-rlm-2fp0</guid>
      <description></description>
      <category>ai</category>
      <category>discuss</category>
      <category>machinelearning</category>
      <category>nlp</category>
    </item>
    <item>
      <title>Vercel Hack: Why You Need to Rotate Your "Non-Sensitive" Environment Variables Today</title>
      <dc:creator>Jonathan Murray</dc:creator>
      <pubDate>Mon, 20 Apr 2026 02:49:24 +0000</pubDate>
      <link>https://dev.to/jon_at_backboardio/vercel-hack-why-you-need-to-rotate-your-non-sensitive-environment-variables-today-25mh</link>
      <guid>https://dev.to/jon_at_backboardio/vercel-hack-why-you-need-to-rotate-your-non-sensitive-environment-variables-today-25mh</guid>
      <description>&lt;p&gt;If you deploy on Vercel, todays headlines about a security incident might have caused some stress. &lt;/p&gt;

&lt;p&gt;I know firsthand how disruptive supply chain alerts can be. Take a deep breath. &lt;/p&gt;

&lt;p&gt;We are going to separate the noise from the facts and focus on the practical steps you can take today to secure your infrastructure.&lt;/p&gt;

&lt;p&gt;Here is a straightforward guide to protecting your applications.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Actually Happened
&lt;/h3&gt;

&lt;p&gt;Before we jump into the steps, here are the verified facts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Root Cause:&lt;/strong&gt; Vercel confirmed unauthorized access to internal systems via a compromised third-party AI tool with a Google Workspace OAuth integration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Exposure:&lt;/strong&gt; Environment variables marked as "Sensitive" remained encrypted and protected. However, standard or non-sensitive environment variables were likely exposed to the attacker.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Claims:&lt;/strong&gt; A threat actor using the name ShinyHunters claims to be selling Vercel data. Vercel is actively handling the situation and their core services remain online.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because non-sensitive variables were likely exposed, your immediate priority is auditing and rotating your credentials.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step-by-Step Remediation Guide
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Step 1: Audit Your Vercel Environment Variables
&lt;/h4&gt;

&lt;p&gt;Log into your Vercel dashboard and review the environment variables for every active project. You are looking for anything that was not explicitly marked with the "Sensitive" flag. Pay close attention to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Database connection strings (Postgres, MongoDB, Redis)&lt;/li&gt;
&lt;li&gt;Third-party API keys (Stripe, SendGrid, OpenAI)&lt;/li&gt;
&lt;li&gt;Authentication secrets and JWT keys&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Step 2: Revoke Upstream Credentials
&lt;/h4&gt;

&lt;p&gt;If you find a secret stored as a non-sensitive variable, changing it in Vercel is not enough. You must invalidate the compromised key at the source.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to the service provider (AWS, Supabase, Stripe, etc.).&lt;/li&gt;
&lt;li&gt;Revoke or delete the old credential entirely.&lt;/li&gt;
&lt;li&gt;Generate a brand new credential.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Step 3: Update and Flag as Sensitive
&lt;/h4&gt;

&lt;p&gt;Take your newly generated keys and update them in your Vercel projects. When you do this, make absolutely sure you check the box to mark the variable as "Sensitive". This ensures the value is encrypted at rest and hidden from the dashboard UI going forward.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 4: Audit Your OAuth Integrations
&lt;/h4&gt;

&lt;p&gt;Since this breach originated from a compromised Workspace app, use this opportunity to clean up your own team integrations.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Review your GitHub organization settings and remove unrecognized OAuth apps.&lt;/li&gt;
&lt;li&gt;Check your Google Workspace integrations.&lt;/li&gt;
&lt;li&gt;Revoke access for any third-party tools your team no longer uses.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Step 5: Monitor Your Logs
&lt;/h4&gt;

&lt;p&gt;Keep a close eye on your application and database logs over the next few days. Look for unfamiliar IP addresses accessing your database or unexpected spikes in API usage. These are clear indicators that a leaked key might be in use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Moving Forward
&lt;/h3&gt;

&lt;p&gt;Security incidents are stressful, but handling them methodically is your best defense. By rotating your exposed keys and locking down your variables, you close the door on the immediate risks. Run through the checklist, secure your workspace, and get back to building.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vercel</category>
      <category>security</category>
      <category>hack</category>
    </item>
  </channel>
</rss>
