<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Paulo Victor Leite Lima Gomes</title>
    <description>The latest articles on DEV Community by Paulo Victor Leite Lima Gomes (@pvgomes).</description>
    <link>https://dev.to/pvgomes</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F109646%2F27accb17-594d-4776-b421-db7cca109bfe.jpg</url>
      <title>DEV Community: Paulo Victor Leite Lima Gomes</title>
      <link>https://dev.to/pvgomes</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pvgomes"/>
    <language>en</language>
    <item>
      <title>tokens are now more expensive than juniors, and less predictable</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Fri, 01 May 2026 09:05:02 +0000</pubDate>
      <link>https://dev.to/pvgomes/tokens-are-now-more-expensive-than-juniors-and-less-predictable-ei5</link>
      <guid>https://dev.to/pvgomes/tokens-are-now-more-expensive-than-juniors-and-less-predictable-ei5</guid>
      <description>&lt;p&gt;I think a lot of companies are still telling themselves a very comforting story about AI costs.&lt;/p&gt;

&lt;p&gt;The story goes like this:&lt;/p&gt;

&lt;p&gt;Tokens are cheap.&lt;br&gt;
Models keep getting better.&lt;br&gt;
A few copilots here, a few agents there, maybe a chatbot for support, maybe some code generation in CI, and somehow this all stays in the “software subscription” bucket.&lt;/p&gt;

&lt;p&gt;I do not buy that story anymore.&lt;/p&gt;

&lt;p&gt;My take is simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;tokens are starting to behave less like a cheap productivity feature and more like a volatile labor line item.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And in a growing number of workflows, they are already expensive enough to compete with what companies would happily pay for junior humans.&lt;br&gt;
Not just junior developers.&lt;br&gt;
Junior assistants too.&lt;/p&gt;

&lt;p&gt;The worse part is not even the absolute price.&lt;br&gt;
It is the unpredictability.&lt;/p&gt;

&lt;p&gt;A junior hire has a salary.&lt;br&gt;
A token budget has moods.&lt;/p&gt;

&lt;h2&gt;
  
  
  the spreadsheet starts lying very early
&lt;/h2&gt;

&lt;p&gt;On paper, token prices still look harmless.&lt;br&gt;
They are quoted per million tokens, which is a wonderful way to make real usage feel abstract.&lt;/p&gt;

&lt;p&gt;A few examples from current public pricing pages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI GPT-5.4: &lt;strong&gt;$2.50 / 1M input&lt;/strong&gt; and &lt;strong&gt;$15 / 1M output&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic Claude Sonnet 4.6: &lt;strong&gt;$3 / 1M input&lt;/strong&gt; and &lt;strong&gt;$15 / 1M output&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Google Gemini 2.5 Pro: &lt;strong&gt;$1.25 / 1M input&lt;/strong&gt; and &lt;strong&gt;$10 / 1M output&lt;/strong&gt; for prompts up to 200k tokens, then &lt;strong&gt;$2.50 input&lt;/strong&gt; and &lt;strong&gt;$15 output&lt;/strong&gt; beyond that threshold&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That still sounds cheap if you are thinking about a few prompts in a playground.&lt;br&gt;
It stops sounding cheap the moment AI stops being a toy and starts becoming part of your operating model.&lt;/p&gt;

&lt;p&gt;Let’s do slightly less fake math.&lt;/p&gt;

&lt;p&gt;Imagine a team with 10 people using coding agents, document summarizers, support drafting, and internal automation.&lt;br&gt;
Nothing science-fiction here.&lt;br&gt;
Just normal “we adopted AI everywhere” behavior.&lt;/p&gt;

&lt;p&gt;Assume each seat consumes &lt;strong&gt;5 million input tokens and 2 million output tokens per workday&lt;/strong&gt;.&lt;br&gt;
That is not tiny, but it is also not insane once you include long contexts, retries, tool traces, generated code, explanations, and review loops.&lt;/p&gt;

&lt;p&gt;Here is what that looks like over roughly 22 workdays:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider/model&lt;/th&gt;
&lt;th&gt;Approx monthly cost for 10 seats&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI GPT-5.4&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$9,350&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Sonnet 4.6&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$9,900&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.5 Pro&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$7,150 to $9,350&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That range on Gemini is already part of the point.&lt;br&gt;
The same team can pay very different numbers depending on prompt size behavior.&lt;/p&gt;

&lt;p&gt;Now compare that with actual wage data.&lt;br&gt;
The U.S. Bureau of Labor Statistics lists:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;$47,460/year&lt;/strong&gt; as the 2024 median pay for secretaries and administrative assistants&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$133,080/year&lt;/strong&gt; as the 2024 median pay for software developers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$79,850/year&lt;/strong&gt; as the lower 10th percentile for software developers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Monthly, that works out to roughly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;$3,955/month&lt;/strong&gt; for an administrative assistant at the median&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$6,654/month&lt;/strong&gt; for the lower 10th percentile of software developers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$11,090/month&lt;/strong&gt; for the median software developer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So no, one engineer casually using a model is not suddenly more expensive than a junior developer.&lt;br&gt;
That would be a silly headline.&lt;/p&gt;

&lt;p&gt;But a company-wide AI workflow absolutely can become more expensive than junior labor, very fast.&lt;br&gt;
And in some cases it already is.&lt;/p&gt;

&lt;p&gt;Five heavy AI seats can outrun a median administrative assistant.&lt;br&gt;
Ten can get uncomfortably close to, or exceed, what many companies would budget for an early-career developer.&lt;br&gt;
That is before you count observability, vector databases, eval pipelines, orchestration glue, and the humans still needed to check whether the machine did something stupid.&lt;/p&gt;

&lt;h2&gt;
  
  
  token costs are worse than salaries because they are less stable
&lt;/h2&gt;

&lt;p&gt;This is the part I think many executives still do not fully internalize.&lt;/p&gt;

&lt;p&gt;A salary is expensive, yes.&lt;br&gt;
But it is legible.&lt;/p&gt;

&lt;p&gt;Token spend is worse in one important way:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;you often do not know the real cost profile until after the workflow becomes popular.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A few reasons:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. output is where the pain lives
&lt;/h3&gt;

&lt;p&gt;A lot of people anchor on input pricing because it looks small.&lt;br&gt;
That is the wrong anchor.&lt;/p&gt;

&lt;p&gt;The expensive part is often output.&lt;br&gt;
Especially when models reason longer, explain more, retry more, or emit giant blobs of code and text nobody asked them to be that verbose about.&lt;/p&gt;

&lt;p&gt;OpenAI GPT-5.4 is 6x more expensive on output than input.&lt;br&gt;
Claude Sonnet 4.6 is 5x more expensive on output than input.&lt;br&gt;
Gemini 2.5 Pro jumps hard on output too.&lt;/p&gt;

&lt;p&gt;So the team that says, “we only send a lot of context” is often missing the real bill.&lt;br&gt;
The bill usually shows up when the system starts talking back too much.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. the same work can suddenly tokenize differently
&lt;/h3&gt;

&lt;p&gt;Anthropic documents that Claude Opus 4.7 uses a new tokenizer that may consume &lt;strong&gt;up to 35% more tokens for the same fixed text&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That should make every finance person mildly uncomfortable.&lt;/p&gt;

&lt;p&gt;Imagine paying 35% more for the same semantic workload because the tokenizer changed.&lt;br&gt;
Not because your product changed.&lt;br&gt;
Not because customers changed.&lt;br&gt;
Just because the vendor changed how text gets counted.&lt;/p&gt;

&lt;p&gt;That is not labor-like.&lt;br&gt;
That is utility-bill-like.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. thresholds and modes quietly change the bill
&lt;/h3&gt;

&lt;p&gt;Gemini 2.5 Pro charges one rate for prompts up to 200k tokens and a higher one above that.&lt;br&gt;
Anthropic has regional multipliers and a fast mode with premium pricing.&lt;br&gt;
OpenAI offers batch discounts, but also a data residency premium.&lt;/p&gt;

&lt;p&gt;So even if the application behavior looks “the same” from the outside, the internal billing shape can move around because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompts got longer&lt;/li&gt;
&lt;li&gt;cache hit rates dropped&lt;/li&gt;
&lt;li&gt;a team enabled a faster mode&lt;/li&gt;
&lt;li&gt;a product shifted regions&lt;/li&gt;
&lt;li&gt;grounding or search got added&lt;/li&gt;
&lt;li&gt;the model started generating more output than last month&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not predictable staffing.&lt;br&gt;
That is spend drift.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. agents multiply hidden tokens
&lt;/h3&gt;

&lt;p&gt;This gets worse with agents.&lt;/p&gt;

&lt;p&gt;A normal chat interaction is one thing.&lt;br&gt;
An agent loop is another beast entirely.&lt;/p&gt;

&lt;p&gt;Now you are paying for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the original prompt&lt;/li&gt;
&lt;li&gt;tool schemas&lt;/li&gt;
&lt;li&gt;tool results&lt;/li&gt;
&lt;li&gt;chain-of-thought-adjacent reasoning budgets, depending on platform semantics&lt;/li&gt;
&lt;li&gt;retries&lt;/li&gt;
&lt;li&gt;file context&lt;/li&gt;
&lt;li&gt;summaries of prior turns&lt;/li&gt;
&lt;li&gt;review passes&lt;/li&gt;
&lt;li&gt;self-correction loops&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;People love saying “the agent did this task in 8 minutes.”&lt;br&gt;
Cool.&lt;br&gt;
What they often do not say is that the agent may have consumed the token equivalent of several ordinary interactions to get there.&lt;/p&gt;

&lt;p&gt;That means your marginal cost per useful result is often much blurrier than the dashboard suggests.&lt;/p&gt;

&lt;h2&gt;
  
  
  this does not mean “stop using AI”
&lt;/h2&gt;

&lt;p&gt;To be clear, I am not making the boomer argument here.&lt;/p&gt;

&lt;p&gt;I am not saying, “AI is too expensive, go back to doing everything manually.”&lt;br&gt;
That would be dumb.&lt;/p&gt;

&lt;p&gt;AI is real leverage.&lt;br&gt;
It is already useful.&lt;br&gt;
It can absolutely make a strong person much stronger.&lt;/p&gt;

&lt;p&gt;But I think companies need to stop treating token spend as if it were automatically better than human spend.&lt;/p&gt;

&lt;p&gt;Sometimes it is.&lt;br&gt;
Sometimes it is not.&lt;br&gt;
And sometimes it is only better if a human is still clearly in charge of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;scope&lt;/li&gt;
&lt;li&gt;review&lt;/li&gt;
&lt;li&gt;escalation&lt;/li&gt;
&lt;li&gt;quality control&lt;/li&gt;
&lt;li&gt;budget discipline&lt;/li&gt;
&lt;li&gt;model selection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The winning pattern is not “replace juniors with tokens.”&lt;br&gt;
The winning pattern is more like:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;use tokens to amplify good people, while good people remain the owners of correctness, cost, and consequences.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is a much more boring sentence.&lt;br&gt;
It is also the one that survives contact with finance.&lt;/p&gt;

&lt;h2&gt;
  
  
  my opinionated version
&lt;/h2&gt;

&lt;p&gt;I think a lot of AI adoption right now is being sold with the same bad habit we saw in early cloud conversations.&lt;/p&gt;

&lt;p&gt;People love the upside story.&lt;br&gt;
Nobody wants to dwell on the bill shape.&lt;/p&gt;

&lt;p&gt;So teams say things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“it is only a few dollars per million tokens”&lt;/li&gt;
&lt;li&gt;“the model is cheap enough”&lt;/li&gt;
&lt;li&gt;“we will optimize later”&lt;/li&gt;
&lt;li&gt;“let’s just let everyone use the best model for now”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is exactly how small variable costs become strategic costs.&lt;/p&gt;

&lt;p&gt;And unlike hiring, token spend can get uglier without any emotionally obvious moment.&lt;br&gt;
You do not interview a token.&lt;br&gt;
You do not onboard a token.&lt;br&gt;
You do not notice 14 small workflow expansions the same way you notice one new headcount request.&lt;/p&gt;

&lt;p&gt;That is why this category is dangerous.&lt;br&gt;
It slips past normal management instincts.&lt;/p&gt;

&lt;p&gt;You would debate a junior hire.&lt;br&gt;
You might not debate a bunch of “helpful” agent workflows until the invoice starts looking like a small payroll category.&lt;/p&gt;

&lt;h2&gt;
  
  
  what smart companies should do instead
&lt;/h2&gt;

&lt;p&gt;My recommendation is not anti-AI.&lt;br&gt;
It is anti-delusion.&lt;/p&gt;

&lt;p&gt;If you are serious about using models across the company, then do a few boring things early:&lt;/p&gt;

&lt;h3&gt;
  
  
  price workflows, not prompts
&lt;/h3&gt;

&lt;p&gt;Do not benchmark one cute demo request.&lt;br&gt;
Measure the full workflow:&lt;br&gt;
retries, context growth, tool calls, review passes, and average output length.&lt;/p&gt;

&lt;h3&gt;
  
  
  assign model tiers intentionally
&lt;/h3&gt;

&lt;p&gt;Not every task deserves the frontier model.&lt;br&gt;
Most companies are massively overpaying because they use the most expensive reasoning setup for work that could be routed to a cheaper model.&lt;/p&gt;

&lt;h3&gt;
  
  
  put humans on the acceptance boundary
&lt;/h3&gt;

&lt;p&gt;Do not use expensive models as a management substitute.&lt;br&gt;
If the output matters, a human should still own acceptance.&lt;br&gt;
Otherwise you are paying for generation and then paying again for the fallout.&lt;/p&gt;

&lt;h3&gt;
  
  
  treat token budgets like cloud budgets
&lt;/h3&gt;

&lt;p&gt;Tag them.&lt;br&gt;
Attribute them.&lt;br&gt;
Alert on them.&lt;br&gt;
Set hard ceilings where needed.&lt;/p&gt;

&lt;p&gt;Cloud taught us this already.&lt;br&gt;
Variable spend is only “efficient” when someone is actually watching it.&lt;/p&gt;

&lt;h3&gt;
  
  
  optimize for controlled leverage
&lt;/h3&gt;

&lt;p&gt;The right comparison is not “AI versus humans.”&lt;br&gt;
It is “AI plus one good human versus the old way of working.”&lt;/p&gt;

&lt;p&gt;That framing usually leads to better architecture and more honest economics.&lt;/p&gt;

&lt;h2&gt;
  
  
  my take
&lt;/h2&gt;

&lt;p&gt;Tokens are still useful.&lt;br&gt;
Sometimes incredibly useful.&lt;/p&gt;

&lt;p&gt;But they are no longer a cute rounding error.&lt;br&gt;
And they are definitely not predictable enough to treat as a harmless software snack.&lt;/p&gt;

&lt;p&gt;For many teams, token spend is becoming a real labor-adjacent budget category.&lt;br&gt;
In some workflows it is already expensive enough to beat junior human cost.&lt;br&gt;
In many more, it is at least expensive enough that the comparison should happen before the rollout, not after the invoice.&lt;/p&gt;

&lt;p&gt;So no, I would not stop using AI.&lt;/p&gt;

&lt;p&gt;I would just stop pretending that tokens are magically cheaper than people.&lt;br&gt;
They are often cheaper than some kinds of work.&lt;br&gt;
That is different.&lt;/p&gt;

&lt;p&gt;And unlike people, tokens come with a billing model that can change under your feet, a cost profile that explodes with usage patterns, and a nasty habit of looking cheap right until they are not.&lt;/p&gt;

&lt;p&gt;That is why my current default is simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;use AI aggressively, but never let the token budget operate without adult supervision.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;OpenAI, &lt;em&gt;API Pricing&lt;/em&gt; — &lt;a href="https://openai.com/api/pricing/" rel="noopener noreferrer"&gt;https://openai.com/api/pricing/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic, &lt;em&gt;Claude pricing&lt;/em&gt; — &lt;a href="https://docs.anthropic.com/en/docs/about-claude/pricing" rel="noopener noreferrer"&gt;https://docs.anthropic.com/en/docs/about-claude/pricing&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Google, &lt;em&gt;Gemini Developer API pricing&lt;/em&gt; — &lt;a href="https://ai.google.dev/gemini-api/docs/pricing" rel="noopener noreferrer"&gt;https://ai.google.dev/gemini-api/docs/pricing&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;U.S. Bureau of Labor Statistics, &lt;em&gt;Software Developers, Quality Assurance Analysts, and Testers&lt;/em&gt; — &lt;a href="https://www.bls.gov/ooh/computer-and-information-technology/software-developers.htm" rel="noopener noreferrer"&gt;https://www.bls.gov/ooh/computer-and-information-technology/software-developers.htm&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;U.S. Bureau of Labor Statistics, &lt;em&gt;Secretaries and Administrative Assistants&lt;/em&gt; — &lt;a href="https://www.bls.gov/ooh/office-and-administrative-support/secretaries-and-administrative-assistants.htm" rel="noopener noreferrer"&gt;https://www.bls.gov/ooh/office-and-administrative-support/secretaries-and-administrative-assistants.htm&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>opinion</category>
      <category>devops</category>
    </item>
    <item>
      <title>controller staleness is the hidden tax of platform automation</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Fri, 01 May 2026 00:02:16 +0000</pubDate>
      <link>https://dev.to/pvgomes/controller-staleness-is-the-hidden-tax-of-platform-automation-45e</link>
      <guid>https://dev.to/pvgomes/controller-staleness-is-the-hidden-tax-of-platform-automation-45e</guid>
      <description>&lt;p&gt;I think a lot of platform engineering discourse still has a very annoying habit.&lt;/p&gt;

&lt;p&gt;We keep treating automation as if the main risk is not having enough of it.&lt;/p&gt;

&lt;p&gt;Not enough controllers.&lt;br&gt;
Not enough reconcilers.&lt;br&gt;
Not enough policy engines.&lt;br&gt;
Not enough workflows.&lt;br&gt;
Not enough AI copilots orchestrating the orchestrators.&lt;/p&gt;

&lt;p&gt;And sure, sometimes that is true.&lt;br&gt;
But once a system gets a bit serious, the failure mode changes.&lt;br&gt;
The problem is usually not that you lack automation.&lt;br&gt;
The problem is that you now have automation making decisions from a stale mental model of reality.&lt;/p&gt;

&lt;p&gt;That is why the Kubernetes v1.36 work on &lt;strong&gt;staleness mitigation and observability for controllers&lt;/strong&gt; is more important than it sounds.&lt;br&gt;
It is not just a controller-author quality-of-life improvement.&lt;br&gt;
It is a small but very clear signal about the next platform pain point.&lt;/p&gt;

&lt;p&gt;My take is simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;controller staleness is the hidden tax of platform automation, and the more teams automate, the more expensive that tax gets.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  automation is only smart if its view of the world is fresh enough
&lt;/h2&gt;

&lt;p&gt;A lot of infrastructure automation depends on a pretty fragile assumption:&lt;br&gt;
that the thing making a decision is acting on an acceptably current view of the system.&lt;/p&gt;

&lt;p&gt;That sounds obvious when you say it out loud.&lt;br&gt;
But a surprising amount of platform logic quietly assumes it anyway.&lt;/p&gt;

&lt;p&gt;Controllers watch resources, build a cached view of cluster state, and then reconcile toward some desired outcome.&lt;br&gt;
That model is powerful because it scales much better than constant direct reads.&lt;br&gt;
It is also exactly where the subtle bugs show up.&lt;/p&gt;

&lt;p&gt;Kubernetes described the problem pretty bluntly in the v1.36 post: controller staleness can lead to controllers taking incorrect actions, often because the author made assumptions that only fail once the cache falls behind reality.&lt;br&gt;
And that is the nasty part.&lt;br&gt;
These issues often do not look dramatic at first.&lt;br&gt;
They look like occasional weirdness.&lt;br&gt;
A duplicate action here.&lt;br&gt;
A delayed correction there.&lt;br&gt;
A reconciliation loop that technically succeeds while doing the wrong thing for a few minutes.&lt;/p&gt;

&lt;p&gt;That is why staleness is such a good platform topic.&lt;br&gt;
It sits right in the uncomfortable zone between “works fine in normal demos” and “causes expensive production behavior.”&lt;/p&gt;

&lt;h2&gt;
  
  
  the hard part of automation is not execution. it is timing and truth
&lt;/h2&gt;

&lt;p&gt;I think this is where a lot of modern platform thinking gets too romantic.&lt;/p&gt;

&lt;p&gt;People love the idea of automated systems because automated systems feel decisive.&lt;br&gt;
A desired state exists, a controller sees drift, the controller corrects it, everyone goes home happy.&lt;/p&gt;

&lt;p&gt;Real life is more annoying.&lt;/p&gt;

&lt;p&gt;In real systems, automation is constantly negotiating with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;partial visibility&lt;/li&gt;
&lt;li&gt;event delays&lt;/li&gt;
&lt;li&gt;retries&lt;/li&gt;
&lt;li&gt;caches&lt;/li&gt;
&lt;li&gt;race conditions&lt;/li&gt;
&lt;li&gt;eventual consistency&lt;/li&gt;
&lt;li&gt;competing controllers&lt;/li&gt;
&lt;li&gt;humans making changes at inconvenient times&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the real challenge is not only “can the system act?”&lt;br&gt;
It is “can the system act based on a trustworthy-enough view of reality?”&lt;/p&gt;

&lt;p&gt;That distinction matters a lot.&lt;br&gt;
Because if your automation gets stronger while your freshness guarantees stay fuzzy, you are not really scaling trust.&lt;br&gt;
You are scaling the blast radius of outdated assumptions.&lt;/p&gt;

&lt;p&gt;That is the hidden tax.&lt;br&gt;
Not the compute bill.&lt;br&gt;
Not the YAML sprawl.&lt;br&gt;
The cognitive and operational cost of having more autonomous behavior than your observability and consistency model can safely support.&lt;/p&gt;

&lt;h2&gt;
  
  
  this is not just a kubernetes problem
&lt;/h2&gt;

&lt;p&gt;Kubernetes controllers make the issue easy to see, but the pattern is much broader.&lt;/p&gt;

&lt;p&gt;You can find the same shape everywhere now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;internal platform workflows acting on lagging state from APIs&lt;/li&gt;
&lt;li&gt;cost automation reacting to yesterday’s data as if it were real time&lt;/li&gt;
&lt;li&gt;deployment systems assuming their inventory view is current when it is already drifting&lt;/li&gt;
&lt;li&gt;security automation revoking or granting based on incomplete propagation&lt;/li&gt;
&lt;li&gt;AI agents chaining actions across tools with a stale understanding of what the previous step actually changed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last one is where this gets especially relevant.&lt;br&gt;
A lot of "agentic" demos look impressive because they show automation doing more steps.&lt;br&gt;
Very few of them spend enough time on whether the agent is acting on fresh, verified state between steps.&lt;/p&gt;

&lt;p&gt;Honestly, that is why I keep being skeptical of the shallow version of AI platform enthusiasm.&lt;br&gt;
We are adding more decision-making loops into systems that already struggle with stale state in much simpler automation.&lt;br&gt;
The problem does not disappear because the interface got friendlier.&lt;br&gt;
It usually gets harder to see.&lt;/p&gt;

&lt;h2&gt;
  
  
  observability for controllers is really observability for trust
&lt;/h2&gt;

&lt;p&gt;One thing I like about the Kubernetes v1.36 direction here is that it treats staleness as something you should not just tolerate silently.&lt;br&gt;
You should be able to detect it, reason about it, and design around it.&lt;/p&gt;

&lt;p&gt;That sounds small.&lt;br&gt;
It is not.&lt;/p&gt;

&lt;p&gt;A lot of platform incidents happen because the system was technically doing what it was built to do, but under conditions the builders were not properly measuring.&lt;br&gt;
A stale controller is a great example.&lt;br&gt;
The logic might be correct.&lt;br&gt;
The intent might be correct.&lt;br&gt;
The action might still be wrong because the world moved and the automation did not notice fast enough.&lt;/p&gt;

&lt;p&gt;That means the observability question is bigger than metrics trivia.&lt;br&gt;
It is really a trust question:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how stale can this controller become before its actions are unsafe?&lt;/li&gt;
&lt;li&gt;which reconciliations depend on fresh reads versus eventually consistent cache views?&lt;/li&gt;
&lt;li&gt;where are we assuming ordering that the platform does not really guarantee?&lt;/li&gt;
&lt;li&gt;which automation loops should refuse to act when their view of state is too old?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the grown-up version of platform automation.&lt;br&gt;
Not “make it autonomous and hope.”&lt;br&gt;
More like “make it autonomous inside clearly observed truth boundaries.”&lt;/p&gt;

&lt;h2&gt;
  
  
  platform teams should think less about magic and more about control surfaces
&lt;/h2&gt;

&lt;p&gt;This is also why I think the most valuable platform engineering work right now is weirdly unglamorous.&lt;/p&gt;

&lt;p&gt;Not the giant internal developer portal launch.&lt;br&gt;
Not the seventh wrapper around LLM tool invocation.&lt;br&gt;
Not the architectural diagram where every box sounds intelligent.&lt;/p&gt;

&lt;p&gt;The valuable work is often things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;defining where freshness matters more than throughput&lt;/li&gt;
&lt;li&gt;making state lag visible before it becomes user-visible damage&lt;/li&gt;
&lt;li&gt;deciding which control loops need hard safeguards&lt;/li&gt;
&lt;li&gt;building reconciliation logic that can prove it is acting on current-enough information&lt;/li&gt;
&lt;li&gt;teaching teams that “eventually consistent” is not a decorative phrase&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not as sexy as talking about fully autonomous platforms.&lt;br&gt;
But it is much closer to what keeps systems from becoming haunted.&lt;/p&gt;

&lt;p&gt;And yes, I said haunted.&lt;br&gt;
Because stale automation has exactly that vibe.&lt;br&gt;
Something changed.&lt;br&gt;
Some controller noticed too late.&lt;br&gt;
Another system reacted to the wrong intermediate state.&lt;br&gt;
And now everyone is trying to explain why the system behaved like it believed in ghosts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia.tenor.com%2F1T2mQK4h5vAAAAAC%2Fconfused-math.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia.tenor.com%2F1T2mQK4h5vAAAAAC%2Fconfused-math.gif" alt="haunted automation energy" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  more automation means more responsibility to constrain automation
&lt;/h2&gt;

&lt;p&gt;I think this is the part many teams still underestimate.&lt;/p&gt;

&lt;p&gt;When you increase automation, you do not only gain leverage.&lt;br&gt;
You also take on a stronger obligation to define the conditions under which that automation is trustworthy.&lt;/p&gt;

&lt;p&gt;That means automation design has to include things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;freshness assumptions&lt;/li&gt;
&lt;li&gt;backoff behavior&lt;/li&gt;
&lt;li&gt;conflict handling&lt;/li&gt;
&lt;li&gt;idempotency&lt;/li&gt;
&lt;li&gt;safe no-op conditions&lt;/li&gt;
&lt;li&gt;clear refusal modes when state confidence is too low&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is one reason I think platform engineering is slowly becoming less about tooling assembly and more about operational philosophy.&lt;br&gt;
What do we allow the machine to do automatically?&lt;br&gt;
Under what evidence?&lt;br&gt;
With what rollback path?&lt;br&gt;
With what visibility?&lt;/p&gt;

&lt;p&gt;Those are not secondary implementation details anymore.&lt;br&gt;
They are the real product decisions of the platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  my take
&lt;/h2&gt;

&lt;p&gt;The Kubernetes controller staleness work matters because it highlights a problem that a lot of modern infrastructure is about to feel more sharply.&lt;/p&gt;

&lt;p&gt;As platforms add more controllers, more policy engines, more automation layers, and more AI-shaped orchestration, the scarce resource is not only compute or developer time.&lt;br&gt;
It is &lt;strong&gt;trustworthy system awareness&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If the automation loop cannot see reality clearly enough, then adding more automation does not reliably create more control.&lt;br&gt;
Sometimes it just creates faster confusion.&lt;/p&gt;

&lt;p&gt;That is why I think controller staleness is the hidden tax of platform automation.&lt;br&gt;
It is the price teams pay when automated systems are allowed to act with more confidence than their view of the world deserves.&lt;/p&gt;

&lt;p&gt;The next generation of strong platform teams will not just ask, “what can we automate?”&lt;br&gt;
They will ask a better question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;how fresh does the truth need to be before we let the machine touch anything important?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is a much less flashy question.&lt;br&gt;
And a much more useful one.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Kubernetes, &lt;em&gt;Kubernetes v1.36: Staleness Mitigation and Observability for Controllers&lt;/em&gt; — &lt;a href="https://kubernetes.io/blog/2026/04/28/kubernetes-v1-36-staleness-mitigation-for-controllers/" rel="noopener noreferrer"&gt;https://kubernetes.io/blog/2026/04/28/kubernetes-v1-36-staleness-mitigation-for-controllers/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Kubernetes, &lt;em&gt;Gateway API v1.5: Moving features to Stable&lt;/em&gt; — &lt;a href="https://kubernetes.io/blog/2026/04/24/gateway-api-v1-5/" rel="noopener noreferrer"&gt;https://kubernetes.io/blog/2026/04/24/gateway-api-v1-5/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Martin Fowler, &lt;em&gt;Structured-Prompt-Driven Development (SPDD)&lt;/em&gt; — &lt;a href="https://martinfowler.com/articles/structured-prompt-driven/" rel="noopener noreferrer"&gt;https://martinfowler.com/articles/structured-prompt-driven/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>platformengineering</category>
      <category>automation</category>
      <category>ai</category>
    </item>
    <item>
      <title>your second brain should not be a folder full of markdown</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Thu, 30 Apr 2026 15:02:24 +0000</pubDate>
      <link>https://dev.to/pvgomes/your-second-brain-should-not-be-a-folder-full-of-markdown-ka0</link>
      <guid>https://dev.to/pvgomes/your-second-brain-should-not-be-a-folder-full-of-markdown-ka0</guid>
      <description>&lt;p&gt;I like markdown.&lt;br&gt;
I really do.&lt;/p&gt;

&lt;p&gt;Markdown is simple, portable, git-friendly, easy to back up, and great for writing.&lt;br&gt;
But I also think a lot of “second brain” tools quietly fall apart at the exact moment they are supposed to become useful.&lt;/p&gt;

&lt;p&gt;They work nicely while your memory is small.&lt;br&gt;
Then one day you want to find that one thing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the exact hour you went to the gym three Tuesdays ago&lt;/li&gt;
&lt;li&gt;the bakery you liked in a city you visited once&lt;/li&gt;
&lt;li&gt;the detailed advice a friend gave you in a long conversation six months ago&lt;/li&gt;
&lt;li&gt;that random insight you had during a walk and saved from your phone&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And suddenly your “second brain” is just a polite pile of files.&lt;/p&gt;

&lt;p&gt;That is the moment where I think most markdown-first memory systems reveal their real limitation:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;they are optimized for storing text, not for retrieving memory.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is why I find &lt;a href="https://github.com/brazanation/jurupari" rel="noopener noreferrer"&gt;Jurupari&lt;/a&gt; interesting.&lt;br&gt;
Not because it adds more note-taking ceremony.&lt;br&gt;
Quite the opposite.&lt;br&gt;
Because it strips the idea down to what actually matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;store memories in a real database&lt;/li&gt;
&lt;li&gt;search them semantically&lt;/li&gt;
&lt;li&gt;expose them through MCP and HTTP&lt;/li&gt;
&lt;li&gt;let any AI tool read and write them for you&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is much closer to what a real second brain should be.&lt;/p&gt;
&lt;h2&gt;
  
  
  the problem with markdown second brains
&lt;/h2&gt;

&lt;p&gt;A folder full of notes feels smart at first.&lt;br&gt;
Engineers especially love it because it feels open and under control.&lt;br&gt;
No vendor lock-in, no weird proprietary format, just files.&lt;/p&gt;

&lt;p&gt;I get the appeal.&lt;br&gt;
I have that instinct too.&lt;/p&gt;

&lt;p&gt;But once memory stops being “a few documents I can browse manually” and becomes “an extension of my day-to-day thinking,” files start getting awkward.&lt;/p&gt;

&lt;p&gt;The problem is not that markdown is bad.&lt;br&gt;
The problem is that &lt;strong&gt;memory retrieval is a search problem&lt;/strong&gt;, and search gets much better when you treat it like a database problem instead of a filesystem hobby.&lt;/p&gt;

&lt;p&gt;If your idea of memory is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;searchable history&lt;/li&gt;
&lt;li&gt;timeline fragments&lt;/li&gt;
&lt;li&gt;personal facts&lt;/li&gt;
&lt;li&gt;recurring preferences&lt;/li&gt;
&lt;li&gt;conversation details&lt;/li&gt;
&lt;li&gt;lightweight journaling&lt;/li&gt;
&lt;li&gt;structured and unstructured recall&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...then embedded search plus a proper data model beats filename gymnastics every time.&lt;/p&gt;
&lt;h2&gt;
  
  
  what jurupari gets right
&lt;/h2&gt;

&lt;p&gt;Jurupari is basically a very simple personal knowledge base with the right primitives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt; for storage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pgvector&lt;/strong&gt; for semantic search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP&lt;/strong&gt; so AI tools can use it directly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP API&lt;/strong&gt; for direct integration and automation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CRUD support&lt;/strong&gt;, not just retrieval&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last part matters a lot.&lt;br&gt;
A lot of “memory” integrations are glorified search adapters.&lt;br&gt;
They can retrieve context, maybe rank snippets, maybe inject them into a prompt.&lt;br&gt;
But they cannot really behave like a durable memory system because writing is awkward or missing.&lt;/p&gt;

&lt;p&gt;Jurupari fixes that.&lt;/p&gt;

&lt;p&gt;With MCP in front of it, memory stops being a manual note-taking ritual and becomes something much more natural:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Hey, save this on my Jurupari memory.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the right abstraction.&lt;br&gt;
I do not want to stop what I am doing, open another app, decide on a folder, decide on a title, decide on tags, and become my own archivist.&lt;br&gt;
I want memory capture to be cheap.&lt;/p&gt;

&lt;p&gt;If the system is good, I should be able to talk to Claude, GPT, OpenClaw, Hermes, n8n, or any other MCP-capable tool and say:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;save this&lt;/li&gt;
&lt;li&gt;find that&lt;/li&gt;
&lt;li&gt;update this&lt;/li&gt;
&lt;li&gt;remove that&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a second brain.&lt;br&gt;
Not a graveyard of notes.&lt;/p&gt;
&lt;h2&gt;
  
  
  semantic search is the real feature
&lt;/h2&gt;

&lt;p&gt;The real power here is not “you can store notes in Postgres.”&lt;br&gt;
That part is almost boring.&lt;/p&gt;

&lt;p&gt;The real feature is that semantic search changes how you interact with memory.&lt;/p&gt;

&lt;p&gt;You do not need to remember the exact words you used.&lt;br&gt;
You just need to remember what you meant.&lt;/p&gt;

&lt;p&gt;That is a huge difference.&lt;/p&gt;

&lt;p&gt;A filesystem usually rewards perfect recall:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;correct filename&lt;/li&gt;
&lt;li&gt;correct folder&lt;/li&gt;
&lt;li&gt;correct keyword&lt;/li&gt;
&lt;li&gt;correct tagging habit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A semantic memory system rewards approximate recall:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“find that thing I wrote about feeling tired after leg day”&lt;/li&gt;
&lt;li&gt;“what was the coffee place I liked near the station?”&lt;/li&gt;
&lt;li&gt;“search my memory for that conversation about changing jobs”&lt;/li&gt;
&lt;li&gt;“what did I say last month about sleep quality?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is much closer to how human memory actually works.&lt;/p&gt;
&lt;h2&gt;
  
  
  this is where a second brain becomes actually useful
&lt;/h2&gt;

&lt;p&gt;A lot of “second brain” marketing is weirdly grandiose.&lt;br&gt;
It talks like you are building a digital philosopher king inside your laptop.&lt;br&gt;
I do not think that is the useful framing.&lt;/p&gt;

&lt;p&gt;The useful framing is much simpler:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;your memory gets more valuable when it becomes easy to save and easy to find.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That means very normal things suddenly become worth recording.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;
&lt;h3&gt;
  
  
  1. everyday activity logging
&lt;/h3&gt;

&lt;p&gt;You want to remember what time you did something.&lt;br&gt;
Not because it is deep or poetic, but because reality is slippery.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what time did I go to the gym?&lt;/li&gt;
&lt;li&gt;when did I stop by the bakery?&lt;/li&gt;
&lt;li&gt;what time did I take the dog out?&lt;/li&gt;
&lt;li&gt;when did I last call my parents?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Prompt examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Save this on my Jurupari memory: I went to the gym today at 07:10.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Save this memory: I went to the bakery at 08:35 and bought sourdough and two pastéis de nata.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Search my Jurupari memory for the last time I went to the gym in the morning.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. detailed conversations
&lt;/h3&gt;

&lt;p&gt;Sometimes the most useful thing to remember is not a task.&lt;br&gt;
It is context.&lt;/p&gt;

&lt;p&gt;Maybe a friend told you something important.&lt;br&gt;
Maybe you had a subtle conversation with your partner.&lt;br&gt;
Maybe someone gave you advice that only makes sense when you preserve the detail.&lt;/p&gt;

&lt;p&gt;Prompt examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Save this on my Jurupari memory: today I talked to Daniel for an hour. He said he is thinking about leaving his job because the team structure changed, he feels blocked by management, and he wants to move closer to product strategy.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Find my memory about Daniel thinking of leaving his job.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is way better than hoping you named the note &lt;code&gt;career-chat-daniel-maybe-job-change-final-final.md&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. journal entries that you can actually recover
&lt;/h3&gt;

&lt;p&gt;This is the part I like most.&lt;br&gt;
Jurupari can work like a journal, but not the kind of journal you write and then lose inside your own archive.&lt;/p&gt;

&lt;p&gt;You can keep small fragments of life:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what made you anxious today&lt;/li&gt;
&lt;li&gt;what went well this week&lt;/li&gt;
&lt;li&gt;a lesson from a hard conversation&lt;/li&gt;
&lt;li&gt;a small win you want to remember&lt;/li&gt;
&lt;li&gt;a pattern you are noticing in your energy, habits, or mood&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Prompt examples:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Save this memory: I felt unusually focused today after sleeping 8 hours and going for a 20-minute walk before work.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Search my memory for patterns involving focus, sleep, and walking.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is where the “second brain” idea stops being branding and starts becoming practical.&lt;/p&gt;

&lt;h2&gt;
  
  
  mcp is what makes this feel native instead of bolted on
&lt;/h2&gt;

&lt;p&gt;The reason this gets much more interesting now than a few years ago is MCP.&lt;/p&gt;

&lt;p&gt;Without MCP, a memory system is usually another app you have to remember to use.&lt;br&gt;
With MCP, memory becomes part of the interface layer of your AI tools.&lt;/p&gt;

&lt;p&gt;That changes behavior.&lt;/p&gt;

&lt;p&gt;Instead of thinking:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I should go open my note system and save this.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You think:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Hey, save this.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is a much lower-friction action.&lt;br&gt;
And low friction is everything for personal memory systems.&lt;br&gt;
Because the best memory tool is not the one with the fanciest graph view.&lt;br&gt;
It is the one you actually keep feeding.&lt;/p&gt;

&lt;p&gt;Jurupari is especially nice here because it is not trying to trap you inside one product surface.&lt;br&gt;
You can plug it into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude&lt;/li&gt;
&lt;li&gt;GPT&lt;/li&gt;
&lt;li&gt;OpenClaw&lt;/li&gt;
&lt;li&gt;Hermes&lt;/li&gt;
&lt;li&gt;n8n&lt;/li&gt;
&lt;li&gt;other MCP-capable tools&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the memory follows your workflow instead of demanding a new one.&lt;/p&gt;
&lt;h2&gt;
  
  
  the real second brain is writable
&lt;/h2&gt;

&lt;p&gt;I think this is the most underrated idea in the whole space.&lt;/p&gt;

&lt;p&gt;A real second brain cannot be read-only.&lt;/p&gt;

&lt;p&gt;If an AI can search your memory but cannot update it, correct it, append to it, or save new facts when you ask, then it is not really your second brain.&lt;br&gt;
It is just a retrieval plugin.&lt;/p&gt;

&lt;p&gt;Jurupari exposing CRUD through MCP is the important design choice.&lt;br&gt;
That is what makes these flows possible:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Save this on my Jurupari memory: the plumber said he will come on Friday between 14:00 and 16:00.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Update that memory: the plumber moved it to Saturday at 10:30.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Delete the duplicate note about the bakery. Keep the one with the exact time.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That sounds small, but it is the difference between “search over notes” and “persistent memory you can manage conversationally.”&lt;/p&gt;

&lt;h2&gt;
  
  
  how to run your own jurupari
&lt;/h2&gt;

&lt;p&gt;The nice part is that this is not some giant infrastructure project.&lt;br&gt;
The repo is refreshingly direct.&lt;/p&gt;

&lt;p&gt;At a high level, the setup is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;deploy Jurupari somewhere you like&lt;/li&gt;
&lt;li&gt;point it at a PostgreSQL database with pgvector&lt;/li&gt;
&lt;li&gt;set your environment variables&lt;/li&gt;
&lt;li&gt;run the API&lt;/li&gt;
&lt;li&gt;expose MCP so your AI tools can connect to it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;From the project README, the local dev flow is basically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cp&lt;/span&gt; .env.example .env
&lt;span class="c"&gt;# fill DATABASE_URL, OPENAI_API_KEY, JURUPARI_TOKEN&lt;/span&gt;

docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
pnpm &lt;span class="nb"&gt;install
&lt;/span&gt;pnpm db:push
pnpm dev:api
pnpm &lt;span class="nt"&gt;--filter&lt;/span&gt; @jurupari/mcp build &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; node packages/mcp/dist/index.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And if you want a simple hosted version, the project explicitly mentions deployment on places like &lt;strong&gt;AWS&lt;/strong&gt; or &lt;strong&gt;Railway&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The mental model is straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Postgres + pgvector&lt;/strong&gt; stores and indexes your memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;the API&lt;/strong&gt; gives you direct application access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;the MCP server&lt;/strong&gt; lets AI clients talk to the memory naturally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;the token model&lt;/strong&gt; controls read/write access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is also a nice split between remote and local MCP setups:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Remote SSE&lt;/strong&gt; for web clients and remote integrations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local stdio&lt;/strong&gt; for tools like Claude Desktop, Claude Code, or Cursor&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means you can choose convenience or locality depending on your setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  who this is for
&lt;/h2&gt;

&lt;p&gt;I think Jurupari makes the most sense for people who:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;already use AI tools every day&lt;/li&gt;
&lt;li&gt;are tired of fragmented personal context&lt;/li&gt;
&lt;li&gt;want memory to be available across tools&lt;/li&gt;
&lt;li&gt;prefer owning their own stack&lt;/li&gt;
&lt;li&gt;understand that retrieval quality matters more than note aesthetics&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Especially engineers.&lt;br&gt;
Because engineers often over-romanticize plain files and under-invest in retrieval.&lt;/p&gt;

&lt;p&gt;I say that with love.&lt;br&gt;
We do this all the time.&lt;br&gt;
We will build a beautiful directory tree and call it knowledge management, then act surprised when finding anything becomes annoying.&lt;/p&gt;

&lt;h2&gt;
  
  
  my take
&lt;/h2&gt;

&lt;p&gt;If you want a writing system, markdown is still fantastic.&lt;br&gt;
If you want a durable searchable memory that can live behind your favorite AI tools, markdown folders are usually the wrong center of gravity.&lt;/p&gt;

&lt;p&gt;That is why I think Jurupari is a much more honest version of the “second brain” idea.&lt;/p&gt;

&lt;p&gt;It does not pretend memory is about collecting pretty notes.&lt;br&gt;
It treats memory like what it actually becomes at scale:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a search problem&lt;/li&gt;
&lt;li&gt;a retrieval problem&lt;/li&gt;
&lt;li&gt;a write problem&lt;/li&gt;
&lt;li&gt;a data-model problem&lt;/li&gt;
&lt;li&gt;an interface problem&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And once you see it that way, the architecture becomes obvious.&lt;/p&gt;

&lt;p&gt;Use a real database.&lt;br&gt;
Use semantic search.&lt;br&gt;
Expose CRUD.&lt;br&gt;
Plug it into the tools you already talk to.&lt;/p&gt;

&lt;p&gt;That is much closer to a real second brain than a synced folder full of markdown will ever be.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Jurupari GitHub repository — &lt;a href="https://github.com/brazanation/jurupari" rel="noopener noreferrer"&gt;https://github.com/brazanation/jurupari&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Jurupari README — &lt;a href="https://raw.githubusercontent.com/brazanation/jurupari/main/README.md" rel="noopener noreferrer"&gt;https://raw.githubusercontent.com/brazanation/jurupari/main/README.md&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>opinion</category>
      <category>devops</category>
    </item>
    <item>
      <title>Do know about the AI transparency index? you should</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Thu, 30 Apr 2026 10:00:12 +0000</pubDate>
      <link>https://dev.to/pvgomes/ai-transparency-index-on-pvgomescom-2p1k</link>
      <guid>https://dev.to/pvgomes/ai-transparency-index-on-pvgomescom-2p1k</guid>
      <description>&lt;h3&gt;
  
  
  AI Transparency index and the numbers are uncomfortable, still you should know
&lt;/h3&gt;

&lt;p&gt;stanford's &lt;a href="https://crfm.stanford.edu/fmti/December-2025/index.html" rel="noopener noreferrer"&gt;foundation model transparency index&lt;/a&gt; dropped its december 2025 edition and if you build anything on top of these models, you should probably read it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;the mean score dropped 17 points.&lt;/strong&gt; from 58 to 41. meta down 29, mistral down 37, openai down 14. this is not a documentation problem — these companies have entire policy teams. it's a choice.&lt;/p&gt;

&lt;p&gt;a few things that stood out to me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ibm scored 95.&lt;/strong&gt; first place across all three years. nobody talks about this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;open-weight ≠ transparent.&lt;/strong&gt; deepseek and alibaba release weights and still scored 32 and 26. publishing weights is not the same as being auditable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;training data is still a black box everywhere.&lt;/strong&gt; what they trained on, whether they had licenses, how they handled pii — consistently the worst-scoring subdomain, three years running.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;anthropic didn't submit a report.&lt;/strong&gt; the fmti team built one manually. anthropic ranked 2nd. good score, bad signal.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;as engineers we're the ones building on top of these systems. when something goes wrong in production, "we didn't disclose how we trained it" is not an answer you can give anyone.&lt;/p&gt;

&lt;p&gt;the index doesn't fix that. but it names who's trying to be honest versus who's retreating as market share grows. that's useful signal when choosing what to build on.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why you should know about the fmti?
&lt;/h3&gt;

&lt;p&gt;most people pick their ai provider based on benchmarks, pricing, or vibes. the foundation model transparency index measures something different: how honest a company is about what they actually built.&lt;br&gt;
that matters more than most engineers realize.&lt;br&gt;
when you integrate a model into a product, you inherit its risks — biased outputs, leaked training patterns, copyright exposure, opaque safety evaluations. you can't audit what was never disclosed. and when something breaks, you're the one explaining it to stakeholders, not the lab.&lt;br&gt;
the fmti gives you a structured way to ask: does this provider tell me enough to reason about what i'm building on?&lt;br&gt;
it's not perfect. scores can be gamed, and disclosure isn't the same as safety. but it's one of the few independent, recurring attempts to hold this industry accountable before regulators do it badly.&lt;br&gt;
if you're doing vendor evaluation, building on llms in a regulated domain, or just tired of treating "trust us" as an architecture decision — this index is worth bookmarking.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>opinion</category>
      <category>devops</category>
    </item>
    <item>
      <title>software engineers are becoming reliability engineers for generated output</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Thu, 30 Apr 2026 05:28:16 +0000</pubDate>
      <link>https://dev.to/pvgomes/software-engineers-are-becoming-reliability-engineers-for-generated-output-3b75</link>
      <guid>https://dev.to/pvgomes/software-engineers-are-becoming-reliability-engineers-for-generated-output-3b75</guid>
      <description>&lt;p&gt;The funny thing about the whole “AI will replace software engineers” discourse is that it keeps describing the wrong job.&lt;/p&gt;

&lt;p&gt;Yes, models can produce more code, more docs, more tests, more plans, more tickets, and more very convincing nonsense than ever.&lt;br&gt;
That part is real.&lt;/p&gt;

&lt;p&gt;But once you use these systems in real work, the shape of the engineering job starts changing.&lt;br&gt;
Less “person who types every line.”&lt;br&gt;
More “person who decides what deserves to become real.”&lt;/p&gt;

&lt;p&gt;My take is simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;software engineers are quietly becoming reliability engineers for generated output.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not just for code, by the way.&lt;br&gt;
For migrations, SQL, runbooks, Terraform, docs, architecture notes, postmortems, prompts, dashboards, and all the other machine-produced artifacts that now show up faster than humans can properly trust them.&lt;/p&gt;

&lt;p&gt;That is where I think the job is moving.&lt;br&gt;
Not toward disappearance.&lt;br&gt;
Toward accountability.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fva7wu928ndeifqie6k3c.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fva7wu928ndeifqie6k3c.gif" alt="this is fine but with more tokens" width="498" height="498"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  the real shift is that plausible output became cheap
&lt;/h2&gt;

&lt;p&gt;The biggest change is not that AI writes things.&lt;br&gt;
The biggest change is that AI makes &lt;strong&gt;plausible output cheap&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For most of software history, output had a natural throttle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;humans were slow&lt;/li&gt;
&lt;li&gt;writing took effort&lt;/li&gt;
&lt;li&gt;context switching was expensive&lt;/li&gt;
&lt;li&gt;making a mess required actual labor&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now a model can generate in minutes what would have taken a human hours.&lt;br&gt;
That sounds like pure leverage until you hit the obvious second-order effect:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;if output gets cheap, bad output gets cheap too.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Actually, not just cheap.&lt;br&gt;
Abundant.&lt;br&gt;
And often polished enough to fool tired people.&lt;/p&gt;

&lt;p&gt;That is why I do not think the durable engineering moat is “being the person who can produce the first draft fastest.”&lt;br&gt;
The machines are already very strong there.&lt;/p&gt;

&lt;p&gt;The moat is increasingly being the person who can answer harder questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;is this correct?&lt;/li&gt;
&lt;li&gt;is it safe?&lt;/li&gt;
&lt;li&gt;what assumptions is it making?&lt;/li&gt;
&lt;li&gt;what breaks under concurrency, retries, or bad inputs?&lt;/li&gt;
&lt;li&gt;does it fit the rest of the system, or does it just look locally smart?&lt;/li&gt;
&lt;li&gt;what damage happens if this gets merged, deployed, or copied ten more times?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is reliability thinking, even when the artifact is “just code.”&lt;/p&gt;

&lt;h2&gt;
  
  
  ai does not remove the backlog. it changes the queue.
&lt;/h2&gt;

&lt;p&gt;One thing I keep noticing is that AI does not simply reduce work.&lt;br&gt;
It often &lt;strong&gt;changes the queue&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;You used to have a backlog of things not yet written.&lt;br&gt;
Now you increasingly get a backlog of things already generated but not yet trusted.&lt;/p&gt;

&lt;p&gt;That is a very different problem.&lt;/p&gt;

&lt;p&gt;A few examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The model generates a migration quickly, but someone still has to verify rollback safety, locking behavior, and ugly data edge cases.&lt;/li&gt;
&lt;li&gt;The model produces a Kubernetes manifest quickly, but someone still has to spot the security assumptions, fake resource guesses, and operational nonsense.&lt;/li&gt;
&lt;li&gt;The model writes lots of tests quickly, but someone still has to figure out whether the tests validate behavior or just mirror implementation trivia.&lt;/li&gt;
&lt;li&gt;The model drafts a design doc fast, but someone still has to separate actual architecture from autocomplete theater.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the bottleneck does not disappear.&lt;br&gt;
It moves.&lt;/p&gt;

&lt;p&gt;In a lot of teams, the new bottleneck is no longer raw production.&lt;br&gt;
It is validation, integration, and accountability.&lt;/p&gt;

&lt;p&gt;That is why the “AI replaces engineers” framing feels shallow to me.&lt;br&gt;
A better framing is:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI increases the volume of candidate artifacts, and engineers become the reliability system that decides which ones deserve reality.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  the new senior engineer smell test
&lt;/h2&gt;

&lt;p&gt;I think this is going to change what seniority feels like.&lt;/p&gt;

&lt;p&gt;Strong engineers are increasingly the people who can look at AI-generated artifacts and immediately feel where the lies are.&lt;/p&gt;

&lt;p&gt;You know the vibe:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“this looks clean but it is hiding a consistency problem”&lt;/li&gt;
&lt;li&gt;“this abstraction reads well and will be miserable to operate”&lt;/li&gt;
&lt;li&gt;“this PR is syntactically fine and semantically clueless”&lt;/li&gt;
&lt;li&gt;“this agent plan is doing six steps because it does not understand the real system boundary”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That kind of judgment does not demo well.&lt;br&gt;
But it is probably getting more valuable, not less.&lt;/p&gt;

&lt;p&gt;Because if AI keeps making output cheaper, then high-quality skepticism becomes a force multiplier.&lt;/p&gt;

&lt;h2&gt;
  
  
  the job is shifting from authoring to acceptance
&lt;/h2&gt;

&lt;p&gt;Maybe the simplest way to say it is this:&lt;/p&gt;

&lt;p&gt;software engineers are spending less of their future being pure authors, and more of it being &lt;strong&gt;acceptance systems&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not passive approvers.&lt;br&gt;
Not human lint rules.&lt;br&gt;
Something more active:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deciding which generated changes are worth keeping&lt;/li&gt;
&lt;li&gt;defining the tests and invariants the machine has to satisfy&lt;/li&gt;
&lt;li&gt;shaping the architecture so local generation cannot create global chaos&lt;/li&gt;
&lt;li&gt;building tooling that catches the model’s favorite failure modes&lt;/li&gt;
&lt;li&gt;creating boundaries where generation is useful and boundaries where it is dangerous&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a different kind of leverage.&lt;br&gt;
And honestly, it is a more senior kind.&lt;/p&gt;

&lt;p&gt;Typing faster was never the deepest layer of engineering anyway.&lt;br&gt;
Understanding consequences was.&lt;/p&gt;

&lt;h2&gt;
  
  
  this is why systems thinking matters even more now
&lt;/h2&gt;

&lt;p&gt;Generated output is usually strongest locally and weakest systemically.&lt;/p&gt;

&lt;p&gt;Models are pretty good at producing a function, an endpoint, a refactor diff, a script, a nice-looking explanation.&lt;br&gt;
They are much less trustworthy when the real question is about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;long-range coupling&lt;/li&gt;
&lt;li&gt;rollback behavior&lt;/li&gt;
&lt;li&gt;observability gaps&lt;/li&gt;
&lt;li&gt;cost shape&lt;/li&gt;
&lt;li&gt;security boundaries&lt;/li&gt;
&lt;li&gt;human maintenance burden six months later&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is where good engineers still earn their keep.&lt;br&gt;
Not by being the fastest autocomplete in the room.&lt;br&gt;
By being the person who sees the system that the autocomplete cannot really hold together on its own.&lt;/p&gt;

&lt;p&gt;That is also why I am skeptical of the lazy “every engineer becomes a prompt engineer” storyline.&lt;br&gt;
Maybe every engineer becomes a bit better at delegation.&lt;br&gt;
Fine.&lt;br&gt;
But the durable skill is still evaluation under constraints.&lt;br&gt;
That is basically systems judgment.&lt;/p&gt;

&lt;h2&gt;
  
  
  this applies way beyond code
&lt;/h2&gt;

&lt;p&gt;One reason I like the phrase “generated output” more than just “AI code” is that the pattern is much broader than programming.&lt;/p&gt;

&lt;p&gt;Modern engineering work is full of machine-assisted artifacts now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;incident summaries&lt;/li&gt;
&lt;li&gt;postmortems&lt;/li&gt;
&lt;li&gt;architecture docs&lt;/li&gt;
&lt;li&gt;customer replies&lt;/li&gt;
&lt;li&gt;risk assessments&lt;/li&gt;
&lt;li&gt;onboarding guides&lt;/li&gt;
&lt;li&gt;support runbooks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these can now be generated faster.&lt;br&gt;
And all of them can now fail faster too.&lt;/p&gt;

&lt;p&gt;A wrong incident summary can send a team chasing the wrong problem.&lt;br&gt;
A polished but shallow design doc can approve a bad direction.&lt;br&gt;
A confident security explanation can normalize unsafe assumptions.&lt;br&gt;
A cheerful AI-generated runbook can make an outage worse.&lt;/p&gt;

&lt;p&gt;So the reliability role is not limited to code review.&lt;br&gt;
It is becoming a broader discipline of truth maintenance around machine-produced artifacts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia.tenor.com%2F8m6G9b27wwoAAAAC%2Ftyping-fast.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia.tenor.com%2F8m6G9b27wwoAAAAC%2Ftyping-fast.gif" alt="someone still has to own prod" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  my take
&lt;/h2&gt;

&lt;p&gt;I do not think software engineers are disappearing.&lt;br&gt;
I think the center of gravity of the job is shifting.&lt;/p&gt;

&lt;p&gt;AI changes the economics of engineering by making production cheaper and verification more important.&lt;br&gt;
Once that happens, the valuable humans are not the ones who merely produce more.&lt;br&gt;
They are the ones who can keep generated output aligned with reality.&lt;/p&gt;

&lt;p&gt;That means correctness.&lt;br&gt;
Safety.&lt;br&gt;
Operability.&lt;br&gt;
Maintainability.&lt;br&gt;
Context.&lt;br&gt;
Accountability.&lt;/p&gt;

&lt;p&gt;So yes, engineers will still write code.&lt;br&gt;
Probably a lot of it.&lt;br&gt;
But the deeper job is starting to look more like reliability engineering for a world where output is abundant, confidence is synthetic, and mistakes can replicate faster than understanding.&lt;/p&gt;

&lt;p&gt;Honestly, that sounds much closer to the real profession anyway.&lt;/p&gt;

&lt;p&gt;The job was never just to make software.&lt;br&gt;
It was to make software that deserves to exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Princeton NLP, &lt;em&gt;SWE-bench: Can Language Models Resolve Real-World GitHub Issues?&lt;/em&gt; — &lt;a href="https://www.swebench.com/" rel="noopener noreferrer"&gt;https://www.swebench.com/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;OpenAI et al., &lt;em&gt;GPT-4 Technical Report&lt;/em&gt; — &lt;a href="https://arxiv.org/abs/2303.08774" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2303.08774&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>reliability</category>
      <category>llm</category>
    </item>
    <item>
      <title>github failed at the only thing they should do: git</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Wed, 29 Apr 2026 14:23:13 +0000</pubDate>
      <link>https://dev.to/pvgomes/github-failed-at-the-only-thing-they-should-do-git-2gm</link>
      <guid>https://dev.to/pvgomes/github-failed-at-the-only-thing-they-should-do-git-2gm</guid>
      <description>&lt;p&gt;Yes, I know GitHub does a lot more than Git these days.&lt;br&gt;
Actions, packages, security tooling, issue tracking, enterprise management, AI features, code search, and half the software industry glued into one UI.&lt;/p&gt;

&lt;p&gt;But that is exactly why this story is so funny in the worst possible way.&lt;/p&gt;

&lt;p&gt;When a company built around Git takes a critical hit in the &lt;strong&gt;git push pipeline&lt;/strong&gt;, my reaction is pretty simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;guys, this was the one thing you absolutely had to not mess up.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is my opinion here, and I do not think it is unfair.&lt;br&gt;
GitHub failed at the only thing they should do: Git.&lt;/p&gt;

&lt;h2&gt;
  
  
  what happened
&lt;/h2&gt;

&lt;p&gt;On April 28, GitHub published two important posts about a serious incident and the vulnerability behind it.&lt;br&gt;
Wiz also published its own breakdown of the issue, tracked as &lt;strong&gt;CVE-2026-3854&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The short version:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wiz described it as a &lt;strong&gt;CVSS 8.7&lt;/strong&gt; remote code execution vulnerability&lt;/li&gt;
&lt;li&gt;the affected area was GitHub’s &lt;strong&gt;git push pipeline&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;GitHub said it &lt;strong&gt;validated, fixed, and investigated&lt;/strong&gt; the issue in under two hours&lt;/li&gt;
&lt;li&gt;GitHub also said it found &lt;strong&gt;no evidence of exploitation&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;separately, GitHub published an availability update acknowledging customer-facing impact during the incident window&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is already enough to make engineers pay attention.&lt;br&gt;
A critical RCE in the push path is not some side quest.&lt;br&gt;
That is the bloodstream.&lt;/p&gt;

&lt;h2&gt;
  
  
  this is why the story matters more than the cve headline
&lt;/h2&gt;

&lt;p&gt;A lot of people will read this as just another security incident.&lt;br&gt;
That is too shallow.&lt;/p&gt;

&lt;p&gt;The deeper issue is trust concentration.&lt;br&gt;
GitHub is not just a code host anymore.&lt;br&gt;
For a lot of teams, GitHub is now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;source control&lt;/li&gt;
&lt;li&gt;CI/CD trigger point&lt;/li&gt;
&lt;li&gt;release workflow anchor&lt;/li&gt;
&lt;li&gt;access-control boundary&lt;/li&gt;
&lt;li&gt;automation hub&lt;/li&gt;
&lt;li&gt;policy surface&lt;/li&gt;
&lt;li&gt;audit trail&lt;/li&gt;
&lt;li&gt;developer identity checkpoint&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means a bug in the push pipeline is not just “one bug.”&lt;br&gt;
It touches one of the highest-trust points in the entire software-delivery chain.&lt;/p&gt;

&lt;p&gt;And that is why my reaction is stronger than “well, incidents happen.”&lt;br&gt;
Of course incidents happen.&lt;br&gt;
But some incidents hit the exact place where your credibility is supposed to be strongest.&lt;br&gt;
This was one of those.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpf1tnqhxpqqfybigg82.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpf1tnqhxpqqfybigg82.gif" alt="well this is awkward" width="500" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  github’s response was fast, but that does not erase the failure
&lt;/h2&gt;

&lt;p&gt;To be fair, GitHub’s own security post says it moved very quickly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;validate the report&lt;/li&gt;
&lt;li&gt;mitigate the issue&lt;/li&gt;
&lt;li&gt;investigate for exploitation&lt;/li&gt;
&lt;li&gt;confirm no exploitation evidence&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is good.&lt;br&gt;
That is exactly what they should do in incident response.&lt;/p&gt;

&lt;p&gt;But fast response does not cancel the original problem.&lt;br&gt;
It just means the recovery side looked competent.&lt;/p&gt;

&lt;p&gt;And that distinction matters.&lt;br&gt;
A good fire brigade does not mean you should stop asking why the building caught fire.&lt;/p&gt;

&lt;h2&gt;
  
  
  the platform expansion story is part of the problem
&lt;/h2&gt;

&lt;p&gt;This is where I think the broader lesson sits.&lt;/p&gt;

&lt;p&gt;GitHub has spent years becoming more than Git.&lt;br&gt;
That was strategically obvious.&lt;br&gt;
It became a platform on top of a protocol, then a workflow engine on top of a platform, then increasingly a software-operating environment.&lt;/p&gt;

&lt;p&gt;That growth worked.&lt;br&gt;
Commercially, it was a huge success.&lt;/p&gt;

&lt;p&gt;But expansion has a tax.&lt;br&gt;
Every layer of platform ambition increases complexity, and complexity has a way of drifting back into the supposedly foundational surfaces.&lt;/p&gt;

&lt;p&gt;When you become the center of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pushes&lt;/li&gt;
&lt;li&gt;merges&lt;/li&gt;
&lt;li&gt;bots&lt;/li&gt;
&lt;li&gt;actions&lt;/li&gt;
&lt;li&gt;enterprise controls&lt;/li&gt;
&lt;li&gt;security scanning&lt;/li&gt;
&lt;li&gt;AI assistance&lt;/li&gt;
&lt;li&gt;policy enforcement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;...your “core Git path” is no longer just a clean old-school plumbing layer sitting alone in a quiet corner.&lt;br&gt;
It becomes part of a much more dangerous machine.&lt;/p&gt;

&lt;p&gt;That is one reason I do not buy the very comfortable narrative that this is just one isolated bug.&lt;br&gt;
It is also a reminder that platform sprawl eventually leaks back into the base layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  availability matters too
&lt;/h2&gt;

&lt;p&gt;GitHub’s availability post matters here because it makes the incident feel more real than a sterile CVE note.&lt;br&gt;
The company was not just publishing a technical write-up after the fact.&lt;br&gt;
It also had to talk publicly about service disruption.&lt;/p&gt;

&lt;p&gt;That combination matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;security issue&lt;/li&gt;
&lt;li&gt;high-trust path&lt;/li&gt;
&lt;li&gt;customer-visible impact&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once those three things line up, this stops being just an internal engineering embarrassment.&lt;br&gt;
It becomes a reliability story too.&lt;/p&gt;

&lt;p&gt;And reliability is part of the product.&lt;br&gt;
Especially when the product is infrastructure for how other engineers ship software.&lt;/p&gt;

&lt;h2&gt;
  
  
  the market keeps forgetting what “boring critical infrastructure” really means
&lt;/h2&gt;

&lt;p&gt;I think the market often rewards GitHub for looking exciting.&lt;br&gt;
AI features, flashy demos, ecosystem gravity, all of that.&lt;/p&gt;

&lt;p&gt;But the real value proposition is still much more boring.&lt;br&gt;
It is something like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;your code lives here&lt;/li&gt;
&lt;li&gt;your push lands here&lt;/li&gt;
&lt;li&gt;your repo history is here&lt;/li&gt;
&lt;li&gt;your automation starts here&lt;/li&gt;
&lt;li&gt;your delivery chain trusts this place&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not glamorous.&lt;br&gt;
That is utility.&lt;br&gt;
That is plumbing.&lt;br&gt;
That is exactly why breaking trust there feels worse than some bug in a side feature.&lt;/p&gt;

&lt;p&gt;If your core business is where software teams place their code and start their release motion, then your standard is not “generally impressive platform.”&lt;br&gt;
Your standard is “this path must be extremely hard to break in dangerous ways.”&lt;/p&gt;

&lt;h2&gt;
  
  
  this is also why single-vendor convenience has a hidden cost
&lt;/h2&gt;

&lt;p&gt;This incident is another reminder that centralization is efficient until it is not.&lt;/p&gt;

&lt;p&gt;Putting source, CI, automation, access, and process gravity into one place makes life smoother on normal days.&lt;br&gt;
But it also means that when something goes wrong in a core path, the blast radius is psychological even when the technical blast radius is contained.&lt;/p&gt;

&lt;p&gt;Teams start asking:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what else is too centralized here?&lt;/li&gt;
&lt;li&gt;what assumptions did we stop questioning?&lt;/li&gt;
&lt;li&gt;how much of our delivery trust sits inside one commercial boundary?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those are healthy questions... probably overdue ones&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb93559acsnshh96rn318.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb93559acsnshh96rn318.gif" alt="everybody stay calm" width="480" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  my take
&lt;/h2&gt;

&lt;p&gt;I think GitHub deserves credit for responding fast.&lt;br&gt;
I also think that is not the main headline.&lt;/p&gt;

&lt;p&gt;The main headline is simpler and harsher:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub had a critical RCE problem in the git push pipeline.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And if you are GitHub, that is the kind of sentence you should find humiliating.&lt;br&gt;
Because among all the things people can reasonably argue about in your product, this is the one surface where you are supposed to be boringly excellent.&lt;/p&gt;

&lt;p&gt;That is why my opinion is what it is:&lt;/p&gt;

&lt;p&gt;GitHub failed at the only thing they should do: Git.&lt;/p&gt;

&lt;p&gt;Not because Git is literally the only feature they offer.&lt;br&gt;
But because Git is the foundational trust contract that justifies the existence of the whole giant platform built on top of it.&lt;/p&gt;

&lt;p&gt;When that layer cracks, the rest of the value proposition looks a little less sophisticated and a little more fragile.&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Wiz, &lt;em&gt;GitHub RCE Vulnerability: CVE-2026-3854 Breakdown&lt;/em&gt; — &lt;a href="https://www.wiz.io/blog/github-rce-vulnerability-cve-2026-3854" rel="noopener noreferrer"&gt;https://www.wiz.io/blog/github-rce-vulnerability-cve-2026-3854&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub, &lt;em&gt;An update on GitHub availability&lt;/em&gt; — &lt;a href="https://github.blog/news-insights/company-news/an-update-on-github-availability/" rel="noopener noreferrer"&gt;https://github.blog/news-insights/company-news/an-update-on-github-availability/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub, &lt;em&gt;Securing the git push pipeline: Responding to a critical remote code execution vulnerability&lt;/em&gt; — &lt;a href="https://github.blog/security/securing-the-git-push-pipeline-responding-to-a-critical-remote-code-execution-vulnerability/" rel="noopener noreferrer"&gt;https://github.blog/security/securing-the-git-push-pipeline-responding-to-a-critical-remote-code-execution-vulnerability/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>github</category>
      <category>git</category>
      <category>security</category>
      <category>devops</category>
    </item>
    <item>
      <title>why agi is not possible with the current llms and transformers</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Wed, 29 Apr 2026 13:49:03 +0000</pubDate>
      <link>https://dev.to/pvgomes/why-agi-is-not-possible-with-the-current-llms-and-transformers-em</link>
      <guid>https://dev.to/pvgomes/why-agi-is-not-possible-with-the-current-llms-and-transformers-em</guid>
      <description>&lt;p&gt;Yes, I know this is the kind of title that usually makes people angry before the second paragraph.&lt;br&gt;
But let’s be honest for a second: a lot of the AGI discussion right now is built on vibes, product demos, and investors hallucinating roadmaps.&lt;/p&gt;

&lt;p&gt;Current LLMs are impressive.&lt;br&gt;
Transformers were a historic breakthrough.&lt;br&gt;
But that does &lt;strong&gt;not&lt;/strong&gt; mean the current architecture is a straight road to AGI.&lt;br&gt;
And this is not even some fringe opinion anymore. People very close to the center of the modern AI wave have been signaling versions of this for a while.&lt;/p&gt;

&lt;p&gt;The point is not that LLMs are useless.&lt;br&gt;
The point is that they are probably not the final substrate for general intelligence.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fva7wu928ndeifqie6k3c.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fva7wu928ndeifqie6k3c.gif" alt="calm down" width="498" height="498"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  transformers changed everything, but not everything they changed leads to agi
&lt;/h2&gt;

&lt;p&gt;The 2017 paper &lt;a href="https://arxiv.org/abs/1706.03762" rel="noopener noreferrer"&gt;&lt;em&gt;Attention Is All You Need&lt;/em&gt;&lt;/a&gt; gave the industry the transformer architecture.&lt;br&gt;
That mattered more than almost any AI paper of the last decade.&lt;br&gt;
Without it, you do not get GPT-style models in the form we know them.&lt;br&gt;
Without that scaling path, you probably do not get the current commercial AI race in the same shape either.&lt;/p&gt;

&lt;p&gt;But there is a very common mistake here.&lt;br&gt;
People take “transformers unlocked a huge jump” and quietly convert that into “transformers must therefore be the road to full general intelligence.”&lt;/p&gt;

&lt;p&gt;That does not follow.&lt;br&gt;
A system can be historically important and still be incomplete.&lt;br&gt;
A ladder can get you much higher without reaching the roof.&lt;/p&gt;

&lt;h2&gt;
  
  
  llms are very good at pattern completion, but that is not the same thing as a mind
&lt;/h2&gt;

&lt;p&gt;This is the part people hate because it sounds like downplaying the models.&lt;br&gt;
It is not downplaying them.&lt;br&gt;
It is just refusing to confuse capability with explanation.&lt;/p&gt;

&lt;p&gt;LLMs are extremely strong statistical systems for sequence modeling.&lt;br&gt;
That is already a huge deal.&lt;br&gt;
They compress gigantic distributions over language, code, and other symbolic artifacts into something operationally useful.&lt;br&gt;
That is why they can write decent prose, summarize legal text, explain a Rust borrow checker error, and generate creepy-good autocomplete.&lt;/p&gt;

&lt;p&gt;But being very good at next-token prediction over giant corpora does not automatically imply:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;grounded world models&lt;/li&gt;
&lt;li&gt;durable causal reasoning&lt;/li&gt;
&lt;li&gt;stable long-horizon planning&lt;/li&gt;
&lt;li&gt;embodiment&lt;/li&gt;
&lt;li&gt;self-directed agency with coherent goals&lt;/li&gt;
&lt;li&gt;persistent internal models of truth independent of text imitation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gap matters.&lt;br&gt;
A lot.&lt;/p&gt;

&lt;p&gt;Right now, much of the industry is basically betting that enough scale, enough reinforcement, enough tooling, and enough product scaffolding will cause those missing properties to emerge strongly enough.&lt;br&gt;
Maybe some of them will.&lt;br&gt;
But that is still a bet, not a proof.&lt;/p&gt;

&lt;h2&gt;
  
  
  even the people closest to this wave have hinted that current llms are not the final answer
&lt;/h2&gt;

&lt;p&gt;Ilya Sutskever has been one of the most important people in modern deep learning.&lt;br&gt;
He co-authored the AlexNet paper, was a co-founder and chief scientist at OpenAI, and has been as close to frontier model development as almost anyone alive.&lt;/p&gt;

&lt;p&gt;He has also repeatedly signaled that scale alone is not the whole story.&lt;br&gt;
Not in the simplistic “just make the next model bigger” sense.&lt;br&gt;
The broader direction from people like him has been that current systems are powerful, but there are still core unsolved questions around reasoning, agency, and what kind of architecture actually gets you to something more general.&lt;/p&gt;

&lt;p&gt;That matters because the people with the best empirical seat in the house are usually less naive about architecture than the market is.&lt;br&gt;
The market sees product demos and says “AGI soon.”&lt;br&gt;
Researchers see brittle failure modes, hidden scaffolding, evaluation gaps, and weird generalization boundaries.&lt;/p&gt;

&lt;p&gt;Those are not the same thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  current llms still depend too much on scaffolding
&lt;/h2&gt;

&lt;p&gt;One of the easiest ways to see the limit is to look at how much extra machinery we keep wrapping around these models.&lt;/p&gt;

&lt;p&gt;Every time the model struggles, the answer becomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;give it retrieval&lt;/li&gt;
&lt;li&gt;give it tools&lt;/li&gt;
&lt;li&gt;give it memory&lt;/li&gt;
&lt;li&gt;give it better prompting&lt;/li&gt;
&lt;li&gt;give it decomposition&lt;/li&gt;
&lt;li&gt;give it agents&lt;/li&gt;
&lt;li&gt;give it reflection loops&lt;/li&gt;
&lt;li&gt;give it verifier models&lt;/li&gt;
&lt;li&gt;give it structured outputs&lt;/li&gt;
&lt;li&gt;give it a planner on top of the planner&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this is bad.&lt;br&gt;
A lot of it is smart engineering.&lt;br&gt;
But it is also a clue.&lt;/p&gt;

&lt;p&gt;If the core system were already on a clean AGI trajectory by itself, we would not need this much external scaffolding just to make it robust at ordinary multi-step work.&lt;/p&gt;

&lt;p&gt;We keep building exoskeletons around the model because the raw model is not enough.&lt;br&gt;
That is useful.&lt;br&gt;
But it is also diagnostic.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia.tenor.com%2F8m6G9b27wwoAAAAC%2Ftyping-fast.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmedia.tenor.com%2F8m6G9b27wwoAAAAC%2Ftyping-fast.gif" alt="this is getting out of hand" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  the context-window obsession is also a tell
&lt;/h2&gt;

&lt;p&gt;A lot of modern LLM discourse treats bigger context windows as if they are a direct proxy for deeper intelligence.&lt;br&gt;
They are not.&lt;br&gt;
They are useful, yes.&lt;br&gt;
But usefulness is not the same as cognition.&lt;/p&gt;

&lt;p&gt;A very large context window can help a model:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;see more documents&lt;/li&gt;
&lt;li&gt;maintain more local continuity&lt;/li&gt;
&lt;li&gt;reference more recent constraints&lt;/li&gt;
&lt;li&gt;reduce some memory hacks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Great.&lt;br&gt;
But that still does not solve the harder problems of abstraction, grounding, causal stability, or independent model-building.&lt;/p&gt;

&lt;p&gt;A model that can read a whole repo is not automatically a model that understands software the way a strong engineer understands systems over time.&lt;br&gt;
A model that can ingest a giant conversation is not automatically a model with coherent memory in the human sense.&lt;/p&gt;

&lt;p&gt;A bigger whiteboard does not equal a better brain.&lt;/p&gt;

&lt;h2&gt;
  
  
  this is also why code generation is a dangerous shortcut to agi hype
&lt;/h2&gt;

&lt;p&gt;Code is one of the strongest arguments for LLM usefulness.&lt;br&gt;
And also one of the easiest places to overstate what the systems are doing.&lt;/p&gt;

&lt;p&gt;Yes, current models can write useful code.&lt;br&gt;
Yes, they can fix bugs, generate boilerplate, explain APIs, and occasionally outperform mediocre humans on bounded tasks.&lt;br&gt;
That is all real.&lt;/p&gt;

&lt;p&gt;But even in code, they still show the same structural weaknesses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;unstable long-range consistency&lt;/li&gt;
&lt;li&gt;brittle execution loops&lt;/li&gt;
&lt;li&gt;hidden dependency on retries and external validation&lt;/li&gt;
&lt;li&gt;no reliable internal model of correctness unless tied to tools/tests&lt;/li&gt;
&lt;li&gt;tendency to bluff through uncertainty&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not AGI.&lt;br&gt;
That is a very strong probabilistic synthesizer with some surprisingly valuable affordances.&lt;br&gt;
Which is still impressive, by the way.&lt;/p&gt;

&lt;h2&gt;
  
  
  three tiny code examples that make the point
&lt;/h2&gt;

&lt;p&gt;Here is the difference between producing plausible code and actually sustaining deeper software understanding over time.&lt;/p&gt;

&lt;h3&gt;
  
  
  rust
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;f64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;f64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Option&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;f64&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;None&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An LLM can write that easily.&lt;br&gt;
The hard part is not syntax.&lt;br&gt;
The hard part is whether the system understands when this function belongs in a larger error-handling model, how it should evolve in a real service, what invariants surround it, and how those choices cascade through a production codebase.&lt;/p&gt;

&lt;h3&gt;
  
  
  java
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="nf"&gt;User&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{}&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UserService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;findById&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;empty&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Again, easy.&lt;br&gt;
But can the model reason reliably about persistence boundaries, consistency guarantees, privacy constraints, operational tracing, and how this service should change under real business pressure?&lt;br&gt;
Sometimes partially. Consistently? Not really.&lt;/p&gt;

&lt;h3&gt;
  
  
  clojure
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight clojure"&gt;&lt;code&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;defn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;safe-parse-int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;try&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;Integer/parseInt&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;catch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;Exception&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nice, compact, useful.&lt;br&gt;
But local code generation is not the same thing as robust system-level intelligence.&lt;br&gt;
It is one slice of it at best.&lt;/p&gt;

&lt;p&gt;That difference is where a lot of AGI marketing hides.&lt;/p&gt;

&lt;h2&gt;
  
  
  the transformer may be a stepping stone, not the final architecture
&lt;/h2&gt;

&lt;p&gt;This is the view that seems most plausible to me.&lt;/p&gt;

&lt;p&gt;Transformers are probably like a major aircraft design breakthrough, not the final aircraft.&lt;br&gt;
They changed the feasible frontier.&lt;br&gt;
They made certain scaling patterns obvious.&lt;br&gt;
They created new industries.&lt;br&gt;
But that does not mean they are the final form of general intelligence any more than early jet engines were the final form of flight.&lt;/p&gt;

&lt;p&gt;Maybe AGI, if it ever arrives, will still inherit ideas from transformers.&lt;br&gt;
That seems likely.&lt;br&gt;
But it may require architectures that integrate memory, planning, grounding, world modeling, and self-correction in ways that current autoregressive LLMs do not naturally do.&lt;/p&gt;

&lt;p&gt;That is a much narrower and more reasonable claim than “LLMs are fake.”&lt;br&gt;
They are not fake.&lt;br&gt;
They are just probably not sufficient.&lt;/p&gt;

&lt;h2&gt;
  
  
  my take
&lt;/h2&gt;

&lt;p&gt;I think the current transformer + LLM wave is historically important, economically real, and still not the same thing as a solved path to AGI.&lt;/p&gt;

&lt;p&gt;That is the part people need to hold in their heads at the same time.&lt;/p&gt;

&lt;p&gt;The models are powerful.&lt;br&gt;
The products are useful.&lt;br&gt;
The industry is not crazy for taking them seriously.&lt;/p&gt;

&lt;p&gt;But the leap from “very strong sequence model with tools” to “general intelligence” is still doing a lot of hand-wavy work.&lt;br&gt;
Too much, honestly.&lt;/p&gt;

&lt;p&gt;So when people say AGI is right around the corner because current LLMs keep getting better, I think the right answer is:&lt;/p&gt;

&lt;p&gt;maybe something important is around the corner.&lt;br&gt;
But it is far from obvious that the current transformer stack is the full road there.&lt;/p&gt;

&lt;p&gt;And that is not pessimism.&lt;br&gt;
That is just architectural humility.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwsecrrkk29jdu4cdaidx.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwsecrrkk29jdu4cdaidx.gif" alt="not the same thing" width="498" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  references
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Vaswani et al., &lt;em&gt;Attention Is All You Need&lt;/em&gt; (2017) — &lt;a href="https://arxiv.org/abs/1706.03762" rel="noopener noreferrer"&gt;https://arxiv.org/abs/1706.03762&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Brown et al., &lt;em&gt;Language Models are Few-Shot Learners&lt;/em&gt; (2020) — &lt;a href="https://arxiv.org/abs/2005.14165" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2005.14165&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Wikipedia, &lt;em&gt;Ilya Sutskever&lt;/em&gt; — &lt;a href="https://en.wikipedia.org/wiki/Ilya_Sutskever" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Ilya_Sutskever&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Wikipedia, &lt;em&gt;Attention Is All You Need&lt;/em&gt; — &lt;a href="https://en.wikipedia.org/wiki/Attention_Is_All_You_Need" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Attention_Is_All_You_Need&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Wikipedia, &lt;em&gt;Large language model&lt;/em&gt; — &lt;a href="https://en.wikipedia.org/wiki/Large_language_model" rel="noopener noreferrer"&gt;https://en.wikipedia.org/wiki/Large_language_model&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>agi</category>
      <category>llm</category>
      <category>transformers</category>
    </item>
    <item>
      <title>Without google's transformers, there is no GPT-ishs</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Sat, 25 Apr 2026 12:01:11 +0000</pubDate>
      <link>https://dev.to/pvgomes/without-googles-transformers-there-is-no-gpt-2-5eg7</link>
      <guid>https://dev.to/pvgomes/without-googles-transformers-there-is-no-gpt-2-5eg7</guid>
      <description>&lt;p&gt;Yes, remember back there in 2020/2021 when OpenAI created the gpt2? How about we really focus on what enable them to do that? google transformers.&lt;/p&gt;

&lt;p&gt;The modern generative AI industry was built on one of the most consequential papers in the history of software: Google’s 2017 paper, &lt;a href="https://arxiv.org/abs/1706.03762" rel="noopener noreferrer"&gt;&lt;em&gt;Attention Is All You Need&lt;/em&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;That paper introduced the Transformer architecture.&lt;br&gt;
And without that architecture, GPT-2 does not happen in the way we know it.&lt;br&gt;
Honestly, most of today’s AI industry does not happen in the same way either.&lt;/p&gt;

&lt;p&gt;This is one of those moments where the industry narrative got flattened.&lt;br&gt;
People remember products, brand names, launches, demos, APIs, and valuation charts.&lt;br&gt;
But under all of that sits a technical shift that changed what was economically and architecturally possible.&lt;/p&gt;

&lt;p&gt;That shift was the Transformer.&lt;/p&gt;




&lt;h2&gt;
  
  
  The pre-Transformer world was not useless, but it was narrower
&lt;/h2&gt;

&lt;p&gt;Before Transformers took over, the field was already making real progress with recurrent neural networks, LSTMs, GRUs, sequence-to-sequence models, and attention layers added on top of those systems.&lt;br&gt;
This mattered.&lt;br&gt;
It was not fake progress.&lt;/p&gt;

&lt;p&gt;But it had limits.&lt;/p&gt;

&lt;p&gt;Those older architectures were much more painful to scale for long-range dependencies, much harder to parallelize efficiently, and generally less well-suited to the kind of giant training runs that would later define modern language models.&lt;/p&gt;

&lt;p&gt;That matters more than people think.&lt;/p&gt;

&lt;p&gt;A lot of AI history is really compute history wearing a research costume.&lt;br&gt;
If an architecture is elegant but does not map well onto large-scale training infrastructure, it can hit a ceiling even if the ideas are good.&lt;/p&gt;

&lt;p&gt;The brilliance of the Transformer was not only that it worked.&lt;br&gt;
It was that it worked in a way the industry could scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Google actually changed
&lt;/h2&gt;

&lt;p&gt;The key claim in &lt;em&gt;Attention Is All You Need&lt;/em&gt; was radical for its time: sequence modeling did not need recurrence or convolution at the center.&lt;br&gt;
The model could rely entirely on attention mechanisms.&lt;/p&gt;

&lt;p&gt;That is the line that changed everything.&lt;/p&gt;

&lt;p&gt;Google’s authors proposed a model architecture that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;removed recurrence from the core sequence model&lt;/li&gt;
&lt;li&gt;relied on self-attention to model relationships across tokens&lt;/li&gt;
&lt;li&gt;made training far more parallelizable than RNN-heavy approaches&lt;/li&gt;
&lt;li&gt;created a cleaner path toward scaling with more data, more parameters, and more compute&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why the paper mattered so much.&lt;br&gt;
It did not just improve benchmark performance.&lt;br&gt;
It changed the operating assumptions of the field.&lt;/p&gt;

&lt;p&gt;Once that door opened, the industry got a new answer to a bigger question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;what if language modeling could be treated as a scaling problem instead of a carefully hand-managed sequence bottleneck problem?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is the real pivot.&lt;/p&gt;




&lt;h2&gt;
  
  
  GPT-2 is not just “an OpenAI breakthrough”
&lt;/h2&gt;

&lt;p&gt;GPT-2 absolutely mattered.&lt;br&gt;
It helped prove that large-scale generative language modeling could produce outputs with a level of fluency that forced the industry to pay attention.&lt;br&gt;
It made a lot of people understand, maybe for the first time, that language models were not just autocomplete toys.&lt;/p&gt;

&lt;p&gt;But GPT-2 was not born in a vacuum.&lt;/p&gt;

&lt;p&gt;GPT-2 stands on the Transformer architecture.&lt;br&gt;
Not metaphorically.&lt;br&gt;
Directly.&lt;/p&gt;

&lt;p&gt;Even the name GPT says it:&lt;br&gt;
&lt;strong&gt;Generative Pre-trained Transformer.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That last word is doing a lot of work.&lt;/p&gt;

&lt;p&gt;Without Google’s Transformer paper, there is no straightforward architectural foundation for GPT-2 as we know it.&lt;br&gt;
Maybe OpenAI would have built some other path eventually.&lt;br&gt;
Maybe the field would have discovered a similar breakthrough through a different line of work.&lt;br&gt;
But the GPT-2 that actually happened, when it happened, and how it happened, is inseparable from the Transformer.&lt;/p&gt;

&lt;p&gt;That is just the technical truth.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Transformer did something more important than improve models
&lt;/h2&gt;

&lt;p&gt;The biggest thing Google gave the industry was not merely a better model block.&lt;br&gt;
It gave the industry a scaling primitive.&lt;/p&gt;

&lt;p&gt;That sounds dry, but it is the whole story.&lt;/p&gt;

&lt;p&gt;The AI industry today is defined by a few recurring ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pretraining at large scale&lt;/li&gt;
&lt;li&gt;transfer of general capability into downstream tasks&lt;/li&gt;
&lt;li&gt;parameter growth&lt;/li&gt;
&lt;li&gt;context-window expansion&lt;/li&gt;
&lt;li&gt;foundation models as platform assets&lt;/li&gt;
&lt;li&gt;model families with derivative products, tools, and APIs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of that became much more viable because the Transformer architecture matched the industrial reality of training large systems on serious hardware.&lt;/p&gt;

&lt;p&gt;That is why its influence extends so far beyond NLP papers.&lt;/p&gt;

&lt;p&gt;The Transformer did not merely improve one subfield.&lt;br&gt;
It helped create the modern operating model for AI companies.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this shaped the industry so deeply
&lt;/h2&gt;

&lt;p&gt;The reason the Transformer shaped the entire industry is simple:&lt;/p&gt;

&lt;p&gt;it connected research progress to economic scale.&lt;/p&gt;

&lt;p&gt;That is what wins.&lt;br&gt;
Not just cleverness.&lt;br&gt;
Not just novelty.&lt;br&gt;
Not just benchmark gains.&lt;/p&gt;

&lt;p&gt;The winning architecture is usually the one that makes the next order of magnitude possible.&lt;/p&gt;

&lt;p&gt;Transformers made it easier to imagine:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;larger language models&lt;/li&gt;
&lt;li&gt;broader pretraining corpora&lt;/li&gt;
&lt;li&gt;reusable model backbones&lt;/li&gt;
&lt;li&gt;generalized text generation&lt;/li&gt;
&lt;li&gt;eventually multimodal systems built on related scaling logic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once that happened, the center of gravity changed.&lt;br&gt;
The industry stopped thinking in terms of narrow task-specific systems and started thinking in terms of large trainable model families.&lt;/p&gt;

&lt;p&gt;That shift is still the water we are swimming in.&lt;/p&gt;

&lt;p&gt;You can argue about whether the current generation of AI products is overhyped, overcapitalized, or overmarketed.&lt;br&gt;
I make that argument all the time.&lt;br&gt;
But the architectural break itself was real.&lt;/p&gt;

&lt;p&gt;And Google triggered it.&lt;/p&gt;




&lt;h2&gt;
  
  
  This is one of the great ironies of AI history
&lt;/h2&gt;

&lt;p&gt;One of the funniest parts of the whole story is that Google created one of the most important technical foundations of the generative AI boom, and then for a while let the market narrative get captured by everyone else.&lt;/p&gt;

&lt;p&gt;That is an extraordinary industrial fumble.&lt;/p&gt;

&lt;p&gt;Researchers at Google helped create the architecture that underlies the most important AI platform shift in years.&lt;br&gt;
Then OpenAI became the popular face of the revolution.&lt;br&gt;
Then the rest of the industry ran around trying to catch up in public, even while depending on the same architectural lineage.&lt;/p&gt;

&lt;p&gt;That is not a criticism of the Transformer.&lt;br&gt;
If anything, it proves how foundational it was.&lt;br&gt;
Once an idea is that powerful, it stops belonging to one company in the practical sense.&lt;br&gt;
It becomes infrastructure for an era.&lt;/p&gt;

&lt;p&gt;That is exactly what happened.&lt;/p&gt;




&lt;h2&gt;
  
  
  GPT-2 was one of the first public proofs of what the architecture unlocked
&lt;/h2&gt;

&lt;p&gt;GPT-2 matters because it was one of the first large public demonstrations of what a Transformer-based generative system could look like when pushed hard enough.&lt;/p&gt;

&lt;p&gt;It was not the final form.&lt;br&gt;
It was not the most advanced model by today’s standards.&lt;br&gt;
But it was one of the moments when the broader industry could no longer pretend this line of research was marginal.&lt;/p&gt;

&lt;p&gt;People saw coherent text generation at a level that changed expectations.&lt;br&gt;
Developers, founders, investors, media, and eventually platform teams all started recalibrating.&lt;/p&gt;

&lt;p&gt;That is why saying “GPT-2 would not be possible without Transformers” is not rhetorical exaggeration.&lt;br&gt;
It is just a concise summary of the dependency chain.&lt;/p&gt;

&lt;p&gt;Google created the architectural breakthrough.&lt;br&gt;
OpenAI used that breakthrough to push generative language modeling into a new public phase.&lt;br&gt;
Then the rest of the industry reorganized around the consequences.&lt;/p&gt;




&lt;h2&gt;
  
  
  My take
&lt;/h2&gt;

&lt;p&gt;If you want to understand why the AI industry looks the way it does now, you have to stop treating products as the primary story and start treating architectures as the primary story.&lt;/p&gt;

&lt;p&gt;ChatGPT was a product event.&lt;br&gt;
GPT-2 was an early capability event.&lt;br&gt;
The Transformer was the deeper industry event.&lt;/p&gt;

&lt;p&gt;That is the layer that changed the shape of the map.&lt;/p&gt;

&lt;p&gt;Without Google’s Transformer paper, there is no clean path to GPT-2 as we know it.&lt;br&gt;
Without that path, there is no similar acceleration in large-scale generative language modeling.&lt;br&gt;
And without that acceleration, the current AI industry probably looks slower, messier, and far less unified around the same technical backbone.&lt;/p&gt;

&lt;p&gt;So yes, OpenAI deserves credit for what it built.&lt;br&gt;
But the underlying architecture that made that path real came from Google research.&lt;/p&gt;

&lt;p&gt;The industry loves to celebrate whoever ships the loudest product.&lt;br&gt;
It is worse at remembering who changed the underlying physics.&lt;/p&gt;

&lt;p&gt;In this case, the underlying physics changed in 2017.&lt;br&gt;
And the rest of the AI industry has been compounding that decision ever since.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>transformers</category>
      <category>llm</category>
      <category>google</category>
    </item>
    <item>
      <title>Why all AI-coding plans are getting more expensive?</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:14:19 +0000</pubDate>
      <link>https://dev.to/pvgomes/why-all-ai-coding-plans-are-getting-more-expensive-3ffi</link>
      <guid>https://dev.to/pvgomes/why-all-ai-coding-plans-are-getting-more-expensive-3ffi</guid>
      <description>&lt;p&gt;This weeks are giving the AI coding market a rare honest moment, showing that we are far away from AI-only coding.&lt;/p&gt;

&lt;p&gt;GitHub tightened Copilot’s individual plans, paused some new signups, and explicitly admitted that agentic coding workloads no longer fit inside the old flat-rate box.&lt;/p&gt;

&lt;p&gt;Anthropic, meanwhile, tested removing Claude Code from the $20 Pro plan for some new users while keeping Claude Code in the much more expensive Max tier.&lt;/p&gt;

&lt;p&gt;People are reacting as if these are random pricing tweaks.&lt;br&gt;
They are not.&lt;/p&gt;

&lt;p&gt;This is what it looks like when AI vendors hit physics, GPU supply, and the reality that an “AI coding request” is no longer a cheap autocomplete call.&lt;/p&gt;

&lt;p&gt;The short version is simple:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;flat monthly pricing was designed for chat and light completion usage, but users are now buying autonomous, long-running, tool-using coding workflows that can burn through inference budget like a small batch job.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That mismatch was always going to break.&lt;br&gt;
April 2026 is just when it became impossible to hide.&lt;/p&gt;




&lt;h2&gt;
  
  
  What changed this week
&lt;/h2&gt;

&lt;p&gt;GitHub was unusually explicit.&lt;br&gt;
In its &lt;a href="https://github.blog/news-insights/company-news/changes-to-github-copilot-individual-plans/" rel="noopener noreferrer"&gt;April 20 post about Copilot Individual plans&lt;/a&gt;, the company said it was &lt;strong&gt;“pausing new sign-ups, tightening usage limits, and adjusting model availability”&lt;/strong&gt; in order to protect existing customers.&lt;br&gt;
It also said that &lt;strong&gt;“agentic workflows have fundamentally changed Copilot’s compute demands”&lt;/strong&gt; and that &lt;strong&gt;“it’s now common for a handful of requests to incur costs that exceed the plan price.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is the important sentence.&lt;br&gt;
Not the marketing copy.&lt;br&gt;
That one.&lt;/p&gt;

&lt;p&gt;It means the unit economics are breaking at the workflow level.&lt;br&gt;
Not in some abstract MBA sense, but at the level of a single heavy user running parallel agent sessions on expensive frontier models.&lt;/p&gt;

&lt;p&gt;On the &lt;a href="https://github.com/features/copilot/plans" rel="noopener noreferrer"&gt;GitHub Copilot pricing page&lt;/a&gt;, the public structure now shows individual tiers including &lt;strong&gt;Free&lt;/strong&gt;, &lt;strong&gt;Pro ($10)&lt;/strong&gt;, &lt;strong&gt;Pro+ ($39)&lt;/strong&gt;, and &lt;strong&gt;Max ($99)&lt;/strong&gt;, plus temporary availability messaging and premium-request mechanics.&lt;br&gt;
That is not simplification.&lt;br&gt;
That is a company trying to carve up users by how much expensive inference they consume.&lt;/p&gt;

&lt;p&gt;Anthropic’s side looks similar, even if the messaging was messier.&lt;br&gt;
Reporting from &lt;a href="https://arstechnica.com/ai/2026/04/anthropic-tested-removing-claude-code-from-the-pro-plan/" rel="noopener noreferrer"&gt;Ars Technica&lt;/a&gt;, &lt;a href="https://www.theregister.com/2026/04/22/anthropic_removes_claude_code_pro/" rel="noopener noreferrer"&gt;The Register&lt;/a&gt;, and &lt;a href="https://thenewstack.io/anthropic-claude-code-limits/" rel="noopener noreferrer"&gt;The New Stack&lt;/a&gt; points to Anthropic testing the removal of Claude Code from the $20 Pro plan for some new signups.&lt;br&gt;
At the same time, the &lt;a href="https://www.anthropic.com/pricing" rel="noopener noreferrer"&gt;current Anthropic pricing page&lt;/a&gt; positions &lt;strong&gt;“Includes Claude Code”&lt;/strong&gt; under Max, starting at &lt;strong&gt;$100/month&lt;/strong&gt;, with Max framed as &lt;strong&gt;5x or 20x more usage than Pro&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Again, the signal is obvious.&lt;br&gt;
The vendors are separating general conversational usage from high-intensity coding-agent usage.&lt;br&gt;
Because they are not the same product economically anymore.&lt;/p&gt;




&lt;h2&gt;
  
  
  The old pricing assumption is dead
&lt;/h2&gt;

&lt;p&gt;For a while, AI coding products got away with pricing that felt almost SaaS-normal.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one monthly fee&lt;/li&gt;
&lt;li&gt;vague “fair use” boundaries&lt;/li&gt;
&lt;li&gt;some model differentiation&lt;/li&gt;
&lt;li&gt;maybe a premium bucket for the power users&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That worked when the product behaved like autocomplete plus a few chat turns.&lt;br&gt;
It does not work when the same product can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;recursively plan and re-plan a task&lt;/li&gt;
&lt;li&gt;read dozens or hundreds of files&lt;/li&gt;
&lt;li&gt;call tools repeatedly&lt;/li&gt;
&lt;li&gt;open parallel subagents&lt;/li&gt;
&lt;li&gt;generate long intermediate contexts&lt;/li&gt;
&lt;li&gt;loop until a test suite passes&lt;/li&gt;
&lt;li&gt;burn expensive large-model calls for review, repair, and validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is not “a request” in the old sense.&lt;br&gt;
That is closer to renting a little slice of an inference cluster and asking it to behave like an intern who never sleeps.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2b4fai71n6285b2qaztr.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2b4fai71n6285b2qaztr.gif" alt="This is fine" width="480" height="266"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The product still looks like software.&lt;br&gt;
The cost profile increasingly looks like compute infrastructure.&lt;/p&gt;

&lt;p&gt;That is why the old flat-rate subscription logic is failing.&lt;br&gt;
It assumed average usage would remain average.&lt;br&gt;
But agentic coding does not create average users.&lt;br&gt;
It creates a power-law distribution where a relatively small set of users can generate absurdly disproportionate cost.&lt;/p&gt;

&lt;p&gt;Once that happens, a $20 or $10 all-you-can-eat plan stops being generous and starts being structurally stupid.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tokens are only part of the story
&lt;/h2&gt;

&lt;p&gt;A lot of people still talk about this as if the only issue is token pricing.&lt;br&gt;
That is too shallow.&lt;/p&gt;

&lt;p&gt;Yes, tokens are expensive.&lt;br&gt;
Yes, frontier reasoning models are especially expensive.&lt;br&gt;
Yes, long contexts, repeated repair loops, and verbose agent behavior make bills explode.&lt;/p&gt;

&lt;p&gt;But the real cost stack is bigger than raw tokens.&lt;/p&gt;

&lt;p&gt;Vendors are also paying for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;scarce accelerator capacity&lt;/li&gt;
&lt;li&gt;multi-region serving infrastructure&lt;/li&gt;
&lt;li&gt;peak load headroom&lt;/li&gt;
&lt;li&gt;rate limiting and abuse controls&lt;/li&gt;
&lt;li&gt;storage and retrieval for tool-heavy sessions&lt;/li&gt;
&lt;li&gt;orchestration layers for agents and subagents&lt;/li&gt;
&lt;li&gt;premium model routing and fallback systems&lt;/li&gt;
&lt;li&gt;the reliability tax of keeping interactive latency acceptable while users run workloads that increasingly resemble background jobs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point matters.&lt;/p&gt;

&lt;p&gt;AI coding tools are now caught in an awkward middle state.&lt;br&gt;
Users expect chat-like responsiveness, but they increasingly use them for batch-like workloads.&lt;br&gt;
That is a nasty platform problem.&lt;br&gt;
You either overprovision capacity, degrade latency for everyone, or start pushing users into harder limits and more expensive tiers.&lt;/p&gt;

&lt;p&gt;So when people say “the companies need to stop burning tokens,” I think the deeper truth is this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwsecrrkk29jdu4cdaidx.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwsecrrkk29jdu4cdaidx.gif" alt="Burning money" width="498" height="227"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;they need to stop pretending that autonomous coding agents are cheap consumer chat products.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because they are not.&lt;br&gt;
They are compute products wearing a SaaS costume.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this is happening now, specifically in April 2026
&lt;/h2&gt;

&lt;p&gt;The timing is not random.&lt;br&gt;
Several trends converged.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Agent mode crossed from demo into real usage
&lt;/h3&gt;

&lt;p&gt;For the last year, vendors kept shipping “agent” features as if they were just a nice extension of chat.&lt;br&gt;
But in practice, more developers started using them for actual work: multi-file changes, repo exploration, iterative debugging, PR generation, and longer coding loops.&lt;/p&gt;

&lt;p&gt;Once that behavior becomes normal, cost per user stops looking like chat.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The strongest models moved from luxury to default aspiration
&lt;/h3&gt;

&lt;p&gt;Users do not merely want a coding assistant anymore.&lt;br&gt;
They want the best model, the biggest context, the most tools, the longest runs, and the least friction.&lt;br&gt;
If a plan includes access to the good stuff, power users will find the ceiling fast.&lt;/p&gt;

&lt;p&gt;This is what happens when premium models stop being occasional upsells and become the default expectation.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Providers have better usage data now
&lt;/h3&gt;

&lt;p&gt;In 2024 and even much of 2025, some of this could still be hand-waved as growth strategy.&lt;br&gt;
In 2026, these companies now have enough real workload data to know exactly which cohorts are underwater.&lt;br&gt;
And once you know a plan is underwater, the option to ignore it becomes temporary.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Capacity is still finite
&lt;/h3&gt;

&lt;p&gt;The market keeps talking as if inference capacity is infinite because the interface feels infinite.&lt;br&gt;
It is not.&lt;br&gt;
Even large vendors do not have infinite high-end compute, infinite power, or infinite thermal headroom.&lt;br&gt;
A surge in long-running agent sessions does not just change the bill.&lt;br&gt;
It changes scheduling, reliability, and service quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. The industry is starting to move from request theater to usage realism
&lt;/h3&gt;

&lt;p&gt;GitHub’s language is especially revealing here.&lt;br&gt;
The moment a provider publicly says that a handful of requests can exceed the plan price, the old fiction is over.&lt;br&gt;
That statement is basically an obituary for simplistic request-based pricing.&lt;/p&gt;

&lt;p&gt;If one “request” can mean “please rename this variable” or “run an autonomous coding workflow across my repo for half an hour,” then request counts are not a serious billing primitive anymore.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zes1vulm86130bvp1o0.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8zes1vulm86130bvp1o0.gif" alt="Confused math" width="498" height="462"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  This is also an energy story
&lt;/h2&gt;

&lt;p&gt;There is another part of this that the industry still tries not to say too loudly.&lt;/p&gt;

&lt;p&gt;Inference is not just expensive in money.&lt;br&gt;
It is expensive in electricity, cooling, and hardware wear.&lt;/p&gt;

&lt;p&gt;The AI industry spent the last two years behaving as if the right answer to every product question was “let the model think longer, call more tools, run more agents, increase the context window, and smooth it over with another subsidy.”&lt;br&gt;
That strategy was useful for grabbing market share.&lt;br&gt;
It was never going to be durable.&lt;/p&gt;

&lt;p&gt;If an AI coding session behaves like a tiny distributed compute job, then the industry eventually has to price it like one.&lt;br&gt;
Or ration it like one.&lt;br&gt;
Or both.&lt;/p&gt;

&lt;p&gt;That is what we are seeing now.&lt;br&gt;
Not a moral awakening.&lt;br&gt;
A resource correction.&lt;/p&gt;

&lt;p&gt;The servers were always real.&lt;br&gt;
The power bill was always real.&lt;br&gt;
The GPU queue was always real.&lt;br&gt;
April 2026 is just the month more companies started admitting it in public.&lt;/p&gt;




&lt;h2&gt;
  
  
  Expect more segmentation, not less
&lt;/h2&gt;

&lt;p&gt;I do not think this ends with one weird GitHub week and one awkward Anthropic pricing test.&lt;br&gt;
I think this is the start of a broader repricing cycle.&lt;/p&gt;

&lt;p&gt;Expect more of the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;premium models pushed into higher tiers&lt;/li&gt;
&lt;li&gt;“agent mode” split from normal chat plans&lt;/li&gt;
&lt;li&gt;stricter monthly quotas on autonomous workflows&lt;/li&gt;
&lt;li&gt;token-based or credit-based billing replacing vague request counts&lt;/li&gt;
&lt;li&gt;separate pricing for interactive use versus background execution&lt;/li&gt;
&lt;li&gt;more enterprise emphasis, because business customers tolerate clearer metering better than consumers do&lt;/li&gt;
&lt;li&gt;more local and hybrid execution, because offloading some work becomes economically rational&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, AI coding tools are becoming less like Netflix and more like cloud infrastructure.&lt;br&gt;
That should surprise nobody.&lt;/p&gt;

&lt;p&gt;The weird part is that the industry tried to pretend otherwise for so long.&lt;/p&gt;




&lt;h2&gt;
  
  
  The market trained users to expect the wrong thing
&lt;/h2&gt;

&lt;p&gt;Part of the backlash is deserved.&lt;br&gt;
The vendors created it.&lt;/p&gt;

&lt;p&gt;They spent months teaching users to think in slogans like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;unlimited&lt;/li&gt;
&lt;li&gt;your AI teammate&lt;/li&gt;
&lt;li&gt;delegate whole tasks&lt;/li&gt;
&lt;li&gt;runs in the background&lt;/li&gt;
&lt;li&gt;use the best models&lt;/li&gt;
&lt;li&gt;more agentic workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then the bill arrived.&lt;/p&gt;

&lt;p&gt;You cannot market autonomous software labor and then act shocked when people use it like autonomous software labor.&lt;br&gt;
If the product pitch is “give it bigger jobs,” users will give it bigger jobs.&lt;br&gt;
If the product pitch is “parallelize your work,” users will parallelize their work.&lt;/p&gt;

&lt;p&gt;So yes, some developers are burning absurd amounts of inference.&lt;br&gt;
But the companies encouraged exactly that behavior.&lt;/p&gt;

&lt;p&gt;This is why I expect pricing language to get more explicit from here.&lt;br&gt;
Not because the companies suddenly became honest, but because they cannot afford not to be.&lt;/p&gt;




&lt;h2&gt;
  
  
  My take
&lt;/h2&gt;

&lt;p&gt;I think both GitHub and Anthropic are reacting to the same underlying truth:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI coding has moved from lightweight assistance to compute-intensive delegated work, and the original prosumer subscription tiers were priced for the former, not the latter.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That is why access is being segmented.&lt;br&gt;
That is why the better models are moving upward.&lt;br&gt;
That is why “premium requests” are showing up.&lt;br&gt;
That is why coding-agent access is being pulled toward $39, $99, or $100+ tiers instead of being bundled into a cheap chat subscription.&lt;/p&gt;

&lt;p&gt;This is not greed replacing generosity.&lt;br&gt;
It is reality catching up with a bad pricing abstraction.&lt;/p&gt;

&lt;p&gt;And honestly, that is healthy.&lt;/p&gt;

&lt;p&gt;The AI market needs fewer fake-unlimited plans and more truthful pricing tied to actual resource use.&lt;br&gt;
It also needs users to understand that long-running agent workflows are not magic.&lt;br&gt;
They are expensive distributed systems hidden behind a pleasant interface.&lt;/p&gt;

&lt;p&gt;If April 2026 becomes the month vendors stopped subsidizing the fantasy that infinite agentic coding fits inside a cheap flat monthly plan, that will probably be remembered as a correction, not a mistake.&lt;/p&gt;

&lt;p&gt;A lot of AI products still pretend tokens are abstract.&lt;br&gt;
They are not.&lt;br&gt;
They are compute, power, queue time, and capacity.&lt;/p&gt;

&lt;p&gt;And eventually, physics always sends the invoice.&lt;/p&gt;

&lt;h3&gt;
  
  
  References
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Image banner from &lt;a href="https://www.azquotes.com/quote/103098" rel="noopener noreferrer"&gt;azquotes&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>claudecode</category>
      <category>codeassistant</category>
    </item>
    <item>
      <title>Claude code and 512000 Becoming Public</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Wed, 01 Apr 2026 11:00:24 +0000</pubDate>
      <link>https://dev.to/pvgomes/claude-code-and-512000-becoming-public-4f74</link>
      <guid>https://dev.to/pvgomes/claude-code-and-512000-becoming-public-4f74</guid>
      <description>&lt;h2&gt;
  
  
  The Moment Every Engineer Knows
&lt;/h2&gt;

&lt;p&gt;You publish a package.&lt;/p&gt;

&lt;p&gt;Everything seems fine.&lt;/p&gt;

&lt;p&gt;Then someone on the internet discovers something you absolutely did &lt;strong&gt;not&lt;/strong&gt; intend to ship.&lt;/p&gt;

&lt;p&gt;And suddenly half a million lines of your internal code are circulating on GitHub.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0uhqkunzeptttm7pseni.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0uhqkunzeptttm7pseni.gif" alt="oops" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That’s essentially what just happened with &lt;strong&gt;Claude Code&lt;/strong&gt;, Anthropic’s command-line developer tool.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Actually Happened
&lt;/h2&gt;

&lt;p&gt;Anthropic released version &lt;strong&gt;2.1.88&lt;/strong&gt; of the &lt;code&gt;claude-code&lt;/code&gt; npm package.&lt;/p&gt;

&lt;p&gt;Inside the package there was a &lt;strong&gt;source map file&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For people outside the frontend / JS ecosystem, a quick explanation:&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;source map&lt;/strong&gt; is normally used to map compiled JavaScript back to the original TypeScript source files for debugging.&lt;/p&gt;

&lt;p&gt;But if the source map accidentally includes references to the original sources &lt;strong&gt;and the sources are embedded or accessible&lt;/strong&gt;, anyone can reconstruct the &lt;strong&gt;entire codebase&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And that’s exactly what happened.&lt;/p&gt;

&lt;p&gt;The result:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~ &lt;strong&gt;2,000 TypeScript files&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;~ &lt;strong&gt;512,000 lines of code&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;internal architecture exposed&lt;/li&gt;
&lt;li&gt;instantly mirrored and forked across GitHub&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Within hours, developers were already exploring the internals.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhjbc01d3kndr3k0yprh.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyhjbc01d3kndr3k0yprh.gif" alt="developers digging" width="370" height="208"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Important: The Models Were NOT Leaked
&lt;/h2&gt;

&lt;p&gt;Let’s be clear about something.&lt;/p&gt;

&lt;p&gt;The leak &lt;strong&gt;did not expose the Claude models&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;No weights.&lt;br&gt;&lt;br&gt;
No training data.&lt;br&gt;&lt;br&gt;
No internal datasets.&lt;/p&gt;

&lt;p&gt;What leaked is &lt;strong&gt;the CLI developer experience layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Think of it as the &lt;strong&gt;operating system around the AI&lt;/strong&gt;, not the AI itself.&lt;/p&gt;

&lt;p&gt;Still, that layer is extremely valuable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;Many people underestimate developer tooling.&lt;/p&gt;

&lt;p&gt;But tools like &lt;strong&gt;Claude Code, Copilot CLI, Cursor, etc.&lt;/strong&gt; are not simple wrappers.&lt;/p&gt;

&lt;p&gt;They are complex systems with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tool execution frameworks&lt;/li&gt;
&lt;li&gt;memory management&lt;/li&gt;
&lt;li&gt;query orchestration&lt;/li&gt;
&lt;li&gt;prompt pipelines&lt;/li&gt;
&lt;li&gt;plugin systems&lt;/li&gt;
&lt;li&gt;guardrails&lt;/li&gt;
&lt;li&gt;verification loops&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;According to early analysis of the leaked code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~ &lt;strong&gt;40k lines&lt;/strong&gt; for the plugin/tool system&lt;/li&gt;
&lt;li&gt;~ &lt;strong&gt;46k lines&lt;/strong&gt; for the query engine&lt;/li&gt;
&lt;li&gt;complex &lt;strong&gt;memory rewriting pipelines&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;multi-step &lt;strong&gt;memory validation&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;background context refinement&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Which confirms something many engineers already suspected:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;These tools are &lt;strong&gt;production-grade software systems&lt;/strong&gt;, not thin wrappers around LLM APIs.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Why Competitors Will Study This Carefully
&lt;/h2&gt;

&lt;p&gt;This leak gives a &lt;strong&gt;rare look at how a modern AI developer tool is built&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Competitors can now analyze:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;architecture decisions&lt;/li&gt;
&lt;li&gt;orchestration patterns&lt;/li&gt;
&lt;li&gt;prompt pipeline design&lt;/li&gt;
&lt;li&gt;tool execution safety models&lt;/li&gt;
&lt;li&gt;memory persistence strategies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even if companies don't copy code directly (which would be legally risky), they can learn &lt;strong&gt;a lot&lt;/strong&gt; from the design choices.&lt;/p&gt;

&lt;p&gt;It's basically a &lt;strong&gt;blueprint of a modern AI coding assistant.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fapx6f29x7ibau69ccxs4.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fapx6f29x7ibau69ccxs4.gif" alt="engineers reading leaked code" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Security Angle
&lt;/h2&gt;

&lt;p&gt;There's also a darker side.&lt;/p&gt;

&lt;p&gt;Whenever internal architecture becomes public, attackers can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;analyze guardrail logic&lt;/li&gt;
&lt;li&gt;probe validation layers&lt;/li&gt;
&lt;li&gt;search for weaknesses in tool execution&lt;/li&gt;
&lt;li&gt;discover bypass patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Security through obscurity is never a real defense, but suddenly giving attackers &lt;strong&gt;a full architectural map&lt;/strong&gt; does raise the stakes.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Irony
&lt;/h2&gt;

&lt;p&gt;Claude Code itself exists to &lt;strong&gt;help engineers ship code faster&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And the leak happened because of a &lt;strong&gt;release packaging mistake&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not a hack.&lt;/p&gt;

&lt;p&gt;Not a breach.&lt;/p&gt;

&lt;p&gt;Just a build artifact that shouldn't have been published.&lt;/p&gt;

&lt;p&gt;Every engineer reading this knows the feeling.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm4wsqt8fmtyhjxgmcj70.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm4wsqt8fmtyhjxgmcj70.gif" alt="deploy panic" width="640" height="360"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  My Take
&lt;/h2&gt;

&lt;p&gt;From an engineering perspective, two things stand out.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The sophistication is real
&lt;/h3&gt;

&lt;p&gt;People often say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"These AI tools are just wrappers around an API."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s clearly &lt;strong&gt;not true anymore&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Half a million lines of orchestration code tells a different story.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. This will accelerate the ecosystem
&lt;/h3&gt;

&lt;p&gt;Ironically, leaks like this often &lt;strong&gt;speed up innovation&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Now thousands of developers can study:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;memory strategies&lt;/li&gt;
&lt;li&gt;prompt pipelines&lt;/li&gt;
&lt;li&gt;tool orchestration&lt;/li&gt;
&lt;li&gt;AI developer UX design&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And build their own versions faster.&lt;/p&gt;

&lt;p&gt;That’s how software ecosystems evolve.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;AI development tools are becoming something closer to &lt;strong&gt;operating systems for programming&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;They orchestrate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;context&lt;/li&gt;
&lt;li&gt;tools&lt;/li&gt;
&lt;li&gt;models&lt;/li&gt;
&lt;li&gt;memory&lt;/li&gt;
&lt;li&gt;safety&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Claude Code is just one example of this emerging layer.&lt;/p&gt;

&lt;p&gt;Seeing its internals confirms something many of us suspected:&lt;/p&gt;

&lt;p&gt;We are not just building &lt;strong&gt;AI models&lt;/strong&gt; anymore.&lt;/p&gt;

&lt;p&gt;We are building &lt;strong&gt;AI runtime environments for developers.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;Some engineers are probably embarrassed about this release.&lt;/p&gt;

&lt;p&gt;But if we’re honest…&lt;/p&gt;

&lt;p&gt;Every developer has shipped something accidentally.&lt;/p&gt;

&lt;p&gt;Most of the time it's harmless.&lt;/p&gt;

&lt;p&gt;Sometimes it's &lt;strong&gt;half a million lines of proprietary code.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flpc2x3xnxsvwvnka1lm2.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flpc2x3xnxsvwvnka1lm2.gif" alt="ship it" width="350" height="252"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>vibecoding</category>
    </item>
    <item>
      <title>OpenClaw is infrastructure, not intelligence</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Mon, 09 Feb 2026 10:14:34 +0000</pubDate>
      <link>https://dev.to/pvgomes/openclaw-is-infrastructure-not-intelligence-3d5o</link>
      <guid>https://dev.to/pvgomes/openclaw-is-infrastructure-not-intelligence-3d5o</guid>
      <description>&lt;p&gt;There is a new hype going around called OpenClaw, and Yes, it is a hype. Early AGI? no.&lt;/p&gt;

&lt;p&gt;OpenClaw is an agent orchestration framework. A good one. Clean. Opinionated. Well documented.&lt;br&gt;
But conceptually, it is not new.&lt;/p&gt;

&lt;p&gt;What OpenClaw actually is? Based on their own docs and code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;OpenClaw lets you define agents&lt;/li&gt;
&lt;li&gt;Each agent can run tools, call LLMs, keep state, and execute steps&lt;/li&gt;
&lt;li&gt;You can run it locally on your own machine&lt;/li&gt;
&lt;li&gt;You control the infrastructure and the data flow&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;that’s it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is not intelligence. This is software plumbing for LLM-powered workflows 😒&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhoze5eq2tffqu1ndt097.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhoze5eq2tffqu1ndt097.gif" alt="calmdown" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  We have seen this before
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Nothing here is magic or unprecedented.&lt;/li&gt;
&lt;li&gt;You can do the same thing today with &lt;a href="https://n8n.io/?utm_source=pvgomes" rel="noopener noreferrer"&gt;n8n&lt;/a&gt; more scalable and way more guardrails&lt;/li&gt;
&lt;li&gt;You can do the same thing with LangChain-style agent loops&lt;/li&gt;
&lt;li&gt;You can do the same thing with custom workers + queues + LLM APIs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Lets be honest, the difference is not capability, is ergonomics.&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvkt762gdpnt9zsu0jejj.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvkt762gdpnt9zsu0jejj.gif" alt="ergonomics" width="480" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OpenClaw gives you a cleaner mental model and less glue code. I see this as a more “developer-first” way to define agents, and as we say in polish 🇵🇱 &lt;strong&gt;oczywiście&lt;/strong&gt;, which means, &lt;strong&gt;offcourse&lt;/strong&gt; That is valuable, not AGI.&lt;/p&gt;

&lt;p&gt;“But it runs locally, that’s new”&lt;/p&gt;

&lt;p&gt;Not really.&lt;/p&gt;

&lt;p&gt;OpenAI themselves shipped local / semi-local execution environments in the past:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tool calling&lt;/li&gt;
&lt;li&gt;Code execution sandboxes&lt;/li&gt;
&lt;li&gt;Agent-like runtimes that could call external systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those experiments came and went. Some were removed.&lt;br&gt;
The idea stayed.&lt;/p&gt;

&lt;p&gt;OpenClaw is simply doing this outside OpenAI, in an open-source, self-hosted way.&lt;/p&gt;

&lt;p&gt;Their difference? Router and messaging apps, WhatsApp, iMessage, Telegram... you named it.&lt;/p&gt;

&lt;p&gt;But again:&lt;br&gt;
You can already do this with n8n, webhooks, and LLM APIs.&lt;/p&gt;

&lt;p&gt;The real difference is cost and control:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;With OpenClaw, you might run everything on a custom local computer&lt;/li&gt;
&lt;li&gt;With n8n or cloud setups, you usually pay more over time&lt;/li&gt;
&lt;li&gt;This is a deployment choice, not a breakthrough.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So what is the right framing?&lt;/p&gt;

&lt;h4&gt;
  
  
  OpenClaw is:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Not AGI&lt;/li&gt;
&lt;li&gt;Not consciousness&lt;/li&gt;
&lt;li&gt;Not reasoning in the human sense&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  OpenClaw is:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;An elegant agent framework&lt;/li&gt;
&lt;li&gt;A solid abstraction over LLM workflows&lt;/li&gt;
&lt;li&gt;A good tool if you want local-first, controlled execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is already enough.&lt;br&gt;
It does not need AGI branding to be useful.&lt;/p&gt;

&lt;h3&gt;
  
  
  My conclusion
&lt;/h3&gt;

&lt;p&gt;Calling OpenClaw “AGI” is marketing-driven confusion. Calling it a clean, modern way to build agents is accurate. And accuracy matters more than hype. I see this is game changing for some cool things, you have now some agent working eventually 24 hours a day, but this one still needs tons of triggers and guidelines as well as will halucinate as we already see in a common openai chat. You can use it,  you can deliver awesome job with it, bit it will not replace a human. yet.&lt;/p&gt;

&lt;h3&gt;
  
  
  References
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Docs: &lt;a href="https://docs.openclaw.ai/start/getting-started" rel="noopener noreferrer"&gt;https://docs.openclaw.ai/start/getting-started&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;https://github.com/openclaw/openclaw&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;n8n: &lt;a href="https://n8n.io/?utm_source=pvgomes" rel="noopener noreferrer"&gt;https://n8n.io/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>openclaw</category>
      <category>agi</category>
      <category>ai</category>
    </item>
    <item>
      <title>VS Code now has agents: what changed</title>
      <dc:creator>Paulo Victor Leite Lima Gomes</dc:creator>
      <pubDate>Sun, 08 Feb 2026 15:24:13 +0000</pubDate>
      <link>https://dev.to/pvgomes/vs-code-now-has-agents-what-changed-d39</link>
      <guid>https://dev.to/pvgomes/vs-code-now-has-agents-what-changed-d39</guid>
      <description>&lt;p&gt;&lt;strong&gt;Keeping up with AI coding tools is exhausting&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First we had simple chat assistants.&lt;/li&gt;
&lt;li&gt;Then multiple modes like ask, edit, plan, agent.&lt;/li&gt;
&lt;li&gt;Then different models.&lt;/li&gt;
&lt;li&gt;And now… VS Code added something new: agent execution environments like: Local, Background, Cloud, Claude&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you feel lost, good.&lt;br&gt;
Me too.&lt;br&gt;
So let’s simplify what is actually happening.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17ghmco2na88oq2wukyl.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F17ghmco2na88oq2wukyl.gif" alt="tired" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  From chat assistants to real agents
&lt;/h3&gt;

&lt;p&gt;At the beginning, which was like two years ago😅, AI in coding was basically:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Here’s some code. Fix it."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;That was the Ask era&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Then tools started editing files directly, that became the Edit era.&lt;/p&gt;

&lt;p&gt;After that, AI began planning multi-step changes.&lt;br&gt;
Welcome to Plan.&lt;/p&gt;

&lt;p&gt;And now we are here where AI can run tasks, monitor progress, and execute workflows.&lt;br&gt;
&lt;strong&gt;That is the Agent era&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the real shift 💩 and is part of the thausand options we have with AI, also with all companies chasing attention in AI initiatives&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftv0f8hv1g02skrybp4w9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftv0f8hv1g02skrybp4w9.gif" alt="agents-are-real" width="480" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Why VS Code suddenly has more options on the side panel
&lt;/h3&gt;

&lt;p&gt;Before, the decision was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which mode?&lt;/li&gt;
&lt;li&gt;Which model?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now there is a third dimension:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Where the agent runs ?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is more like from "AI that answers and are our copilot" to be more like "AI that does the work"&lt;/p&gt;

&lt;p&gt;This is off course not something new, this was the idea before, and now the biggest trend, and yes its getting harder to follow. Reasons why we are here, to organize it mentally.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx3to3vua6mpqpv2zmyl9.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx3to3vua6mpqpv2zmyl9.gif" alt="mental" width="245" height="140"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How to think about the new VS Code Agents (simple mental model)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;first, forget the complexity&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Just remember:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Modes = what the AI should do: ask, edit, plan, agent&lt;/li&gt;
&lt;li&gt;Model = how smart it is more or less&lt;/li&gt;
&lt;li&gt;Execution environment = where it runs: local, background, cloud, claude&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;That’s it&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now its it more than a autocomplete, or copilot or an step-by-step doing work, its more really delegating the work using your IDE.&lt;/p&gt;

&lt;p&gt;For the full official explanation from Microsoft, &lt;a href="https://code.visualstudio.com/docs/copilot/agents/overview" rel="noopener noreferrer"&gt;read here&lt;/a&gt;&lt;/p&gt;

</description>
      <category>vscode</category>
      <category>ai</category>
      <category>agents</category>
      <category>softwaredevelopment</category>
    </item>
  </channel>
</rss>
