<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Max Quimby</title>
    <description>The latest articles on DEV Community by Max Quimby (@max_quimby).</description>
    <link>https://dev.to/max_quimby</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3823178%2F0a97facc-1e95-494c-9db9-084aa3b35e47.png</url>
      <title>DEV Community: Max Quimby</title>
      <link>https://dev.to/max_quimby</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/max_quimby"/>
    <language>en</language>
    <item>
      <title>Block Just Cut 40% of Its Engineers. BuilderBot Writes the Code Now.</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Thu, 02 Apr 2026 17:47:41 +0000</pubDate>
      <link>https://dev.to/max_quimby/block-just-cut-40-of-its-engineers-builderbot-writes-the-code-now-p6h</link>
      <guid>https://dev.to/max_quimby/block-just-cut-40-of-its-engineers-builderbot-writes-the-code-now-p6h</guid>
      <description>&lt;p&gt;There's a moment in the a16z interview where Owen Jennings, a Block executive, says something that makes you pause: &lt;strong&gt;"We are not writing code by hand anymore. That's over."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not "we're using AI to assist our engineers." Not "we've increased productivity." The correlation between headcount and output at Block — the $50B fintech company formerly known as Square — &lt;strong&gt;broke&lt;/strong&gt; in the first week of December 2025. And the company acted on it.&lt;/p&gt;

&lt;p&gt;Block cut over 40% of its engineering staff. Squads that once had 14 people now run with 3 or 4. Their internal AI coding agent, BuilderBot, autonomously merges pull requests and takes features to 85-90% completion before a human ever looks at the code.&lt;/p&gt;

&lt;p&gt;This isn't a startup experimenting with AI tools. This is a publicly traded company with thousands of employees making a structural bet that AI agents can replace the majority of traditional software engineering work. And the early evidence suggests they're right.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;📊 The numbers:&lt;/strong&gt; 40%+ engineering staff reduction. Squads from 14 → 3-4 people. BuilderBot takes features to 85-90% completion autonomously. These aren't projections — they're operational reality at Block as of Q1 2026.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The "Binary Shift" — What Actually Happened in December 2025
&lt;/h2&gt;

&lt;p&gt;Owen Jennings describes a specific inflection point. Not a gradual improvement — a discontinuous jump. In the first week of December 2025, two things shipped nearly simultaneously: &lt;strong&gt;Anthropic's Opus 4.6&lt;/strong&gt; and &lt;strong&gt;OpenAI's Codex 5.3&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The critical breakthrough wasn't raw intelligence. It was the ability to work with &lt;strong&gt;existing complex codebases&lt;/strong&gt; — not just greenfield projects. Before December, AI coding tools were impressive on new projects but struggled with the tangled reality of production systems: legacy APIs, undocumented business logic, migration debt, cross-service dependencies.&lt;/p&gt;

&lt;p&gt;Opus 4.6 and Codex 5.3 crossed that threshold. Suddenly, AI agents could navigate Block's massive codebase — hundreds of services, years of accumulated complexity — and make meaningful changes that actually passed CI and code review.&lt;/p&gt;

&lt;p&gt;Jennings called it a "binary shift." One week the correlation between headcount and output held. The next week, it didn't. The implications were immediate and brutal.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/krdrkl38nRw"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  How BuilderBot Actually Works
&lt;/h2&gt;

&lt;p&gt;BuilderBot isn't just Copilot bolted onto an IDE. It's an autonomous agent deeply integrated into Block's development workflow:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Workflow
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ticket ingestion.&lt;/strong&gt; BuilderBot reads Jira tickets, design specs, and related documentation. It understands what needs to be built — not just the code change, but the business context.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Codebase navigation.&lt;/strong&gt; The agent maps dependencies, understands service boundaries, and identifies which files need modification. This is where pre-December models failed — they couldn't hold the full context of a complex system.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Implementation.&lt;/strong&gt; BuilderBot writes the code, creates tests, and handles cross-service changes. It doesn't just generate snippets — it builds complete feature implementations.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Self-review and iteration.&lt;/strong&gt; Before submitting a PR, BuilderBot runs the test suite, checks for common anti-patterns, and iterates on its own output. Failed tests trigger automatic debugging cycles.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Autonomous PR merge.&lt;/strong&gt; For changes within established patterns and confidence thresholds, BuilderBot merges its own PRs without human review. Higher-risk changes get flagged for human review by the remaining senior engineers.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The 85-90% Number
&lt;/h3&gt;

&lt;p&gt;When Jennings says BuilderBot takes features to "85-90% completion," he means the agent handles the entire implementation — from understanding requirements to writing code to passing tests. The remaining 10-15% is typically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Edge cases&lt;/strong&gt; that require deep domain expertise&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Design decisions&lt;/strong&gt; that involve product trade-offs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-team coordination&lt;/strong&gt; that requires human judgment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security-sensitive changes&lt;/strong&gt; that demand human review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The senior engineers who remain on squads spend their time on this 10-15% — the highest-judgment work that requires understanding not just the code but the business, the users, and the regulatory environment.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;How the squads changed:&lt;/strong&gt; Block's engineering squads went from 14 people (mix of junior, mid, and senior engineers plus a manager) to 3-4 people (senior engineers and a tech lead). The junior and mid-level implementation work that filled most of the headcount is now handled by BuilderBot.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Moat Thesis: Understanding &amp;gt; Code
&lt;/h2&gt;

&lt;p&gt;The most strategically important thing Jennings said wasn't about BuilderBot. It was about what makes a company defensible in the AI era:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"The moat is which companies understand something super hard for others to understand."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Block's edge isn't their codebase — any AI agent could eventually write equivalent payments processing code. Their edge is &lt;strong&gt;deep data on how sellers and buyers participate in the economy&lt;/strong&gt;. Years of transaction data, merchant behavior patterns, fraud signals, lending risk models — that's institutional knowledge that can't be replicated by pointing an AI at a blank repository.&lt;/p&gt;

&lt;p&gt;This reframes the entire competitive landscape. If code becomes commoditized (and Block is betting it already has), then the companies that survive are the ones with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Proprietary data&lt;/strong&gt; that feeds better models and decisions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain expertise&lt;/strong&gt; that AI can't learn from public sources&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network effects&lt;/strong&gt; that compound with scale&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regulatory knowledge&lt;/strong&gt; in complex, licensed industries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything else — the UI, the APIs, the infrastructure — becomes a commodity that any sufficiently capable AI agent can reproduce. Jennings described this as an "existential vibe-coding threat" to companies that can't answer what they &lt;em&gt;uniquely&lt;/em&gt; know.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Enterprise AI Landscape That Made This Possible
&lt;/h2&gt;

&lt;p&gt;Block's transformation didn't happen in a vacuum. The broader enterprise AI market underwent a seismic shift in the same period.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Anthropic Surge
&lt;/h3&gt;

&lt;p&gt;According to data cited by Peter Diamandis at the Abundance360 Summit and discussed on the All-In Podcast, enterprise AI market share flipped dramatically in late 2025:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;📊 Enterprise AI market share (Q1 2026):&lt;/strong&gt; Anthropic 73% vs OpenAI 26% — a complete reversal from 60/40 in OpenAI's favor just three months earlier. Claude Code and Opus 4.6 were the forcing function.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This isn't just about benchmarks. Anthropic's Claude Code became the default enterprise coding agent because it could do what Block needed — navigate complex existing codebases, not just generate new code. The tool's ability to understand monorepo structures, respect existing patterns, and integrate with CI/CD pipelines made it the backbone of systems like BuilderBot.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/SjBNoni78aM"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  The Multi-Model Reality
&lt;/h3&gt;

&lt;p&gt;Block isn't exclusively using one provider. The most sophisticated enterprise AI deployments in 2026 use multiple models for different tasks — a pattern that Gauntlet AI's Austen Allred has been vocal about:&lt;/p&gt;

&lt;p&gt;The emerging enterprise stack looks like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic Claude (Opus 4.6)&lt;/strong&gt; for complex reasoning and codebase navigation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI Codex 5.3&lt;/strong&gt; for rapid code generation and test writing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google's models&lt;/strong&gt; for design and UI work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom fine-tuned models&lt;/strong&gt; for domain-specific tasks (fraud detection, risk scoring)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Block's BuilderBot likely orchestrates across multiple providers, routing different subtasks to whichever model handles them best. This multi-model approach is why the "which AI is best?" question is increasingly irrelevant — the answer is "all of them, for different things."&lt;/p&gt;

&lt;h2&gt;
  
  
  What the All-In Crew Is Saying
&lt;/h2&gt;

&lt;p&gt;The All-In Podcast — hosted by four billionaire tech investors who collectively touch hundreds of companies — covered Block's transformation extensively. Their reactions reveal how the investment community is processing this shift:&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/4Gmd5UTF4rk"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;The consensus among the hosts: Block is the canary in the coal mine, not the exception. Every company with a large engineering org is going to face the same math — AI agents that can do 85-90% of implementation work at a fraction of the cost of human engineers.&lt;/p&gt;

&lt;p&gt;David Sacks framed it in terms of unit economics: if BuilderBot handles the work of 10 engineers at the cost of compute tokens, the ROI is so overwhelming that &lt;em&gt;not&lt;/em&gt; adopting similar tools becomes a fiduciary risk.&lt;/p&gt;

&lt;h2&gt;
  
  
  The ServiceNow Signal
&lt;/h2&gt;

&lt;p&gt;Block isn't alone. ServiceNow research published this week demonstrates that terminal-based coding agents with direct API access can now handle enterprise automation tasks that previously required dedicated teams.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;@_akhaliq&lt;/strong&gt;: "Terminal Agents Suffice for Enterprise Automation — ServiceNow research shows terminal-based coding agents with direct API access..." — &lt;a href="https://x.com/_akhaliq/status/2039734774894395747" rel="noopener noreferrer"&gt;View on X&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The pattern is consistent: autonomous agents aren't just writing code — they're operating within enterprise systems, making API calls, handling workflows, and closing tickets. The "agent" in "coding agent" is becoming literal.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Engineering Leaders
&lt;/h2&gt;

&lt;p&gt;If you lead an engineering organization, Block's story isn't something to panic about — it's something to prepare for. Here's the practical playbook:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Audit Your "Moat" — Today
&lt;/h3&gt;

&lt;p&gt;Ask yourself: &lt;strong&gt;What does my company understand that's genuinely hard for others to understand?&lt;/strong&gt; If the answer is "we have a good codebase" or "we have experienced engineers," you're in trouble. Code is being commoditized. Engineering talent is being augmented to the point where a team of 4 can do what 14 used to do.&lt;/p&gt;

&lt;p&gt;The defensible moats are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Proprietary data and the models trained on it&lt;/li&gt;
&lt;li&gt;Deep domain expertise in regulated industries&lt;/li&gt;
&lt;li&gt;Network effects that compound with usage&lt;/li&gt;
&lt;li&gt;Customer relationships built on trust and switching costs&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Restructure Teams Around Judgment, Not Output
&lt;/h3&gt;

&lt;p&gt;Block's squad reduction from 14 to 3-4 isn't arbitrary. They kept the people who do the highest-judgment work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Architects&lt;/strong&gt; who make design decisions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Senior engineers&lt;/strong&gt; who handle edge cases and security&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tech leads&lt;/strong&gt; who coordinate across systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain experts&lt;/strong&gt; who understand the business context&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern for everyone else: fewer engineers, each with dramatically amplified capability. A senior engineer with BuilderBot-class tooling does the implementation work that previously required a team.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Invest in AI Infrastructure, Not Headcount
&lt;/h3&gt;

&lt;p&gt;Jensen Huang's recent provocation — that a $500K engineer should be spending $250K on AI tokens — sounds like marketing from the CEO of a GPU company. But the math at Block validates the principle. The cost of AI compute tokens to run BuilderBot is a fraction of the salary, benefits, and overhead of the engineers it replaced.&lt;/p&gt;

&lt;p&gt;The shift in capital allocation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Before:&lt;/strong&gt; 80% salaries, 20% tools/infrastructure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After:&lt;/strong&gt; 40% salaries (for senior staff), 40% AI compute, 20% infrastructure&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Start with the Boring Stuff
&lt;/h3&gt;

&lt;p&gt;Block didn't start by having AI write their core payments engine. They started with repetitive, well-defined tasks: CRUD endpoints, migration scripts, test coverage, documentation updates. As confidence in the tooling grew, they expanded scope.&lt;/p&gt;

&lt;p&gt;Your roadmap:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Month 1-2:&lt;/strong&gt; AI-assisted code review and test generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Month 3-4:&lt;/strong&gt; Autonomous handling of bug fixes and small features&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Month 5-6:&lt;/strong&gt; Full feature implementation with human review&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Month 7+:&lt;/strong&gt; Selective autonomous merge for high-confidence changes&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Retrain, Don't Just Fire
&lt;/h3&gt;

&lt;p&gt;The hardest part of Block's story isn't the technology — it's the human cost. 40% staff reduction means real people losing real jobs. The companies that handle this well will retrain engineers for the roles that remain:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI system architects (designing the agent pipelines)&lt;/li&gt;
&lt;li&gt;Prompt engineers and agent operators&lt;/li&gt;
&lt;li&gt;Quality assurance and security reviewers&lt;/li&gt;
&lt;li&gt;Domain experts who guide AI output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The companies that handle it badly will face lawsuits, talent acquisition problems, and the kind of reputation damage that makes future hiring harder.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🔑 The key insight:&lt;/strong&gt; Block's BuilderBot didn't replace engineers — it replaced &lt;em&gt;engineering tasks&lt;/em&gt;. The humans who remain are doing fundamentally different work: guiding AI agents, making judgment calls, and leveraging domain expertise that models can't replicate. The question isn't "will AI replace engineers?" It's "what kind of engineering work is left for humans?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Naval Ravikant distilled the shift in a single tweet that went viral this week:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;&lt;a class="mentioned-user" href="https://dev.to/naval"&gt;@naval&lt;/a&gt;&lt;/strong&gt;: "Vibe coding is more addictive than any video game ever made (if you know what you want to build)." — &lt;a href="https://x.com/naval/status/2039617101221224858" rel="noopener noreferrer"&gt;View on X&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The subtext is important: "if you know what you want to build." That conditional is doing enormous work. The value isn't in the coding — it's in knowing what to code. Block figured this out. Their remaining engineers aren't coders — they're decision-makers who happen to use code as their medium.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Uncomfortable Questions
&lt;/h2&gt;

&lt;p&gt;Block's transformation raises questions that the industry hasn't fully grappled with:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What happens to junior engineers?&lt;/strong&gt; If AI handles the work that juniors traditionally do — implementing well-defined features, writing tests, fixing bugs — how do junior engineers develop the skills to become senior engineers? Block's squad structure assumes a steady supply of experienced engineers, but the pipeline that produces them may be drying up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is 85-90% completion good enough?&lt;/strong&gt; The remaining 10-15% that requires human judgment includes security, edge cases, and architectural decisions — the exact areas where mistakes are most costly. Are 3-4 person squads enough to catch the things that AI misses?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How far does this go?&lt;/strong&gt; Block is at 40% reduction now. If AI capabilities continue improving at the current rate, is 60% next? 80%? At what point does the company become entirely dependent on AI agents, with human engineers serving only as a safety net?&lt;/p&gt;

&lt;p&gt;These aren't rhetorical questions. They're planning questions. Every engineering leader needs to have answers — or at least frameworks for finding answers — before the "binary shift" hits their organization.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Block's story is the most concrete, publicly documented case of AI agents replacing a significant portion of engineering work at a major company. It's not theoretical anymore. The numbers are real: 40% headcount reduction, 14-person squads becoming 4-person squads, an AI agent that autonomously merges PRs.&lt;/p&gt;

&lt;p&gt;The "binary shift" Jennings described — where the correlation between headcount and output suddenly breaks — isn't unique to Block. It's a threshold that every engineering organization will cross as AI coding agents improve. The question is whether you cross it on your own terms, with a plan for restructuring and retraining, or whether it catches you unprepared.&lt;/p&gt;

&lt;p&gt;Block's bet is clear: the future of software engineering isn't humans writing code. It's humans understanding problems, and AI writing the solutions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"We are not writing code by hand anymore. That's over."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The rest of the industry is about to find out if he's right.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sources: a16z interview with Owen Jennings, All-In Podcast, Peter Diamandis Abundance360 Summit. Enterprise market share data cited by multiple sources covering Q1 2026 enterprise AI adoption.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>career</category>
      <category>webdev</category>
    </item>
    <item>
      <title>AI Agent Supply Chain Attacks: What the LiteLLM Breach Means for Your Stack</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Thu, 02 Apr 2026 17:45:37 +0000</pubDate>
      <link>https://dev.to/max_quimby/ai-agent-supply-chain-attacks-what-the-litellm-breach-means-for-your-stack-3kcp</link>
      <guid>https://dev.to/max_quimby/ai-agent-supply-chain-attacks-what-the-litellm-breach-means-for-your-stack-3kcp</guid>
      <description>&lt;p&gt;The morning of March 31, 2026, started badly for the AI ecosystem. A malicious actor had slipped compromised versions of LiteLLM — one of the most widely-deployed LLM proxy libraries in production — onto PyPI. The poisoned packages were live for &lt;strong&gt;40 minutes&lt;/strong&gt;. That was enough.&lt;/p&gt;

&lt;p&gt;Mercor, the AI-powered hiring platform backed by top-tier VCs, disclosed it had been hit. And Wiz's cloud scanning data made the scale immediately clear: LiteLLM is present in &lt;strong&gt;36% of cloud environments&lt;/strong&gt;. ~500,000 machines reached. This wasn't a niche tool getting exploited. This was the supply chain for AI.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;40 minutes of exposure. ~500,000 machines reached. 36% of cloud AI environments at risk.&lt;/strong&gt;&lt;br&gt;
The LiteLLM breach is the AI ecosystem's SolarWinds moment — and most teams aren't prepared.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What Actually Happened
&lt;/h2&gt;

&lt;p&gt;LiteLLM is a Python library that provides a unified API across 100+ LLM providers — OpenAI, Anthropic, Cohere, Bedrock, and more. If you're running AI agents in production and you want to switch between model providers without rewriting your code, LiteLLM is the answer most teams reach for. That trust is exactly what made it a target.&lt;/p&gt;

&lt;p&gt;The compromised versions contained code designed to exfiltrate credentials — specifically, the API keys and tokens that LiteLLM is typically configured with to proxy requests to LLM providers. In an AI agent stack, those keys aren't just for one service. They're often for all of them: your OpenAI key, your Anthropic key, your database credentials, your cloud provider tokens. One compromised dependency, one credential dump.&lt;/p&gt;

&lt;p&gt;Mercor disclosed the breach and confirmed the vector. &lt;a href="https://techcrunch.com/2026/03/31/mercor-says-it-was-hit-by-cyberattack-tied-to-compromise-of-open-source-litellm-project/" rel="noopener noreferrer"&gt;TechCrunch reported the details&lt;/a&gt;. The &lt;a href="https://news.ycombinator.com/item?id=47596739" rel="noopener noreferrer"&gt;Hacker News discussion (110 points, 34 comments)&lt;/a&gt; surfaced the broader concern: who else was hit and simply hasn't disclosed yet?&lt;/p&gt;




&lt;h2&gt;
  
  
  The Same Week: Axios Attack and the Delve Scandal
&lt;/h2&gt;

&lt;p&gt;The LiteLLM breach didn't happen in isolation. The same week saw:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Axios npm attack.&lt;/strong&gt; Andrej Karpathy flagged it on X: &lt;a href="mailto:axios@1.14.1"&gt;axios@1.14.1&lt;/a&gt;, the npm HTTP library with 300 million weekly downloads, was compromised via a maintainer account takeover. The supply chain attack pattern — target a widely-trusted package, slip in credential-exfiltrating code — worked the same way across both npm and PyPI simultaneously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Delve scandal.&lt;/strong&gt; Delve, a YC-backed compliance startup, was simultaneously accused of forking open-source tools and reselling them as proprietary software — &lt;a href="https://techcrunch.com/2026/04/01/the-reputation-of-troubled-yc-startup-delve-has-gotten-even-worse/" rel="noopener noreferrer"&gt;a TechCrunch investigation&lt;/a&gt; detailed the allegations. Adding insult to injury: the LiteLLM breach had exposed Mercor's customer list, which included Delve — meaning Delve was on a hacker-targeting shortlist the same week its ethics came under scrutiny.&lt;/p&gt;

&lt;p&gt;Three stories converging in one week isn't coincidence. It's a signal about the structural vulnerability of the AI tooling layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why AI Agent Pipelines Are Uniquely Vulnerable
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The proxy layer problem.&lt;/strong&gt; Libraries like LiteLLM sit in a privileged position: they're the translation layer between your application and every LLM provider you use. They necessarily hold credentials for all of them. Compromise the proxy, and you don't just get one key — you get the whole credential store.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Credentials are exceptionally sensitive.&lt;/strong&gt; In most AI agent deployments, the credentials passed through the LLM proxy include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;LLM provider API keys (which can rack up enormous costs)&lt;/li&gt;
&lt;li&gt;Database read credentials (agents query data)&lt;/li&gt;
&lt;li&gt;Cloud storage tokens (agents access files)&lt;/li&gt;
&lt;li&gt;Service integration keys (agents call external APIs)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The dependency surface is massive and moving fast.&lt;/strong&gt; AI tooling is evolving at a pace that makes rigorous security review nearly impossible at the team level. A project that was legitimate and well-reviewed six months ago may have changed maintainers, added dependencies, or been targeted since.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Most teams never audit their PyPI dependencies.&lt;/strong&gt; Ask yourself: do you know every transitive dependency in your AI agent stack? Do you run integrity checks on installed packages? For most teams, the honest answer is no.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 5-Step Security Checklist for AI Agent Stacks
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Pin Your Dependencies with Hash Verification
&lt;/h3&gt;

&lt;p&gt;Don't just pin versions — verify hashes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate hashes for pinned requirements&lt;/span&gt;
pip-compile &lt;span class="nt"&gt;--generate-hashes&lt;/span&gt; requirements.in &lt;span class="nt"&gt;-o&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# Install with hash verification&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--require-hashes&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Version pinning alone doesn't protect against maintainer account compromise; hash verification does.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Isolate Credential Access by Function
&lt;/h3&gt;

&lt;p&gt;Your LLM proxy should not hold credentials for systems it doesn't need to reach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LLM proxy (LiteLLM):&lt;/strong&gt; Only needs LLM provider API keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent orchestrator:&lt;/strong&gt; Only needs credentials for authorized tools&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory layer:&lt;/strong&gt; Only needs database read/write for its designated tables&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Use separate service accounts and &lt;strong&gt;never pass credentials as plain-text environment variables in containerized environments&lt;/strong&gt; — use a secrets manager.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Enable Dependency Auditing in CI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# GitHub Actions example&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Audit Python dependencies&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;pip install pip-audit&lt;/span&gt;
    &lt;span class="s"&gt;pip-audit --requirement requirements.txt --strict&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Audit npm dependencies&lt;/span&gt;  
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm audit --audit-level=high&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also run &lt;a href="https://github.com/google/osv-scanner" rel="noopener noreferrer"&gt;OSV-Scanner&lt;/a&gt; — it catches issues that pip-audit misses.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Monitor for Anomalous API Key Usage
&lt;/h3&gt;

&lt;p&gt;Set up alerts for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Unusual LLM provider spend spikes&lt;/li&gt;
&lt;li&gt;API calls from unexpected IP ranges or regions&lt;/li&gt;
&lt;li&gt;Off-hours authentication events&lt;/li&gt;
&lt;li&gt;Failed auth attempts against your cloud provider&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A compromised key burning $10K/day in tokens is a signal you want to catch in hours, not on your monthly bill.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Run a Quarterly Dependency Audit
&lt;/h3&gt;

&lt;p&gt;Every 90 days, review your AI stack's dependency tree:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Have any maintainers changed on critical packages?&lt;/li&gt;
&lt;li&gt;Have any packages been flagged on PyPI Safety DB or npm Security Advisories?&lt;/li&gt;
&lt;li&gt;Are you still on pinned versions, or has a &lt;code&gt;pip install --upgrade&lt;/code&gt; crept in?&lt;/li&gt;
&lt;li&gt;Do you know who maintains your top 10 most critical AI dependencies?&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Broader Pattern
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;36% of cloud environments&lt;/strong&gt; run LiteLLM. One compromised maintainer account. 40 minutes of exposure.&lt;/p&gt;

&lt;p&gt;The blast radius of AI tooling supply chain attacks scales with adoption. The most popular packages are the most valuable targets.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Simon Willison — one of the most reliable signal-filters in the AI developer community — flagged supply chain attacks against PyPI and npm in his March newsletter alongside agentic engineering patterns. When Willison puts something in his signal-filtered newsletter, it's past the noise threshold.&lt;/p&gt;

&lt;p&gt;The same week as the LiteLLM breach, Andrej Karpathy personally found a compromised npm package in his own environment. If someone who actively thinks about AI security found one in his local environment, the rate of undetected compromise across production systems is higher than the public disclosure count suggests.&lt;/p&gt;

&lt;p&gt;The question isn't whether your AI stack has been targeted. The question is whether you'd know if it had.&lt;/p&gt;




&lt;h2&gt;
  
  
  What to Do Right Now
&lt;/h2&gt;

&lt;p&gt;If you're running LiteLLM or any AI proxy layer in production:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Immediately&lt;/strong&gt;: Check your installed version. If you auto-updated during the breach window, rotate all credentials the package had access to.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Today&lt;/strong&gt;: Enable hash verification for your Python dependencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;This week&lt;/strong&gt;: Audit what credentials each component of your stack holds. Scope them down.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;This month&lt;/strong&gt;: Set up dependency scanning in CI and usage anomaly alerts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ongoing&lt;/strong&gt;: Treat your AI dependency tree as attack surface, not infrastructure.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The Mercor breach is a disclosure. There are almost certainly others that haven't disclosed yet. Get ahead of it now, while fixing it is still a precaution and not a post-incident response.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sources: &lt;a href="https://techcrunch.com/2026/03/31/mercor-says-it-was-hit-by-cyberattack-tied-to-compromise-of-open-source-litellm-project/" rel="noopener noreferrer"&gt;TechCrunch — Mercor breach&lt;/a&gt; · &lt;a href="https://news.ycombinator.com/item?id=47596739" rel="noopener noreferrer"&gt;HN discussion (110pts)&lt;/a&gt; · &lt;a href="https://techcrunch.com/2026/04/01/the-reputation-of-troubled-yc-startup-delve-has-gotten-even-worse/" rel="noopener noreferrer"&gt;TechCrunch — Delve scandal&lt;/a&gt; · &lt;a href="https://x.com/karpathy/status/2038849654423798197" rel="noopener noreferrer"&gt;@karpathy on axios attack&lt;/a&gt; · &lt;a href="https://simonwillison.net/2026/Apr/2/march-newsletter/" rel="noopener noreferrer"&gt;Simon Willison March newsletter&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>python</category>
      <category>devops</category>
    </item>
    <item>
      <title>Qwen 3.6-Plus: Alibaba's Agent Model Built for Real-World Tasks (Review)</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Thu, 02 Apr 2026 16:13:04 +0000</pubDate>
      <link>https://dev.to/max_quimby/qwen-36-plus-alibabas-agent-model-built-for-real-world-tasks-review-3b5c</link>
      <guid>https://dev.to/max_quimby/qwen-36-plus-alibabas-agent-model-built-for-real-world-tasks-review-3b5c</guid>
      <description>&lt;p&gt;Liquid syntax error: Tag '{%% youtube aNg47-U_x6A %%}' was not properly terminated with regexp: /\%\}/&lt;/p&gt;
</description>
      <category>ai</category>
      <category>agents</category>
      <category>opensource</category>
      <category>review</category>
    </item>
    <item>
      <title>AMD's Lemonade Just Made Every Nvidia-Only AI Guide Obsolete</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Thu, 02 Apr 2026 15:58:38 +0000</pubDate>
      <link>https://dev.to/max_quimby/amds-lemonade-just-made-every-nvidia-only-ai-guide-obsolete-2a3l</link>
      <guid>https://dev.to/max_quimby/amds-lemonade-just-made-every-nvidia-only-ai-guide-obsolete-2a3l</guid>
      <description>&lt;p&gt;Search for "how to run LLMs locally" and count the Nvidia logos. CUDA this, CUDA that. If you own AMD hardware — and statistically, a lot of you do — the local AI ecosystem has treated you like a second-class citizen for years.&lt;/p&gt;

&lt;p&gt;That just changed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Famd-lemonade-local-llm-server-guide-2026-hero.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Famd-lemonade-local-llm-server-guide-2026-hero.png" alt="AMD Lemonade local AI server hero image" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Lemonade is an open-source, AMD-backed local AI server that handles LLM chat, image generation, speech synthesis, and transcription — all from a single install, all running on your hardware, all private. It hit 216 points on Hacker News this week, and the discussion thread tells you everything about why AMD users are paying attention.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;🍋 What Lemonade actually is:&lt;/strong&gt; A 2MB native C++ service that auto-configures for your AMD GPU, NPU, or CPU. It exposes an OpenAI-compatible API at &lt;code&gt;localhost:13305&lt;/code&gt;, meaning any app that talks to OpenAI (VS Code Copilot, Open WebUI, n8n, Continue, hundreds more) works out of the box — pointed at your own machine instead of the cloud. Zero tokens billed. Zero data leaving your network.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why This Matters Right Now
&lt;/h2&gt;

&lt;p&gt;The local AI movement has been building momentum for two years. Ollama proved the concept. LM Studio made it pretty. But both share a dirty secret: &lt;strong&gt;AMD support is an afterthought.&lt;/strong&gt; ROCm drivers are a maze. Getting llama.cpp to build with the right GPU target is a weekend project. Most users give up.&lt;/p&gt;

&lt;p&gt;Lemonade's value proposition is brutally simple: &lt;strong&gt;one install, it detects your hardware, it works.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"If you have an AMD machine and want to run local models with minimal headache… it's really the easiest method. This runs on my NAS, handles my home assistant setup." — HN commenter&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But it's not just ease of use. Lemonade is the &lt;strong&gt;only&lt;/strong&gt; open-source OpenAI-compatible server that offers AMD Ryzen AI NPU acceleration. That's a hardware advantage Nvidia literally cannot match — there is no Nvidia NPU in your laptop.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture: NPU + GPU Hybrid Execution
&lt;/h2&gt;

&lt;p&gt;On Ryzen AI 300/400 series chips (Strix Point, Strix Halo), Lemonade splits the workload:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompt processing (prefill)&lt;/strong&gt; → Offloaded to the NPU, which has superior compute throughput for this specific task. This minimizes Time To First Token (TTFT).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Token generation (decode)&lt;/strong&gt; → Handed to the integrated GPU (iGPU) or discrete GPU, which has better memory bandwidth for sequential token generation.&lt;/p&gt;

&lt;p&gt;This hybrid approach is why a Ryzen AI laptop can feel snappier than raw token-per-second numbers would suggest.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/mcf7dDybUco"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Benchmarks: What Can You Actually Expect?
&lt;/h2&gt;

&lt;p&gt;These are from AMD's own benchmarks on a Ryzen AI 9 HX 370 laptop (Radeon 890M, 32GB LPDDR5X-7500) running DeepSeek-R1-Distill-Llama-8B at INT4:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Context Length&lt;/th&gt;
&lt;th&gt;Time to First Token&lt;/th&gt;
&lt;th&gt;Tokens/Second&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;128 tokens&lt;/td&gt;
&lt;td&gt;0.94s&lt;/td&gt;
&lt;td&gt;20.7 tok/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;256 tokens&lt;/td&gt;
&lt;td&gt;1.14s&lt;/td&gt;
&lt;td&gt;20.5 tok/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;512 tokens&lt;/td&gt;
&lt;td&gt;1.65s&lt;/td&gt;
&lt;td&gt;20.0 tok/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1024 tokens&lt;/td&gt;
&lt;td&gt;2.68s&lt;/td&gt;
&lt;td&gt;19.2 tok/s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2048 tokens&lt;/td&gt;
&lt;td&gt;5.01s&lt;/td&gt;
&lt;td&gt;17.6 tok/s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Those are &lt;strong&gt;integrated graphics&lt;/strong&gt; numbers. Not a $1,500 discrete GPU — a laptop chip.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;📊 Community benchmarks from Strix Halo (128GB):&lt;/strong&gt; GPT-OSS 120B at ~50 tok/s • Qwen3-Coder-Next at 43 tok/s (Q4) • Qwen3.5 35B-A3B at 55 tok/s (Q4) • Qwen3.5 27B at 11-12 tok/s (Q4, dense architecture). Yes — a 120B parameter model running at 50 tokens/second on a desktop APU with no discrete GPU.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Setup: From Zero to Running in Under 5 Minutes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Windows (Recommended)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Download the installer from GitHub releases&lt;/span&gt;
&lt;span class="c"&gt;# https://github.com/lemonade-sdk/lemonade/releases/latest&lt;/span&gt;
&lt;span class="c"&gt;# Run Lemonade_Server_Installer.exe&lt;/span&gt;

&lt;span class="c"&gt;# 2. Select your models during installation&lt;/span&gt;
&lt;span class="c"&gt;# The installer auto-detects your GPU/NPU&lt;/span&gt;

&lt;span class="c"&gt;# 3. Launch from desktop shortcut — that's it.&lt;/span&gt;
&lt;span class="c"&gt;# Server runs at http://localhost:13305&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Linux (Ubuntu/Fedora)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Ubuntu (snap)&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;snap &lt;span class="nb"&gt;install &lt;/span&gt;lemonade-server

&lt;span class="c"&gt;# Fedora (RPM)&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;dnf &lt;span class="nb"&gt;install &lt;/span&gt;lemonade-server

&lt;span class="c"&gt;# Start the server&lt;/span&gt;
lemonade run Gemma-3-4b-it-GGUF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  macOS (Beta)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install via the official installer&lt;/span&gt;
&lt;span class="c"&gt;# https://lemonade-server.ai/install_options.html#macos&lt;/span&gt;
lemonade run Gemma-3-4b-it-GGUF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once running, pulling and switching models is dead simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Browse available models&lt;/span&gt;
lemonade list

&lt;span class="c"&gt;# Pull and run a model&lt;/span&gt;
lemonade pull Gemma-3-4b-it-GGUF
lemonade run Gemma-3-4b-it-GGUF

&lt;span class="c"&gt;# Multi-modality&lt;/span&gt;
lemonade run SDXL-Turbo        &lt;span class="c"&gt;# Image gen&lt;/span&gt;
lemonade run kokoro-v1          &lt;span class="c"&gt;# Speech synthesis&lt;/span&gt;
lemonade run Whisper-Large-v3-Turbo  &lt;span class="c"&gt;# Transcription&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Connecting Apps: The OpenAI-Compatible Trick
&lt;/h2&gt;

&lt;p&gt;Because Lemonade exposes an OpenAI-standard API, any app that supports custom OpenAI endpoints works immediately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:13305/api/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lemonade&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# required but unused
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Llama-3.2-1B-Instruct-Hybrid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Explain quantum computing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That same endpoint works with VS Code Copilot, Open WebUI, Continue, n8n, and any OpenAI SDK in Python, Node.js, Go, Rust, C#, Java, Ruby, or PHP.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/yZs-Yzl736E"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Lemonade vs. Ollama: The Honest Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Lemonade&lt;/th&gt;
&lt;th&gt;Ollama&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primary focus&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;AMD optimization + multi-modality&lt;/td&gt;
&lt;td&gt;Cross-platform model serving&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GPU support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ROCm (AMD), Vulkan, Metal (beta)&lt;/td&gt;
&lt;td&gt;CUDA (Nvidia), ROCm, Metal&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NPU support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ XDNA2 (Ryzen AI 300/400)&lt;/td&gt;
&lt;td&gt;❌ None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Modalities&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Chat, Vision, Image Gen, TTS, STT&lt;/td&gt;
&lt;td&gt;Chat, Vision&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API compatibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OpenAI, Ollama, Anthropic&lt;/td&gt;
&lt;td&gt;Ollama, OpenAI (partial)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multiple models&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Simultaneously&lt;/td&gt;
&lt;td&gt;One at a time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mobile app&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ iOS + Android&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Binary size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~2MB (server)&lt;/td&gt;
&lt;td&gt;~200MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Bottom line:&lt;/strong&gt; If you're on AMD hardware, Lemonade is the better choice. If you need Nvidia CUDA support or the simplest possible cross-platform install, Ollama is still the safer bet.&lt;/p&gt;

&lt;p&gt;One HN user ran a direct comparison on an M1 Max MacBook:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Model: qwen3.59b. Ollama completed in about 1:44. Lemonade completed in about 1:14. So it seems faster in this very limited test."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The NPU Question: Is It Worth It?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What NPUs are good for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Low-power "always-on" inference for small models (1-4B parameters)&lt;/li&gt;
&lt;li&gt;Accelerating prompt processing (prefill) in hybrid mode&lt;/li&gt;
&lt;li&gt;Running AI tasks without touching your GPU&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What NPUs are NOT good for (yet):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Running large models (&amp;gt;10B parameters)&lt;/li&gt;
&lt;li&gt;Matching discrete GPU speeds for raw token generation&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ NPU reality check:&lt;/strong&gt; The NPU kernels used by Lemonade's FastFlowLM backend are proprietary (free for reasonable commercial use). The llama.cpp GPU path remains fully open. If you're on a Strix Halo with 128GB RAM, the GPU path is fast enough that NPU acceleration is a nice-to-have.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What's Coming Next
&lt;/h2&gt;

&lt;p&gt;The Lemonade roadmap is active:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MLX support&lt;/strong&gt; — for better Apple Silicon performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vLLM support&lt;/strong&gt; — for high-throughput serving&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enhanced custom model support&lt;/strong&gt; — easier GGUF/ONNX imports from Hugging Face&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With Ubuntu 26.04 LTS adding native AMD NPU support and Lemonade 10.0 shipping Linux NPU support via FastFlowLM, Linux users are getting first-class treatment too.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;The llama.cpp creator Georgi Gerganov just joined Hugging Face — a consolidation event for the open-source local AI stack. Google's TurboQuant paper demonstrated KV cache compression to 3 bits, potentially slashing memory requirements. The infrastructure for running capable AI on consumer hardware is converging fast.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I find it very frustrating to get LLMs, diffusion, etc. working fast on AMD. It's way too much work." — HN commenter, explaining exactly why Lemonade exists&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Lemonade exists because that frustration is real, widespread, and fixable. If you've got AMD silicon, give it a shot. The install is a few minutes, the API is standard, and the models are free.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://lemonade-server.ai" rel="noopener noreferrer"&gt;Lemonade Server&lt;/a&gt; — Official site&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/lemonade-sdk/lemonade" rel="noopener noreferrer"&gt;GitHub Repository&lt;/a&gt; — Source code + releases&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.amd.com/en/developer/resources/technical-articles/unlocking-a-wave-of-llm-apps-on-ryzen-ai-through-lemonade-server.html" rel="noopener noreferrer"&gt;AMD Developer Article&lt;/a&gt; — Technical deep-dive&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://news.ycombinator.com/item?id=47612724" rel="noopener noreferrer"&gt;Hacker News Discussion&lt;/a&gt; — Community reactions&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://www.computeleap.com/blog/amd-lemonade-local-llm-server-guide-2026/" rel="noopener noreferrer"&gt;ComputeLeap&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>amd</category>
      <category>opensource</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Best AI Agent Orchestration Tools in 2026: From Superpowers to oh-my-claudecode</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Tue, 31 Mar 2026 17:47:22 +0000</pubDate>
      <link>https://dev.to/max_quimby/best-ai-agent-orchestration-tools-in-2026-from-superpowers-to-oh-my-claudecode-1h54</link>
      <guid>https://dev.to/max_quimby/best-ai-agent-orchestration-tools-in-2026-from-superpowers-to-oh-my-claudecode-1h54</guid>
      <description>&lt;p&gt;We've crossed a threshold. Single-agent coding assistants — Copilot, Claude Code, Codex — are table stakes. The frontier has shifted to &lt;strong&gt;multi-agent orchestration&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;&lt;a href="https://www.agentconn.com/blog/best-ai-agent-orchestration-tools-2026/" rel="noopener noreferrer"&gt;Read the full version with charts and embedded sources on AgentConn →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F86q74yv2vtoz8gptw8hn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F86q74yv2vtoz8gptw8hn.png" alt="Hero" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Orchestration Matters Now
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"The unit of software production has changed from team-years to founder-days."&lt;/strong&gt; — &lt;a href="https://x.com/garrytan/status/2038297938892607812" rel="noopener noreferrer"&gt;@garrytan&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;📊 &lt;strong&gt;GitHub Signal:&lt;/strong&gt; 158K+ combined stars across these 6 tools.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  1. Superpowers — 122K★
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/obra/superpowers" rel="noopener noreferrer"&gt;obra/superpowers&lt;/a&gt; — Shell-based agentic skills framework.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/obra/superpowers &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;superpowers &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; ./install.sh
claude &lt;span class="nt"&gt;--skill&lt;/span&gt; skills/my-skill &lt;span class="s2"&gt;"Do Y for this codebase"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. oh-my-claudecode — 15K★
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/Yeachan-Heo/oh-my-claudecode" rel="noopener noreferrer"&gt;Yeachan-Heo/oh-my-claudecode&lt;/a&gt; — Teams-first multi-agent orchestration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; oh-my-claudecode
omcc run &lt;span class="s2"&gt;"implement authentication"&lt;/span&gt; &lt;span class="nt"&gt;--agents&lt;/span&gt; 3 &lt;span class="nt"&gt;--parallel&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/UPtmKh1vMN8"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  3. hermes-agent — 16K★
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;NousResearch/hermes-agent&lt;/a&gt; — The agent that grows with you.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;hermes-agent &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; hermes-agent run &lt;span class="s2"&gt;"refactor auth module"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. learn-claude-code — 42K★
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/shareAI-lab/learn-claude-code" rel="noopener noreferrer"&gt;shareAI-lab/learn-claude-code&lt;/a&gt; — Nano agent harness built from 0 to 1.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/shareAI-lab/learn-claude-code
npx ts-node src/agent.ts &lt;span class="s2"&gt;"add tests for auth module"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/6SnFH43qPAw"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  5. claude-mem — 42K★
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/thedotmack/claude-mem" rel="noopener noreferrer"&gt;thedotmack/claude-mem&lt;/a&gt; — Auto session memory compression.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; claude-mem
claude config plugin add claude-mem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-2038297938892607812-770" src="https://platform.twitter.com/embed/Tweet.html?id=2038297938892607812"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2038297938892607812-770');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2038297938892607812&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;h2&gt;
  
  
  6. AgentScope — 22K★
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/agentscope-ai/agentscope" rel="noopener noreferrer"&gt;agentscope-ai/agentscope&lt;/a&gt; — Visual, auditable multi-agent pipelines.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/rlIy7b-3DC8"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-2038225369191391370-279" src="https://platform.twitter.com/embed/Tweet.html?id=2038225369191391370"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2038225369191391370-279');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2038225369191391370&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Stars&lt;/th&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;Multi-Agent&lt;/th&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Superpowers&lt;/td&gt;
&lt;td&gt;122K&lt;/td&gt;
&lt;td&gt;Shell&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;oh-my-claudecode&lt;/td&gt;
&lt;td&gt;15K&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;shared&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hermes-agent&lt;/td&gt;
&lt;td&gt;16K&lt;/td&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;learns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;learn-claude-code&lt;/td&gt;
&lt;td&gt;42K&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;claude-mem&lt;/td&gt;
&lt;td&gt;42K&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;td&gt;plugin&lt;/td&gt;
&lt;td&gt;auto&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;agentscope&lt;/td&gt;
&lt;td&gt;22K&lt;/td&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;em&gt;Full article: &lt;a href="https://www.agentconn.com/blog/best-ai-agent-orchestration-tools-2026/" rel="noopener noreferrer"&gt;agentconn.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://www.agentconn.com/blog/best-ai-agent-orchestration-tools-2026/" rel="noopener noreferrer"&gt;Full article on AgentConn →&lt;/a&gt;&lt;/strong&gt; | Follow &lt;a href="https://x.com/ComputeLeapAI" rel="noopener noreferrer"&gt;@ComputeLeapAI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>opensource</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>How a 5-Person AI Startup Outperforms Teams of 25 (With AI Coding Agents)</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Tue, 31 Mar 2026 17:46:01 +0000</pubDate>
      <link>https://dev.to/max_quimby/how-a-5-person-ai-startup-outperforms-teams-of-25-with-ai-coding-agents-5d7d</link>
      <guid>https://dev.to/max_quimby/how-a-5-person-ai-startup-outperforms-teams-of-25-with-ai-coding-agents-5d7d</guid>
      <description>&lt;p&gt;&lt;a href="/blog/ai-coding-agents-startup-hero.png" class="article-body-image-wrapper"&gt;&lt;img src="/blog/ai-coding-agents-startup-hero.png" alt="AI coding agents powering a small startup team — multiple monitors showing code streams in a modern office"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/ai-coding-agents-startup-productivity-2026/" rel="noopener noreferrer"&gt;Read the full version with charts and embedded sources on ComputeLeap →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A 12-person company is processing petabytes of fraud data for Fortune 500 clients. Five engineers. No army of contractors. No offshore development center. Just five people, each running three monitors of AI coding agents — and a customer success manager who ships features without ever opening a terminal.&lt;/p&gt;

&lt;p&gt;This isn't a thought experiment. It's Variance, a YC-backed startup that just emerged from three years of stealth with a $21M Series A to tell the story.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;📊 The numbers that matter:&lt;/strong&gt; Variance — 12-person team, 5 engineers — processes petabytes of data for Fortune 500 marketplaces, detected state-sponsored fraud rings during elections, and operates at a scale that would traditionally require 25+ engineers. Their co-founder describes a team where "every engineer runs three monitors of coding agents."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Variance Playbook: What "AI-Native" Actually Looks Like
&lt;/h2&gt;

&lt;p&gt;In a &lt;a href="https://youtube.com/watch?v=JF6XIixstmQ" rel="noopener noreferrer"&gt;recent Y Combinator interview&lt;/a&gt;, Variance's co-founders — who previously built Trust &amp;amp; Safety ML infrastructure at Apple and Discord — described a workflow that makes traditional dev teams look like they're running uphill in mud.&lt;/p&gt;

&lt;p&gt;Every engineer at Variance operates multiple AI coding agents simultaneously. Not copilot-style autocomplete. Autonomous agents that take a task description, read the codebase, write implementation code, run tests, and submit pull requests — while the engineer supervises and reviews across three screens.&lt;/p&gt;

&lt;p&gt;But the most striking detail isn't about the engineers. It's about their customer success manager. This non-technical team member ships production features to enterprise clients using &lt;a href="https://cursor.sh" rel="noopener noreferrer"&gt;Cursor&lt;/a&gt;'s agent mode — without ever filing an engineering ticket. She describes what the customer needs, the agent writes the code, and the feature goes live after a quick review.&lt;/p&gt;

&lt;p&gt;That's the inflection point. When non-engineers start shipping code, the bottleneck isn't engineering capacity anymore. It's product imagination.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/JF6XIixstmQ"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Why 2026 Is the Tipping Point
&lt;/h2&gt;

&lt;p&gt;This isn't just a Variance story. The entire startup ecosystem is experiencing the same compression.&lt;/p&gt;

&lt;p&gt;Y Combinator president Garry Tan put it bluntly on X last week:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Ftweet-garrytan-unit-of-production.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Ftweet-garrytan-unit-of-production.png" alt="Garry Tan tweet: The unit of software production has changed from team-years to founder-days. Act accordingly." width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://x.com/garrytan/status/2038297938892607812" rel="noopener noreferrer"&gt;View original post on X →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;He's not being hyperbolic. Tan is so invested in this thesis that he's building &lt;a href="https://github.com/garrytan/gstack" rel="noopener noreferrer"&gt;GStack&lt;/a&gt;, an open-source AI development framework, himself. When the president of the world's top startup accelerator writes code for AI dev tools in his spare time, the signal is deafening.&lt;/p&gt;

&lt;p&gt;And the data from the current YC W26 batch backs it up. Solo founders and two-person teams are shipping products that historically required Series A headcount. The economics have flipped: hiring 15 engineers is now a liability if five engineers with agents can ship faster, iterate quicker, and maintain less organizational overhead.&lt;/p&gt;

&lt;p&gt;Meanwhile, Jason Calacanis — investor and All-In podcast co-host — declared on X that "we've already reached AGI — we just haven't implemented it broadly":&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Ftweet-jason-calacanis-agi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Ftweet-jason-calacanis-agi.png" alt="Jason Calacanis tweet about AGI being already reached but not broadly implemented — 580K views" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://x.com/Jason/status/2038330365601816652" rel="noopener noreferrer"&gt;View original post on X →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Whether you agree with the AGI framing or not, the practical reality is clear: AI coding agents are already delivering a 3-5x productivity multiplier for teams that know how to use them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tools: What's Actually Working in 2026
&lt;/h2&gt;

&lt;p&gt;Not all AI coding tools are created equal. Here's a breakdown of what teams like Variance are actually using, and what each tool does best.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;Pricing&lt;/th&gt;
&lt;th&gt;Autonomy Level&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CLI agent&lt;/td&gt;
&lt;td&gt;Complex multi-file refactors, architecture work, CI/CD integration&lt;/td&gt;
&lt;td&gt;$100/mo (Max) or $20/mo (Pro)&lt;/td&gt;
&lt;td&gt;High — reads codebase, writes code, runs tests, commits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;IDE (VS Code fork)&lt;/td&gt;
&lt;td&gt;Daily coding, non-engineers shipping features, rapid prototyping&lt;/td&gt;
&lt;td&gt;$20/mo (Pro) or $40/mo (Business)&lt;/td&gt;
&lt;td&gt;Medium-High — agent mode handles full tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Codex CLI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Terminal agent&lt;/td&gt;
&lt;td&gt;Code review, parallel task execution, investigation&lt;/td&gt;
&lt;td&gt;$200/mo (ChatGPT Pro)&lt;/td&gt;
&lt;td&gt;High — autonomous with sandbox execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;IDE extension&lt;/td&gt;
&lt;td&gt;Autocomplete, inline suggestions, quick edits&lt;/td&gt;
&lt;td&gt;$10/mo (Individual) or $19/mo (Business)&lt;/td&gt;
&lt;td&gt;Low-Medium — suggestion-based, new agent mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Windsurf&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;IDE (Codeium)&lt;/td&gt;
&lt;td&gt;Budget teams, educational contexts, lighter projects&lt;/td&gt;
&lt;td&gt;Free tier available, $15/mo Pro&lt;/td&gt;
&lt;td&gt;Medium — Cascade agent flow&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚡ The real unlock:&lt;/strong&gt; Most productive teams don't pick one tool. They stack them. Engineers at companies like Variance run Claude Code for complex backend work and architecture, Cursor for frontend iteration and feature work, and Codex CLI for code review and debugging — simultaneously across multiple monitors.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Claude Code: The Power User's Choice
&lt;/h3&gt;

&lt;p&gt;Claude Code is the tool serious engineering teams gravitate toward. It runs in your terminal, reads your entire codebase (up to 1M tokens of context), and operates as an autonomous agent — not just an autocomplete engine.&lt;/p&gt;

&lt;p&gt;What makes it different: Claude Code understands project architecture. It reads your &lt;code&gt;CLAUDE.md&lt;/code&gt; files for project conventions, uses hooks for CI integration, and can run cloud sessions that follow PRs and auto-fix CI failures while you sleep. Anthropic's recent additions — conditional hooks, cloud auto-fix, and Dispatch (text Claude from your phone, it takes over your desktop) — are turning it from a coding tool into a full development platform.&lt;/p&gt;

&lt;p&gt;The three-hour advanced course from Nick Saraev is the best practical resource for teams getting started:&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/UPtmKh1vMN8"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  Cursor: The Gateway Drug
&lt;/h3&gt;

&lt;p&gt;Cursor is what gets non-engineers coding. Its VS Code-based interface is familiar, its agent mode is powerful enough to handle full feature implementations, and its learning curve is gentle enough that a customer success manager at Variance ships production code with it.&lt;/p&gt;

&lt;p&gt;For teams with mixed technical backgrounds, Cursor is the highest-leverage starting point. The agent mode handles everything from reading existing code to writing tests to explaining what it did — in a visual interface that doesn't require terminal comfort.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Multi-Agent Setup
&lt;/h3&gt;

&lt;p&gt;The most productive teams in 2026 aren't using one AI tool. They're running a fleet. Here's what a typical engineer's setup looks like at an AI-native startup:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monitor 1 — Claude Code (Architecture &amp;amp; Backend)&lt;/strong&gt;&lt;br&gt;
Complex multi-file changes, database migrations, API design, infrastructure work. Claude Code's deep context window and CLAUDE.md project conventions make it ideal for work that requires understanding the full system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monitor 2 — Cursor (Feature Development &amp;amp; Frontend)&lt;/strong&gt;&lt;br&gt;
Rapid iteration on features, UI work, quick bug fixes. Agent mode for new features; tab-complete for small edits. This is where the fast, visual feedback loop lives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Monitor 3 — Codex CLI or Review Dashboard&lt;/strong&gt;&lt;br&gt;
Code review, test execution monitoring, debugging investigations. Some engineers use this screen for a second Claude Code session running independent tasks in parallel.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Practical Setup: Getting Your Team Started
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Week 1: Foundation
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Pick your primary agent.&lt;/strong&gt; If your team is mostly engineers, start with Claude Code. If you have non-technical team members who need to ship, start with Cursor.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Create your &lt;code&gt;CLAUDE.md&lt;/code&gt; (or equivalent project config).&lt;/strong&gt; This is the single most impactful thing you can do. Document your coding conventions, architecture decisions, testing requirements, and deployment process. Every AI agent reads these files and follows them — it's like onboarding a new developer in 30 seconds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Start with contained tasks.&lt;/strong&gt; Don't hand the agent your entire roadmap on day one. Start with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Writing unit tests for existing code&lt;/li&gt;
&lt;li&gt;Bug fixes with clear reproduction steps&lt;/li&gt;
&lt;li&gt;Documentation generation&lt;/li&gt;
&lt;li&gt;Refactoring functions the team already understands&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Week 2: Expand
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Add a second tool.&lt;/strong&gt; If you started with Claude Code, add Cursor for your frontend work. If you started with Cursor, add Claude Code for your complex backend tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Enable CI integration.&lt;/strong&gt; Claude Code's hooks system can auto-fix failing CI. Set it up so the agent catches lint errors, type issues, and test failures before they hit your PR review queue.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Let a non-engineer try.&lt;/strong&gt; Give your most technically curious non-engineer a Cursor seat and a well-defined feature request. You'll be surprised.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Week 3+: Scale
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Run parallel agent sessions.&lt;/strong&gt; Each engineer should be comfortable running 2-3 agent sessions simultaneously — one per task stream.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Establish review protocols.&lt;/strong&gt; AI-generated code still needs human review. Set up your code review process explicitly: what to look for, what the agents get wrong, and what patterns to enforce.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  What to Delegate vs. What to Keep Human
&lt;/h2&gt;

&lt;p&gt;This is where most teams get it wrong. They either under-delegate (using AI as fancy autocomplete) or over-delegate (trusting agents with architectural decisions they shouldn't make).&lt;/p&gt;
&lt;h3&gt;
  
  
  Delegate to AI Agents ✅
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Boilerplate and scaffolding&lt;/strong&gt; — CRUD endpoints, model definitions, form components, API clients&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test writing&lt;/strong&gt; — Unit tests, integration tests, test data generation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bug fixes with clear repro steps&lt;/strong&gt; — Stack traces, error messages, reproduction paths&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refactoring&lt;/strong&gt; — Renaming, extracting functions, migrating patterns across files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt; — API docs, README files, inline comments, changelog entries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code review first pass&lt;/strong&gt; — Style violations, common bugs, missing error handling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data transformations&lt;/strong&gt; — ETL scripts, format conversions, migration scripts&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Keep Human 🧠
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Architecture decisions&lt;/strong&gt; — Service boundaries, database choices, API contract design&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security-critical code&lt;/strong&gt; — Authentication flows, encryption, access control, input validation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business logic validation&lt;/strong&gt; — Does this feature actually solve the customer's problem?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance optimization&lt;/strong&gt; — Agents can profile, but humans need to decide what tradeoffs to accept&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incident response&lt;/strong&gt; — When production breaks at 3 AM, you need human judgment about risk and rollback&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hiring and team decisions&lt;/strong&gt; — AI makes your existing team more productive. It doesn't replace the need for the right people.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ The "Entertainment Purposes" Warning:&lt;/strong&gt; Microsoft recently added "for entertainment purposes only" to Copilot's Terms of Service — while simultaneously marketing it as an enterprise productivity tool. This is the legal reality of AI-generated code in 2026: the tools are powerful, but liability sits with you. Always review, always test, and never ship agent-generated code to production without human verification of security-critical paths.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  The Honest Limitations
&lt;/h2&gt;

&lt;p&gt;We're bullish on AI coding agents. We're also engineers. Here's what doesn't work yet.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Novel Architecture Is Still Hard
&lt;/h3&gt;

&lt;p&gt;AI agents excel at implementing patterns they've seen in training data. Ask Claude Code to build a standard REST API, and it'll produce excellent code. Ask it to design a novel event-sourcing architecture for your specific domain constraints, and you'll get something that looks right but misses subtle requirements. Agents implement. Humans architect.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Context Windows Have Limits
&lt;/h3&gt;

&lt;p&gt;Even Claude's 1M token context window has boundaries. Large monorepos with hundreds of services still overwhelm agents. The workaround: structure your codebase into well-defined modules with clear interfaces. Good architecture isn't just for humans anymore — it's for your AI agents too.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Debugging Novel Failures
&lt;/h3&gt;

&lt;p&gt;When the bug is a known pattern — null pointer, race condition, off-by-one — agents are excellent debuggers. When the failure is a novel interaction between your specific library versions, infrastructure configuration, and business logic, agents struggle. They'll suggest plausible fixes that don't address the root cause. For hard bugs, agents are research assistants, not fixers.&lt;/p&gt;
&lt;h3&gt;
  
  
  4. The Security Surface Area
&lt;/h3&gt;

&lt;p&gt;Every AI agent that reads your codebase is a potential data exposure vector. The &lt;a href="https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan" rel="noopener noreferrer"&gt;Axios NPM supply chain compromise&lt;/a&gt; that hit Hacker News today (1,588 points) is a reminder: your dependency chain is your attack surface. AI agents that run arbitrary shell commands add another dimension to that surface. Sandboxing, network isolation, and review gates aren't optional.&lt;/p&gt;
&lt;h3&gt;
  
  
  5. The "Looks Right" Problem
&lt;/h3&gt;

&lt;p&gt;AI-generated code compiles, passes tests, and looks clean. It can also contain subtle logic errors that only surface under specific conditions. The agents are getting better at this — Claude Opus 4.6 catches many of its own mistakes — but human review remains non-negotiable for anything customer-facing.&lt;/p&gt;

&lt;p&gt;A Google DeepMind researcher shared how he stopped writing progress indicators in his code entirely — instead, he just asks a Codex session for ETAs:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Ftweet-giffmana-codex-eta.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Ftweet-giffmana-codex-eta.png" alt="Lucas Beyer tweet about using Codex for ETAs instead of writing progress indicators — showing terminal output" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://x.com/giffmana/status/2038225369191391370" rel="noopener noreferrer"&gt;View original post on X →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That's a creative use case — but it also reveals how deeply these agents are integrating into developer workflows. The integration is happening whether the limitations are solved or not.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Economics: Why This Changes Startup Strategy
&lt;/h2&gt;

&lt;p&gt;The math is simple and brutal.&lt;/p&gt;

&lt;p&gt;A 25-person engineering team at Bay Area market rates costs roughly &lt;strong&gt;$6-8M per year&lt;/strong&gt; in fully-loaded compensation. A 5-person team with AI agent tooling costs &lt;strong&gt;$1.5-2M per year&lt;/strong&gt; in compensation plus maybe &lt;strong&gt;$50K-100K per year&lt;/strong&gt; in AI tool subscriptions.&lt;/p&gt;

&lt;p&gt;That's a 4-5x cost reduction with comparable (and sometimes superior) output velocity. For startups, this isn't just an efficiency gain — it's a fundamentally different funding equation. You need less capital, which means less dilution, which means more optionality.&lt;/p&gt;

&lt;p&gt;Variance raised $21M at a point where many comparably-capable companies would have needed $50M+. They're not being capital-efficient because they're scrappy. They're capital-efficient because AI agents changed the production function.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;📊 The funding implications:&lt;/strong&gt; If 5 engineers with AI agents match the output of 25 engineers without them, the Series A you need drops from $15M to $5M. That's not just less dilution — it's a completely different relationship with your investors. You can be profitable earlier, default alive sooner, and keep strategic control longer.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  What Happens Next
&lt;/h2&gt;

&lt;p&gt;Three trends to watch:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Agent-to-agent collaboration.&lt;/strong&gt; Today, each agent session is independent. The next step — already emerging in tools like OpenClaw and Paperclip — is agents that coordinate with each other. One agent writes the feature, another writes the tests, a third reviews both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Non-engineer builders at scale.&lt;/strong&gt; Variance's customer success manager is an early signal. Within 12 months, expect product managers, designers, and ops teams at AI-native companies to routinely ship code through agent interfaces. The title "developer" will increasingly describe a skill set, not a job title.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. The agency model disruption.&lt;/strong&gt; If a 5-person startup can match a 25-person team, what happens to software consultancies and agencies? They either adopt agents at the same rate (compressing team sizes and billing models) or they get undercut by solo operators and tiny teams who can.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Betting Markets Agree: Tech Layoffs Are Coming
&lt;/h2&gt;

&lt;p&gt;This isn't just anecdotal. &lt;a href="https://polymarket.com" rel="noopener noreferrer"&gt;Polymarket&lt;/a&gt; — the world's largest prediction market — has real money backing these trends:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Market&lt;/th&gt;
&lt;th&gt;Odds&lt;/th&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tech layoffs up in 2026?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;93% Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$10.4K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tech layoffs up in Q1 2026?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;86% Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$56&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;US unemployment hits 5.0%+ in 2026?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;60% Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$344K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI bubble burst by Dec 2026?&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;22% Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$3M&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Ftweet-polymarket-tech-layoffs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.computeleap.com%2Fblog%2Ftweet-polymarket-tech-layoffs.png" alt="Polymarket prediction market: Tech Layoffs Up or Down in 2026 — 93% odds on Up, sharp climb from 50% to 93% in one week" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://polymarket.com/event/tech-layoffs-up-or-down-in-2026" rel="noopener noreferrer"&gt;View market on Polymarket →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Bettors with real money on the line — $344K on the unemployment market alone — overwhelmingly expect tech layoffs to &lt;strong&gt;increase&lt;/strong&gt; this year. The 93% consensus on rising tech layoffs isn't speculation. It's the market pricing in exactly what Variance is demonstrating: five engineers with AI agents replace twenty-five without them.&lt;/p&gt;

&lt;p&gt;The uncomfortable math: if a 5-person startup matches a 25-person team, that's an 80% headcount reduction at equivalent output. Scale that across the industry — where &lt;a href="https://www.youtube.com/shorts/OtSSMApH7do" rel="noopener noreferrer"&gt;CS grad placement has already collapsed from 89% to 19%&lt;/a&gt; and &lt;a href="https://www.youtube.com/watch?v=PTZ5iN9nDkY" rel="noopener noreferrer"&gt;solo-founded companies now make up 36% of new startups&lt;/a&gt; (up from 23% five years ago) — and the prediction markets are pricing in the inevitable.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The signal isn't that AI replaces engineers.&lt;/strong&gt; It's that AI makes small teams so productive that large teams become a competitive &lt;em&gt;disadvantage&lt;/em&gt;. The overhead of coordination, communication, and management doesn't scale down — it just becomes unnecessary weight.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  Getting Started Today
&lt;/h2&gt;

&lt;p&gt;If you're a startup founder or engineering lead reading this, here's the 30-minute version:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Sign up for Claude Code Max&lt;/strong&gt; ($100/month) or &lt;strong&gt;Cursor Pro&lt;/strong&gt; ($20/month). Pick based on your team's terminal comfort level.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create a &lt;code&gt;CLAUDE.md&lt;/code&gt; file&lt;/strong&gt; in your repo root documenting your project's conventions, architecture, and testing requirements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Give the agent a real task&lt;/strong&gt; — not a toy demo. A bug fix. A feature. A test suite. Something that would normally take 2-4 hours of human time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Measure the actual time savings&lt;/strong&gt; including review time. Your first task might be slower (learning curve). Your fifth task will blow your mind.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add a second agent tool&lt;/strong&gt; within two weeks. The multi-agent setup is where the 3-5x multiplier lives.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The companies that figure this out first don't just move faster. They win markets while competitors are still hiring.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/zgxorh9LhiE"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The AI coding landscape moves fast. We track the latest tools, benchmarks, and real-world case studies weekly. Follow &lt;a href="https://www.computeleap.com" rel="noopener noreferrer"&gt;ComputeLeap&lt;/a&gt; for analysis that cuts through the hype.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Related: &lt;a href="https://dev.to/blog/claude-code-complete-guide-2026/"&gt;Complete Guide to Claude Code in 2026&lt;/a&gt; · &lt;a href="https://dev.to/blog/best-ai-coding-assistants-compared-2026/"&gt;Best AI Coding Assistants Compared&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/ai-coding-agents-startup-productivity-2026/" rel="noopener noreferrer"&gt;Full article with charts and interactive sources on ComputeLeap →&lt;/a&gt;&lt;/strong&gt; | Follow &lt;a href="https://x.com/ComputeLeapAI" rel="noopener noreferrer"&gt;@ComputeLeapAI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>startup</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Vibe Coding in 2026: How Founders Are Building Real Products Without Engineering Teams</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Tue, 31 Mar 2026 15:13:55 +0000</pubDate>
      <link>https://dev.to/max_quimby/vibe-coding-in-2026-how-founders-are-building-real-products-without-engineering-teams-foh</link>
      <guid>https://dev.to/max_quimby/vibe-coding-in-2026-how-founders-are-building-real-products-without-engineering-teams-foh</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq242qno8fzd96lc3v1oa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq242qno8fzd96lc3v1oa.png" alt="Vibe Coding in 2026 — a founder building software with AI coding tools" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/vibe-coding-founders-building-real-products-2026/" rel="noopener noreferrer"&gt;Read the full version with charts and embedded sources on ComputeLeap →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Chamath Palihapitiya built a replacement HR system for his company on a Sunday. Not a prototype — a working system that replaced the vendor. Jason Freeberg shipped annotated.com, a project he'd been thinking about for 15 years, in a single weekend. Neither of them wrote code in the traditional sense. They described what they wanted, and AI built it.&lt;/p&gt;

&lt;p&gt;This isn't a hypothetical future. It's happening right now, in March 2026, and the results are forcing everyone — founders, engineers, investors — to recalibrate what's possible.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"The unit of software production has changed from team-years to founder-days. Act accordingly."&lt;/strong&gt; — Garry Tan, Y Combinator CEO, March 29, 2026&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But here's the thing: vibe coding isn't magic. It's a skill with a workflow, a toolchain, and very real limits. This guide breaks down what's actually working, what breaks, and how to get started — without the hype.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Vibe Coding?
&lt;/h2&gt;

&lt;p&gt;The term comes from Andrej Karpathy, former Tesla AI director and OpenAI researcher, who coined it in early 2025:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"There's a new kind of coding I call 'vibe coding,' where you fully give in to the vibes, embrace exponentials, and forget that the code even exists."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Karpathy wasn't describing sloppy work. He was describing a fundamental shift in how software gets built: instead of writing code line by line, you describe what you want in natural language, and an AI agent writes, tests, and iterates on the implementation. You steer with intent. The AI handles syntax.&lt;/p&gt;

&lt;p&gt;In practice, vibe coding means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Describing&lt;/strong&gt; features in plain English (or any language)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reviewing&lt;/strong&gt; what the AI generates — not writing it from scratch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Iterating&lt;/strong&gt; through conversation — "make the sidebar collapsible" or "add error handling for the API timeout case"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Testing&lt;/strong&gt; by using the app, not by reading every line of code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shipping&lt;/strong&gt; when it works, not when you understand every implementation detail&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's closer to being a product manager who can deploy than a programmer who designs products. And in 2026, the tools have gotten good enough that this actually works for real applications.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/4Gmd5UTF4rk"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  The Evidence: Real Products Built Without Engineers
&lt;/h2&gt;

&lt;p&gt;Let's start with what's actually been shipped. Not demos. Not "Hello World" apps. Real products that people use.&lt;/p&gt;

&lt;h3&gt;
  
  
  The All-In Podcast Revelations
&lt;/h3&gt;

&lt;p&gt;The All-In crew — four billionaire tech investors who collectively touch hundreds of companies — have been vibe coding on air. Here's what they've reported:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Chamath Palihapitiya&lt;/strong&gt; replaced his company's HR system by vibe coding it himself on a Sunday. Not a weekend hackathon with a team. Just him, an AI coding agent, and a problem to solve. The old vendor system cost money and was mediocre. The replacement works and does exactly what his company needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Jason Freeberg&lt;/strong&gt; had been thinking about annotated.com for 15 years — a project he never had time to build because building software used to require... well, a lot of building. In a single weekend with AI coding tools, he went from concept to deployed product.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;David Sacks&lt;/strong&gt; described the overall shift: the barrier to testing a business idea is now close to zero. If you have a product idea, you can have a working prototype by Monday morning.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Reality check:&lt;/strong&gt; These are technically sophisticated people. Chamath has a CS background. Sacks has built multiple software companies. They're not "non-technical" — they're technical people who stopped writing code years ago and can now build again. The tools didn't give them knowledge they lacked. The tools removed the friction between their knowledge and a working product.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The YC Signal
&lt;/h3&gt;

&lt;p&gt;Y Combinator's current batch tells the story in numbers. Garry Tan, YC's CEO, has been the most vocal advocate for what's happening:&lt;/p&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-2038297938892607812-360" src="https://platform.twitter.com/embed/Tweet.html?id=2038297938892607812"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2038297938892607812-360');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2038297938892607812&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;p&gt;His follow-up was equally pointed:&lt;/p&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-2038318965741019528-285" src="https://platform.twitter.com/embed/Tweet.html?id=2038318965741019528"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2038318965741019528-285');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2038318965741019528&amp;amp;theme=dark"
  }



&lt;/p&gt;

&lt;p&gt;"Plan well before, test well afterwards" — that's the entire workflow in one sentence. Vibe coding isn't about eliminating skill. It's about changing where the skill gets applied. The thinking moves upstream (what to build, how to architect it) and downstream (testing, iterating, deploying). The middle part — writing the actual code — is increasingly handled by AI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Peter Diamandis: The Macro View
&lt;/h3&gt;

&lt;p&gt;Peter Diamandis, the XPRIZE founder who's been tracking exponential technologies for decades, has been making an even bolder claim: AI can now run significant parts of your company, not just write code.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/AM7g3dL4-f0"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;His argument: the same AI agents that write code can also handle customer support, data analysis, content creation, and operational workflows. Vibe coding is just the most visible manifestation of a broader shift — AI as a general-purpose business tool, not just a developer productivity booster.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tools: Your Vibe Coding Stack
&lt;/h2&gt;

&lt;p&gt;Not all AI coding tools are created equal. Here's an honest breakdown of the major players in March 2026, what each is best at, and who should use what.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Code — The Power User's Choice
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Anthropic's command-line AI coding agent. Runs in your terminal, reads your files, writes code, runs tests, commits to git, executes shell commands.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Developers and technical founders who are comfortable in a terminal. Full-stack applications, complex refactors, CI/CD integration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Massive context window (1M tokens) — it can hold your entire codebase in working memory&lt;/li&gt;
&lt;li&gt;Autonomous agent behavior — give it a task, it figures out the steps&lt;/li&gt;
&lt;li&gt;Direct filesystem and git integration — no copy-paste workflow&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;.claude/&lt;/code&gt; folder system for persistent project memory&lt;/li&gt;
&lt;li&gt;Auto-fix integration with CI pipelines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Limitations:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Terminal-only — no visual interface for non-technical users&lt;/li&gt;
&lt;li&gt;Requires Anthropic Max subscription ($100/month) for Opus-tier model access&lt;/li&gt;
&lt;li&gt;Steeper learning curve than GUI-based alternatives&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Best starting point:&lt;/strong&gt; Install it (&lt;code&gt;npm install -g @anthropic-ai/claude-code&lt;/code&gt;), navigate to a project folder, and type &lt;code&gt;claude&lt;/code&gt;. Then describe what you want in plain English.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Cursor — The IDE Experience
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; A fork of VS Code with deep AI integration. Code editor with AI that can read your codebase, suggest changes across files, and execute multi-step plans.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Developers who want AI assistance within a familiar IDE environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt; Familiar VS Code interface, multi-file editing with diff previews, multiple AI model backends, strong autocomplete alongside agentic mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitations:&lt;/strong&gt; Expensive at scale (Pro + API costs), less autonomous than Claude Code, desktop app dependency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Windsurf (Codeium) — The Balanced Option
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Another AI-enhanced IDE with its own model and "Cascade" agent flow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Developers who want a polished AI IDE experience with good context awareness.&lt;/p&gt;

&lt;h3&gt;
  
  
  Replit Agent — The No-Setup Option
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; A fully browser-based AI coding agent. Describe your app, Replit Agent builds it, deploys it, and gives you a URL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; True non-technical founders. People who want to go from idea to deployed app without installing anything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strengths:&lt;/strong&gt; Zero setup, built-in deployment, database/auth/hosting included, truly accessible to non-programmers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Limitations:&lt;/strong&gt; Less architectural control, vendor lock-in, struggles with complex multi-service architectures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bolt (StackBlitz) — The Rapid Prototyper
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; Browser-based AI app builder focused on speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Rapid prototyping, landing pages, simple web apps. Testing ideas before committing to a full build.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which Tool Should You Pick?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;If you are...&lt;/th&gt;
&lt;th&gt;Start with...&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Technical founder, comfortable with terminal&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Most powerful, most autonomous&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Developer who wants AI in their IDE&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Familiar interface, strong multi-file support&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-technical founder&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Replit Agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Zero setup, built-in hosting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testing an idea quickly&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Bolt&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fastest from idea to visual prototype&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Want a balance of power and polish&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Windsurf&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Good middle ground&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/UPtmKh1vMN8"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  The Workflow: From Idea to Deployed App
&lt;/h2&gt;

&lt;p&gt;Here's the actual process that works. Not theory — this is what founders are using daily.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Define Before You Describe
&lt;/h3&gt;

&lt;p&gt;The biggest mistake is jumping straight into "build me an app." The AI is only as good as your spec.&lt;/p&gt;

&lt;p&gt;Before you open any tool, write down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What the app does&lt;/strong&gt; (one sentence)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Who uses it&lt;/strong&gt; (be specific)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core features&lt;/strong&gt; (3-5 max for v1)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;What success looks like&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Garry Tan's advice: &lt;strong&gt;plan well before.&lt;/strong&gt; Spend 30 minutes thinking. This saves hours of AI-generated code that solves the wrong problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Scaffold With Conversation
&lt;/h3&gt;

&lt;p&gt;Start your AI tool and describe your app in detail. Be specific:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bad:&lt;/strong&gt; "Build me a project management app"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Good:&lt;/strong&gt; "Build a project management app for freelance designers. It needs: a kanban board with drag-and-drop, a client portal where clients can view progress and leave comments, file upload for design deliverables, and email notifications when tasks change status. Use Next.js with TypeScript, Tailwind CSS, and Supabase for the backend."&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Iterate, Don't Rewrite
&lt;/h3&gt;

&lt;p&gt;Treat the AI like a junior developer, not a code generator. After the initial scaffold:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use the app. Click every button. Fill out every form.&lt;/li&gt;
&lt;li&gt;Note what's wrong — but don't fix it yourself.&lt;/li&gt;
&lt;li&gt;Describe the problem to the AI.&lt;/li&gt;
&lt;li&gt;Watch it fix the issue, test again, repeat.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 4: Test Like a User
&lt;/h3&gt;

&lt;p&gt;You probably can't read every line of generated code. Instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Test every user flow end-to-end&lt;/li&gt;
&lt;li&gt;Try to break it — enter weird data, click fast, open multiple tabs&lt;/li&gt;
&lt;li&gt;Have someone else use it without guidance&lt;/li&gt;
&lt;li&gt;If it handles money or sensitive data, get a real engineer to review security&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 5: Deploy and Iterate in Production
&lt;/h3&gt;

&lt;p&gt;Ship early. Ship ugly. Ship with known issues. Get real users touching it, then iterate based on actual feedback.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Vibe Coding Breaks Down
&lt;/h2&gt;

&lt;p&gt;This is the section most guides skip. Here's where it fails.&lt;/p&gt;

&lt;h3&gt;
  
  
  Complex State Management
&lt;/h3&gt;

&lt;p&gt;AI tools struggle with intricate, interrelated state. Think: collaborative document editors with multi-user editing, undo/redo, conflict resolution. The AI can scaffold this, but subtle sync bugs will eat you alive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule:&lt;/strong&gt; If your core value depends on getting state management exactly right, you need an engineer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security-Critical Systems
&lt;/h3&gt;

&lt;p&gt;Vibe coding an app that handles payments, medical records, or auth is dangerous. AI models generate code that &lt;em&gt;works&lt;/em&gt; but may have SQL injection vectors, insecure token storage, or missing input validation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule:&lt;/strong&gt; If a security breach would be catastrophic, get a human security review. No exceptions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performance at Scale
&lt;/h3&gt;

&lt;p&gt;AI-generated code tends to be correct but naive — O(n²) algorithms, N+1 queries, loading datasets into memory. For 100 users, fine. For 100,000, your app falls over.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule:&lt;/strong&gt; If you expect real scale, plan for an engineering hire to optimize critical paths.&lt;/p&gt;

&lt;h3&gt;
  
  
  Large, Evolving Codebases
&lt;/h3&gt;

&lt;p&gt;AI tools work best on greenfield projects. Once your codebase grows beyond ~50,000 lines, the AI starts losing coherence.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📊 &lt;strong&gt;The honest math:&lt;/strong&gt; Vibe coding gets you from 0 to 80% remarkably fast. The last 20% — edge cases, performance, security, scale — still requires traditional engineering. But that 80% used to require a team and months of work. Now it takes one person and a weekend.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Who Should (and Shouldn't) Vibe Code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Vibe Coding Is For You If:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You're a founder validating an idea.&lt;/strong&gt; Get a prototype in front of users before spending $50K on a dev team.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You're a domain expert who can't code.&lt;/strong&gt; Your domain knowledge is the hard part — the coding is now the easy part.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You're a developer who wants to move faster.&lt;/strong&gt; AI for scaffolding, you for critical logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You're building internal tools.&lt;/strong&gt; Lower reliability bar, higher tolerance for iteration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You're prototyping.&lt;/strong&gt; For any prototype, vibe coding is the fastest path.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Vibe Coding Is NOT For You If:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Failure has serious consequences.&lt;/strong&gt; Medical devices, financial trading, infrastructure software.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You need to maintain a complex system over years.&lt;/strong&gt; AI-generated codebases can become unmaintainable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You're competing on technical depth.&lt;/strong&gt; If your moat is engineering quality, you need engineers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You refuse to learn anything.&lt;/strong&gt; You need enough technical literacy to evaluate AI output.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Getting Started Today
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Day 1: Pick One Tool, Build One Thing
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Terminal comfort?&lt;/strong&gt; Install Claude Code, build a task manager.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero setup?&lt;/strong&gt; Go to &lt;a href="https://bolt.new" rel="noopener noreferrer"&gt;bolt.new&lt;/a&gt;, describe a landing page, deploy it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maximum guidance?&lt;/strong&gt; Open &lt;a href="https://replit.com" rel="noopener noreferrer"&gt;Replit&lt;/a&gt;, start an Agent session, describe a simple app.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Day 2: Build Something You Actually Need
&lt;/h3&gt;

&lt;p&gt;A tool your team needs. A prototype of your business idea. An automation for something you do manually.&lt;/p&gt;

&lt;h3&gt;
  
  
  Week 1: Ship to Users
&lt;/h3&gt;

&lt;p&gt;Deploy. Give it to real people. Collect feedback. Iterate.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Cost
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code:&lt;/strong&gt; $20-100/month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cursor:&lt;/strong&gt; Free tier, Pro $20/month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replit:&lt;/strong&gt; Free for basics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bolt:&lt;/strong&gt; Free to start&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windsurf:&lt;/strong&gt; Free tier available&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compare: one software engineer = $150K-250K/year + equity + benefits + onboarding.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Vibe coding in 2026 is real, practical, and changing how software gets built. The evidence is Chamath replacing enterprise software on a Sunday, Garry Tan reframing the unit economics, and thousands of founders shipping products that would have required engineering teams a year ago.&lt;/p&gt;

&lt;p&gt;But it's not a replacement for software engineering. It's a new layer — a way for more people to build more things, faster.&lt;/p&gt;

&lt;p&gt;The barrier to building software just dropped to near zero. What matters now is what you decide to build.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;🚀 &lt;strong&gt;Start now:&lt;/strong&gt; Pick one tool, build one thing this weekend, deploy it. You can't evaluate vibe coding by reading about it — you have to feel the workflow.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/vibe-coding-founders-building-real-products-2026/" rel="noopener noreferrer"&gt;Full article on ComputeLeap →&lt;/a&gt;&lt;/strong&gt; | Follow &lt;a href="https://x.com/ComputeLeapAI" rel="noopener noreferrer"&gt;@ComputeLeapAI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>startup</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Anthropic vs OpenAI 2026: Who’s Actually Winning?</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Sun, 29 Mar 2026 23:06:07 +0000</pubDate>
      <link>https://dev.to/max_quimby/anthropic-vs-openai-2026-whos-actually-winning-2oop</link>
      <guid>https://dev.to/max_quimby/anthropic-vs-openai-2026-whos-actually-winning-2oop</guid>
      <description>&lt;p&gt;In 2021, Dario Amodei walked out of OpenAI with his sister Daniela and roughly 30 researchers. They didn't get fired. They weren't poached. They &lt;em&gt;revolted&lt;/em&gt; — because they believed the company they'd helped build was heading somewhere dangerous.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/anthropic-vs-openai-rivalry-2026/" rel="noopener noreferrer"&gt;Read the full version with charts, Polymarket data, and embedded sources on ComputeLeap →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Three years later, they were an interesting footnote: the safety nerds who left the rocket ship.&lt;/p&gt;

&lt;p&gt;Five years later — right now, in March 2026 — &lt;a href="https://www.wsj.com/tech/ai/the-decadelong-feud-shaping-the-future-of-ai-7075acde" rel="noopener noreferrer"&gt;they're on the Wall Street Journal's front page&lt;/a&gt;, Dario is comparing his former colleagues to Hitler and Stalin, ChatGPT's market share has collapsed from 69% to 45%, and Anthropic is on track to overtake OpenAI in revenue by mid-year.&lt;/p&gt;

&lt;p&gt;The revolt won.&lt;/p&gt;

&lt;p&gt;This isn't a neutral comparison piece. The data has a clear direction, and pretending otherwise would be dishonest. But this also isn't hagiography — Anthropic faces real risks that could reverse everything. We'll get to those.&lt;/p&gt;

&lt;p&gt;First, let's talk about the week that made the outcome undeniable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Week Everything Flipped
&lt;/h2&gt;

&lt;p&gt;The week of March 22–29, 2026 didn't start the reversal. But it made it impossible to ignore.&lt;/p&gt;

&lt;p&gt;In seven days:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic's next-generation model leaked.&lt;/strong&gt; Fortune &lt;a href="https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/" rel="noopener noreferrer"&gt;reported&lt;/a&gt; that roughly 3,000 internal documents were found on a publicly accessible server, revealing a model called "Mythos" — described internally as "by far the most powerful AI model we've ever developed." A new capability tier &lt;em&gt;above&lt;/em&gt; Opus. Reddit exploded: "Anthropic May Have Had An Architectural Breakthrough!" hit 866 upvotes and 302 comments on r/singularity.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Anthropic May Have Had An Architectural Breakthrough!"&lt;/strong&gt; — 866 upvotes, 302 comments on r/singularity&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://reddit.com/r/singularity/comments/1s6hj0n/" rel="noopener noreferrer"&gt;View thread on Reddit →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OpenAI killed Sora.&lt;/strong&gt; The AI video platform that once broke the internet was burning $15 million per day in inference costs — $130 per 10-second clip, $5.4 billion annually. Disney's $1 billion content deal? Dead. CEO Fiji Simo pulled the plug as pre-IPO cleanup.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Claude Code crossed $2.5 billion ARR.&lt;/strong&gt; Nine months from launch to $2.5 billion, &lt;a href="https://www.forbes.com/sites/the-prompt/2026/02/17/anthropic-is-cashing-in-on-claude-codes-success/" rel="noopener noreferrer"&gt;per Forbes&lt;/a&gt; — the fastest B2B product ramp in AI history, now driving over half of Anthropic's enterprise revenue.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Karen Hao's exposé dropped a bomb.&lt;/strong&gt; The investigative journalist's &lt;a href="https://www.youtube.com/watch?v=Cn8HBj8QAbk" rel="noopener noreferrer"&gt;Diary of a CEO interview&lt;/a&gt; drew on 300+ interviews — including 90+ current and former OpenAI employees — for her book &lt;em&gt;Empire of AI&lt;/em&gt;. The NDA wall cracked wide open during IPO prep. Timing? Chef's kiss.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The All-In Podcast called it "generational."&lt;/strong&gt; Chamath Palihapitiya and the Besties — who'd been &lt;a href="https://youtube.com/watch?v=4Gmd5UTF4rk" rel="noopener noreferrer"&gt;skeptical of Anthropic&lt;/a&gt; for two years — used the word that venture capitalists only break out when they're genuinely spooked: this run is &lt;em&gt;generational&lt;/em&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dario Amodei went scorched earth in the WSJ.&lt;/strong&gt; But more on that later.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't a bad quarter for OpenAI. This is a regime change.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The scoreboard, March 29, 2026:&lt;/strong&gt; ChatGPT app market share down from 69% to 45%. Anthropic ARR at ~$19B (up from $9B at end of 2025). Claude Code at $2.5B ARR from zero in 9 months. OpenAI projecting a $14B net loss. Sora dead. Claude hit #1 on the U.S. App Store. Polymarket gives Anthropic a 99% chance of having the best AI model at end of March.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Founding Schism — 2021
&lt;/h2&gt;

&lt;p&gt;To understand why March 2026 matters, you need to understand why Anthropic exists at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why They Left
&lt;/h3&gt;

&lt;p&gt;Dario Amodei was VP of Research at OpenAI. Daniela Amodei was VP of Safety &amp;amp; Policy. They weren't outsiders critiquing from the sidelines — they were the people closest to the work and closest to the risk.&lt;/p&gt;

&lt;p&gt;The disagreements weren't abstract. According to the &lt;a href="https://www.wsj.com/tech/ai/the-decadelong-feud-shaping-the-future-of-ai-7075acde" rel="noopener noreferrer"&gt;WSJ's reporting&lt;/a&gt;, the breaking point came when OpenAI President Greg Brockman floated the idea of selling artificial general intelligence to governments — specifically the nuclear powers on the UN Security Council. Russia. China. The United States.&lt;/p&gt;

&lt;p&gt;Dario considered this "tantamount to treason" and nearly quit on the spot. He demanded direct board reporting and said he couldn't work with Brockman. The relationship was unsalvageable.&lt;/p&gt;

&lt;p&gt;What followed wasn't a quiet departure. Roughly 30 researchers left together — not leaked out over months, but in a coordinated exodus. This was a &lt;em&gt;revolt&lt;/em&gt;, rooted in a specific philosophical conviction: that the path OpenAI was on would end badly, and that there was a better way to build transformative AI.&lt;/p&gt;

&lt;h3&gt;
  
  
  What They Built Instead
&lt;/h3&gt;

&lt;p&gt;Anthropic's founding thesis was deceptively simple: you could build the most capable AI systems &lt;em&gt;and&lt;/em&gt; the most responsible ones. These weren't opposing goals — in fact, the safety research (Constitutional AI, the Responsible Scaling Policy) would produce &lt;em&gt;better models&lt;/em&gt;, not handicapped ones.&lt;/p&gt;

&lt;p&gt;The early years looked like a bet against the market. While OpenAI was shipping consumer features, signing Microsoft deals, and racing to ChatGPT, Anthropic was publishing papers on AI alignment and carefully releasing Claude with guardrails that competitors mocked as overcautious.&lt;/p&gt;

&lt;p&gt;The mockery aged poorly.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Pattern Nobody Noticed
&lt;/h3&gt;

&lt;p&gt;Here's what most analysis of Anthropic gets wrong: the company's contrarian bets weren't principled &lt;em&gt;instead of&lt;/em&gt; strategic. They were principled &lt;em&gt;and&lt;/em&gt; strategic. Refusing to ship features that compromised safety wasn't leaving money on the table — it was building the kind of trust that enterprise customers and developers pay a premium for.&lt;/p&gt;

&lt;p&gt;But that thesis needed time to prove itself. And in 2024, with OpenAI holding 69% market share and a $157 billion valuation, the clock looked like it was running out.&lt;/p&gt;

&lt;p&gt;Then OpenAI started making mistakes.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI's Unforced Errors
&lt;/h2&gt;

&lt;p&gt;The most important thing to understand about OpenAI's decline is that nobody did this to them. Every wound was self-inflicted.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Pentagon Bet That Backfired
&lt;/h3&gt;

&lt;p&gt;In late February 2026, Anthropic walked away from a Pentagon contract for AI systems that could be used in autonomous weaponry and mass surveillance. Their position: the technology wasn't ready for fully autonomous military deployment, and they weren't willing to pretend otherwise.&lt;/p&gt;

&lt;p&gt;The Trump administration's response was immediate and unprecedented: a federal blacklisting of Anthropic from all government agencies. First time a sitting president had targeted an AI lab for &lt;em&gt;refusing&lt;/em&gt; a military contract.&lt;/p&gt;

&lt;p&gt;OpenAI took the deal Anthropic refused. Within 48 hours.&lt;/p&gt;

&lt;p&gt;The consumer backlash was the most expensive PR disaster in AI history.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techcrunch.com/2026/03/02/chatgpt-uninstalls-surged-by-295-after-dod-deal/" rel="noopener noreferrer"&gt;TechCrunch, citing Sensor Tower data&lt;/a&gt;, reported the damage: ChatGPT uninstalls surged &lt;strong&gt;295%&lt;/strong&gt; day-over-day. One-star reviews spiked &lt;strong&gt;775%&lt;/strong&gt; in a single day. ChatGPT downloads dropped 13%. Meanwhile, Claude downloads jumped 51%, and the Claude app hit &lt;strong&gt;#1 on the U.S. App Store&lt;/strong&gt; — leaping 20+ ranks in under a week.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"US judge says Pentagon's blacklisting of Anthropic looks like punishment for its views on AI safety"&lt;/strong&gt; — 2,346 upvotes on r/technology&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://reddit.com/r/technology/comments/1s36sys/" rel="noopener noreferrer"&gt;View thread on Reddit →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;On Reddit, "Cancel your ChatGPT Plus, burn their compute on the way out, and switch to Claude" hit 29,903 upvotes on r/ChatGPT — &lt;em&gt;OpenAI's own subreddit&lt;/em&gt;. The top comment: "Anthropic was founded by people who left OpenAI specifically because they saw the company abandoning its mission. Turns out they were right about every single concern they raised."&lt;/p&gt;

&lt;p&gt;An estimated 1.5 million subscription cancellations followed. A federal judge later said the Pentagon's blacklisting of Anthropic "looks like punishment for its views on AI safety."&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pentagon domino effect:&lt;/strong&gt; Anthropic refuses contract → Trump blacklists Anthropic → OpenAI takes the deal → 295% uninstall spike → 775% 1-star review surge → ~1.5M cancellations → Claude hits #1 App Store. Every step of the sequence was a foreseeable consequence. OpenAI walked into it anyway.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Sora Money Pit
&lt;/h3&gt;

&lt;p&gt;Sora was supposed to be OpenAI's moonshot — the platform that proved AI could create professional-quality video. Instead, it became the most expensive demo reel ever built.&lt;/p&gt;

&lt;p&gt;The numbers are staggering. At peak usage, Sora was generating 11 million clips per day at a compute cost of roughly &lt;a href="https://www.tomsguide.com/ai/openai-just-killed-sora-as-company-readies-ipo-and-new-spud-model" rel="noopener noreferrer"&gt;$130 per 10-second clip&lt;/a&gt;. That's $15 million per day in inference alone — $5.4 billion annualized. OpenAI's adjusted gross margin fell from 40% to 33% before the kill decision was made.&lt;/p&gt;

&lt;p&gt;Disney had signed a $1 billion deal granting Sora access to over 200 characters, including Mickey Mouse and Darth Vader. That deal is now dead. Fiji Simo killed Sora as pre-IPO cleanup, and honestly? It was the most rational decision OpenAI has made in a year.&lt;/p&gt;

&lt;p&gt;But rationality doesn't reverse the narrative damage. Killing your flagship creative product weeks before an IPO tells the market something uncomfortable: we built something we couldn't afford to run.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/wkpUQG7hPNo"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  The Culture Cracking Open
&lt;/h3&gt;

&lt;p&gt;Then there's the human cost. Karen Hao's &lt;em&gt;Empire of AI&lt;/em&gt;, drawn from over 300 interviews including 90+ current and former OpenAI employees, paints a picture of an organization in internal turmoil — and it dropped at the worst possible time for OpenAI's IPO narrative.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/Cn8HBj8QAbk"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;The key pattern Hao documents: every major builder at OpenAI eventually left feeling used, and each one started a direct competitor. Dario Amodei founded Anthropic. Ilya Sutskever founded Safe Superintelligence Inc. Mira Murati founded Thinking Machines Lab. No other tech company has seen its &lt;em&gt;entire original builder team&lt;/em&gt; walk out and compete head-on.&lt;/p&gt;

&lt;p&gt;Hao's reporting also alleges that Altman tailored the AGI narrative depending on his audience — "cure cancer" for Congress, "best assistant ever" for consumers, "$100 billion revenue machine" for Microsoft. Whether that's savvy marketing or something more corrosive is a question each reader can answer for themselves.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Karen Hao Whistleblower Exposed How Sam Altman Allegedly Manipulated Elon Musk"&lt;/strong&gt; — 252 upvotes on r/ArtificialInteligence&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://reddit.com/r/ArtificialInteligence/comments/1s4gdpo/" rel="noopener noreferrer"&gt;View thread on Reddit →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What's not debatable is the timing: when 90+ employees are willing to talk to a journalist during IPO prep, the NDA wall isn't just cracking. It's crumbling.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anthropic's Jiu-Jitsu
&lt;/h2&gt;

&lt;p&gt;Anthropic didn't win by outspending OpenAI or out-hiring them. They won by turning every "no" into a competitive advantage. It's strategic jiu-jitsu — using the opponent's momentum against them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Saying No as Strategy
&lt;/h3&gt;

&lt;p&gt;Consider the pattern:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No to the Pentagon&lt;/strong&gt; → earned public trust → Claude downloads surge 51% → #1 App Store → subscriber growth that would have cost billions to acquire through marketing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No to erotic chatbots&lt;/strong&gt; → maintained the safety brand → became the default choice for enterprise customers who need to explain their AI vendor to a compliance department.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No to shopping features and side quests&lt;/strong&gt; → maintained developer focus → Claude Code dominance → $2.5B ARR from the highest-value customer segment in tech.&lt;/p&gt;

&lt;p&gt;Each "no" looked like leaving money on the table at the time. Collectively, they built a moat that money can't replicate: &lt;em&gt;trust&lt;/em&gt;. In AI, where every customer knows they're handing over sensitive data and critical workflows to a model they can't fully audit, trust isn't a nice-to-have. It's the product.&lt;/p&gt;

&lt;p&gt;If you've been following our &lt;a href="https://dev.to/blog/claude-vs-chatgpt-vs-gemini-2026/"&gt;comparison of Claude, ChatGPT, and Gemini&lt;/a&gt;, this pattern has been building for months. The Pentagon moment just made it visible to everyone else.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Claude Code Phenomenon
&lt;/h3&gt;

&lt;p&gt;Claude Code is the most important product launch in AI since ChatGPT itself — and almost nobody outside the developer community noticed until the revenue numbers forced them to.&lt;/p&gt;

&lt;p&gt;Launched in mid-2025, &lt;a href="https://www.forbes.com/sites/the-prompt/2026/02/17/anthropic-is-cashing-in-on-claude-codes-success/" rel="noopener noreferrer"&gt;Claude Code reached $2.5 billion in annual run-rate revenue&lt;/a&gt; by early 2026, doubling since January. It serves over 300,000 business customers and now drives more than half of Anthropic's enterprise revenue. We've covered the implications of agentic coding in our &lt;a href="https://dev.to/blog/claude-code-remote-tasks-cloud-ai-agents-2026/"&gt;deep dive on Claude Code's remote task capabilities&lt;/a&gt; — what's happening here is bigger than a product launch. It's a platform shift.&lt;/p&gt;

&lt;p&gt;The product itself keeps accelerating: Auto Dream (memory consolidation modeled on human REM sleep), auto-fix in the cloud for CI pipelines, a hooks system for custom workflows, and iMessage integration that signals where agents are heading next. The creator ecosystem is exploding — 5+ tutorial videos per day, ecosystem velocity that exceeds even the early GPT-wrapper era.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Claude Code ARR trajectory:&lt;/strong&gt; $0 at launch (mid-2025) → ~$1.2B (January 2026) → $2.5B (March 2026). Key milestones along the way: Auto Dream memory consolidation, cloud auto-fix for CI pipelines, hooks system for custom workflows, and iMessage integration. The fastest B2B product ramp in AI history — and it's still accelerating.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Matthew Berman's widely-watched model tier list placed Claude as S-tier and ChatGPT as A-tier. For developers, the hierarchy has quietly settled: Claude Code is the tool you reach for first. If you're building AI-powered development workflows, our &lt;a href="https://dev.to/blog/best-ai-coding-assistants-compared-2026/"&gt;comparison of the best AI coding assistants&lt;/a&gt; breaks down exactly why.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mythos — The Next Punch
&lt;/h3&gt;

&lt;p&gt;Then there's the leak that might not have been a leak.&lt;/p&gt;

&lt;p&gt;In late March, Fortune &lt;a href="https://fortune.com/2026/03/26/anthropic-says-testing-mythos-powerful-new-ai-model-after-data-leak-reveals-its-existence-step-change-in-capabilities/" rel="noopener noreferrer"&gt;reported&lt;/a&gt; that roughly 3,000 unsecured digital assets were discovered on a publicly accessible Anthropic server. Among them: documentation for a model called Mythos, described as "by far the most powerful AI model we've ever developed," with dramatically higher scores on software coding, academic reasoning, and cybersecurity benchmarks compared to the current flagship Opus 4.6.&lt;/p&gt;

&lt;p&gt;The most intriguing detail: Mythos represents a new capability tier called "Capybara" — &lt;em&gt;above&lt;/em&gt; Opus. Anthropic has never created a tier above Opus before. The planned release strategy? Cybersecurity organizations first, not the general public. The stated reason: Mythos is "currently far ahead of any other AI model in cyber capabilities" and could enable attacks that "far outpace the efforts of defenders."&lt;/p&gt;

&lt;p&gt;Anthropic called the exposure "human error." Maybe it was. But the timing — right when the competitive narrative favors Anthropic, right when OpenAI is hemorrhaging trust — gave Anthropic the best of both worlds: free publicity for their most impressive model &lt;em&gt;and&lt;/em&gt; the responsible AI narrative of only releasing it to defenders first.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Was the Mythos leak intentional?&lt;/strong&gt; Probably not — 3,000 unsecured files is an embarrassingly large surface area for a controlled leak. But the rapid cleanup, immediate confirmation, and "cybersecurity-first release" narrative suggest Anthropic pivoted fast from "security incident" to "strategic positioning." Never let a crisis go to waste.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Dario Interview That Changed the Tone
&lt;/h2&gt;

&lt;p&gt;On approximately March 10, the Wall Street Journal published &lt;a href="https://www.wsj.com/tech/ai/the-decadelong-feud-shaping-the-future-of-ai-7075acde" rel="noopener noreferrer"&gt;"The Decade-Long Feud Shaping the Future of AI"&lt;/a&gt; — and Dario Amodei stopped being diplomatic.&lt;/p&gt;

&lt;p&gt;The quotes are extraordinary for a sitting CEO of a company valued at $380 billion:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;He compared the legal battle between Sam Altman and Elon Musk to "the fight between Hitler and Stalin."&lt;/p&gt;

&lt;p&gt;He dubbed Greg Brockman's $25 million donation to a pro-Trump super PAC "evil."&lt;/p&gt;

&lt;p&gt;He likened OpenAI and other rivals to "tobacco companies knowingly hawking a harmful product."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"Altman is as evil as Stalin — Dario Amodei"&lt;/strong&gt; — 664 upvotes, 198 comments on r/OpenAI&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://reddit.com/r/OpenAI/comments/1s5zujd/" rel="noopener noreferrer"&gt;View thread on Reddit →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a CEO going on the record in the &lt;em&gt;Wall Street Journal&lt;/em&gt; with language that would get most PR teams fired. The Hitler-Stalin comparison alone would normally be career-ending in corporate America.&lt;/p&gt;

&lt;p&gt;But here's what's interesting: the market didn't punish it. If anything, it accelerated the narrative shift. Why?&lt;/p&gt;

&lt;p&gt;Because Dario wasn't being reckless — he was being &lt;em&gt;specific&lt;/em&gt;. The Hitler-Stalin comparison wasn't about character; it was about the dynamic between Musk and Altman's legal battle, two powerful figures fighting each other while the real stakes (AI governance) went unaddressed. The "tobacco companies" framing wasn't hyperbole; it was a reference to knowingly shipping products with downplayed risks.&lt;/p&gt;

&lt;p&gt;And the "treason" accusation — that Brockman wanted to sell AGI to the UN Security Council nations including Russia and China — wasn't name-calling. It was a description of a specific proposal that, if accurate, represents the most reckless idea in tech history.&lt;/p&gt;

&lt;p&gt;The r/singularity thread hit 591 upvotes. The r/OpenAI thread hit 664. These aren't massive numbers — but they're happening &lt;em&gt;on OpenAI's home turf&lt;/em&gt;. The narrative has shifted from "Anthropic is the underdog" to "Anthropic is the frontrunner who's now willing to play offense."&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/ugWqorspshI"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers Don't Lie
&lt;/h2&gt;

&lt;p&gt;Strip away the narrative. Strip away the Reddit threads and the WSJ quotes and the podcast takes. What do the raw numbers say?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Market share shift (Jan 2025 → Mar 2026):&lt;/strong&gt; ChatGPT app share declined from 69% to 45%. Claude's share rose from ~5% to ~15-20%. The crossover trajectory is clear — and accelerating after the Pentagon backlash. Source: All-In E220, TechCrunch/Sensor Tower data.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;OpenAI (March 2026)&lt;/th&gt;
&lt;th&gt;Anthropic (March 2026)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;App market share&lt;/td&gt;
&lt;td&gt;45% (↓ from 69%)&lt;/td&gt;
&lt;td&gt;~15-20% (↑ rapidly)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Annualized revenue&lt;/td&gt;
&lt;td&gt;~&lt;a href="https://money.usnews.com/investing/news/articles/2026-03-04/openai-tops-25-billion-in-annualized-revenue-last-month-the-information-reports" rel="noopener noreferrer"&gt;$25B&lt;/a&gt;
&lt;/td&gt;
&lt;td&gt;~&lt;a href="https://x.com/thealexbanks/status/2034273131796336703" rel="noopener noreferrer"&gt;$19B&lt;/a&gt; (10× growth/year)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Projected 2026 net income&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.theinformation.com/articles/openai-projections-imply-losses-tripling-to-14-billion-in-2026" rel="noopener noreferrer"&gt;-$14B loss&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Not disclosed (leaner cost structure)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flagship product killed&lt;/td&gt;
&lt;td&gt;Sora ($5.4B/yr burn)&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Military contracts&lt;/td&gt;
&lt;td&gt;Took Pentagon deal&lt;/td&gt;
&lt;td&gt;Refused → blacklisted → won ruling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Developer sentiment&lt;/td&gt;
&lt;td&gt;A-tier (Berman rankings)&lt;/td&gt;
&lt;td&gt;S-tier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Employee morale&lt;/td&gt;
&lt;td&gt;90+ talked to Karen Hao&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Latest model&lt;/td&gt;
&lt;td&gt;GPT-5.4&lt;/td&gt;
&lt;td&gt;Opus 4.6 + Mythos (leaked)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Valuation&lt;/td&gt;
&lt;td&gt;~$340B (pre-IPO)&lt;/td&gt;
&lt;td&gt;~&lt;a href="https://www.cnbc.com/video/2026/03/27/anthropic-eyes-october-ipo---reports.html" rel="noopener noreferrer"&gt;$380B&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://epochai.substack.com/p/anthropic-could-surpass-openai-in" rel="noopener noreferrer"&gt;Epoch AI's analysis&lt;/a&gt; projects the revenue crossover: since each company hit $1 billion in annualized revenue, Anthropic has grown at 10× per year versus OpenAI's 3.4×. If recent trends continue, Anthropic overtakes OpenAI in total revenue by mid-2026.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Revenue crossover projection:&lt;/strong&gt; Anthropic growing at 10× per year vs OpenAI's 3.4× since each hit $1B ARR. Current: OpenAI ~$25B, Anthropic ~$19B. At these rates, Anthropic overtakes OpenAI in total revenue by mid-2026. Source: Epoch AI growth rate analysis.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And the understanding that this isn't just analyst projection — there's real money behind this — is where the numbers get particularly interesting. The cost of reasoning models has been &lt;a href="https://dev.to/blog/hidden-cost-cheap-ai-reasoning-models-2026/"&gt;dropping dramatically&lt;/a&gt;, which favors the company with the more efficient architecture.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Betting Markets Have Already Decided
&lt;/h3&gt;

&lt;p&gt;Polymarket — the prediction market where traders put real money behind their forecasts — tells a story that leaves almost no room for ambiguity.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Market&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Best AI model at end of March 2026&lt;/td&gt;
&lt;td&gt;Anthropic: &lt;strong&gt;100%&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;$16M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best AI model end of April 2026&lt;/td&gt;
&lt;td&gt;Anthropic: &lt;strong&gt;90%&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;$3M&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Will Anthropic or OpenAI IPO first?&lt;/td&gt;
&lt;td&gt;Anthropic: &lt;strong&gt;69%&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;$50.6K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic $500B+ valuation?&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;91%&lt;/strong&gt; yes&lt;/td&gt;
&lt;td&gt;$11K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Mythos released by June 30?&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;70%&lt;/strong&gt; yes&lt;/td&gt;
&lt;td&gt;$37.6K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude 5 released by June 30?&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;59%&lt;/strong&gt; yes&lt;/td&gt;
&lt;td&gt;$3M (161 comments)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic Pentagon deal?&lt;/td&gt;
&lt;td&gt;Only &lt;strong&gt;19%&lt;/strong&gt; yes&lt;/td&gt;
&lt;td&gt;$43.2K&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI has #1 model by June 30?&lt;/td&gt;
&lt;td&gt;Only &lt;strong&gt;29%&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Read those last two lines again. Only 19% of traders think Anthropic will take a Pentagon deal — the market has priced in that Anthropic will continue to say no. And only 29% think OpenAI will reclaim the top model spot by the end of June. With $16 million in volume on the March market alone, this isn't speculation from bored degens — it's institutional-grade conviction.&lt;/p&gt;

&lt;p&gt;The betting markets have already decided. The question is whether the fundamentals agree. So far, they do.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Polymarket snapshot, March 29:&lt;/strong&gt; $16M in volume says Anthropic has the best model. 91% say Anthropic hits $500B+ valuation. Only 29% think OpenAI reclaims #1 by June. When this much money is on the line, sentiment becomes signal.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What Could Go Wrong for Anthropic
&lt;/h2&gt;

&lt;p&gt;The data supports Anthropic winning. But intellectual honesty requires asking: what could reverse this?&lt;/p&gt;

&lt;p&gt;The answer isn't nothing. It's four specific things.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Capacity Problem
&lt;/h3&gt;

&lt;p&gt;Anthropic is growing faster than its infrastructure can handle — and users are noticing.&lt;/p&gt;

&lt;p&gt;On r/ClaudeAI, an open letter titled "Want to free up compute during peak hours?" hit 1,052 upvotes — a rare display of user frustration from Anthropic's most loyal community. The complaint: throttled responses, degraded quality during peak usage, and rate limits that feel punitive for paying customers.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"They're adding 1M users/day... It's a miracle these AI services are up at all." — &lt;a href="https://x.com/Austen/status/2036886520100012459" rel="noopener noreferrer"&gt;@Austen&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Growth this fast can break more than servers. It can break talent pipelines, engineering culture, and the careful quality control that earned Anthropic its reputation. The history of tech is littered with companies that grew faster than their infrastructure — and the ones that survived were the ones that throttled growth until quality caught up. The ones that didn't? Ask anyone who worked at early-growth Twitter.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Mythos Expectations Trap
&lt;/h3&gt;

&lt;p&gt;When you leak documentation calling your next model "by far the most powerful AI model we've ever developed" and create a &lt;em&gt;new tier above your flagship product&lt;/em&gt;, you've set expectations that are nearly impossible to meet.&lt;/p&gt;

&lt;p&gt;If Mythos delivers a genuine step-change — the kind of jump that Opus 4 represented over Claude 3 — Anthropic's lead becomes structural. But if Mythos feels like an incremental improvement with better marketing, the narrative reverses fast. Markets reward expectation beats, not absolute performance.&lt;/p&gt;

&lt;p&gt;The 70% Polymarket odds on Mythos releasing by June 30 mean there's already a countdown clock ticking. Every week that passes without a release builds both anticipation and skepticism.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Government Isn't Done
&lt;/h3&gt;

&lt;p&gt;Anthropic won a court ruling, but they haven't won the war.&lt;/p&gt;

&lt;p&gt;The federal judge said the Pentagon blacklisting "looks like punishment for its views on AI safety" — a meaningful legal signal. But the government has far more tools than lawsuits: executive orders, procurement requirements, export controls, national security designations. The next administration could flip the entire posture. And the current one has &lt;a href="https://www.theverge.com/ai-artificial-intelligence/883456/anthropic-pentagon-department-of-defense-negotiations" rel="noopener noreferrer"&gt;demonstrated willingness&lt;/a&gt; to punish companies that don't align with its AI agenda.&lt;/p&gt;

&lt;p&gt;The Polymarket number here is telling: only 19% think Anthropic will take a Pentagon deal. That's the market pricing in continued refusal — which means continued government friction.&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenAI Isn't Dead
&lt;/h3&gt;

&lt;p&gt;Let's not write the obituary yet.&lt;/p&gt;

&lt;p&gt;OpenAI still holds 45% app market share. They topped &lt;a href="https://money.usnews.com/investing/news/articles/2026-03-04/openai-tops-25-billion-in-annualized-revenue-last-month-the-information-reports" rel="noopener noreferrer"&gt;$25 billion in annualized revenue&lt;/a&gt; as of February. They just raised &lt;a href="https://www.cnbc.com/2026/03/24/openai-secures-an-extra-10-billion-in-record-funding-round-cfo-friar-says.html" rel="noopener noreferrer"&gt;an additional $10 billion&lt;/a&gt;, bringing total funding past $120 billion. They're &lt;a href="https://www.neowin.net/news/openai-to-merge-atlas-browser-chatgpt-and-codex-into-a-single-desktop-super-app/" rel="noopener noreferrer"&gt;hiring aggressively&lt;/a&gt; — from 4,500 to 8,000 employees. And the SuperApp consolidation (merging Atlas browser, ChatGPT, and Codex into a single desktop application) is architecturally sound.&lt;/p&gt;

&lt;p&gt;Most importantly: &lt;a href="https://www.tomsguide.com/ai/openai-just-killed-sora-as-company-readies-ipo-and-new-spud-model" rel="noopener noreferrer"&gt;Tom's Guide reports&lt;/a&gt; that OpenAI is preparing a new model codenamed "Spud" — potentially GPT-6 — and that Sora's compute was freed specifically to train it. If Spud delivers a genuine capability leap, the Polymarket odds reset overnight. Killing Sora was a sacrifice, not a surrender — OpenAI bet that video AI was the wrong game and coding/reasoning is the right one.&lt;/p&gt;

&lt;p&gt;The $14 billion projected loss looks alarming until you remember that OpenAI has $120B+ in backing and is targeting a $1 trillion IPO valuation in H2 2026. They can afford to lose money for a long time — the question is whether that money buys them back the trust they've burned.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The realistic bear case for Anthropic:&lt;/strong&gt; Capacity constraints alienate power users → Mythos underwhelms relative to expectations → Government pressure escalates beyond the courts → OpenAI's "Spud" delivers a genuine GPT-6-level leap → IPO capital gives OpenAI an infrastructure advantage Anthropic can't match. Each of these alone is manageable. Together, they could reverse the narrative.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Uncomfortable Question
&lt;/h2&gt;

&lt;p&gt;Here's what keeps me thinking about this story long after the numbers are tallied.&lt;/p&gt;

&lt;p&gt;If doing the right thing is also the optimal business strategy, what does that mean for every other company?&lt;/p&gt;

&lt;p&gt;Anthropic refused the Pentagon contract — and got rewarded with the #1 App Store position and a subscriber wave that would have cost billions to acquire through paid marketing. They refused to ship erotic chatbots — and earned the enterprise trust that's driving $19 billion in ARR. They focused on developer tools instead of consumer gimmicks — and Claude Code became the fastest B2B product ramp in AI history.&lt;/p&gt;

&lt;p&gt;Every contrarian bet was a bet on the proposition that &lt;em&gt;responsible AI development produces better commercial outcomes&lt;/em&gt;. Not because the market rewards virtue (it usually doesn't), but because in AI specifically, trust is the scarcest resource. When you're asking enterprises to route their most sensitive data through your models, when you're asking developers to build their careers on your platform, when you're asking consumers to trust you with conversations they wouldn't have with another human — the company that demonstrably takes safety seriously has a structural advantage over the company that takes Pentagon contracts and ships adult content.&lt;/p&gt;

&lt;p&gt;This isn't a feel-good story. It's a market story. And if Anthropic's thesis is correct — if principle and profit are genuinely aligned in AI — then every company in tech needs to reconsider the assumption that ethics is a cost center.&lt;/p&gt;

&lt;p&gt;OpenAI's IPO will be the biggest test. It will either be the comeback story of the decade or the most expensive validation of Dario Amodei's original thesis: that the people who left were right all along.&lt;/p&gt;

&lt;p&gt;The betting markets have picked their side. With $16 million in volume.&lt;/p&gt;

&lt;p&gt;The question isn't really who's winning anymore. The question is what it means that this is &lt;em&gt;how&lt;/em&gt; they won.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This article is part of our ongoing coverage of the AI industry landscape. For a direct model comparison, see our &lt;a href="https://dev.to/blog/claude-vs-chatgpt-vs-gemini-2026/"&gt;Claude vs ChatGPT vs Gemini breakdown&lt;/a&gt;. For a deeper look at how Claude Code is reshaping development workflows, read our &lt;a href="https://dev.to/blog/claude-code-remote-tasks-cloud-ai-agents-2026/"&gt;analysis of Claude Code's remote task capabilities&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/anthropic-vs-openai-rivalry-2026/" rel="noopener noreferrer"&gt;Full article with charts and interactive sources on ComputeLeap →&lt;/a&gt;&lt;/strong&gt; | Follow &lt;a href="https://x.com/ComputeLeapAI" rel="noopener noreferrer"&gt;@ComputeLeapAI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>openai</category>
      <category>programming</category>
    </item>
    <item>
      <title>Anthropic Just Turned Claude Into a Desktop Agent. Here's How CoWork Actually Works.</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Sun, 29 Mar 2026 17:43:21 +0000</pubDate>
      <link>https://dev.to/max_quimby/anthropic-just-turned-claude-into-a-desktop-agent-heres-how-cowork-actually-works-4oi0</link>
      <guid>https://dev.to/max_quimby/anthropic-just-turned-claude-into-a-desktop-agent-heres-how-cowork-actually-works-4oi0</guid>
      <description>&lt;p&gt;Anthropic has been quietly shifting Claude from a developer's tool into something anyone can use. CoWork — a sandboxed agent environment built into Claude Desktop — is the clearest signal yet. No terminal. No command line. Just mount your files, describe what you need, and let Claude work.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/claude-cowork-complete-guide-2026/" rel="noopener noreferrer"&gt;Read the full version with charts and embedded sources on ComputeLeap →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This week, Anthropic's Head of Design Jenny Wen sat down with Peter Yang for a 40-minute official walkthrough of CoWork. That's not a casual product update — it's Anthropic telling the market: Claude is for everyone now.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgizm0r8fiq1uw6ic18pb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgizm0r8fiq1uw6ic18pb.png" alt="Claude CoWork desktop agent interface" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's what CoWork actually is, how to set it up, what it's good at, and where it falls short.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Claude CoWork?
&lt;/h2&gt;

&lt;p&gt;CoWork is a tab inside Claude Desktop that gives Claude agent-level capabilities without requiring any technical knowledge. Think of it as Claude Code's non-developer sibling.&lt;/p&gt;

&lt;p&gt;When you open CoWork, you get a sandboxed environment where Claude can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Read and write files&lt;/strong&gt; from folders you mount (documents, spreadsheets, CSVs, PDFs)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run code in the background&lt;/strong&gt; to process data, generate charts, or transform documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Create and edit files&lt;/strong&gt; directly — reports, presentations, cleaned datasets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execute multi-step workflows&lt;/strong&gt; — "analyze this CSV, find the outliers, write a summary, export it as a formatted PDF"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key word is &lt;em&gt;sandboxed&lt;/em&gt;. CoWork runs in an isolated environment on your machine. It can't browse the web, can't access your email, and can't touch anything you haven't explicitly shared with it. This is a deliberate design choice — Anthropic is trading capability for trust.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;CoWork launched in January 2026&lt;/strong&gt; as part of the Claude Desktop app. It's available on the $20/month Pro plan — the same tier that gives you access to Claude Opus 4.6 and extended thinking. No additional cost, no waitlist.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How to Access CoWork
&lt;/h2&gt;

&lt;p&gt;CoWork is available to anyone on Claude Pro ($20/month) or higher. Here's how to get started:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Download Claude Desktop&lt;/strong&gt; from &lt;a href="https://claude.ai/download" rel="noopener noreferrer"&gt;claude.ai/download&lt;/a&gt; if you haven't already (macOS and Windows supported)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sign in&lt;/strong&gt; with your Claude Pro account&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Look for the CoWork tab&lt;/strong&gt; in the left sidebar — it's separate from the standard chat interface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mount a folder&lt;/strong&gt; by clicking the folder icon and selecting a directory from your computer&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. No API keys, no configuration files, no environment variables. You point CoWork at your files and start talking.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your First CoWork Project: A Practical Walkthrough
&lt;/h2&gt;

&lt;p&gt;Let's say you have a folder of monthly sales reports in CSV format and you need a quarterly summary with charts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Open CoWork and mount the folder containing your CSV files.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Tell Claude what you need:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Analyze all the CSV files in this folder. Each one is a monthly sales report. Create a quarterly summary showing total revenue by product category, month-over-month growth rates, and highlight any categories that declined. Export the summary as a formatted markdown report and generate a bar chart comparing the three months."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Watch Claude work. CoWork shows you what it's doing in real-time — reading files, running Python code, generating outputs. You'll see the code it writes and the intermediate results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; Review the outputs. Claude will create the summary report and chart in your mounted folder. If something's off, just tell it: "The chart needs a legend" or "Break down the electronics category by sub-category."&lt;/p&gt;

&lt;p&gt;The feedback loop is conversational. You don't need to understand Python or data analysis — you just need to know what you want.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/rlIy7b-3DC8"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Best Use Cases for CoWork
&lt;/h2&gt;

&lt;p&gt;After testing CoWork extensively and watching what the community is building, these are the use cases where it genuinely shines:&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Analysis and Reporting
&lt;/h3&gt;

&lt;p&gt;This is CoWork's strongest suit. Drop CSVs, Excel files, or JSON data into a mounted folder and ask Claude to analyze, visualize, and summarize. It handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sales reports and financial analysis&lt;/li&gt;
&lt;li&gt;Survey data processing and visualization&lt;/li&gt;
&lt;li&gt;Log file analysis and pattern detection&lt;/li&gt;
&lt;li&gt;Cleaning messy datasets (deduplication, format normalization, missing value handling)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Document Processing
&lt;/h3&gt;

&lt;p&gt;CoWork excels at batch document work that would take hours manually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extracting structured data from PDFs&lt;/li&gt;
&lt;li&gt;Converting between formats (markdown to HTML, CSV to formatted reports)&lt;/li&gt;
&lt;li&gt;Summarizing long documents and flagging key sections&lt;/li&gt;
&lt;li&gt;Generating templated documents from data (contracts, invoices, proposals)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Project Management Artifacts
&lt;/h3&gt;

&lt;p&gt;Need a project plan, Gantt chart, or status report? Mount your project files and let Claude generate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Project timelines and milestone tracking&lt;/li&gt;
&lt;li&gt;Resource allocation summaries&lt;/li&gt;
&lt;li&gt;Risk assessment documents&lt;/li&gt;
&lt;li&gt;Meeting notes → action items → follow-up templates&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Content Creation Workflows
&lt;/h3&gt;

&lt;p&gt;CoWork is particularly good at content workflows where you need Claude to reference existing materials:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Draft blog posts from research notes and outlines&lt;/li&gt;
&lt;li&gt;Create social media calendars from content strategy docs&lt;/li&gt;
&lt;li&gt;Generate email sequences from product briefs&lt;/li&gt;
&lt;li&gt;Build presentation outlines from meeting transcripts&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Power user tip from Jenny Wen's tutorial:&lt;/strong&gt; Start with a "project brief" file in your mounted folder. Write a plain-text document describing what the project is, what the expected outputs are, and any constraints. When you start a CoWork session, tell Claude to read the brief first. This gives it context that persists across the entire session and dramatically improves output quality.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  CoWork vs. Claude Code: When to Use Which
&lt;/h2&gt;

&lt;p&gt;This is the question everyone's asking. Both are Anthropic products. Both give Claude agent capabilities. The difference is who they're built for.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;CoWork&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Claude Code&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Target user&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Non-developers, business users, analysts&lt;/td&gt;
&lt;td&gt;Software developers, engineers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Interface&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GUI in Claude Desktop&lt;/td&gt;
&lt;td&gt;Terminal / CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Environment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Sandboxed file system&lt;/td&gt;
&lt;td&gt;Full system access (with permissions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primary use&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Data analysis, documents, content&lt;/td&gt;
&lt;td&gt;Writing code, debugging, DevOps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;File access&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Mounted folders only&lt;/td&gt;
&lt;td&gt;Entire project directory + git&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code execution&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Background (Python sandbox)&lt;/td&gt;
&lt;td&gt;Direct (any language, full toolchain)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Price&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$20/mo (Pro)&lt;/td&gt;
&lt;td&gt;$20/mo (Pro) or $100/mo (Max)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Use CoWork when:&lt;/strong&gt; You're working with documents, data, or content and you don't want to touch a terminal. You need Claude to process files, generate reports, or automate office workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Claude Code when:&lt;/strong&gt; You're building software. You need git integration, multi-file code editing, test execution, CI/CD interaction, or anything that requires a real development environment.&lt;/p&gt;

&lt;p&gt;They're complementary, not competing. Many people use both — CoWork for business tasks, Claude Code for engineering work.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/UPtmKh1vMN8"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  CoWork vs. Competitors
&lt;/h2&gt;

&lt;p&gt;CoWork isn't the only player in the "AI agent for non-developers" space. Here's how it stacks up.&lt;/p&gt;

&lt;h3&gt;
  
  
  Paperclip
&lt;/h3&gt;

&lt;p&gt;Paperclip has been positioning itself aggressively as the "CoWork killer" this week, with creators like Nate Herk and Greg Isenberg framing it as the AI-native staffing alternative. The pitch: hire AI agents like employees to handle specific business functions.&lt;/p&gt;

&lt;p&gt;The fundamental difference is architecture. CoWork runs locally in a sandbox on your machine — your files stay on your computer. Paperclip routes through cloud APIs.&lt;/p&gt;

&lt;p&gt;HuggingFace CEO Clement Delangue flagged this directly: Paperclip sends your data through external APIs, which means your documents, spreadsheets, and business data transit through third-party infrastructure. For anyone handling sensitive data — client information, financial records, internal strategy docs — this is a meaningful distinction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CoWork advantage:&lt;/strong&gt; Local execution, data stays on your machine, Anthropic's privacy stance.&lt;br&gt;
&lt;strong&gt;Paperclip advantage:&lt;/strong&gt; More specialized agent templates, "hire an employee" UX metaphor, broader integrations.&lt;/p&gt;

&lt;h3&gt;
  
  
  ChatGPT Canvas
&lt;/h3&gt;

&lt;p&gt;OpenAI's Canvas is the closest direct comparison. Like CoWork, it provides a workspace for non-developers to collaborate with AI on documents and code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CoWork advantage:&lt;/strong&gt; Deeper file system integration (mount entire folders vs. single documents), stronger data analysis pipeline, Claude Opus 4.6's superior reasoning for complex analytical tasks.&lt;br&gt;
&lt;strong&gt;Canvas advantage:&lt;/strong&gt; Better real-time collaborative editing UX, integrated image generation (DALL-E), broader plugin ecosystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Google Gemini Workspace Integration
&lt;/h3&gt;

&lt;p&gt;Google's approach is different — rather than a standalone agent environment, Gemini is being woven directly into Workspace apps (Docs, Sheets, Slides, Gmail).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CoWork advantage:&lt;/strong&gt; More powerful for complex, multi-file workflows. Not locked into Google's ecosystem.&lt;br&gt;
&lt;strong&gt;Gemini advantage:&lt;/strong&gt; Native integration with tools billions of people already use. If you live in Google Workspace, the AI comes to you — you don't go to it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The bigger picture:&lt;/strong&gt; Anthropic's $6B ARR (as of February 2026) is 75% API revenue — enterprise developers building on Claude. CoWork is their first serious play for the other 75% of knowledge workers who will never touch an API. The All-In Podcast covered this as Anthropic's "generational run" — and CoWork is a key part of how they sustain it beyond developers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Limitations You Should Know
&lt;/h2&gt;

&lt;p&gt;CoWork is impressive, but it has real constraints:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No internet access.&lt;/strong&gt; CoWork can't browse the web, call APIs, or fetch external data. Everything it works with must be in your mounted folder.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No persistent memory across sessions.&lt;/strong&gt; Each CoWork session starts fresh. Claude doesn't remember what you worked on yesterday. The workaround: keep a "project context" file in your folder that you update after each session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File type limitations.&lt;/strong&gt; CoWork handles text-based files well (CSV, JSON, markdown, plain text, Python scripts). Complex Excel files with macros or pivot tables may not parse perfectly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rate limits on Pro.&lt;/strong&gt; The $20/month Pro plan has usage limits. The $100/month Max plan offers higher limits for intensive processing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;macOS and Windows only.&lt;/strong&gt; No Linux support yet, no mobile, no web-only option. You need the Claude Desktop app.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tips for Getting the Most Out of CoWork
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Structure your folders before starting.&lt;/strong&gt; Claude works better when files are organized logically — not dumped in a flat mess.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Write a project brief.&lt;/strong&gt; A 200-word document describing your project, expected outputs, and constraints will save you multiple rounds of back-and-forth.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Be specific about output formats.&lt;/strong&gt; "Generate a report" is vague. "Generate a markdown report with headers for each product category, a summary table at the top, and bullet points for key findings" gets you what you want on the first try.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Use iterative refinement.&lt;/strong&gt; Don't try to get everything in one prompt. Start with the core analysis, review it, then refine.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Keep sessions focused.&lt;/strong&gt; One project per CoWork session. If you need to switch contexts, start a new session with a fresh folder mount.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The Simon Willison prediction, realized:&lt;/strong&gt; When CoWork launched in January, Simon Willison (the Django co-creator and prominent AI commentator) predicted Anthropic would shift Claude's messaging from "developer tool" to "general productivity agent." Jenny Wen's official tutorial this week — focused entirely on non-developer workflows — confirms exactly that trajectory.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Claude CoWork is not revolutionary — it's the logical next step. AI assistants that can read your files, process your data, and generate useful outputs without requiring technical skills. What makes it worth paying attention to is &lt;em&gt;who&lt;/em&gt; built it and &lt;em&gt;how&lt;/em&gt; it works.&lt;/p&gt;

&lt;p&gt;Anthropic's local-first, sandbox approach is a genuine differentiator in a market where most competitors route your data through cloud APIs. The $20/month price point (bundled with everything else Claude Pro offers) makes it accessible. And the quality of Claude Opus 4.6 underneath means the outputs are legitimately good — not demo-quality, but actually usable.&lt;/p&gt;

&lt;p&gt;If you're a knowledge worker drowning in spreadsheets, reports, and document processing, CoWork is worth trying. If you're a developer, you probably want Claude Code instead. And if you're evaluating both CoWork and Paperclip, the privacy architecture difference should be your first decision criterion.&lt;/p&gt;

&lt;p&gt;The agent wars are heating up. Anthropic just made their move for the non-developer market. Whether that move sticks depends on how fast they can close the gaps — internet access, persistent memory, and the ecosystem integrations that make an agent truly indispensable.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Want the developer-focused perspective? Check out our guide to &lt;a href="https://www.computeleap.com/blog/best-ai-apis-for-developers-2026" rel="noopener noreferrer"&gt;AI APIs for developers in 2026&lt;/a&gt; and the &lt;a href="https://www.computeleap.com/blog/rise-of-ai-agents-2026" rel="noopener noreferrer"&gt;rise of AI agents&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/claude-cowork-complete-guide-2026/" rel="noopener noreferrer"&gt;Full article on ComputeLeap →&lt;/a&gt;&lt;/strong&gt; | Follow &lt;a href="https://x.com/ComputeLeapAI" rel="noopener noreferrer"&gt;@ComputeLeapAI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>productivity</category>
      <category>tools</category>
    </item>
    <item>
      <title>ARC-AGI V3 Explained: The New AI Benchmark That Breaks Every Agent</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Sun, 29 Mar 2026 17:41:38 +0000</pubDate>
      <link>https://dev.to/max_quimby/arc-agi-v3-explained-the-new-ai-benchmark-that-breaks-every-agent-1oc8</link>
      <guid>https://dev.to/max_quimby/arc-agi-v3-explained-the-new-ai-benchmark-that-breaks-every-agent-1oc8</guid>
      <description>&lt;h2&gt;
  
  
  The Score That Should Have Everyone Worried
&lt;/h2&gt;

&lt;p&gt;On March 28, 2026, the AI world got a number that should be printed on every AGI roadmap in a very large font: &lt;strong&gt;0.3%&lt;/strong&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;&lt;a href="https://www.agentconn.com/blog/arc-agi-v3-ai-agent-benchmark-2026/" rel="noopener noreferrer"&gt;Read the full version with charts and embedded sources on AgentConn →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's the score GPT-5.4 High and Claude Opus 4.6 Max — the two most capable AI systems on the planet — achieved on ARC-AGI V3. At a cost of $5,000 to $9,000 per task.&lt;/p&gt;

&lt;p&gt;Humans? &lt;strong&gt;100%&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Symbolica's Agentica SDK? &lt;strong&gt;36%&lt;/strong&gt; — and a total bill of about $1,005 for 113 of 182 levels.&lt;/p&gt;

&lt;p&gt;This isn't a minor benchmark update. ARC-AGI V3 is the clearest signal yet that the AI industry has been solving the wrong problem.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📊 &lt;strong&gt;The V3 Scoreboard&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Humans: 100% success rate&lt;/li&gt;
&lt;li&gt;Symbolica Agentica SDK: 36.08% (113/182 levels, $1,005 total)&lt;/li&gt;
&lt;li&gt;GPT-5.4 High: ~0.3% (at $5,000-9,000 per task)&lt;/li&gt;
&lt;li&gt;Claude Opus 4.6 Max: ~0.25-0.3% (similar cost profile)&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What Is ARC-AGI, and Why Does It Keep Mattering?
&lt;/h2&gt;

&lt;p&gt;The Abstraction and Reasoning Corpus (ARC) is the benchmark that François Chollet — Keras creator, Google DeepMind researcher, and arguably the AI field's most credible skeptic — designed specifically to measure fluid intelligence rather than memorized knowledge.&lt;/p&gt;

&lt;p&gt;The core insight: if a model has seen enough training examples, it can score well on almost any benchmark. ARC was designed from the start to resist this.&lt;/p&gt;

&lt;h3&gt;
  
  
  V1 to V2 to V3: Closing the Escape Hatches
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;ARC-AGI V1&lt;/strong&gt; (2019): Static 2D grid puzzles. Given a few input-output examples, derive the transformation rule and apply it to a new input.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ARC-AGI V2&lt;/strong&gt; (2025): Addressed the contamination problem with harder, more compositional puzzles and stricter novelty guarantees.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ARC-AGI V3&lt;/strong&gt; (2026): A complete category shift — interactive video game environments instead of static puzzles.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/UkCfrNTzUMM"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;




&lt;h2&gt;
  
  
  How ARC-AGI V3 Actually Works
&lt;/h2&gt;

&lt;p&gt;V3 drops agents into &lt;strong&gt;interactive video game environments&lt;/strong&gt;. Here's what that means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent is presented with a novel mini-game it has never seen before&lt;/li&gt;
&lt;li&gt;There are &lt;strong&gt;zero instructions&lt;/strong&gt; — no goal, no controls, no rules explained&lt;/li&gt;
&lt;li&gt;The agent has a &lt;strong&gt;limited number of turns&lt;/strong&gt; to figure everything out&lt;/li&gt;
&lt;li&gt;Success means: discover the goal, learn the controls, understand the rules, complete the task&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is how humans learn to play new games. A 10-year-old can master a new mobile game in minutes. Current AI systems? They break.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;Chollet's Core Thesis:&lt;/strong&gt; LLMs didn't get smarter. They got better-trained on verifiable domains like code. Move to genuinely novel, non-verifiable tasks — and the apparent intelligence evaporates.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  François Chollet's Vision — and Warning
&lt;/h2&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/k2ZLQC8P7dc"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;His key arguments:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;LLMs improved on measurable tasks, not on intelligence itself.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Move to unverifiable domains and progress stalls.&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AGI timeline: 2030, but not via the current path.&lt;/strong&gt; The core engine will fit in fewer than 10,000 lines of code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The $600K ARC Prize exists to redirect research incentives&lt;/strong&gt; away from benchmark gaming.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Results in Context: What 0.3% Actually Means
&lt;/h2&gt;

&lt;p&gt;GPT-5.4 High, running at $5,000-9,000 per task, scored &lt;strong&gt;0.3%&lt;/strong&gt; on an evaluation where humans score &lt;strong&gt;100%&lt;/strong&gt;. This is not a narrow gap.&lt;/p&gt;

&lt;p&gt;Symbolica's Agentica SDK: &lt;strong&gt;36%&lt;/strong&gt; at ~$9 per level — approximately 1,000x cheaper than the frontier models that scored near-zero.&lt;/p&gt;

&lt;p&gt;The 36% vs 0.3% comparison is a signal about &lt;strong&gt;architecture&lt;/strong&gt;. Symbolica's program synthesis approach outperforms pure LLM scaling by 120x on this benchmark.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Exposes the Agent Industry's Blind Spot
&lt;/h2&gt;

&lt;p&gt;AI agents can write code, browse the web, draft emails. But these are all examples of &lt;strong&gt;applying learned patterns to familiar scenarios&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Put an agent in a genuinely novel environment and it collapses. ARC-AGI V3 puts numbers on the failure.&lt;/p&gt;

&lt;p&gt;The kicker: the agent industry's go-to defense — it gets better with more context — directly contradicts the V3 premise. &lt;strong&gt;You don't get examples. You get to figure it out.&lt;/strong&gt;&lt;/p&gt;


&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
      &lt;div class="c-embed__body flex items-center justify-between"&gt;
        &lt;a href="https://x.com/arcprize" rel="noopener noreferrer" class="c-link fw-bold flex items-center"&gt;
          &lt;span class="mr-2"&gt;x.com&lt;/span&gt;
          

        &lt;/a&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;





&lt;h2&gt;
  
  
  What Happens Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Researchers:&lt;/strong&gt; Hybrid architectures combining pattern matching with program synthesis are more likely to crack V3-style problems than pure scaling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Builders:&lt;/strong&gt; Stop overselling adaptability. Design for the limitation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Buyers:&lt;/strong&gt; Benchmark performance ≠ general capability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AGI watchers:&lt;/strong&gt; Chollet's 2030 estimate looks more credible. We're not in the final miles.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The score is 100% humans, 36% Symbolica, 0.3% everything else. The gap is the map.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;&lt;a href="https://www.agentconn.com/blog/arc-agi-v3-ai-agent-benchmark-2026/" rel="noopener noreferrer"&gt;Read the full analysis with multimedia at AgentConn&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://www.agentconn.com/blog/arc-agi-v3-ai-agent-benchmark-2026/" rel="noopener noreferrer"&gt;Full article on AgentConn →&lt;/a&gt;&lt;/strong&gt; | Follow &lt;a href="https://x.com/ComputeLeapAI" rel="noopener noreferrer"&gt;@ComputeLeapAI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>agents</category>
      <category>benchmark</category>
    </item>
    <item>
      <title>Claude Code Just Hit #1 on Hacker News. Here's Everything You Need to Know.</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Sat, 28 Mar 2026 17:45:45 +0000</pubDate>
      <link>https://dev.to/max_quimby/claude-code-just-hit-1-on-hacker-news-heres-everything-you-need-to-know-j74</link>
      <guid>https://dev.to/max_quimby/claude-code-just-hit-1-on-hacker-news-heres-everything-you-need-to-know-j74</guid>
      <description>&lt;p&gt;Claude Code hit #1 on Hacker News today. The post — a deep dive into the &lt;code&gt;.claude/&lt;/code&gt; folder anatomy — pulled 556 points and counting. Five YouTube tutorials dropped in the last 48 hours. X is buzzing with auto-fix demos, hooks configurations, and cloud session workflows. And Anthropic just shipped conditional hooks and cloud-based auto-fix in the same week.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/claude-code-complete-guide-2026/" rel="noopener noreferrer"&gt;Read the full version with charts and embedded sources on ComputeLeap →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Something is happening. Claude Code isn't just a developer tool anymore — it's becoming the default way a new generation of builders creates software. Prosumers who've never opened a terminal are cloning repos and shipping sites. Senior engineers are restructuring their entire CI/CD pipelines around it. The adoption curve isn't linear — it's vertical.&lt;/p&gt;

&lt;p&gt;This guide covers everything. Installation to advanced workflows. Whether you're opening Claude Code for the first time or you're ready to wire it into your CI pipeline with hooks and auto-fix, this is the resource you bookmark and come back to.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;📊 The adoption signal is loud:&lt;/strong&gt; Chase AI posted 5 Claude Code tutorials in two days. Kenny Liao dropped a beginner-to-mastery deep dive. Matthew Berman ranked Claude S-tier in his March 2026 model tier list — "unbelievable, good at everything." This isn't just developer content anymore. It's prosumer builders using Claude Code as their default "build anything fast" tool.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What Is Claude Code?
&lt;/h2&gt;

&lt;p&gt;Claude Code is Anthropic's command-line AI coding agent. Unlike chat-based AI assistants that suggest code snippets, Claude Code operates directly in your terminal — reading your files, understanding your project structure, writing code, running tests, committing to git, and executing shell commands. It's an autonomous agent, not an autocomplete engine.&lt;/p&gt;

&lt;p&gt;Think of the difference like this: GitHub Copilot is a passenger giving directions. Claude Code is a driver who knows the roads, checks the mirrors, and parallel parks.&lt;/p&gt;

&lt;p&gt;It runs on Anthropic's Claude models (currently Opus 4.6 by default for Max subscribers, Sonnet 4.5 for Pro) with a massive context window — up to 1 million tokens. That means it can hold your entire codebase in its head while working.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation and Setup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;You need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Node.js 18+&lt;/strong&gt; (Claude Code is an npm package)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An Anthropic account&lt;/strong&gt; with a Max subscription ($100/month for unlimited Opus 4.6) or Pro ($20/month with Sonnet 4.5 and limited Opus)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A terminal&lt;/strong&gt; — macOS Terminal, iTerm2, Windows Terminal, or any Linux terminal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git&lt;/strong&gt; installed and configured&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Install Claude Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @anthropic-ai/claude-code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. One command. No Docker containers, no Python virtual environments, no config files to create first.&lt;/p&gt;

&lt;h3&gt;
  
  
  Authenticate
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running &lt;code&gt;claude&lt;/code&gt; for the first time opens a browser window for OAuth authentication with your Anthropic account. Once authenticated, the token is stored locally and you're ready to go.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verify Your Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude &lt;span class="nt"&gt;--version&lt;/span&gt;
claude /doctor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;/doctor&lt;/code&gt; command checks your environment — Node version, authentication status, git configuration, and available tools.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚡ Pro tip:&lt;/strong&gt; If you're on macOS, install via Homebrew for automatic updates: &lt;code&gt;brew install claude-code&lt;/code&gt;. The npm install works everywhere, but Homebrew keeps you on the latest version without thinking about it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Your First Project Walkthrough
&lt;/h2&gt;

&lt;p&gt;Let's build something real. Open a terminal, navigate to a project directory (or create a new one), and start Claude Code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;my-first-project &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;my-first-project
git init
claude
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code launches in interactive mode. You'll see a prompt where you can type natural language instructions. Try this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a React app with TypeScript that displays a real-time 
cryptocurrency price dashboard. Use Vite for the build tool, 
Tailwind CSS for styling, and the CoinGecko free API for data. 
Include a search bar, favorites list, and auto-refresh every 30 seconds.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Watch what happens. Claude Code will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Plan&lt;/strong&gt; — outline the architecture and file structure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaffold&lt;/strong&gt; — create &lt;code&gt;package.json&lt;/code&gt;, &lt;code&gt;vite.config.ts&lt;/code&gt;, &lt;code&gt;tsconfig.json&lt;/code&gt;, Tailwind config&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implement&lt;/strong&gt; — write components, hooks, API integration, types&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configure&lt;/strong&gt; — set up routing, environment variables, dev scripts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test&lt;/strong&gt; — run the dev server to verify everything works&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The entire process takes 3-5 minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The .claude/ Folder: Your Project's Brain
&lt;/h2&gt;

&lt;p&gt;This is what hit #1 on Hacker News. The &lt;code&gt;.claude/&lt;/code&gt; folder is where Claude Code stores its understanding of your project. Think of it as the configuration layer between "generic AI" and "AI that knows your codebase."&lt;/p&gt;

&lt;p&gt;Here's the anatomy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.claude/
├── CLAUDE.md           # Project instructions (the big one)
├── settings.json       # Claude Code configuration
├── settings.local.json # Local overrides (gitignored)
├── commands/           # Custom slash commands
│   ├── review.md       # /review command
│   └── deploy.md       # /deploy command
├── skills/             # Reusable capabilities
│   └── my-skill/
│       └── skill.md
└── rules/              # Constraints and patterns
    ├── no-any.md
    └── error-handling.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CLAUDE.md — The Most Important File
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;CLAUDE.md&lt;/code&gt; is the instruction manual for Claude Code in your project. When Claude starts a session, it reads this file first. Everything in it shapes how Claude understands and works with your code.&lt;/p&gt;

&lt;p&gt;A good &lt;code&gt;CLAUDE.md&lt;/code&gt; includes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Project: CryptoDash&lt;/span&gt;

&lt;span class="gu"&gt;## Architecture&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; React 18 + TypeScript + Vite
&lt;span class="p"&gt;-&lt;/span&gt; State management: Zustand (NOT Redux — we migrated away in v2.1)
&lt;span class="p"&gt;-&lt;/span&gt; API layer: TanStack Query with custom hooks in src/hooks/api/
&lt;span class="p"&gt;-&lt;/span&gt; Styling: Tailwind CSS with custom design tokens in tailwind.config.ts

&lt;span class="gu"&gt;## Conventions&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; All components use named exports (not default exports)
&lt;span class="p"&gt;-&lt;/span&gt; API hooks follow the pattern: useGet{Resource}, useMutate{Resource}
&lt;span class="p"&gt;-&lt;/span&gt; Error boundaries wrap every route-level component
&lt;span class="p"&gt;-&lt;/span&gt; Tests colocate with source: Component.tsx → Component.test.tsx

&lt;span class="gu"&gt;## Do NOT&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Use any type — use unknown with type guards instead
&lt;span class="p"&gt;-&lt;/span&gt; Import from barrel files (index.ts) in the same package
&lt;span class="p"&gt;-&lt;/span&gt; Add dependencies without checking bundle size impact first
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The CLAUDE.md Hierarchy
&lt;/h3&gt;

&lt;p&gt;Claude Code reads multiple &lt;code&gt;CLAUDE.md&lt;/code&gt; files in priority order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;~/.claude/CLAUDE.md&lt;/code&gt;&lt;/strong&gt; — Global instructions (your personal coding style, always applied)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;./CLAUDE.md&lt;/code&gt;&lt;/strong&gt; or &lt;strong&gt;&lt;code&gt;./.claude/CLAUDE.md&lt;/code&gt;&lt;/strong&gt; — Project root (team-shared, committed to git)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;./src/CLAUDE.md&lt;/code&gt;&lt;/strong&gt; — Directory-specific (instructions for specific parts of the codebase)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Deeper files override shallower ones. Think of CLAUDE.md files like &lt;code&gt;.gitignore&lt;/code&gt; — they cascade from general to specific.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hooks System: Automating Everything
&lt;/h2&gt;

&lt;p&gt;Hooks are where Claude Code transforms from "AI assistant" to "development platform." They're user-defined commands that execute automatically at specific points in Claude Code's lifecycle.&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/i-jawzwnjSA"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h3&gt;
  
  
  Hook Events
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Event&lt;/th&gt;
&lt;th&gt;When It Fires&lt;/th&gt;
&lt;th&gt;Common Use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;SessionStart&lt;/td&gt;
&lt;td&gt;Session begins/resumes&lt;/td&gt;
&lt;td&gt;Load environment, inject context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PreToolUse&lt;/td&gt;
&lt;td&gt;Before a tool call&lt;/td&gt;
&lt;td&gt;Block dangerous commands, validate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PostToolUse&lt;/td&gt;
&lt;td&gt;After a tool call succeeds&lt;/td&gt;
&lt;td&gt;Auto-format, lint, notify&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Notification&lt;/td&gt;
&lt;td&gt;Claude needs attention&lt;/td&gt;
&lt;td&gt;Desktop alerts, Slack messages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SubagentStart&lt;/td&gt;
&lt;td&gt;Subagent spawned&lt;/td&gt;
&lt;td&gt;Log, resource management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PreCompact&lt;/td&gt;
&lt;td&gt;Before context compaction&lt;/td&gt;
&lt;td&gt;Save important context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SessionEnd&lt;/td&gt;
&lt;td&gt;Session ends&lt;/td&gt;
&lt;td&gt;Cleanup, report generation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Your First Hook: Auto-Format on Save
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx prettier --write &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;$CLAUDE_FILE_PATH&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every time Claude writes a file, Prettier formats it automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conditional Hooks with if
&lt;/h3&gt;

&lt;p&gt;This just shipped this week — the &lt;code&gt;if&lt;/code&gt; field enables conditional hook execution:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"echo 'Blocked: no direct DB access in production'"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"if"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"echo $CLAUDE_TOOL_INPUT | grep -q 'psql.*prod'"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Prompt-Based Hooks: AI Reviewing AI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Write"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Review this file change for security vulnerabilities. If you find any, return BLOCK. If safe, return ALLOW."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"claude-haiku-4"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A cheaper, faster model (Haiku) reviews every file write for security issues. AI auditing AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Auto-Fix: Claude Code Meets CI/CD
&lt;/h2&gt;

&lt;p&gt;Auto-fix lets Claude Code automatically monitor your pull requests and fix CI failures — linting errors, type errors, failing tests — without you lifting a finger.&lt;/p&gt;

&lt;p&gt;The magic: this now runs in the &lt;strong&gt;cloud&lt;/strong&gt;. Your laptop can be closed. Claude Code cloud sessions monitor your PRs and fix them autonomously.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up Auto-Fix
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"autoFix"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"github"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"ciChecks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"test"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"lint"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"typecheck"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"build"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"maxAttempts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"branchPattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"feat/*"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The CI Integration Pattern
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Claude Auto-Fix&lt;/span&gt;
&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;check_suite&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;types&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;completed&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;autofix&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;github.event.check_suite.conclusion == 'failure'&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;anthropic/claude-code-action@v1&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;auto-fix&lt;/span&gt;
          &lt;span class="na"&gt;max-attempts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚡ Real-world usage pattern:&lt;/strong&gt; Start with auto-fix on lint and type errors only. Graduate to test failures once you trust the workflow. Never auto-fix security scans automatically.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Cloud Sessions and Remote Tasks
&lt;/h2&gt;

&lt;p&gt;Anthropic shipped Remote Tasks — the ability to run Claude Code sessions on Anthropic's cloud infrastructure, triggered on schedules or events.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled maintenance&lt;/strong&gt; — "Every Monday at 9 AM, audit dependencies and open PRs for outdated packages"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event-driven workflows&lt;/strong&gt; — "When a new issue is labeled bug, create a branch, investigate, and open a draft PR"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Continuous documentation&lt;/strong&gt; — "After every merge to main, update the API docs and changelog"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/1hc-lAbSFVE"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;h2&gt;
  
  
  Advanced Workflows
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Multi-Agent Architecture with Subagents
&lt;/h3&gt;

&lt;p&gt;Claude Code can spawn subagents — isolated Claude instances that handle specific subtasks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Main Agent (Opus 4.6)
├── Subagent 1: "Implement the API endpoints" (Opus)
├── Subagent 2: "Write tests for the API" (Sonnet)
├── Subagent 3: "Update documentation" (Haiku)
└── Subagent 4: "Review all changes" (Opus)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Harness Engineering Pattern
&lt;/h3&gt;

&lt;p&gt;The most sophisticated Claude Code users build &lt;strong&gt;harnesses&lt;/strong&gt; around it — CLAUDE.md files, hooks, custom commands, MCP integrations, review pipelines, and CI workflows.&lt;/p&gt;

&lt;p&gt;The same model scored 78% with one harness and 42% with another. The harness matters more than the model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plan&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;th&gt;Model Access&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Pro&lt;/td&gt;
&lt;td&gt;$20/month&lt;/td&gt;
&lt;td&gt;Sonnet 4.5 (default), limited Opus&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max 5x&lt;/td&gt;
&lt;td&gt;$100/month&lt;/td&gt;
&lt;td&gt;Opus 4.6 (default), unlimited Sonnet&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Max 20x&lt;/td&gt;
&lt;td&gt;$200/month&lt;/td&gt;
&lt;td&gt;Opus 4.6 extended, priority access&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API&lt;/td&gt;
&lt;td&gt;Pay-per-token&lt;/td&gt;
&lt;td&gt;Any model&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Tips That Actually Matter
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Write Your CLAUDE.md Before Writing Code&lt;/strong&gt; — 15 minutes saves hours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use /compact Strategically&lt;/strong&gt; — Run it when Claude starts making mistakes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start Conversations with Context&lt;/strong&gt; — Reference specific files and patterns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git Commit Frequently&lt;/strong&gt; — Gives you rollback points.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trust But Verify&lt;/strong&gt; — Always review the diff before merging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use the Right Model for the Job&lt;/strong&gt; — Opus for architecture, Sonnet for features, Haiku for docs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hooks Are Your Guardrails&lt;/strong&gt; — PreToolUse to block, PostToolUse to format, Notification to alert.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Start Building
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @anthropic-ai/claude-code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create a &lt;code&gt;CLAUDE.md&lt;/code&gt;. Set up your first hook. Push a PR and let auto-fix handle the lint errors. Build something that would have taken you a week — in an afternoon.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://www.computeleap.com/blog/claude-code-complete-guide-2026/" rel="noopener noreferrer"&gt;Full article on ComputeLeap →&lt;/a&gt;&lt;/strong&gt; | Follow &lt;a href="https://x.com/ComputeLeapAI" rel="noopener noreferrer"&gt;@ComputeLeapAI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>claudecode</category>
      <category>ai</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Anthropic Dispatch Review: The AI Desktop Agent That Delivers Finished Work</title>
      <dc:creator>Max Quimby</dc:creator>
      <pubDate>Sat, 28 Mar 2026 17:44:53 +0000</pubDate>
      <link>https://dev.to/max_quimby/anthropic-dispatch-review-the-ai-desktop-agent-that-delivers-finished-work-28o5</link>
      <guid>https://dev.to/max_quimby/anthropic-dispatch-review-the-ai-desktop-agent-that-delivers-finished-work-28o5</guid>
      <description>&lt;p&gt;There's a phrase Anthropic keeps repeating when they talk about Dispatch, their new agentic desktop product: &lt;em&gt;finished work&lt;/em&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📖 &lt;strong&gt;&lt;a href="https://www.agentconn.com/blog/anthropic-dispatch-ai-desktop-agent-review-2026/" rel="noopener noreferrer"&gt;Read the full version with charts and embedded sources on AgentConn →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Not a briefing. Not a summary to review. Not a draft that needs your edits. &lt;strong&gt;Finished work.&lt;/strong&gt; The kind that was blocking you, then wasn't, because Claude handled it while you were in a meeting, on a flight, or — and this is the paradigm they're selling — just texting from your phone.&lt;/p&gt;

&lt;p&gt;That framing distinction matters more than the feature list. It's the difference between AI as a productivity multiplier (still in your workflow) and AI as a workflow executor (operating autonomously while you're away). Dispatch is betting on the latter.&lt;/p&gt;

&lt;p&gt;After analyzing Nate B Jones's detailed walkthrough, cross-referencing Anthropic's own framing, and looking at how it compares to the broader computer-use agent landscape, here's what you actually need to know about Anthropic Dispatch in 2026.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is Anthropic Dispatch?
&lt;/h2&gt;

&lt;p&gt;Dispatch is Anthropic's consumer-facing agentic product that pairs two capabilities into a single coherent workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Desktop Computer Use&lt;/strong&gt; — Claude takes over your Mac (or PC), sees your screen, opens applications, navigates UIs, fills forms, and executes multi-step tasks exactly as a human would — without any API integrations required&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remote Task Delegation&lt;/strong&gt; — You text Claude from your phone (or any remote interface), hand it a task, and walk away. Claude works on your actual desktop environment until it's done&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key word in that second point is &lt;em&gt;actual desktop&lt;/em&gt;. Dispatch isn't running in a cloud VM. It's operating in your local desktop environment, with access to every app you have installed, every file on your disk, every tool in your workflow — no API credentials, no webhooks, no Zapier flows required.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;The paradigm shift in one line:&lt;/strong&gt; Before Dispatch, AI delivered work that still landed on your desk. Dispatch delivers work that never reaches your desk at all.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Nate B Jones Analysis: Three Tools, One Paradigm
&lt;/h2&gt;

&lt;p&gt;Nate B Jones covered Dispatch as part of a broader announcement — "Anthropic Just Gave You 3 Tools That Work While You're Gone" — and his framing is the sharpest take on what Anthropic is actually building.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=3e7gmNPr5Vo" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnj7wwp28cvmm4n6hu4ca.jpg" alt="YouTube Video" width="480" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;His core insight: Anthropic shipped Dispatch as a pair with Computer Use updates not as separate products but as a single thesis. The thesis is &lt;strong&gt;asynchronous work delegation&lt;/strong&gt;. You are not using Claude in a chat session while you watch it work. You are handing it a task and returning to find it done.&lt;/p&gt;

&lt;p&gt;Nate walks through several real-world demonstrations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Technical debt clearing&lt;/strong&gt; — Claude navigating a codebase across multiple files, identifying deprecated patterns, making fixes, running tests, and committing changes — all while the developer is in a standup&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document workflows&lt;/strong&gt; — Claude pulling data from various sources, populating a spreadsheet, formatting a report, and exporting a final version — without touching a single API&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Research and synthesis&lt;/strong&gt; — Claude browsing multiple sites, collecting information, and producing a structured output in whatever tool you use (Notion, Google Docs, Word — doesn't matter, it just opens them)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The "no API" angle deserves special emphasis. The biggest friction in enterprise automation has always been integrations. Your tools don't talk to each other natively. Dispatch sidesteps this entirely by operating at the UI layer — the one interface every app exposes equally.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Dispatch Differs From Basic Computer Use
&lt;/h2&gt;

&lt;p&gt;Anthropic has had Computer Use in API beta since late 2024. So what's new?&lt;/p&gt;

&lt;p&gt;The API-level Computer Use product is powerful but requires significant setup: Docker containers, custom tool-calling code, screenshot pipelines, managing the agent loop yourself. It's a developer primitive — incredibly flexible, but not something you hand to a non-technical user.&lt;/p&gt;

&lt;p&gt;Dispatch is the consumer layer built on top of that primitive:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Computer Use API&lt;/th&gt;
&lt;th&gt;Anthropic Dispatch&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Setup&lt;/td&gt;
&lt;td&gt;Docker container, API keys, custom code&lt;/td&gt;
&lt;td&gt;Native desktop app install&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Interface&lt;/td&gt;
&lt;td&gt;Programmatic tool calls&lt;/td&gt;
&lt;td&gt;Natural language via text/chat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trigger&lt;/td&gt;
&lt;td&gt;Your own orchestration code&lt;/td&gt;
&lt;td&gt;Text from phone / any device&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Environment&lt;/td&gt;
&lt;td&gt;Sandboxed VM (recommended)&lt;/td&gt;
&lt;td&gt;Your actual local desktop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Target user&lt;/td&gt;
&lt;td&gt;Developers&lt;/td&gt;
&lt;td&gt;Knowledge workers, professionals&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Task model&lt;/td&gt;
&lt;td&gt;Synchronous (you watch it work)&lt;/td&gt;
&lt;td&gt;Asynchronous (you return to find it done)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Important distinction:&lt;/strong&gt; Dispatch runs on your local machine, not a cloud desktop. This means it has access to everything you have access to. That's powerful — and requires careful thought about permissions and scope.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Real Use Cases: What "Finished Work" Actually Looks Like
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ✅ Where Dispatch Shines
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Cross-application data workflows:&lt;/strong&gt; "Pull the Q1 pipeline numbers from Salesforce, update the board deck in Google Slides, and send me the updated file." Three apps, zero APIs. This previously required either developer-built integrations or a virtual assistant who could sit at your computer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Research and competitive intelligence:&lt;/strong&gt; "Browse these 8 competitor pricing pages, extract their tier structures, and put it in a comparison table in our Notion database." The human version of this takes 2 hours. Dispatch does it while you sleep.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Email and communication triage:&lt;/strong&gt; Actually navigate to your email client, find threads, produce staged replies — not just "draft a response given this context."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code maintenance tasks:&lt;/strong&gt; "Find all deprecated &lt;code&gt;fetchUser&lt;/code&gt; API calls, replace them with the new pattern, run the tests, push to a branch." The kind of grunt work that burns half a day for no creative value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Form-heavy administrative work:&lt;/strong&gt; Government portals, insurance claims, benefits enrollment. The UI layer that no API will ever reach.&lt;/p&gt;

&lt;h3&gt;
  
  
  ⚠️ Where Dispatch Struggles
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Long-horizon tasks with ambiguity&lt;/strong&gt; — Multiple branch points increase error compounding probability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security-sensitive workflows&lt;/strong&gt; — Financial transactions need explicit human confirmation steps&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Novel interfaces and dynamic UIs&lt;/strong&gt; — Non-standard UI patterns can confuse the vision-based agent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time data requirements&lt;/strong&gt; — Tasks requiring very current information have natural limitations&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The "Finished Work" Paradigm
&lt;/h2&gt;

&lt;p&gt;Most AI productivity tools still operate in the &lt;strong&gt;work-lands-on-your-desk model&lt;/strong&gt;: the AI produces something — a draft, a summary, a code snippet — and you pick it up from there.&lt;/p&gt;

&lt;p&gt;Dispatch is a serious attempt at the &lt;strong&gt;work-gets-off-your-desk model&lt;/strong&gt;. The completed task is the output. Not a draft. Not a starting point. A closed loop.&lt;/p&gt;

&lt;p&gt;The trust is earned through:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Verification checkpoints&lt;/strong&gt; — Pause before irreversible actions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit trails&lt;/strong&gt; — A log of exactly what the agent did, when, and what the outcomes were&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scope boundaries&lt;/strong&gt; — You define what apps and data the agent can access&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;🎯 &lt;strong&gt;The real productivity unlock:&lt;/strong&gt; Dispatch isn't valuable when you're watching it work. It's valuable when you stop watching — when you hand it a task, go do something else, and return to a closed loop. The product's value scales with your willingness to delegate.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Social Signal: What The Community Is Saying
&lt;/h2&gt;

&lt;p&gt;The developer community's reaction to Anthropic's recent agentic product push has been significant. Claude Code's explosive growth — which shipped alongside Dispatch — has generated a wave of prosumer adoption that's hard to ignore.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://twitter.com/bcherny" rel="noopener noreferrer"&gt;https://twitter.com/bcherny&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When Chase AI posted 5 Claude Code tutorials in 48 hours — website cloning in 15 minutes, Obsidian integration, animated site generation — that wasn't developer content. That was prosumer builders discovering a new default for "build anything fast."&lt;/p&gt;

&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/i-jawzwnjSA"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Dispatch is that same energy applied to knowledge work.&lt;/p&gt;




&lt;h2&gt;
  
  
  Matthew Berman's S-Tier Context
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.youtube.com/watch?v=3M7jTPLf86w" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimg.youtube.com%2Fvi%2F3M7jTPLf86w%2F0.jpg" alt="YouTube Video" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Matthew Berman's recent model tier list placed Claude at &lt;strong&gt;S-tier&lt;/strong&gt; — "unbelievable model, good at everything, love every interaction" — while ChatGPT landed at A ("all the features, not best in class at anything").&lt;/p&gt;

&lt;p&gt;That underlying model quality matters enormously for computer-use agents. The agent loop is only as good as the model's reasoning when it hits an unexpected state. Claude's strength at nuanced reasoning and instruction-following is precisely what makes a computer-use product reliable enough for the "walk away and trust it" use case. A mediocre model doing computer use is a liability; an S-tier model doing computer use is a different product category.&lt;/p&gt;




&lt;h2&gt;
  
  
  Dispatch vs. The Alternatives
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;vs. Open Interpreter:&lt;/strong&gt; Open-source, model-agnostic, fully local. Dispatch wins on polish and the mobile delegation UX. Open Interpreter wins on cost, privacy, and flexibility. Different markets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vs. OpenAI Operator:&lt;/strong&gt; Cloud browser only — can't access your local files or desktop applications. For browser-only tasks, viable. For anything requiring your actual desktop environment, Dispatch wins on scope.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vs. Perplexity Computer:&lt;/strong&gt; Interesting model-routing architecture, but also cloud/browser-bound. Same limitation as Operator for local desktop work.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;📊 &lt;strong&gt;Competitive summary:&lt;/strong&gt; Dispatch is the only major player combining (1) local desktop access, (2) mobile delegation interface, and (3) consumer-grade polish. The "text from phone, work gets done on desktop" UX is genuinely novel in the category.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Anthropic Dispatch is the most serious attempt yet to move AI assistance from "work that lands on your desk" to "work that never reaches your desk."&lt;/p&gt;

&lt;p&gt;The limitations are real. Start with low-stakes tasks, verify outcomes carefully, and gradually expand the scope of what you delegate as you build confidence in the agent's judgment.&lt;/p&gt;

&lt;p&gt;But the direction is clear. The question was never &lt;em&gt;whether&lt;/em&gt; this kind of autonomous desktop agent would arrive. The question was &lt;em&gt;who&lt;/em&gt; would be the first to get the consumer experience right.&lt;/p&gt;

&lt;p&gt;Anthropic just made a strong argument that the answer is them.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sources: &lt;a href="https://youtube.com/watch?v=3e7gmNPr5Vo" rel="noopener noreferrer"&gt;Nate B Jones — Anthropic Just Gave You 3 Tools That Work While You're Gone&lt;/a&gt; · &lt;a href="https://youtube.com/watch?v=3M7jTPLf86w" rel="noopener noreferrer"&gt;Matthew Berman — Best Models Tier List&lt;/a&gt; · &lt;a href="https://youtube.com/watch?v=i-jawzwnjSA" rel="noopener noreferrer"&gt;Chase AI — I STOLE a $100K Website in 15 Minutes with Claude Code&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv10hpn16rbusti3iqgxr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv10hpn16rbusti3iqgxr.png" alt="Anthropic Dispatch — AI desktop agent hero" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://www.agentconn.com/blog/anthropic-dispatch-ai-desktop-agent-review-2026/" rel="noopener noreferrer"&gt;Full article on AgentConn →&lt;/a&gt;&lt;/strong&gt; | Follow &lt;a href="https://x.com/ComputeLeapAI" rel="noopener noreferrer"&gt;@ComputeLeapAI&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>anthropic</category>
      <category>ai</category>
      <category>automation</category>
      <category>claude</category>
    </item>
  </channel>
</rss>
