<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Anton Abyzov</title>
    <description>The latest articles on DEV Community by Anton Abyzov (@aabyzov).</description>
    <link>https://dev.to/aabyzov</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2694434%2Fa10a50ee-2e9a-4199-acb5-06c1ed1559f4.png</url>
      <title>DEV Community: Anton Abyzov</title>
      <link>https://dev.to/aabyzov</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aabyzov"/>
    <language>en</language>
    <item>
      <title>Anthropic Just Validated Agent Teams, Why Specs Matter More Than Prompts</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Wed, 08 Apr 2026 01:06:15 +0000</pubDate>
      <link>https://dev.to/aabyzov/anthropic-just-validated-agent-teams-why-specs-matter-more-than-prompts-4p4l</link>
      <guid>https://dev.to/aabyzov/anthropic-just-validated-agent-teams-why-specs-matter-more-than-prompts-4p4l</guid>
      <description>&lt;h1&gt;
  
  
  Anthropic Just Validated Agent Teams, Why Specs Matter More Than Prompts
&lt;/h1&gt;

&lt;p&gt;Today Anthropic showed the slide that matters most for the next phase of agentic software: &lt;strong&gt;Agent Teams Revisited&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not because it is flashy, but because it makes the shift explicit:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;subagents&lt;/li&gt;
&lt;li&gt;Claude Code communicating with them&lt;/li&gt;
&lt;li&gt;coordination as a first class capability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is the real story.&lt;/p&gt;

&lt;p&gt;Most people still think in terms of one super prompt and one super assistant.&lt;br&gt;
I think the future is much closer to an organization:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;one lead agent&lt;/li&gt;
&lt;li&gt;multiple specialists&lt;/li&gt;
&lt;li&gt;a shared spec&lt;/li&gt;
&lt;li&gt;verification before execution&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Prompts give output, specs give alignment
&lt;/h2&gt;

&lt;p&gt;A prompt is good for producing a result.&lt;br&gt;
A spec is better for aligning a system.&lt;/p&gt;

&lt;p&gt;Once you want multiple agents to work together, alignment matters more than cleverness.&lt;br&gt;
Without structure, multi agent workflows become multi chaos.&lt;br&gt;
With structure, they become leverage.&lt;/p&gt;

&lt;p&gt;That is why I am bullish on spec first orchestration.&lt;/p&gt;

&lt;p&gt;The workflow I want is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spec -&amp;gt; /sw:team-lead -&amp;gt; specialists -&amp;gt; verification
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The lead agent should not improvise from vague intent.&lt;br&gt;
It should have enough structure to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;understand the target outcome&lt;/li&gt;
&lt;li&gt;break work into tasks&lt;/li&gt;
&lt;li&gt;delegate to specialists&lt;/li&gt;
&lt;li&gt;review what came back&lt;/li&gt;
&lt;li&gt;return a clean execution path&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why Anthropic's slide matters
&lt;/h2&gt;

&lt;p&gt;When a frontier lab shows subagents and Claude Code communicating with each other, it validates a broader direction:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;better models will matter&lt;/li&gt;
&lt;li&gt;but orchestration will matter just as much&lt;/li&gt;
&lt;li&gt;the next moat is not only intelligence, it is coordination&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A lot of the value will move into the workflow layer around the model.&lt;br&gt;
That includes how work is specified, delegated, verified, and merged.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I am building toward
&lt;/h2&gt;

&lt;p&gt;This is the direction I have been pushing with &lt;strong&gt;SpecWeave&lt;/strong&gt; and the &lt;strong&gt;Verified Skill&lt;/strong&gt; layer.&lt;/p&gt;

&lt;p&gt;Both are &lt;strong&gt;FREE and OPEN SOURCE&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://specweave.com" rel="noopener noreferrer"&gt;https://specweave.com&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://verified-skill.com" rel="noopener noreferrer"&gt;https://verified-skill.com&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My view is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompts are not enough&lt;/li&gt;
&lt;li&gt;teams need specs&lt;/li&gt;
&lt;li&gt;agent teams need a lead&lt;/li&gt;
&lt;li&gt;execution needs verification&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anthropic showed the destination.&lt;br&gt;
Now the real race is building the best execution layer around it.&lt;/p&gt;

&lt;p&gt;If you are building in this space, I would pay very close attention to that shift.&lt;/p&gt;

</description>
      <category>opensource</category>
    </item>
    <item>
      <title>Project Glasswing changes the AI security conversation</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Tue, 07 Apr 2026 19:04:33 +0000</pubDate>
      <link>https://dev.to/aabyzov/project-glasswing-changes-the-ai-security-conversation-2d08</link>
      <guid>https://dev.to/aabyzov/project-glasswing-changes-the-ai-security-conversation-2d08</guid>
      <description>&lt;h1&gt;
  
  
  Project Glasswing changes the AI security conversation
&lt;/h1&gt;

&lt;p&gt;Anthropic’s Project Glasswing is one of the clearest signals yet that frontier AI has crossed from “helpful coding assistant” into something much more consequential: autonomous vulnerability discovery.&lt;/p&gt;

&lt;p&gt;According to Anthropic, Claude Mythos Preview found thousands of high-severity vulnerabilities, including in every major operating system and web browser. More importantly, the company says many of these findings, and some related exploit paths, were discovered autonomously.&lt;/p&gt;

&lt;p&gt;If those claims hold up, the conversation around AI and software security has changed.&lt;/p&gt;

&lt;p&gt;For the past couple of years, most discussions about AI in software have focused on productivity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;faster coding&lt;/li&gt;
&lt;li&gt;better debugging&lt;/li&gt;
&lt;li&gt;easier refactoring&lt;/li&gt;
&lt;li&gt;more capable agentic workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Project Glasswing points to the next phase.&lt;/p&gt;

&lt;p&gt;The question is no longer just whether AI can help engineers write software faster. It is whether frontier models can become first-class actors in finding and fixing vulnerabilities across critical infrastructure before attackers use similar capabilities offensively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this announcement matters
&lt;/h2&gt;

&lt;p&gt;Three things make this announcement different.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Anthropic is explicitly restricting the model
&lt;/h3&gt;

&lt;p&gt;Anthropic is not broadly releasing Claude Mythos Preview. That alone says a lot. Companies do not usually frame their own model as too dangerous for wide deployment unless they believe the capability jump is material.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The partner list is unusually serious
&lt;/h3&gt;

&lt;p&gt;AWS, Apple, Google, Microsoft, Cisco, CrowdStrike, the Linux Foundation, NVIDIA, Palo Alto Networks, and JPMorganChase are not participating for PR theater. That coalition signals the industry believes AI-driven vulnerability discovery is becoming strategically important.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Open source is central to the story
&lt;/h3&gt;

&lt;p&gt;Anthropic paired the model-access announcement with usage credits and donations for open-source security organizations. That matters because critical infrastructure increasingly depends on open-source components, while maintainers are often stretched thin.&lt;/p&gt;

&lt;h2&gt;
  
  
  My bigger takeaway: the AI skills supply chain now matters
&lt;/h2&gt;

&lt;p&gt;The most interesting second-order effect of Project Glasswing is not just about model safety. It is about trust in the systems that surround these models.&lt;/p&gt;

&lt;p&gt;If AI agents are increasingly writing code, reviewing code, testing software, and securing infrastructure, then we need much better provenance and verification across the entire execution stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;which skills the agent can use&lt;/li&gt;
&lt;li&gt;who authored them&lt;/li&gt;
&lt;li&gt;what they are allowed to do&lt;/li&gt;
&lt;li&gt;how they are versioned&lt;/li&gt;
&lt;li&gt;how they are audited&lt;/li&gt;
&lt;li&gt;how teams can trust them in production&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is exactly why I think the AI skills supply chain is about to become a major category.&lt;/p&gt;

&lt;p&gt;It is also why I care about verified-skill.com, a FREE and OPEN SOURCE registry for verified AI skills. If we want agentic systems to operate safely in real environments, we need trusted building blocks around them, not just more powerful frontier models.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real race
&lt;/h2&gt;

&lt;p&gt;Project Glasswing also makes the central strategic question painfully clear:&lt;br&gt;
Who gets these capabilities first at scale, defenders or attackers?&lt;/p&gt;

&lt;p&gt;Anthropic’s answer is to give defenders a head start. That is rational. But it also suggests a deeper truth: once AI reaches this level of cyber capability, trust, governance, disclosure, patching workflows, and skill-level controls become just as important as raw model intelligence.&lt;/p&gt;

&lt;p&gt;The next era of software security will not be defined only by smarter models.&lt;br&gt;
It will be defined by whether we can build trustworthy systems around them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing thought
&lt;/h2&gt;

&lt;p&gt;Project Glasswing may be remembered as the moment the industry stopped thinking about AI security as a side topic and started treating it as foundational infrastructure.&lt;/p&gt;

&lt;p&gt;Smarter agents are coming whether we are ready or not.&lt;br&gt;
The real work now is making them trustworthy.&lt;/p&gt;

</description>
      <category>opensource</category>
    </item>
    <item>
      <title>Claude Code UltraPlan: why the workflow matters more than the hype</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Tue, 07 Apr 2026 05:30:18 +0000</pubDate>
      <link>https://dev.to/aabyzov/claude-code-ultraplan-why-the-workflow-matters-more-than-the-hype-3p2n</link>
      <guid>https://dev.to/aabyzov/claude-code-ultraplan-why-the-workflow-matters-more-than-the-hype-3p2n</guid>
      <description>&lt;p&gt;Claude Code’s new UltraPlan is getting a lot of “smarter planning” attention.&lt;/p&gt;

&lt;p&gt;I think that framing misses the real product shift.&lt;/p&gt;

&lt;p&gt;UltraPlan looks more important as a &lt;em&gt;workflow upgrade&lt;/em&gt; than as a pure intelligence upgrade.&lt;/p&gt;

&lt;h2&gt;
  
  
  What UltraPlan officially changes
&lt;/h2&gt;

&lt;p&gt;From the official docs, the basic loop is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;start planning from the terminal with &lt;code&gt;/ultraplan&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Claude drafts the plan in the cloud&lt;/li&gt;
&lt;li&gt;you review it in the browser&lt;/li&gt;
&lt;li&gt;you can leave inline comments and reactions&lt;/li&gt;
&lt;li&gt;then you either execute in the cloud or teleport the plan back to your terminal&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That sounds simple, but it changes where planning lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real value: terminal → cloud → review → execution
&lt;/h2&gt;

&lt;p&gt;Most people focus on whether the plan itself is better.&lt;/p&gt;

&lt;p&gt;But in practice, planning is often limited by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how easy it is to review&lt;/li&gt;
&lt;li&gt;how easy it is to revise&lt;/li&gt;
&lt;li&gt;how much it blocks your local workflow&lt;/li&gt;
&lt;li&gt;how cleanly it hands off into execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;UltraPlan improves all four.&lt;/p&gt;

&lt;p&gt;Your terminal stays free.&lt;br&gt;
You get a better review surface.&lt;br&gt;
You can comment on specific parts of the plan.&lt;br&gt;
And you can choose whether execution stays remote or comes back local.&lt;/p&gt;

&lt;p&gt;That is a meaningful improvement in engineering workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where UltraPlan looks stronger
&lt;/h2&gt;

&lt;p&gt;From the transcript I reviewed, a few things stood out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it looked roughly ~2x faster than local planning across repeated runs&lt;/li&gt;
&lt;li&gt;in some migration-style tasks, it seemed better at auditing blast radius and risk&lt;/li&gt;
&lt;li&gt;it looked better suited for multitasking because you can fire off plans and review them asynchronously&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is real value, especially for people working across multiple code changes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the hype breaks down
&lt;/h2&gt;

&lt;p&gt;The same transcript also showed something important:&lt;/p&gt;

&lt;p&gt;UltraPlan did &lt;strong&gt;not&lt;/strong&gt; look consistently smarter than local planning.&lt;/p&gt;

&lt;p&gt;In some tasks it looked stronger.&lt;br&gt;
In others it looked very similar to local planning, just with a much nicer review experience.&lt;/p&gt;

&lt;p&gt;That nuance matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I think this is bigger than one feature
&lt;/h2&gt;

&lt;p&gt;My current read is that UltraPlan may matter more as planning infrastructure than as one fixed planner.&lt;/p&gt;

&lt;p&gt;If Anthropic is using this cloud review loop to test and refine planning strategies over time, then the deeper story is not just a new slash command.&lt;/p&gt;

&lt;p&gt;It is a new control surface for planning quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  The other side of the problem: execution discipline
&lt;/h2&gt;

&lt;p&gt;There is also a separate question here:&lt;/p&gt;

&lt;p&gt;What happens after the plan?&lt;/p&gt;

&lt;p&gt;If your goal is deterministic, spec-first execution, that is where tools like SpecWeave are still important.&lt;/p&gt;

&lt;p&gt;SpecWeave is about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;spec&lt;/li&gt;
&lt;li&gt;plan&lt;/li&gt;
&lt;li&gt;tasks&lt;/li&gt;
&lt;li&gt;tracked execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is completely free and open source.&lt;/p&gt;

&lt;p&gt;That is a different layer of the workflow, but an important one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final takeaway
&lt;/h2&gt;

&lt;p&gt;My takeaway is simple:&lt;/p&gt;

&lt;p&gt;UltraPlan is not mainly interesting because it might generate a better plan.&lt;/p&gt;

&lt;p&gt;It is interesting because it turns planning into a cloud workflow with better review, better handoff, and better iteration speed.&lt;/p&gt;

&lt;p&gt;That may end up mattering more than people think.&lt;/p&gt;

</description>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>I Cancelled My $26,280/Year Cloud GPU Subscription - Here's Why</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Thu, 02 Apr 2026 00:15:45 +0000</pubDate>
      <link>https://dev.to/aabyzov/i-cancelled-my-26280year-cloud-gpu-subscription-heres-why-5bg2</link>
      <guid>https://dev.to/aabyzov/i-cancelled-my-26280year-cloud-gpu-subscription-heres-why-5bg2</guid>
      <description>&lt;p&gt;Last week I ran &lt;code&gt;nvidia-smi&lt;/code&gt; on my MacBook Pro M4 Max.&lt;/p&gt;

&lt;p&gt;128GB unified memory. 7,168 CUDA cores. CUDA 12.8, running natively on Apple Silicon.&lt;/p&gt;

&lt;p&gt;Then I loaded a 70B parameter LLM. Full QLoRA finetune. On a laptop. From my couch.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Part Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;The H100 has 80GB of HBM3. The M4 Max has 128GB unified. The model that literally doesn't fit on a $40,000 datacenter GPU fits on a MacBook.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Math Nobody Does
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Setup&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;H100 cloud&lt;/td&gt;
&lt;td&gt;730 hrs x $3/hr = $2,190/month = $26,280/year&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;M4 Max MacBook Pro&lt;/td&gt;
&lt;td&gt;$4,000 one-time&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Break-even: month 2. After that: pure savings.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inference Performance
&lt;/h2&gt;

&lt;p&gt;The M4 Max's memory bandwidth (546 GB/s) gives me about 15 tok/s on a 70B model. Production-usable for most use cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Shift
&lt;/h2&gt;

&lt;p&gt;Three years ago, finetuning a 70B model required a cluster. Now it requires a laptop and an afternoon.&lt;/p&gt;

&lt;p&gt;What's your current setup for ML work? Cloud or local?&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>apple</category>
      <category>gpu</category>
      <category>ai</category>
    </item>
    <item>
      <title>Your AI Skills Deserve More Than a GitHub Repo Nobody Finds</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Sun, 22 Mar 2026 00:01:54 +0000</pubDate>
      <link>https://dev.to/aabyzov/your-ai-skills-deserve-more-than-a-github-repo-nobody-finds-40p6</link>
      <guid>https://dev.to/aabyzov/your-ai-skills-deserve-more-than-a-github-repo-nobody-finds-40p6</guid>
      <description>&lt;p&gt;2.4 million. That's how many AI skills have been submitted to &lt;a href="https://verified-skill.com" rel="noopener noreferrer"&gt;verified-skill.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;It's a free, open source marketplace for AI agent skills across 39 platforms. Every submission gets AI intent analysis and a three-tier trust score (current average: 99.0). Over 107,000 skills are verified and discoverable right now.&lt;/p&gt;

&lt;p&gt;If you've built a skill for Claude, GPT, Gemini, or any other agent platform, submit it. Two minutes. Free. Your skill stops being invisible.&lt;/p&gt;

&lt;p&gt;Check it out → &lt;a href="https://verified-skill.com" rel="noopener noreferrer"&gt;https://verified-skill.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>agents</category>
      <category>webdev</category>
    </item>
    <item>
      <title>5 Months of Daily Shipping AI Developer Tools — Here's What Happened</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Sat, 14 Mar 2026 20:26:04 +0000</pubDate>
      <link>https://dev.to/aabyzov/5-months-of-daily-shipping-ai-developer-tools-heres-what-happened-3mcp</link>
      <guid>https://dev.to/aabyzov/5-months-of-daily-shipping-ai-developer-tools-heres-what-happened-3mcp</guid>
      <description>&lt;p&gt;5 months ago I made a commitment: build AI developer tools every single day.&lt;/p&gt;

&lt;p&gt;Not blog posts about AI. Not Twitter threads. Actual tools that developers download and use.&lt;/p&gt;

&lt;p&gt;Here's what 5 months of daily shipping looks like — and what I learned.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;vskill&lt;/strong&gt;: 6,273 weekly npm downloads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;specweave&lt;/strong&gt;: 88 GitHub stars (just started promoting)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0 days&lt;/strong&gt; off the keyboard&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;h3&gt;
  
  
  vskill — Verified AI Skills Registry
&lt;/h3&gt;

&lt;p&gt;The skill/plugin layer for AI coding agents (Claude, Cursor, Codex, etc). Think loadable behaviors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Run TDD cycle"&lt;/li&gt;
&lt;li&gt;"Generate E2E tests"&lt;/li&gt;
&lt;li&gt;"Review PR"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;6,000+ developers download it every week. Completely free and open source.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://verified-skill.com" rel="noopener noreferrer"&gt;https://verified-skill.com&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  specweave — AI-First Spec-Driven Development
&lt;/h3&gt;

&lt;p&gt;The bigger vision: Write spec → AI generates tasks with BDD test plans → AI implements → tests gate every closure.&lt;/p&gt;

&lt;p&gt;No vibe coding. No drift. Every AI-generated change is verified by design.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://spec-weave.com" rel="noopener noreferrer"&gt;https://spec-weave.com&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Lesson
&lt;/h2&gt;

&lt;p&gt;The developers winning with AI aren't prompting harder. They're building systems where &lt;strong&gt;AI correctness is guaranteed by design&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Specs + tests &amp;gt; better prompts.&lt;/p&gt;

&lt;p&gt;Both tools are completely &lt;strong&gt;free and open source&lt;/strong&gt;. Both are growing.&lt;/p&gt;

&lt;p&gt;What are you building with AI? I'm curious what approaches others are taking to make AI coding reliable.&lt;/p&gt;

</description>
      <category>buildinpublic</category>
    </item>
    <item>
      <title>Claude Code's /voice heard 'clot coat' when I said 'Claude Code' — Voice Tools for Developers Compared</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Fri, 13 Mar 2026 16:28:01 +0000</pubDate>
      <link>https://dev.to/aabyzov/claude-codes-voice-heard-clot-coat-when-i-said-claude-code-voice-tools-for-developers-2fd</link>
      <guid>https://dev.to/aabyzov/claude-codes-voice-heard-clot-coat-when-i-said-claude-code-voice-tools-for-developers-2fd</guid>
      <description>&lt;p&gt;Claude Code just shipped &lt;code&gt;/voice&lt;/code&gt; — voice input directly in the terminal.&lt;/p&gt;

&lt;p&gt;I tested it the moment it landed. Said "Claude Code." It transcribed "clot coat."&lt;/p&gt;

&lt;p&gt;Not great. But let's be fair about what &lt;code&gt;/voice&lt;/code&gt; actually is today.&lt;/p&gt;

&lt;h2&gt;
  
  
  What /voice does
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Voice input only&lt;/strong&gt; — you speak, it transcribes to text, Claude responds in text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Terminal CLI only&lt;/strong&gt; — no VSCode support yet&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No voice output&lt;/strong&gt; — Claude doesn't speak back&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No vocabulary learning&lt;/strong&gt; — it will keep getting "Claude Code" wrong&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point is the real issue for daily use.&lt;/p&gt;

&lt;h2&gt;
  
  
  The comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;/voice&lt;/th&gt;
&lt;th&gt;ElevenLabs&lt;/th&gt;
&lt;th&gt;Wispr Flow&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Voice input&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Voice output (TTS)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Works in terminal&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Works in VSCode&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Works everywhere on Mac&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vocabulary learning&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why vocabulary learning matters
&lt;/h2&gt;

&lt;p&gt;Wispr Flow remembers every correction. Fix "Claude Code" once, it's correct forever. Same for your project names, framework abbreviations, and technical jargon.&lt;/p&gt;

&lt;p&gt;It works in every input field on Mac — Slack, browser, terminal, editors, everything. And it gets smarter on YOUR vocabulary with every use.&lt;/p&gt;

&lt;h2&gt;
  
  
  The verdict
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;/voice&lt;/code&gt; is a promising v1. If you live entirely in the terminal and don't mind re-correcting the same words, it works.&lt;/p&gt;

&lt;p&gt;But for daily developer workflows on Mac, Wispr Flow is still in a different category. The vocabulary learning alone makes it irreplaceable.&lt;/p&gt;

&lt;p&gt;Has anyone found a workflow where &lt;code&gt;/voice&lt;/code&gt; actually outperforms dedicated tools? I'm genuinely curious.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>claudecode</category>
      <category>devtools</category>
    </item>
    <item>
      <title>Anthropic's Paid Code Reviews vs Free Multi-Agent Reviews with SpecWeave</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Wed, 11 Mar 2026 04:39:11 +0000</pubDate>
      <link>https://dev.to/aabyzov/anthropics-paid-code-reviews-vs-free-multi-agent-reviews-with-specweave-475e</link>
      <guid>https://dev.to/aabyzov/anthropics-paid-code-reviews-vs-free-multi-agent-reviews-with-specweave-475e</guid>
      <description>&lt;p&gt;Anthropic just announced paid code reviews in Claude Code. $15-$25 per review. Can't use your Pro plan. Can't use Max.&lt;/p&gt;

&lt;p&gt;But here's what most developers don't realize: &lt;strong&gt;Claude Code's CLI already supports code review locally. For free.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Single-Repo Problem
&lt;/h2&gt;

&lt;p&gt;The bigger issue with their paid review? It only analyzes &lt;strong&gt;one repository at a time&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you're running microservices, a change in your API gateway could break your payment service. Their review will never see that. It only looks at one repo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Multi-Agent Code Reviews with SpecWeave
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://spec-weave.com" rel="noopener noreferrer"&gt;SpecWeave&lt;/a&gt; (completely free and open source, available on &lt;a href="https://verified-skill.com" rel="noopener noreferrer"&gt;verified-skill.com&lt;/a&gt;) takes a different approach. One command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/sw:team-lead [PR-URL] "thoroughly review this PR"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This spins up &lt;strong&gt;3 parallel AI agents&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Security reviewer&lt;/strong&gt; — vulnerability analysis, auth issues, injection risks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Logic reviewer&lt;/strong&gt; — business logic errors, edge cases, race conditions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Architecture reviewer&lt;/strong&gt; — design patterns, coupling, scalability concerns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All three review across your &lt;strong&gt;entire codebase&lt;/strong&gt; — not just one repo. All your microservices. All your shared libraries. The full picture.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Results
&lt;/h2&gt;

&lt;p&gt;Each agent produces independent findings that get coordinated by the team-lead agent into a unified review with severity ratings (critical, high, medium, low).&lt;/p&gt;

&lt;p&gt;The whole thing runs locally using Claude Code under the hood. No API fees. No per-review charges. Just your existing subscription.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://spec-weave.com" rel="noopener noreferrer"&gt;spec-weave.com&lt;/a&gt; — the SpecWeave project&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://verified-skill.com" rel="noopener noreferrer"&gt;verified-skill.com&lt;/a&gt; — free, open source skill registry&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Install it in 2 minutes. One command. Three reviewers. Full codebase visibility.&lt;/p&gt;

</description>
      <category>devtools</category>
    </item>
    <item>
      <title>Claude Opus 4.6 Found 22 Firefox Vulnerabilities in 2 Weeks — AI Security Just Got Real</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Mon, 09 Mar 2026 13:26:34 +0000</pubDate>
      <link>https://dev.to/aabyzov/claude-opus-46-found-22-firefox-vulnerabilities-in-2-weeks-ai-security-just-got-real-52kl</link>
      <guid>https://dev.to/aabyzov/claude-opus-46-found-22-firefox-vulnerabilities-in-2-weeks-ai-security-just-got-real-52kl</guid>
      <description>&lt;p&gt;Anthropic's Claude Opus 4.6 just discovered 22 new security vulnerabilities in Firefox — 14 of them high-severity — in just two weeks of automated scanning.&lt;/p&gt;

&lt;p&gt;One use-after-free bug was found in 20 minutes of exploration. These weren't theoretical — they were real bugs patched in Firefox 148.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;22&lt;/strong&gt; new vulnerabilities discovered&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;14&lt;/strong&gt; high-severity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;6,000&lt;/strong&gt; C++ files scanned&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;20 minutes&lt;/strong&gt; to find one critical use-after-free bug&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2&lt;/strong&gt; successful exploits out of hundreds of attempts&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Dual-Use Problem
&lt;/h2&gt;

&lt;p&gt;Here's what makes this both exciting and concerning: the same AI capability that finds bugs defensively can be weaponized offensively.&lt;/p&gt;

&lt;p&gt;Right now, AI appears to be a better defender than attacker — Claude could find bugs but only successfully wrote 2 exploits out of several hundred attempts. But that capability gap won't last forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means
&lt;/h2&gt;

&lt;p&gt;If you're in security, this changes your threat model. AI-assisted vulnerability discovery at scale means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Defenders get superpowers&lt;/strong&gt; — codebases can be audited at unprecedented speed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attackers get the same tools&lt;/strong&gt; — zero-day discovery becomes faster and cheaper&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verification becomes critical&lt;/strong&gt; — we need to verify AI skills, agents, and tools before they touch production systems&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is exactly the problem I'm working on at &lt;a href="https://verified-skill.com" rel="noopener noreferrer"&gt;verified-skill.com&lt;/a&gt; — building verification infrastructure for AI agent skills before they can execute on your system.&lt;/p&gt;

&lt;p&gt;The AI security arms race isn't coming. It's here.&lt;/p&gt;

</description>
      <category>firefox</category>
    </item>
    <item>
      <title>Three AI Stories Dropped in 24 Hours. Almost No One Is Connecting Them.</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Fri, 06 Mar 2026 21:54:01 +0000</pubDate>
      <link>https://dev.to/aabyzov/three-ai-stories-dropped-in-24-hours-almost-no-one-is-connecting-them-4fk8</link>
      <guid>https://dev.to/aabyzov/three-ai-stories-dropped-in-24-hours-almost-no-one-is-connecting-them-4fk8</guid>
      <description>&lt;p&gt;Yesterday was arguably the most important day in AI this year. Not because of any single announcement — but because of three that landed simultaneously.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. OpenAI dropped GPT-5.4
&lt;/h2&gt;

&lt;p&gt;Native computer use. 1 million token context window. 33% fewer hallucinations vs GPT-5.2. Three models at once: GPT-5.3 Instant, GPT-5.4 Thinking, GPT-5.4 Pro.&lt;/p&gt;

&lt;p&gt;This is their most capable release ever. The message is clear: raw, unrestricted capability, shipped as fast as possible.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://openai.com/index/introducing-gpt-5-4/" rel="noopener noreferrer"&gt;Source: OpenAI announcement&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Pentagon officially labeled Anthropic a supply chain risk
&lt;/h2&gt;

&lt;p&gt;Effective immediately. Anthropic is now the &lt;strong&gt;first American company ever&lt;/strong&gt; to receive this designation, which has traditionally been reserved for foreign adversaries like Huawei or Kaspersky.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The reason?&lt;/strong&gt; Anthropic refused to let Claude be used for mass surveillance of American citizens or autonomous weapons systems. Defense Secretary Hegseth announced it publicly.&lt;/p&gt;

&lt;p&gt;Read that again: a company built ethical guardrails into its AI. The U.S. Department of Defense called them a national security risk for it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://techcrunch.com/2026/03/05/its-official-the-pentagon-has-labeled-anthropic-a-supply-chain-risk/" rel="noopener noreferrer"&gt;Source: TechCrunch&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Claude Code brought back "ultrathink"
&lt;/h2&gt;

&lt;p&gt;After Anthropic deprecated the &lt;code&gt;ultrathink&lt;/code&gt; keyword in January, users noticed quality degradation in complex coding tasks. A GitHub issue was filed. Community pressure mounted. The feature was restored in the latest update.&lt;/p&gt;

&lt;p&gt;This is a small story, but it matters: users still have power when they speak up.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/anthropics/claude-code/issues/19098" rel="noopener noreferrer"&gt;Source: GitHub issue #19098&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why these three stories matter together
&lt;/h2&gt;

&lt;p&gt;On the same day, we saw:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pure capability&lt;/strong&gt; being shipped at maximum speed (GPT-5.4)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A company getting punished&lt;/strong&gt; by the government for setting ethical guardrails (Anthropic)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Users successfully demanding quality&lt;/strong&gt; from their tools (ultrathink)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI industry just hit a genuine fork in the road:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Build everything, ask questions later.&lt;br&gt;
Or build responsibly, even when it costs you.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  My take
&lt;/h2&gt;

&lt;p&gt;I build developer tools on Claude Code every day. These models power real production work for me and thousands of others.&lt;/p&gt;

&lt;p&gt;This week forced me to think harder about the stack I depend on. Not just which model is fastest or cheapest — but which company's values align with how I want AI to be built.&lt;/p&gt;

&lt;p&gt;Both paths lead somewhere. The question is where.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What do you think — should AI companies have the right to set ethical guardrails on military use of their products? I'd genuinely love to hear your perspective in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>ethics</category>
      <category>openai</category>
      <category>programming</category>
    </item>
    <item>
      <title>Hackers Jailbroke Claude to Steal 195M Mexican Taxpayer Records — Why AI Security Needs Layers</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Fri, 06 Mar 2026 02:45:20 +0000</pubDate>
      <link>https://dev.to/aabyzov/hackers-jailbroke-claude-to-steal-195m-mexican-taxpayer-records-why-ai-security-needs-layers-3dpk</link>
      <guid>https://dev.to/aabyzov/hackers-jailbroke-claude-to-steal-195m-mexican-taxpayer-records-why-ai-security-needs-layers-3dpk</guid>
      <description>&lt;p&gt;Hackers just jailbroke Claude with 1,000+ prompts and stole 195 million Mexican taxpayer records. The AI initially refused. They kept pushing until it didn't.&lt;/p&gt;

&lt;p&gt;This is exactly why we built &lt;a href="https://github.com/openclaw/openclaw" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt; with strict guardrails and audit trails. AI agents that touch real systems need real security. Not just "please don't hack things" in the system prompt.&lt;/p&gt;

&lt;p&gt;The cost of sophistication just dropped to near zero. If your AI tools don't have layered defenses, you're already behind.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A cybercrime group used 1,000+ jailbreak prompts to bypass Claude's safety guardrails&lt;/li&gt;
&lt;li&gt;They compromised 9 Mexican government systems stealing 150GB of data&lt;/li&gt;
&lt;li&gt;195 million identities exposed including tax records, vehicle registrations, birth certificates&lt;/li&gt;
&lt;li&gt;Anthropic banned the accounts but the damage was done&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Source: &lt;a href="https://www.latimes.com/business/story/2026-03-05/how-our-ai-bots-are-ignoring-their-programming-giving-hackers-superpowers" rel="noopener noreferrer"&gt;LA Times&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>cybersecurity</category>
      <category>claude</category>
    </item>
    <item>
      <title>Claude AI: #1 on App Store, 99.74% Government Uptime, and the Iran-Hormuz Crisis</title>
      <dc:creator>Anton Abyzov</dc:creator>
      <pubDate>Tue, 03 Mar 2026 21:10:42 +0000</pubDate>
      <link>https://dev.to/aabyzov/claude-ai-1-on-app-store-9974-government-uptime-and-the-iran-hormuz-crisis-121k</link>
      <guid>https://dev.to/aabyzov/claude-ai-1-on-app-store-9974-government-uptime-and-the-iran-hormuz-crisis-121k</guid>
      <description>&lt;p&gt;The same AI that hit #1 on the App Store today was used to plan military strikes yesterday.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Claude AI&lt;/strong&gt; by Anthropic is now the #1 free app on Apple's US App Store&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ChatGPT uninstalls&lt;/strong&gt; jumped 295% after OpenAI announced its Pentagon partnership&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude for Government&lt;/strong&gt; shows 99.74% uptime on status.claude.com&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;94% of ship traffic&lt;/strong&gt; through the Strait of Hormuz has stopped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Oil prices&lt;/strong&gt; are surging&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Happened
&lt;/h2&gt;

&lt;p&gt;OpenAI signed a classified Pentagon deal. Anthropic refused.&lt;/p&gt;

&lt;p&gt;Trump banned Anthropic's tech. Hours later, the military used Claude for intelligence assessments during Operation Epic Fury against Iran.&lt;/p&gt;

&lt;p&gt;Today, Trump offered political risk insurance for ships transiting the Strait of Hormuz.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Irony
&lt;/h2&gt;

&lt;p&gt;The same AI millions are downloading to write emails is reportedly running military operations. Anthropic refused the Pentagon contract on ethical grounds. The military used it anyway.&lt;/p&gt;

&lt;p&gt;We're not in a sci-fi novel. This is Tuesday.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;What do you think?&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
