<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sayed Ali Alkamel</title>
    <description>The latest articles on DEV Community by Sayed Ali Alkamel (@sayed_ali_alkamel).</description>
    <link>https://dev.to/sayed_ali_alkamel</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2652218%2F63a5dfd1-8229-48c1-85eb-54a58560297f.jpg</url>
      <title>DEV Community: Sayed Ali Alkamel</title>
      <link>https://dev.to/sayed_ali_alkamel</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sayed_ali_alkamel"/>
    <language>en</language>
    <item>
      <title>LLM Token Counting Explained: The 30-Year-Old Algorithm Quietly Running Your AI Bill</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Mon, 15 Jun 2026 16:41:00 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/llm-token-counting-explained-the-30-year-old-algorithm-quietly-running-your-ai-bill-2lfh</link>
      <guid>https://dev.to/sayed_ali_alkamel/llm-token-counting-explained-the-30-year-old-algorithm-quietly-running-your-ai-bill-2lfh</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TLDR:&lt;/strong&gt; A token is a subword chunk, typically 3 to 4 characters in English. The algorithm splitting your text into those chunks started as a file compression trick in 1994, was adapted for neural machine translation in 2016, and is now the billing unit for every major AI API. Each provider uses a different tokenizer. The same prompt can produce meaningfully different token counts across GPT, Claude, Gemini, and Llama. Understanding this is not optional if you are building production AI systems.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Where Tokens Actually Came From
&lt;/h2&gt;

&lt;p&gt;Most developers learn about tokens through an API bill. That is the wrong starting point.&lt;/p&gt;

&lt;p&gt;The algorithm behind modern LLM tokenization is called &lt;strong&gt;Byte Pair Encoding (BPE)&lt;/strong&gt;. Philip Gage introduced it in 1994 in a paper titled “A New Algorithm for Data Compression,” published in &lt;em&gt;The C Users Journal&lt;/em&gt;. The idea was simple: scan a binary file for the most frequent pair of adjacent bytes, replace that pair with a single unused byte, repeat until the file is as small as possible.&lt;/p&gt;

&lt;p&gt;That is it. A compression trick. Nothing to do with language models.&lt;/p&gt;

&lt;p&gt;In 2016, Rico Sennrich, Barry Haddow, and Alexandra Birch published “Neural Machine Translation of Rare Words with Subword Units.” Their core insight was that the same iterative merge logic that compressed bytes could compress characters into subwords for machine translation. Instead of shrinking a file, you were shrinking a vocabulary, and you could represent rare words without an “unknown token” fallback.&lt;/p&gt;

&lt;p&gt;That paper became the foundation of how every major LLM today handles text.&lt;/p&gt;




&lt;h2&gt;
  
  
  How BPE Tokenization Actually Works
&lt;/h2&gt;

&lt;p&gt;Start with individual characters. Find the most frequent pair. Merge them into one token. Repeat until you hit your target vocabulary size.&lt;/p&gt;

&lt;p&gt;The interactive animation below walks through this on a real example. Open it in any browser:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F547xv643r0876o77z7xk.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F547xv643r0876o77z7xk.gif" alt=" " width="660" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is the same process as a static diagram for reference:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8flokgwwcpp5fupmnbl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn8flokgwwcpp5fupmnbl.png" alt=" " width="766" height="3180"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The key insight: common sequences earn their own token slot through frequency alone. No linguistic rules, no hand-crafted vocabulary. Just counting and merging, over and over.&lt;/p&gt;

&lt;p&gt;When GPT-2 shipped, OpenAI extended this to &lt;strong&gt;byte-level BPE&lt;/strong&gt;: the base vocabulary is all 256 possible bytes, which means no text is ever truly “unknown.” Any Unicode input can be represented as a sequence of existing tokens. That was a meaningful leap.&lt;/p&gt;

&lt;p&gt;The result is that common English words become single tokens. Rare words, code identifiers, and non-Latin scripts split into multiple tokens. The word “tokenization” is two tokens in OpenAI’s cl100k encoding. The Turkish word for “hello” (merhaba) is three.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Token Became the Unit of AI Economics
&lt;/h2&gt;

&lt;p&gt;Language models process text as a flat sequence of integers, one per token. The transformer’s attention mechanism scales quadratically with sequence length, which means longer sequences are exponentially more expensive to compute. Pricing in tokens is not arbitrary: it maps directly to compute cost.&lt;/p&gt;

&lt;p&gt;Context windows have grown from 512 tokens in early GPT-1 to 10 million tokens in models like Llama 4 Scout today. That is a 20,000x increase in working memory. But bigger windows are not free. A 1 million token context at Claude’s pricing costs more per call than most developers expect the first time they hit it.&lt;/p&gt;

&lt;p&gt;There is also a performance ceiling. A 2023 paper by Liu et al. established what practitioners now call the &lt;strong&gt;“lost in the middle” problem&lt;/strong&gt;: LLMs attend best to content at the start and end of a prompt. Information buried in the center of a long context is processed less reliably, even when it technically fits within the window. Context count is not the same as context quality.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Different Providers Count Differently
&lt;/h2&gt;

&lt;p&gt;Every major provider maintains its own tokenizer, trained on its own corpus, with its own vocabulary size and merge rules. The same sentence produces different token counts depending on where you send it.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Tokenizer&lt;/th&gt;
&lt;th&gt;Vocab Size&lt;/th&gt;
&lt;th&gt;Offline?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI (GPT-4o)&lt;/td&gt;
&lt;td&gt;tiktoken (o200k_base)&lt;/td&gt;
&lt;td&gt;200,000&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI (GPT-4)&lt;/td&gt;
&lt;td&gt;tiktoken (cl100k_base)&lt;/td&gt;
&lt;td&gt;100,256&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Meta (Llama 3)&lt;/td&gt;
&lt;td&gt;tiktoken-compatible&lt;/td&gt;
&lt;td&gt;128,000&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google (Gemini)&lt;/td&gt;
&lt;td&gt;SentencePiece&lt;/td&gt;
&lt;td&gt;~256,000&lt;/td&gt;
&lt;td&gt;API only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anthropic (Claude 3+)&lt;/td&gt;
&lt;td&gt;Proprietary BPE&lt;/td&gt;
&lt;td&gt;Undisclosed&lt;/td&gt;
&lt;td&gt;API only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mistral&lt;/td&gt;
&lt;td&gt;SentencePiece&lt;/td&gt;
&lt;td&gt;32,000&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The critical column is the last one. Anthropic does not ship a local tokenizer for Claude 3 and later models. The only way to get a ground-truth token count for Claude is through the &lt;code&gt;count_tokens&lt;/code&gt; API endpoint, which is free but requires a network call. When that is not suitable, tiktoken works as a close approximation with roughly 5 to 10 percent variance.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Rules That Actually Matter in Production
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Rule 1: 1 token is roughly 4 English characters or 0.75 words.&lt;/strong&gt;&lt;br&gt;
This is the universal approximation. Divide your word count by 0.75 to estimate tokens quickly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule 2: Non-English text multiplies your token budget.&lt;/strong&gt;&lt;br&gt;
Non-Latin scripts, especially Arabic, Chinese, and Thai, can produce 2 to 3x more tokens per word compared to English. If your product handles Arabic input and you priced it against English benchmarks, your cost model is wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule 3: Code tokenizes differently than prose.&lt;/strong&gt;&lt;br&gt;
Code identifiers, whitespace, and symbols split unpredictably. A 200-line Python file is not the same token budget as a 200-word paragraph.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule 4: Count the full message, not just the user prompt.&lt;/strong&gt;&lt;br&gt;
System prompts, tool schemas, function signatures, and conversation history all count against your context window and your bill. Developers who only count user input are usually undercounting by 20 to 40 percent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rule 5: Fit matters more than size.&lt;/strong&gt;&lt;br&gt;
A 1 million token window does not mean you should use 900,000 of it. Attention degrades at scale, latency grows, and cost grows linearly. The right context is the minimum that gives the model what it needs.&lt;/p&gt;


&lt;h2&gt;
  
  
  Counting Tokens in Code
&lt;/h2&gt;

&lt;p&gt;For OpenAI-compatible models, &lt;code&gt;tiktoken&lt;/code&gt; is the fastest option and works fully offline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tiktoken&lt;/span&gt;

&lt;span class="n"&gt;enc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tiktoken&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_encoding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cl100k_base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# GPT-4 and Claude approximation
&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Your prompt goes here&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;token_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token_count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Claude’s exact count (requires a network call, but it is free):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count_tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-opus-4-6&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Your prompt goes here&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_tokens&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a zero-dependency approximation that works in any language:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;estimate_tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# 4 characters per token is the standard English rule of thumb
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What This Means for How You Build
&lt;/h2&gt;

&lt;p&gt;Tokens are not a detail. They are the constraint that determines whether your RAG pipeline fits in a request, whether your agent loop runs for 5 turns or 50, and whether your API bill is $200 or $2,000 at scale.&lt;/p&gt;

&lt;p&gt;The 1994 compression algorithm that became BPE was designed to do one thing: make redundant patterns smaller. That is still exactly what it does inside every LLM. Text in, integer sequence out, attention over that sequence, text back out.&lt;/p&gt;

&lt;p&gt;Knowing the shape of that pipeline does not make you a researcher. It makes you a builder who does not get surprised by a $4,000 invoice.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Key references:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gage, P. (1994). “A New Algorithm for Data Compression.” &lt;em&gt;The C Users Journal.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Sennrich, R., Haddow, B., &amp;amp; Birch, A. (2016). “Neural Machine Translation of Rare Words with Subword Units.” &lt;em&gt;ACL 2016.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Liu, N., et al. (2023). “Lost in the Middle: How Language Models Use Long Contexts.” &lt;em&gt;arXiv:2307.03172.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;OpenAI tiktoken: &lt;a href="https://github.com/openai/tiktoken" rel="noopener noreferrer"&gt;github.com/openai/tiktoken&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic Token Counting API: &lt;a href="https://docs.anthropic.com/en/docs/build-with-claude/token-counting" rel="noopener noreferrer"&gt;docs.anthropic.com/en/docs/build-with-claude/token-counting&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Antigravity 2.0 for Flutter Developers: CLI, SDK &amp; Agentic Workflows That Actually Matter</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Sun, 14 Jun 2026 17:44:43 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/antigravity-20-for-flutter-developers-cli-sdk-agentic-workflows-that-actually-matter-231o</link>
      <guid>https://dev.to/sayed_ali_alkamel/antigravity-20-for-flutter-developers-cli-sdk-agentic-workflows-that-actually-matter-231o</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Antigravity 2.0 replaces Gemini CLI with a faster Go-based CLI (&lt;code&gt;agy&lt;/code&gt;), adds an SDK for custom agent workflows, and ships a Dart &amp;amp; Flutter MCP server that gives agents live context of your running app. The migration deadline is &lt;strong&gt;June 18, 2026&lt;/strong&gt;. The three things that matter most for Flutter devs: &lt;code&gt;AGENTS.md&lt;/code&gt;, Agentic Hot Reload, and the Stitch → MCP → Flutter pipeline.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;Gemini CLI is being retired on June 18, 2026. If you're a Flutter developer who uses it, you're not just upgrading a tool; you're migrating to a different model of how AI fits into your workflow. Antigravity 2.0, announced at Google I/O 2026 alongside Flutter 3.44, is not a rename. It's four distinct surfaces where one VS Code fork used to be:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Antigravity Desktop:&lt;/strong&gt; IDE-style editor with an integrated agent panel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Antigravity CLI (&lt;code&gt;agy&lt;/code&gt;):&lt;/strong&gt; terminal-first agent built in Go, fast and scriptable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Antigravity SDK:&lt;/strong&gt; programmatic API for custom agent workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Managed Agents:&lt;/strong&gt; cloud-isolated Linux environments spun up with a single API call&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This article focuses on the three things that give Flutter developers a real, practical edge, not the marketing overview.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. &lt;code&gt;AGENTS.md&lt;/code&gt;: Your Project's AI Constitution
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is &lt;code&gt;AGENTS.md&lt;/code&gt;?&lt;/strong&gt; It's a Markdown file you place at the root of your Flutter project. Antigravity reads it at the start of every agent session and uses it as a standing instruction set for the entire project. It replaces the habit of copy-pasting your architecture rules into every prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Project Agent Rules&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Architecture: Clean Architecture, feature-based folder structure
&lt;span class="p"&gt;-&lt;/span&gt; State management: Riverpod (prefer AsyncNotifier over FutureProvider)
&lt;span class="p"&gt;-&lt;/span&gt; Never hardcode colors, always use ThemeData tokens
&lt;span class="p"&gt;-&lt;/span&gt; Run &lt;span class="sb"&gt;`flutter analyze`&lt;/span&gt; before suggesting any code change
&lt;span class="p"&gt;-&lt;/span&gt; Target Flutter 3.44+, Dart 3.4+
&lt;span class="p"&gt;-&lt;/span&gt; Localization: use flutter_localizations, never raw strings in widgets
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop this in version control. Every developer on your team, and every agent session, starts with the same context. Onboarding a new agent is now the same as onboarding a new developer.&lt;/p&gt;

&lt;p&gt;The file is also backward compatible: &lt;code&gt;GEMINI.md&lt;/code&gt; is still read, but &lt;code&gt;AGENTS.md&lt;/code&gt; takes priority when both exist.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Install the Antigravity CLI and Migrate Today
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is &lt;code&gt;agy&lt;/code&gt;?&lt;/strong&gt; It's the Antigravity CLI, a terminal-based agent built in Go that replaces &lt;code&gt;gemini&lt;/code&gt;. It's faster, ships with a proper TUI, and shares the same underlying model and plugin system as the desktop app.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# macOS / Linux&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://antigravity.google/install.sh | bash

&lt;span class="c"&gt;# Windows (PowerShell)&lt;/span&gt;
irm https://antigravity.google/install.ps1 | iex

&lt;span class="c"&gt;# Verify&lt;/span&gt;
agy &lt;span class="nt"&gt;--version&lt;/span&gt;  &lt;span class="c"&gt;# agy version 2.0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you have existing Gemini CLI config, migration is either automatic on first run (it detects and prompts) or manual:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agy plugin import gemini
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your Agent Skills, Hooks, Subagents, and Extensions (now called Antigravity Plugins) carry over. Global config lives in &lt;code&gt;~/.antigravity/&lt;/code&gt; going forward.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Enable the Dart &amp;amp; Flutter MCP Server
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is the Flutter MCP server?&lt;/strong&gt; MCP (Model Context Protocol) is a standard for giving an agent live, structured access to your development environment. The Dart &amp;amp; Flutter MCP server lets Antigravity read your widget tree, active routes, Dart analysis output, and running app state, without you pasting code into the chat.&lt;/p&gt;

&lt;p&gt;Install it inside Antigravity IDE:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open the Agent panel with &lt;code&gt;Cmd/Ctrl + L&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Search for &lt;strong&gt;Dart &amp;amp; Flutter MCP server&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Install and restart&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Once enabled, the agent can see what's in your running emulator and act on it. This is what makes Agentic Hot Reload work.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Agentic Hot Reload: Prompt-to-Emulator in Seconds
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is Agentic Hot Reload?&lt;/strong&gt; When Antigravity's agent modifies a Flutter file, it triggers Flutter's hot reload automatically. No manual &lt;code&gt;r&lt;/code&gt;. No switching windows. You describe a UI change in the Agent panel, the Dart file updates, and the emulator reflects it in place.&lt;/p&gt;

&lt;p&gt;Enable it by switching to &lt;strong&gt;Agent-driven mode&lt;/strong&gt; in Antigravity settings. In this mode the agent runs commands (including hot reload) without asking for per-step approval. For production projects, use &lt;strong&gt;Review-driven mode&lt;/strong&gt; instead, with the same capability but the agent pauses and asks before executing commands like &lt;code&gt;flutter pub add&lt;/code&gt; or file deletions.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. The Stitch → Antigravity → Flutter Pipeline
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How do you go from a design to a running Flutter app with Antigravity?&lt;/strong&gt; The workflow the Flutter community has been building around uses three tools in sequence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Design in Stitch&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Describe your screen with full context: platform (Android/iOS), design system (Material 3), accessibility requirements, screens, components. Stitch outputs a structured design artifact.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Connect via MCP&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Export the Stitch output through MCP connectors. This passes the design as structured context into Antigravity's agent, not as a screenshot but as typed design tokens and layout data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Prompt the agent with your architecture&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
In the Antigravity Agent tab:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Build the habit tracker screen from the Stitch export.
Follow Clean Architecture. Use Riverpod, Material 3 tokens, 
support light and dark mode. No hardcoded strings.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In 10–12 minutes you have runnable Flutter + Dart code. Then &lt;code&gt;flutter run&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to review manually:&lt;/strong&gt; Complex async state, platform channels, native plugin integrations, and anything touching sensitive permissions. The agent handles scaffolding well; business logic edge cases still need a human pass.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Antigravity SDK: Automate Your Flutter Agent Workflows
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is the Antigravity SDK for?&lt;/strong&gt; It gives you programmatic access to the same agent harness that powers the desktop and CLI, optimized for Gemini 3.5 Flash (currently the default model, 4x faster than the previous generation, higher benchmark scores). Use it to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run a Flutter lint-and-fix agent as a CI step before PR merge&lt;/li&gt;
&lt;li&gt;Build a custom agent that understands your design system tokens and enforces them across generated code&lt;/li&gt;
&lt;li&gt;Automate multi-step workflows: generate → test → fix → commit, with Managed Agents providing the isolated Linux environment for each run&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Managed Agents spin up a full isolated environment with a single API call and maintain persistent state across multi-turn sessions, useful for long-running tasks like migrating a large Flutter codebase to a new API.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pricing Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Plan&lt;/th&gt;
&lt;th&gt;Monthly&lt;/th&gt;
&lt;th&gt;Usage limit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Free (Individual)&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;td&gt;Deprecating June 18, 2026&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google AI Pro&lt;/td&gt;
&lt;td&gt;~$20&lt;/td&gt;
&lt;td&gt;Standard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI Ultra&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$100&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;5x Pro limits + Managed Agents&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Deadline:&lt;/strong&gt; Free Gemini Code Assist for individuals stops serving requests on &lt;strong&gt;June 18, 2026&lt;/strong&gt;. You need a plan or the CLI migration before that date.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What This Changes in Practice
&lt;/h2&gt;

&lt;p&gt;Antigravity 2.0 doesn't change what good Flutter code looks like. Clean Architecture is still Clean Architecture. Riverpod still behaves the same way. What changes is how much of the scaffolding and boilerplate you write yourself.&lt;/p&gt;

&lt;p&gt;The developers getting the most out of it treat &lt;code&gt;AGENTS.md&lt;/code&gt; seriously, use Review-driven mode on anything that touches production, and keep human review on complex state logic and native integrations. The agent is fast at the right things. Know what those things are, and you get the speed without the rework.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sources: &lt;a href="https://docs.flutter.dev/ai/antigravity" rel="noopener noreferrer"&gt;Flutter docs — Antigravity&lt;/a&gt; · &lt;a href="https://docs.flutter.dev/ai/antigravity-cli" rel="noopener noreferrer"&gt;Flutter docs — Antigravity CLI&lt;/a&gt; · &lt;a href="https://developers.googleblog.com/an-important-update-transitioning-gemini-cli-to-antigravity-cli/" rel="noopener noreferrer"&gt;Google Developers Blog — Gemini CLI to Antigravity CLI transition&lt;/a&gt; · &lt;a href="https://techcrunch.com/2026/05/19/google-launches-antigravity-2-0-with-an-updated-desktop-app-and-cli-tool-at-io-2026/" rel="noopener noreferrer"&gt;TechCrunch — Google I/O 2026 coverage&lt;/a&gt; · &lt;a href="https://www.marktechpost.com/2026/05/19/google-launches-antigravity-2-0-at-i-o-2026-a-standalone-agent-first-platform-with-cli-sdk-managed-execution-and-enterprise-support/" rel="noopener noreferrer"&gt;MarkTechPost — Antigravity 2.0 full breakdown&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>flutter</category>
      <category>antigravity</category>
      <category>ai</category>
      <category>dart</category>
    </item>
    <item>
      <title>I Pointed a Skill Linter at a 52k-Star Repo. Here Is What 84/100 Looks Like.</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Sat, 13 Jun 2026 12:54:57 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/i-pointed-a-skill-linter-at-a-52k-star-repo-here-is-what-84100-looks-like-28cn</link>
      <guid>https://dev.to/sayed_ali_alkamel/i-pointed-a-skill-linter-at-a-52k-star-repo-here-is-what-84100-looks-like-28cn</guid>
      <description>&lt;p&gt;Every AI agent skill you write burns context on every turn.&lt;/p&gt;

&lt;p&gt;Not just when the skill is running. On every turn. The agent keeps each skill's name and description loaded permanently so it knows when to invoke them. A vague description is not just a documentation problem. It is a tax you pay per message, forever.&lt;/p&gt;

&lt;p&gt;That is the problem I built &lt;a href="https://pub.dev/packages/skillscore" rel="noopener noreferrer"&gt;skillscore&lt;/a&gt; to catch.&lt;/p&gt;

&lt;p&gt;When &lt;a href="https://github.com/addyosmani/agent-skills" rel="noopener noreferrer"&gt;addyosmani/agent-skills&lt;/a&gt; hit 52,000 stars and went to #1 trending on GitHub, I had my benchmark. 24 production-grade skills written by people who clearly know what they are doing. If a static linter has anything useful to say at this level, this is where to find out.&lt;/p&gt;

&lt;p&gt;So I ran it.&lt;/p&gt;




&lt;h2&gt;
  
  
  One command. 24 skills. Two seconds.
&lt;/h2&gt;

&lt;p&gt;This is what skillscore 0.2.0 can do now:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;skillscore /path/to/agent-skills/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One command scores everything in the tree. Here is the output:&lt;/p&gt;


  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fsayed3li97%2Fskillscore%2Fmain%2Fdocs%2Fassets%2Fmultipath-demo.gif" alt="Terminal recording: skillscore scores three agent-skills in one command showing 91/A, 88/B, and 77/C, then drills into the 77/C skill" width="800" height="586"&gt;


&lt;p&gt;&lt;em&gt;Three skills from addyosmani/agent-skills scored in one command, then a drill-down into the lowest scorer.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The full results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Skill&lt;/th&gt;
&lt;th&gt;Score&lt;/th&gt;
&lt;th&gt;Grade&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;spec-driven-development&lt;/td&gt;
&lt;td&gt;91&lt;/td&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;browser-testing-with-devtools&lt;/td&gt;
&lt;td&gt;91&lt;/td&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;deprecation-and-migration&lt;/td&gt;
&lt;td&gt;91&lt;/td&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;frontend-ui-engineering&lt;/td&gt;
&lt;td&gt;91&lt;/td&gt;
&lt;td&gt;A&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;test-driven-development&lt;/td&gt;
&lt;td&gt;88&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;code-review-and-quality&lt;/td&gt;
&lt;td&gt;88&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;interview-me&lt;/td&gt;
&lt;td&gt;86&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ci-cd-and-automation&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;code-simplification&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;context-engineering&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;documentation-and-adrs&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;incremental-implementation&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;security-and-hardening&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;shipping-and-launch&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;source-driven-development&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;using-agent-skills&lt;/td&gt;
&lt;td&gt;85&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;doubt-driven-development&lt;/td&gt;
&lt;td&gt;80&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;observability-and-instrumentation&lt;/td&gt;
&lt;td&gt;80&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;planning-and-task-breakdown&lt;/td&gt;
&lt;td&gt;80&lt;/td&gt;
&lt;td&gt;B&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;api-and-interface-design&lt;/td&gt;
&lt;td&gt;78&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;debugging-and-error-recovery&lt;/td&gt;
&lt;td&gt;77&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;git-workflow-and-versioning&lt;/td&gt;
&lt;td&gt;77&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;idea-refine&lt;/td&gt;
&lt;td&gt;77&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;performance-optimization&lt;/td&gt;
&lt;td&gt;77&lt;/td&gt;
&lt;td&gt;C&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Average: 84/100 (B)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;To be clear: 84 across 24 production skills is excellent. No failures. No D grades. Most skill libraries I have tested do not get close to this. The instruction content inside these skills is genuinely good. What the linter found is at the edges, not in the core.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two gaps. Five skills. Every single C.
&lt;/h2&gt;

&lt;p&gt;I drilled into all five C-grade skills. The same two findings came up in every one of them.&lt;/p&gt;

&lt;h3&gt;
  
  
  The description does not say when to stop
&lt;/h3&gt;

&lt;p&gt;Every C-grade description says what the skill does. None says when not to use it.&lt;/p&gt;

&lt;p&gt;This matters because an agent with no stop condition will stretch a skill to cover loosely related requests. It invokes when it should not. It does not know where the boundary is because you never told it.&lt;/p&gt;

&lt;p&gt;The fix is one sentence at the end of the description:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Do not use when the codebase already has an established pattern for this."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is it. One sentence. The skill immediately becomes less likely to activate on the wrong request.&lt;/p&gt;

&lt;h3&gt;
  
  
  Terminal commands with no Safety section
&lt;/h3&gt;

&lt;p&gt;Several of the C-grade skills ship step-by-step terminal commands in the body. None of them has a &lt;code&gt;## Safety&lt;/code&gt; section.&lt;/p&gt;

&lt;p&gt;The Antigravity authoring guide requires any skill that runs commands to document what those commands touch, and what the agent must never run unattended. Without that section, the linter applies up to an 8-point penalty. The reason is practical: an agent executing undocumented commands has no signal about blast radius.&lt;/p&gt;

&lt;p&gt;Here is what a Safety section looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Safety&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Never run &lt;span class="sb"&gt;`git push --force`&lt;/span&gt; unattended. Confirm with the user first.
&lt;span class="p"&gt;-&lt;/span&gt; All destructive commands require explicit confirmation before execution.
&lt;span class="p"&gt;-&lt;/span&gt; Scripts in scripts/ are reviewed before running, never piped directly to sh.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Five lines. Eight points back.&lt;/p&gt;

&lt;p&gt;Add both of those things and every C-grade skill in this dataset moves to B or A territory. The instruction quality is already there. The metadata layer just needed these two signals.&lt;/p&gt;




&lt;h2&gt;
  
  
  Running it yourself
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
dart pub global activate skillscore

&lt;span class="c"&gt;# Score your skills&lt;/span&gt;
skillscore path/to/your-skills/

&lt;span class="c"&gt;# Gate CI (fail if any skill drops below 80)&lt;/span&gt;
skillscore skills/ &lt;span class="nt"&gt;--min-score&lt;/span&gt; 80 &lt;span class="nt"&gt;--format&lt;/span&gt; sarif
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--format sarif&lt;/code&gt; pipes findings into GitHub code scanning so they appear as inline annotations on pull requests. No more "I forgot to check the skill before merging."&lt;/p&gt;

&lt;p&gt;If a finding is unclear, &lt;code&gt;skillscore explain &amp;lt;rule-id&amp;gt;&lt;/code&gt; prints the full rationale and the guide it came from. Every output line includes the rule ID for exactly this reason.&lt;/p&gt;

&lt;p&gt;Fully offline. No API key. Deterministic. The same input always produces the same score, which is the only way to use something in a CI gate.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this experiment actually proves
&lt;/h2&gt;

&lt;p&gt;The gaps in the C-grade skills are invisible in normal review. If you read &lt;code&gt;performance-optimization&lt;/code&gt; cold you would probably call it good, because the instructions are good. A human reviewer is not going to flag the absence of a boundary clause or notice that the Safety section is missing. They are going to read the content and nod.&lt;/p&gt;

&lt;p&gt;A linter does not read. It checks. And what it found here is that the most common quality gap in real-world agent skills is not bad instructions. It is the two or three structural signals the agent uses to decide when and whether to invoke the skill at all.&lt;/p&gt;

&lt;p&gt;That is a solvable problem. Now you have a number for it.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Try it:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://pub.dev/packages/skillscore" rel="noopener noreferrer"&gt;pub.dev/packages/skillscore&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/sayed3li97/skillscore" rel="noopener noreferrer"&gt;github.com/sayed3li97/skillscore&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/addyosmani/agent-skills" rel="noopener noreferrer"&gt;addyosmani/agent-skills&lt;/a&gt; — the library used in this post&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Official authoring guides
&lt;/h2&gt;

&lt;p&gt;These are the primary sources skillscore's rules are drawn from. Each finding in the tool output cites one of them. Worth reading once if you author skills regularly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anthropic — Agent Skills: Best Practices&lt;/strong&gt;&lt;br&gt;
The canonical guide for Claude Code skills. Covers description quality (action verbs, when-clauses), conciseness (500-line body limit), progressive disclosure patterns, script documentation, and the overall structure of an effective SKILL.md.&lt;br&gt;
&lt;a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices" rel="noopener noreferrer"&gt;platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Antigravity — Authoring Antigravity Skills (Google Codelabs)&lt;/strong&gt;&lt;br&gt;
The official hands-on guide for Google Antigravity skills. Covers the Safety section requirement for skills that run commands, the boundary clause ("do not use when..."), and four levels of skill complexity from basic routing to procedural script execution.&lt;br&gt;
&lt;a href="https://codelabs.developers.google.com/getting-started-with-antigravity-skills" rel="noopener noreferrer"&gt;codelabs.developers.google.com/getting-started-with-antigravity-skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent Skills Open Standard — Specification&lt;/strong&gt;&lt;br&gt;
The format specification that all agents implement: Claude Code, Codex, Antigravity, Gemini CLI, Cursor. Defines frontmatter fields, directory structure, optional folders, and progressive disclosure principles.&lt;br&gt;
&lt;a href="https://agentskills.io/specification" rel="noopener noreferrer"&gt;agentskills.io/specification&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>devtools</category>
      <category>opensource</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>skillscore: a CLI that scores your AI agent's SKILL.md 0–100</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Fri, 12 Jun 2026 23:27:22 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/skillscore-a-cli-that-scores-your-ai-agents-skillmd-0-100-48l1</link>
      <guid>https://dev.to/sayed_ali_alkamel/skillscore-a-cli-that-scores-your-ai-agents-skillmd-0-100-48l1</guid>
      <description>&lt;p&gt;A vague AI agent skill is worse than no skill at all — because the agent pays for it in context budget on &lt;em&gt;every single turn&lt;/em&gt;, whether it uses it or not. Yet most of us write &lt;code&gt;SKILL.md&lt;/code&gt; files by feel and ship them with zero feedback.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;skillscore&lt;/strong&gt;: a command-line tool that statically analyzes any &lt;code&gt;SKILL.md&lt;/code&gt; and gives it a 0–100 quality score, a letter grade, and a list of fix-it findings — each one citing the official authoring guide it comes from.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;skillscore is an open-source Dart CLI that lints and scores AI agent skills (&lt;code&gt;SKILL.md&lt;/code&gt; files) against the Claude, Codex, and Antigravity authoring guides.&lt;/strong&gt; It runs fully offline, is deterministic, and exits with CI-friendly status codes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🎯 Scores any &lt;code&gt;SKILL.md&lt;/code&gt; &lt;strong&gt;0–100&lt;/strong&gt; with a letter grade and actionable findings across 7 categories.&lt;/li&gt;
&lt;li&gt;📚 Rules are drawn from the &lt;strong&gt;official Anthropic (Claude), OpenAI (Codex), Google (Antigravity), and Flutter&lt;/strong&gt; skill-authoring guides — and every finding cites its source.&lt;/li&gt;
&lt;li&gt;🔌 &lt;strong&gt;Offline, deterministic, zero network calls.&lt;/strong&gt; Same input → same score, every time.&lt;/li&gt;
&lt;li&gt;🚦 Built for CI: &lt;code&gt;--min-score 80&lt;/code&gt;, JSON output, and &lt;strong&gt;SARIF 2.1.0&lt;/strong&gt; that annotates pull requests.&lt;/li&gt;
&lt;li&gt;⚡ Install: &lt;code&gt;dart pub global activate skillscore&lt;/code&gt; → &lt;a href="https://pub.dev/packages/skillscore" rel="noopener noreferrer"&gt;pub.dev/packages/skillscore&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why I built this
&lt;/h2&gt;

&lt;p&gt;AI agent skills are quietly becoming a standard. A &lt;em&gt;skill&lt;/em&gt; is just a folder with a &lt;code&gt;SKILL.md&lt;/code&gt; — YAML frontmatter (a &lt;code&gt;name&lt;/code&gt; and a &lt;code&gt;description&lt;/code&gt;) plus a Markdown body of instructions — and optional &lt;code&gt;references/&lt;/code&gt;, &lt;code&gt;examples/&lt;/code&gt;, &lt;code&gt;scripts/&lt;/code&gt;, and &lt;code&gt;assets/&lt;/code&gt; subfolders. Claude Code, Codex, Antigravity, Gemini CLI, and Cursor all read the same format.&lt;/p&gt;

&lt;p&gt;Here's the catch that most people miss: &lt;strong&gt;an agent keeps every skill's &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt; in its context window permanently&lt;/strong&gt;, so it can decide when to reach for one. A skill with a fuzzy description doesn't just fail to get used — it taxes every prompt and occasionally fires on the wrong request.&lt;/p&gt;

&lt;p&gt;The vendors all published authoring guides telling you how to avoid this: front-load triggers, write in the third person, state when &lt;em&gt;not&lt;/em&gt; to use the skill, keep the body short, document your scripts. Good advice — scattered across four different documents, none of them enforceable. There was no &lt;code&gt;eslint&lt;/code&gt; for skills. So I wrote one.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is skillscore?
&lt;/h2&gt;

&lt;p&gt;skillscore is a &lt;strong&gt;skill linter and &lt;code&gt;SKILL.md&lt;/code&gt; validator&lt;/strong&gt; that turns those authoring guides into 24 concrete, checkable rules. Point it at a file, a skill folder, or a whole monorepo, and it produces a score per skill:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install (it's on pub.dev)&lt;/span&gt;
dart pub global activate skillscore

&lt;span class="c"&gt;# Score a single skill — any name, any location&lt;/span&gt;
skillscore path/to/SKILL.md

&lt;span class="c"&gt;# Score every skill in a tree&lt;/span&gt;
skillscore path/to/skills/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The rules live in 7 weighted categories:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;What it checks&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;A — Frontmatter validity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;---&lt;/code&gt; delimiters, &lt;code&gt;name&lt;/code&gt; format, &lt;code&gt;description&lt;/code&gt; present&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;B — Description quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;states &lt;em&gt;what&lt;/em&gt; + &lt;em&gt;when&lt;/em&gt;, third person, front-loaded triggers, boundary clause&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;C — Conciseness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;body length, no explainer bloat, no endless "or" chains&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;D — Structure&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;progressive disclosure, links one level deep, TOCs on long references&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;E — Instruction quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;anti-patterns, workflow checklist, feedback loop, code examples&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;F — Content hygiene&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;no rotting date references, forward-slash paths, consistent terms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;G — Safety &amp;amp; scripts&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;a &lt;em&gt;penalty&lt;/em&gt; (up to −15) when bundled scripts lack docs or a Safety section&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;100 points are distributed across A–F; category G only bites if your skill ships scripts or terminal commands. Profiles that exclude a vendor-specific rule are normalized back to 0–100, so a score means the same thing on every target.&lt;/p&gt;
&lt;h2&gt;
  
  
  Does it actually work? Let's score a real one
&lt;/h2&gt;

&lt;p&gt;Here's skillscore run against a genuine skill from the &lt;strong&gt;Flutter team's public repo&lt;/strong&gt; — &lt;code&gt;flutter-add-widget-test/SKILL.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;flutter-add-widget-test  (SKILL.md)
  Score: 90/100  Grade: A

  A  Frontmatter validity                     15/15  ██████████
  B  Description quality                      21/25  ████████░░
  C  Conciseness &amp;amp; token economy              15/15  ██████████
  D  Structure &amp;amp; progressive disclosure       15/15  ██████████
  E  Instruction quality                      14/20  ███████░░░
  F  Content hygiene                          10/10  ██████████
  G  Safety &amp;amp; scripts                    no penalty

  WARNING E1_anti_patterns  line 8
          Body contains no explicit anti-patterns (no "do not", "never", or "avoid").
          fix: Add explicit prohibitions, e.g. "Never share a WidgetTester across tests."

  INFO    B5_boundary_clause  line 3
          Description has no boundary clause saying when NOT to use the skill.
          fix: Append a boundary, e.g. "Do not use for multi-screen integration tests."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;A genuinely good skill, and skillscore says so — but it also pinpoints the two things keeping it off a perfect score: it never tells the model what &lt;em&gt;not&lt;/em&gt; to do, and its description doesn't state where the skill stops. Both are real, both are fixable in one line, and both come straight from the published guides.&lt;/p&gt;

&lt;p&gt;Want the rationale behind any finding? Ask:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;skillscore explain E1_anti_patterns
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;It prints why the rule exists, the exact fix, and the source guide it's from.&lt;/p&gt;
&lt;h2&gt;
  
  
  Built for CI
&lt;/h2&gt;

&lt;p&gt;A score you have to eyeball isn't a gate. skillscore is designed to live in a pipeline:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/skills.yml&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Lint agent skills&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;dart pub global activate skillscore&lt;/span&gt;
    &lt;span class="s"&gt;skillscore skills/ --min-score 80 --no-color&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--min-score 80&lt;/code&gt; → the job &lt;strong&gt;exits non-zero&lt;/strong&gt; if any skill dips below the bar.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--format json&lt;/code&gt; → structured output for dashboards.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--format sarif&lt;/code&gt; → valid &lt;strong&gt;SARIF 2.1.0&lt;/strong&gt; that uploads to GitHub code scanning, so findings annotate the exact lines in a pull request.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Exit codes are pipeline-grade: &lt;code&gt;0&lt;/code&gt; everything passed, &lt;code&gt;1&lt;/code&gt; a gate failed, &lt;code&gt;2&lt;/code&gt; a usage error. No flaky LLM in the loop, no network — the same skill always scores the same.&lt;/p&gt;
&lt;h2&gt;
  
  
  How is this different from just asking an LLM to review my skill?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;skillscore&lt;/th&gt;
&lt;th&gt;Vendor schema check&lt;/th&gt;
&lt;th&gt;Markdown linter&lt;/th&gt;
&lt;th&gt;"Ask an LLM"&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Validates frontmatter&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;⚠️&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scores &lt;em&gt;quality&lt;/em&gt; (discoverability, structure, instructions)&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cites a source guide per finding&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deterministic / reproducible&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Safe for a CI gate&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Offline&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;An LLM review is great for nuance but non-deterministic — you can't gate a build on it. A schema check tells you the file is &lt;em&gt;valid&lt;/em&gt;, not whether it's any &lt;em&gt;good&lt;/em&gt;. skillscore fills the gap in the middle, and it pairs nicely with the other two.&lt;/p&gt;
&lt;h2&gt;
  
  
  It's a library too
&lt;/h2&gt;

&lt;p&gt;The CLI is a thin wrapper over a public Dart API, so you can embed scoring in your own tooling:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="s"&gt;'package:skillscore/skillscore.dart'&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;SkillParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;parseFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'my-skill/SKILL.md'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Scorer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RuleRegistry&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Target&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;universal&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="n"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="si"&gt;${result.score}&lt;/span&gt;&lt;span class="s"&gt;/100 &lt;/span&gt;&lt;span class="si"&gt;${result.grade}&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is an AI agent skill?&lt;/strong&gt;&lt;br&gt;
A folder with a &lt;code&gt;SKILL.md&lt;/code&gt; manifest (YAML frontmatter + Markdown instructions) that teaches an AI agent a repeatable task. Optional subfolders hold references, examples, scripts, and assets. The format is shared across Claude Code, Codex, Antigravity, Gemini CLI, and Cursor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Which agents does skillscore support?&lt;/strong&gt;&lt;br&gt;
All of them — the &lt;code&gt;SKILL.md&lt;/code&gt; format is shared. Score against one vendor with &lt;code&gt;--target claude|codex|antigravity&lt;/code&gt;, or use the default &lt;code&gt;universal&lt;/code&gt; profile, which a portable skill should pass everywhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is it really offline?&lt;/strong&gt;&lt;br&gt;
Completely. No network calls at runtime, local files only, fully deterministic — the same input always produces the same score and the same finding order.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does my skill have to be named a particular way?&lt;/strong&gt;&lt;br&gt;
No. skillscore is name-agnostic: the frontmatter &lt;code&gt;name&lt;/code&gt;, the folder name, and the file name are independent, and even non-ASCII folder names work. Rule &lt;code&gt;A2&lt;/code&gt; will still tell you if the &lt;code&gt;name&lt;/code&gt; &lt;em&gt;field&lt;/em&gt; breaks the official format.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What happens with malformed frontmatter?&lt;/strong&gt;&lt;br&gt;
No crash. The relevant frontmatter errors are reported, every other rule that can still run does, and you always get a score.&lt;/p&gt;
&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;v0.1.0 is live and the rubric is stable, but it's early. The roadmap: more vendor targets as new guides land, an autofix mode for the mechanical findings (forward slashes, missing TOCs), and a GitHub Action wrapper so CI setup is one line. The rule engine is deliberately simple — &lt;strong&gt;a new rule is one class plus one registration&lt;/strong&gt; — so contributions are welcome, and every rule must cite the published guide it enforces.&lt;/p&gt;
&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dart pub global activate skillscore
skillscore your-skill/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;📦 pub.dev: &lt;strong&gt;&lt;a href="https://pub.dev/packages/skillscore" rel="noopener noreferrer"&gt;pub.dev/packages/skillscore&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;💻 GitHub: &lt;strong&gt;&lt;a href="https://github.com/sayed3li97/skillscore" rel="noopener noreferrer"&gt;github.com/sayed3li97/skillscore&lt;/a&gt;&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/sayed3li97" rel="noopener noreferrer"&gt;
        sayed3li97
      &lt;/a&gt; / &lt;a href="https://github.com/sayed3li97/skillscore" rel="noopener noreferrer"&gt;
        skillscore
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Lint and score any AI agent SKILL.md against the official Claude, Codex, and Antigravity authoring guides — offline Dart CLI.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;skillscore — lint and score AI agent skills (SKILL.md)&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;
  &lt;a rel="noopener noreferrer nofollow" href="https://raw.githubusercontent.com/sayed3li97/skillscore/main/docs/assets/cover.png"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fsayed3li97%2Fskillscore%2Fmain%2Fdocs%2Fassets%2Fcover.png" alt="skillscore — score your AI agent's SKILL.md 0 to 100 against the Claude, Codex, and Antigravity authoring guides" width="100%"&gt;&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/sayed3li97/skillscore/actions/workflows/ci.yml" rel="noopener noreferrer"&gt;&lt;img src="https://github.com/sayed3li97/skillscore/actions/workflows/ci.yml/badge.svg" alt="CI"&gt;&lt;/a&gt;
&lt;a href="https://pub.dev/packages/skillscore" rel="nofollow noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/b379da7d8e6bdbd4ec02dccfe1fb0ed2281dfe9836d5f94bc43adfb9d9b02723/68747470733a2f2f696d672e736869656c64732e696f2f7075622f762f736b696c6c73636f72652e737667" alt="pub package"&gt;&lt;/a&gt;
&lt;a href="https://github.com/sayed3li97/skillscore/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/798509b4df525f56802b56f8096862487f08023e3d7561c68656f8dab10d0d6e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4170616368652d2d322e302d626c75652e737667" alt="license: Apache-2.0"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;skillscore&lt;/strong&gt; statically analyzes any AI agent skill — a &lt;code&gt;SKILL.md&lt;/code&gt; manifest
and its folder — and produces a &lt;strong&gt;0–100 quality score&lt;/strong&gt;, a &lt;strong&gt;letter grade&lt;/strong&gt;
and a list of &lt;strong&gt;actionable findings&lt;/strong&gt;, scored against the official skill
authoring guides from &lt;strong&gt;Anthropic (Claude)&lt;/strong&gt;, &lt;strong&gt;Google (Antigravity)&lt;/strong&gt;, and
&lt;strong&gt;OpenAI (Codex)&lt;/strong&gt;. Offline, deterministic, CI-friendly.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What is skillscore?&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;skillscore is a &lt;strong&gt;skill linter / SKILL.md validator / agent-skill quality
checker / AI skill scorer&lt;/strong&gt;. Agent skills are an open standard — a folder
with a &lt;code&gt;SKILL.md&lt;/code&gt; (YAML frontmatter + Markdown body) plus optional
&lt;code&gt;references/&lt;/code&gt;, &lt;code&gt;examples/&lt;/code&gt;, &lt;code&gt;scripts/&lt;/code&gt;, and &lt;code&gt;assets/&lt;/code&gt; — used by Claude Code
Codex, Antigravity, Gemini CLI, and Cursor. Because an agent keeps every
skill's &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt; in its context budget permanently, &lt;strong&gt;a vague
or malformed skill is worse than no skill&lt;/strong&gt;. skillscore catches exactly those…&lt;/p&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/sayed3li97/skillscore" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;If you maintain skills, run it against your &lt;code&gt;SKILL.md&lt;/code&gt; and tell me what score you get — and what it got &lt;em&gt;wrong&lt;/em&gt;. I want the rules to reflect how people actually author skills, so findings you disagree with are the most useful feedback I can get. And if it saves you a context-budget headache, a ⭐ helps it reach other people building agents.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>cli</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
    <item>
      <title>DiffusionGemma: How Google's New Open LLM Hits 1,000 Tokens/sec and Changes Inference Economics</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Fri, 12 Jun 2026 18:30:19 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/diffusiongemma-how-googles-new-open-llm-hits-1000-tokenssec-and-changes-inference-economics-4587</link>
      <guid>https://dev.to/sayed_ali_alkamel/diffusiongemma-how-googles-new-open-llm-hits-1000-tokenssec-and-changes-inference-economics-4587</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Google released DiffusionGemma, an open Apache 2.0 diffusion-based LLM that generates text up to 4x faster than autoregressive models, hitting 1,000+ tokens/sec on a single H100 and fitting in 18 GB VRAM. It trades some accuracy for speed. Here is what that means in practice.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  What DiffusionGemma Actually Is
&lt;/h2&gt;

&lt;p&gt;Google DeepMind released &lt;strong&gt;DiffusionGemma&lt;/strong&gt;, the first production-grade open-weight model that applies discrete diffusion to text generation. The same family of techniques behind image generators like Stable Diffusion, now applied to language.&lt;/p&gt;

&lt;p&gt;Instead of predicting one token at a time left-to-right, DiffusionGemma fills a 256-token block with noise and &lt;strong&gt;iteratively refines the entire block across multiple denoising passes&lt;/strong&gt; until confidence thresholds are met. It commits roughly 15-20 tokens per forward pass on average, not one.&lt;/p&gt;

&lt;p&gt;This is a fundamentally different compute pattern from everything shipping in production today.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tokens/sec (H100, FP8, low batch)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1,100+&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tokens/sec (RTX 5090)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;700+&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total parameters&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;25.2B (marketed as 26B)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Active parameters at inference&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.8B&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MoE expert config&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;8 active / 128 total&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;VRAM required (quantized)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;18 GB&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Canvas (block) size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;256 tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tokens committed per forward pass&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~15-20&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Max denoising steps&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;48&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context window&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;256K tokens&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;License&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Apache 2.0&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For context: comparable autoregressive models on the same H100 generate roughly 200-250 tokens/sec. DiffusionGemma is up to &lt;strong&gt;4x faster&lt;/strong&gt; on throughput. The jump comes from shifting the decode bottleneck from memory bandwidth to compute.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why the Architecture Matters
&lt;/h2&gt;

&lt;p&gt;DiffusionGemma is a &lt;strong&gt;26B Mixture of Experts (MoE)&lt;/strong&gt; model built on the Gemma 4 backbone, but it replaces the autoregressive decoder with a &lt;strong&gt;diffusion head&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How a single generation works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The model initializes a 256-token block with random placeholder tokens&lt;/li&gt;
&lt;li&gt;It runs up to 48 denoising steps, refining all tokens simultaneously with &lt;strong&gt;bidirectional attention&lt;/strong&gt; (every token attends to every other token in the block)&lt;/li&gt;
&lt;li&gt;Tokens that cross an entropy confidence threshold get committed to the KV cache early via adaptive stopping&lt;/li&gt;
&lt;li&gt;For sequences longer than 256 tokens, committed blocks are cached and the next block begins&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key difference from GPT-style models: token N can see tokens N+1 through N+256 during generation. This enables genuine &lt;strong&gt;self-correction&lt;/strong&gt; across the block. Autoregressive models structurally cannot do this.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where It Wins and Where It Does Not
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Structural advantages
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Code infilling:&lt;/strong&gt; It sees the code on both sides of the gap before generating the fill, not just the left side&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inline document editing:&lt;/strong&gt; Revising a paragraph in context of surrounding paragraphs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time latency-sensitive apps:&lt;/strong&gt; 1,100 tokens/sec on H100 vs ~230 tokens/sec from a comparable autoregressive model&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single-GPU efficiency:&lt;/strong&gt; 3.8B active parameters means 18 GB VRAM at quantized precision, which fits on an RTX 4090 or 5090&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Benchmark trade-offs vs Gemma 4 26B (autoregressive)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;DiffusionGemma&lt;/th&gt;
&lt;th&gt;Gemma 4 26B&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MMLU Pro&lt;/td&gt;
&lt;td&gt;77.6%&lt;/td&gt;
&lt;td&gt;82.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AIME 2026&lt;/td&gt;
&lt;td&gt;69.1%&lt;/td&gt;
&lt;td&gt;88.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GPQA Diamond&lt;/td&gt;
&lt;td&gt;73.2%&lt;/td&gt;
&lt;td&gt;82.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MMMU Pro (Vision)&lt;/td&gt;
&lt;td&gt;54.3%&lt;/td&gt;
&lt;td&gt;73.8%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Google describes it as experimental. For reasoning-heavy workloads (complex math, multi-step logic, vision understanding) the autoregressive Gemma 4 is still ahead. DiffusionGemma is the right tool when &lt;strong&gt;latency and throughput matter more than peak accuracy&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Multi-modal capabilities
&lt;/h3&gt;

&lt;p&gt;The model processes interleaved text, images (5 resolution tiers up to 1120 tokens), and video (up to 60 seconds at 1 fps). It supports OCR, chart comprehension, screen understanding, and handwriting recognition across 35+ languages, with training data covering 140+ languages.&lt;/p&gt;




&lt;h2&gt;
  
  
  Deploy It in 5 Minutes with vLLM
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vllm

vllm serve google/diffusiongemma-26B-A4B-it &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-model-len&lt;/span&gt; 262144 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-num-seqs&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--gpu-memory-utilization&lt;/span&gt; 0.85 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--attention-backend&lt;/span&gt; TRITON_ATTN &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--generation-config&lt;/span&gt; vllm &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--hf-overrides&lt;/span&gt; &lt;span class="s1"&gt;'{"diffusion_sampler": "entropy_bound", "diffusion_entropy_bound": 0.1}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--diffusion-config&lt;/span&gt; &lt;span class="s1"&gt;'{"canvas_length": 256}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--enable-chunked-prefill&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The endpoint is OpenAI-compatible. Point your existing client at &lt;code&gt;http://localhost:8000&lt;/code&gt; with no other code changes needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Supported inference runtimes:&lt;/strong&gt; vLLM, Hugging Face Transformers, SGLang, MLX (Apple Silicon), NVIDIA NIM containers, Google Cloud Vertex AI Model Garden.&lt;/p&gt;




&lt;h2&gt;
  
  
  Fine-Tuning
&lt;/h2&gt;

&lt;p&gt;The ecosystem arrived fast for a day-1 release:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hackable Diffusion:&lt;/strong&gt; Google's JAX-based modular research toolbox&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hugging Face Transformers:&lt;/strong&gt; standard PEFT/LoRA workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unsloth:&lt;/strong&gt; memory-efficient fine-tuning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NVIDIA NeMo:&lt;/strong&gt; enterprise training pipelines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A published case study fine-tuned DiffusionGemma on a Sudoku dataset and improved success rate from approximately 0% to 80%. Fine-tuning can also teach the model to stop denoising early when confidence is already high, reducing inference steps further. Autoregressive models have no equivalent lever.&lt;/p&gt;




&lt;h2&gt;
  
  
  What to Evaluate Right Now
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;This week:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Spin up the model on an H100 or RTX 4090 (18 GB VRAM quantized)&lt;/li&gt;
&lt;li&gt;[ ] Benchmark it on your actual latency-sensitive workload, not synthetic tasks&lt;/li&gt;
&lt;li&gt;[ ] Compare serving cost ($/1M tokens) against your current stack&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Next sprint:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Test code infilling quality in IDE tooling, its structural sweet spot due to bidirectional attention&lt;/li&gt;
&lt;li&gt;[ ] If you run real-time chat or inline editing, measure UX metrics, not just accuracy scores&lt;/li&gt;
&lt;li&gt;[ ] Follow Unsloth + LoRA support for DiffusionGemma, it is maturing fast&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Architecture signal:&lt;/strong&gt;&lt;br&gt;
This model is built on the same Gemini Diffusion research that will likely inform future proprietary Gemini releases. If diffusion inference stabilizes at this quality level, it rewrites autoregressive serving assumptions at scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;DiffusionGemma is not a production replacement for your current LLM stack today. Accuracy trade-offs are real and Google is transparent about the experimental status.&lt;/p&gt;

&lt;p&gt;But the throughput numbers are genuine, the hardware requirements are accessible, and the license is Apache 2.0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1,100 tokens per second. 18 GB VRAM. Open weights. From Google.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That combination is worth benchmarking on your actual workload this week.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Resources:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://huggingface.co/google/diffusiongemma-26B-A4B-it" rel="noopener noreferrer"&gt;Model on Hugging Face&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://deepmind.google/models/gemma/diffusiongemma/" rel="noopener noreferrer"&gt;Google DeepMind Model Page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.googleblog.com/en/diffusiongemma-the-developer-guide/" rel="noopener noreferrer"&gt;Developer Guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.google/innovation-and-ai/technology/developers-tools/diffusion-gemma-faster-text-generation/" rel="noopener noreferrer"&gt;Google Blog Announcement&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Found this useful? Follow for more signal-over-noise breakdowns of AI releases that matter.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>llm</category>
      <category>developers</category>
    </item>
    <item>
      <title>How to Write a Flutter Agent Skill That Actually Works: The 2026 Recipe</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Fri, 12 Jun 2026 18:13:04 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/how-to-write-a-flutter-agent-skill-that-actually-works-the-2026-recipe-2joi</link>
      <guid>https://dev.to/sayed_ali_alkamel/how-to-write-a-flutter-agent-skill-that-actually-works-the-2026-recipe-2joi</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; A great agent skill is not a pile of documentation. It is a tightly scoped &lt;code&gt;SKILL.md&lt;/code&gt; with a description engineered for discovery, ruthless conciseness, anti-patterns stated up front, a checklist workflow, and a feedback loop. The format is an open standard that works across Claude Code, OpenAI Codex, Google Antigravity, Gemini CLI, and Cursor. This post synthesizes the official authoring guidance from &lt;strong&gt;Flutter, Anthropic, Google, and OpenAI&lt;/strong&gt; into one recipe, hands you a complete copy-pasteable Flutter skill, and shows you how to actually evaluate it instead of guessing.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In my last article, I wrote about the official Dart and Flutter Agent Skills and why they stop your AI from writing 2022 Flutter. The most common reply I got was some version of the same question:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Cool. How do I write my own?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So I went and read the actual playbooks. Not the hot takes, the primary sources: Flutter's skill docs and eval framework, Anthropic's skill authoring best practices, Google's Antigravity skill docs, and OpenAI's Codex skill guide. The good news is they agree on almost everything. The better news is that the gap between a skill that works and a skill that gets silently ignored comes down to a handful of decisions, and most people get them wrong.&lt;/p&gt;

&lt;p&gt;Here is the recipe, Flutter-flavored.&lt;/p&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Why a bad skill is worse than no skill&lt;/li&gt;
&lt;li&gt;The anatomy you need before the recipe&lt;/li&gt;
&lt;li&gt;One format, every agent&lt;/li&gt;
&lt;li&gt;The recipe: 9 ingredients of a skill that works&lt;/li&gt;
&lt;li&gt;A complete Flutter skill you can steal&lt;/li&gt;
&lt;li&gt;How to actually evaluate your skill&lt;/li&gt;
&lt;li&gt;The security caveat nobody mentions&lt;/li&gt;
&lt;li&gt;The honest take from the community&lt;/li&gt;
&lt;li&gt;The ship-it checklist&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Wrapping up&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why a bad skill is worse than no skill
&lt;/h2&gt;

&lt;p&gt;AI agents are generalists. They average across years of Flutter code, much of it deprecated, and hand you the most statistically common answer instead of the currently correct one. The Flutter team named this the &lt;strong&gt;knowledge gap&lt;/strong&gt;: the framework ships features faster than language models can update their training data. Skills exist to close that gap by handing the agent a task-specific, expert workflow.&lt;/p&gt;

&lt;p&gt;But here is what nobody tells you. A poorly written skill does not just fail to help. It actively costs you. Every skill's metadata sits in the agent's context budget at all times. A vague skill that never triggers is dead weight. A skill with a fuzzy description that triggers on the &lt;em&gt;wrong&lt;/em&gt; tasks is worse, because now your agent is following the wrong playbook with full confidence.&lt;/p&gt;

&lt;p&gt;The bar is not "wrote some Markdown." The bar is "the agent reliably finds it, trusts it, and follows it." Everything below is in service of that bar.&lt;/p&gt;

&lt;h2&gt;
  
  
  The anatomy you need before the recipe
&lt;/h2&gt;

&lt;p&gt;A skill is the simplest possible thing: a folder with one required file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;building-riverpod-async-screens/
├── SKILL.md          # Required: metadata + instructions
├── references/       # Optional: deep-dive docs loaded on demand
├── examples/         # Optional: reference implementations
├── scripts/          # Optional: scripts the agent runs, not reads
└── assets/           # Optional: templates, images
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;SKILL.md&lt;/code&gt; itself is YAML frontmatter plus a Markdown body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;building-riverpod-async-screens&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Build&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Flutter&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;screen&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;that&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;loads&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;async&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;with&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Riverpod..."&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Building Riverpod Async Screens&lt;/span&gt;

[instructions go here]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The magic that makes this scale is &lt;strong&gt;progressive disclosure&lt;/strong&gt;. At startup the agent loads only the lightweight metadata (name, description, path) of every skill. It reads the full &lt;code&gt;SKILL.md&lt;/code&gt; only when a task matches, and it reads anything in &lt;code&gt;references/&lt;/code&gt; or &lt;code&gt;examples/&lt;/code&gt; only when the body points it there. If you write Flutter, you already know this pattern: it is deferred loading for the context window. OpenAI, Anthropic, and Google all describe the exact same mechanism.&lt;/p&gt;

&lt;h2&gt;
  
  
  One format, every agent
&lt;/h2&gt;

&lt;p&gt;This is the part that makes writing a skill worth your time. &lt;code&gt;SKILL.md&lt;/code&gt; is an open standard (published at agentskills.io, originated at Anthropic, since adopted across the ecosystem). One skill works almost everywhere:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Vendor&lt;/th&gt;
&lt;th&gt;Where skills live&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Claude Code&lt;/td&gt;
&lt;td&gt;Anthropic&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;.claude/skills/&lt;/code&gt; (project), &lt;code&gt;~/.claude/skills/&lt;/code&gt; (personal)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI Codex&lt;/td&gt;
&lt;td&gt;OpenAI&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;.codex/skills/&lt;/code&gt; (project), &lt;code&gt;~/.codex/skills/&lt;/code&gt; or &lt;code&gt;~/.agents/skills/&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Antigravity&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;.agents/skills/&lt;/code&gt; (workspace), &lt;code&gt;~/.gemini/antigravity/skills/&lt;/code&gt; (global)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini CLI&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;SKILL.md&lt;/code&gt; standard locations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cursor / Copilot&lt;/td&gt;
&lt;td&gt;Various&lt;/td&gt;
&lt;td&gt;supported with manual placement&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The Flutter team's installer targets the cross-tool location directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add flutter/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; &lt;span class="s1"&gt;'*'&lt;/span&gt; &lt;span class="nt"&gt;--agent&lt;/span&gt; universal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--agent universal&lt;/code&gt; flag drops everything into &lt;code&gt;.agents/skills&lt;/code&gt;, the folder compatible agents auto-discover. Write a skill once, and your whole team gets the same expertise regardless of which agent they prefer. Codex adds a distribution layer on top (it calls the authoring format a "skill" and the installable package a "plugin"), but the core file is identical.&lt;/p&gt;

&lt;h2&gt;
  
  
  The recipe: 9 ingredients of a skill that works
&lt;/h2&gt;

&lt;p&gt;Every official source converges on these. I have ordered them by how much they matter in practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The description is 80% of the battle
&lt;/h3&gt;

&lt;p&gt;If your skill does not trigger, it is almost never the instructions. It is the description. This is the single most important line in the entire file, because it is the only part the agent reads when deciding &lt;em&gt;whether to load your skill at all&lt;/em&gt;, often choosing from 100+ candidates.&lt;/p&gt;

&lt;p&gt;Three rules from the official guidance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Write in third person.&lt;/strong&gt; The description is injected into the system prompt. "I can help you build screens" and "You can use this to..." both cause discovery problems. Write "Builds Flutter screens that...".&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State what it does AND when to use it.&lt;/strong&gt; Include concrete trigger words a developer would actually type.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Front-load the key use case.&lt;/strong&gt; Codex's guide is explicit: put the trigger terms first so matching still works if the description gets truncated. Antigravity recommends adding a "Do not use" clause to stop over-activation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compare:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Weak: vague, no triggers, will rarely fire correctly&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Helps with Flutter screens.&lt;/span&gt;

&lt;span class="c1"&gt;# Strong: what + when + triggers + boundary&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build a Flutter screen that loads async data with Riverpod,&lt;/span&gt;
  &lt;span class="s"&gt;handling loading, error, and data states with AsyncValue. Use when&lt;/span&gt;
  &lt;span class="s"&gt;fetching from a repository or API and rendering spinners, retry UI, and&lt;/span&gt;
  &lt;span class="s"&gt;lists. Do not use for purely static screens with no async data.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Be ruthlessly concise
&lt;/h3&gt;

&lt;p&gt;Anthropic puts it perfectly: the context window is a public good. Your skill shares it with the system prompt, the conversation, every other skill's metadata, and the user's actual request. The default assumption must be that &lt;strong&gt;the agent is already very smart&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Do not explain what Flutter is. Do not explain what a widget is. Do not define JSON. Challenge every sentence: does the agent really not know this? Keep the &lt;code&gt;SKILL.md&lt;/code&gt; body under 500 lines. If it grows past that, split it into &lt;code&gt;references/&lt;/code&gt; files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- Bad: wastes tokens on what the model already knows --&amp;gt;&lt;/span&gt;
Flutter is Google's UI toolkit. A widget is a building block of the UI.
To make a network call, you first need an HTTP client, which is a piece
of software that...

&lt;span class="c"&gt;&amp;lt;!-- Good: assumes competence, gets to the point --&amp;gt;&lt;/span&gt;
Use the &lt;span class="sb"&gt;`http`&lt;/span&gt; package for REST calls. Wrap responses in a typed model.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Match the degrees of freedom to the task
&lt;/h3&gt;

&lt;p&gt;This framing from Anthropic is the one most people miss. Think of the agent as a robot walking a path:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Narrow bridge with cliffs&lt;/strong&gt; (low freedom): one correct sequence, high cost of failure. Give exact, rigid instructions. Example: "Run exactly &lt;code&gt;dart run build_runner build --delete-conflicting-outputs&lt;/code&gt;. Do not modify the flags."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open field&lt;/strong&gt; (high freedom): many valid routes, context decides. Give direction and trust the agent. Example: "Structure the feature using the layered approach; choose folder names that fit the existing project."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fragile, deterministic Flutter operations (code generation, migrations, platform config) want low freedom. Architectural and design decisions want high freedom. Most skills need a mix.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Lead with anti-patterns, not just patterns
&lt;/h3&gt;

&lt;p&gt;This is what makes the official Flutter skills so effective, and it is the ingredient that separates a senior skill from a junior one. Do not only say what to do. Ban the wrong instinct explicitly.&lt;/p&gt;

&lt;p&gt;The official &lt;code&gt;flutter-build-responsive-layout&lt;/code&gt; skill does exactly this. It does not just say "be responsive." It says: do NOT switch layouts on &lt;code&gt;MediaQuery.orientationOf&lt;/code&gt;, do NOT check for "phone" vs "tablet", do NOT lock orientation. Those negative rules are what stop the model from reaching for the plausible-but-wrong pattern it learned from a thousand old tutorials.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Rules&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Use &lt;span class="sb"&gt;`AsyncValue.when`&lt;/span&gt; to render data/loading/error. Never assume data is present.
&lt;span class="p"&gt;-&lt;/span&gt; Do NOT use &lt;span class="sb"&gt;`FutureBuilder`&lt;/span&gt; for server state. It re-runs on every rebuild
  and causes duplicate network calls.
&lt;span class="p"&gt;-&lt;/span&gt; Do NOT swallow exceptions or show an infinite spinner on failure.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Turn the task into a checklist workflow
&lt;/h3&gt;

&lt;p&gt;For any multi-step task, give the agent a checklist it can copy into its response and tick off. This prevents skipped steps, which is the most common failure mode on complex work. Both Anthropic and Flutter's own skills use this pattern.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Workflow&lt;/span&gt;

Copy this checklist and track progress:
&lt;span class="p"&gt;
-&lt;/span&gt; [ ] Define the immutable data model.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Add the repository method returning &lt;span class="sb"&gt;`Future&amp;lt;Model&amp;gt;`&lt;/span&gt;.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Create the provider that calls the repository.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Build the screen with &lt;span class="sb"&gt;`ref.watch`&lt;/span&gt; + &lt;span class="sb"&gt;`AsyncValue.when`&lt;/span&gt;.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Implement the error branch with a retry action.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Run &lt;span class="sb"&gt;`dart analyze`&lt;/span&gt; and fix everything. Repeat until clean.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Add a feedback loop
&lt;/h3&gt;

&lt;p&gt;The highest-leverage pattern in the entire playbook: &lt;strong&gt;run validator, fix errors, repeat.&lt;/strong&gt; Give the agent an objective check it can run and a rule to keep going until it passes. In Flutter, you have world-class validators for free.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;After generating code, run &lt;span class="sb"&gt;`dart analyze`&lt;/span&gt;. If it reports issues, fix them
and run it again. Only present the result when analysis is clean and
&lt;span class="sb"&gt;`flutter test`&lt;/span&gt; passes.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This single habit improves output quality more than almost anything else, because it converts "looks right" into "provably compiles and lints clean."&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Use progressive disclosure deliberately
&lt;/h3&gt;

&lt;p&gt;Keep the main &lt;code&gt;SKILL.md&lt;/code&gt; as a lean overview and push depth into linked files. Three patterns, named by the Antigravity docs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Router pattern&lt;/strong&gt;: &lt;code&gt;SKILL.md&lt;/code&gt; only. For focused, single-purpose skills.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reference pattern&lt;/strong&gt;: &lt;code&gt;SKILL.md&lt;/code&gt; + &lt;code&gt;references/&lt;/code&gt;. For skills with deep API detail.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Few-shot pattern&lt;/strong&gt;: &lt;code&gt;SKILL.md&lt;/code&gt; + &lt;code&gt;examples/&lt;/code&gt;. For skills where output quality depends on seeing worked examples.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two rules when you split: keep references &lt;strong&gt;one level deep&lt;/strong&gt; from &lt;code&gt;SKILL.md&lt;/code&gt; (the agent may only partially read nested files), and add a &lt;strong&gt;table of contents&lt;/strong&gt; to any reference file longer than 100 lines so the agent can see the full scope even on a partial read.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. Kill time-sensitive information
&lt;/h3&gt;

&lt;p&gt;Never write "before August 2025, use the old API." It rots. Instead, put deprecated guidance in a collapsed "old patterns" section so the current path stays clean while history stays available.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Old patterns&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;details&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;summary&amp;gt;&lt;/span&gt;Why not FutureBuilder? (legacy)&lt;span class="nt"&gt;&amp;lt;/summary&amp;gt;&lt;/span&gt;

&lt;span class="sb"&gt;`FutureBuilder`&lt;/span&gt; re-runs its future on every rebuild unless cached, causing
duplicate calls. Providers cache and dedupe by default. Prefer providers.
&lt;span class="nt"&gt;&amp;lt;/details&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  9. One canonical example beats ten adjectives
&lt;/h3&gt;

&lt;p&gt;If output quality depends on style, show a complete input/output example rather than describing it. The model matches patterns far better than it follows prose. One correct, runnable Dart snippet anchors the entire skill.&lt;/p&gt;

&lt;h2&gt;
  
  
  A complete Flutter skill you can steal
&lt;/h2&gt;

&lt;p&gt;Here is a full, working skill that bundles every ingredient above. It targets a spot where AI agents reliably write outdated Flutter: loading async data. Drop this into &lt;code&gt;.agents/skills/building-riverpod-async-screens/SKILL.md&lt;/code&gt; and it works in Claude Code, Codex, and Antigravity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;building-riverpod-async-screens&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build a Flutter screen that loads async data with Riverpod,&lt;/span&gt;
  &lt;span class="s"&gt;handling loading, error, and data states with AsyncValue. Use when&lt;/span&gt;
  &lt;span class="s"&gt;fetching from a repository, API, or database and rendering spinners,&lt;/span&gt;
  &lt;span class="s"&gt;retry UI, and lists. Do not use for static screens with no async data.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Building Riverpod Async Screens&lt;/span&gt;

Wire an async data screen the way a senior Flutter dev would: a typed
provider, &lt;span class="sb"&gt;`AsyncValue`&lt;/span&gt; state handling, and explicit loading/error/data
branches. No raw &lt;span class="sb"&gt;`FutureBuilder`&lt;/span&gt;, no manual &lt;span class="sb"&gt;`setState`&lt;/span&gt; for server state,
no swallowed errors.

&lt;span class="gu"&gt;## Rules&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Use a &lt;span class="sb"&gt;`FutureProvider`&lt;/span&gt; for read-only data, or an &lt;span class="sb"&gt;`AsyncNotifier`&lt;/span&gt; when the
  screen also mutates state. Do NOT use &lt;span class="sb"&gt;`StatefulWidget`&lt;/span&gt; + &lt;span class="sb"&gt;`setState`&lt;/span&gt; for
  server state.
&lt;span class="p"&gt;-&lt;/span&gt; Watch with &lt;span class="sb"&gt;`ref.watch`&lt;/span&gt; inside &lt;span class="sb"&gt;`build`&lt;/span&gt;. Use &lt;span class="sb"&gt;`ref.read`&lt;/span&gt; only inside callbacks.
&lt;span class="p"&gt;-&lt;/span&gt; Render all three states with &lt;span class="sb"&gt;`AsyncValue.when`&lt;/span&gt;. Never assume data exists.
&lt;span class="p"&gt;-&lt;/span&gt; Always give the error branch a retry path. Do NOT swallow exceptions or
  show an infinite spinner on failure.
&lt;span class="p"&gt;-&lt;/span&gt; Keep shared providers in their own file: one feature, one providers file.

&lt;span class="gu"&gt;## Workflow&lt;/span&gt;

Copy this checklist and track progress:
&lt;span class="p"&gt;
-&lt;/span&gt; [ ] Define the immutable data model.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Add the repository method returning &lt;span class="sb"&gt;`Future&amp;lt;Model&amp;gt;`&lt;/span&gt;.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Create a &lt;span class="sb"&gt;`FutureProvider`&lt;/span&gt; (or &lt;span class="sb"&gt;`AsyncNotifier`&lt;/span&gt;) that calls the repository.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Build the screen: &lt;span class="sb"&gt;`ref.watch`&lt;/span&gt; the provider, render with &lt;span class="sb"&gt;`AsyncValue.when`&lt;/span&gt;.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Implement the error branch with a retry that invalidates the provider.
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Run &lt;span class="sb"&gt;`dart analyze`&lt;/span&gt; and fix all issues. Repeat until clean.

&lt;span class="gu"&gt;## Example&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="c1"&gt;// product_providers.dart&lt;/span&gt;
&lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;productProvider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FutureProvider&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;autoDispose&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&amp;gt;((&lt;/span&gt;&lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;async&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;watch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;productRepositoryProvider&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;repo&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fetchProducts&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// product_screen.dart&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductScreen&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="n"&gt;ConsumerWidget&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;ProductScreen&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nd"&gt;@override&lt;/span&gt;
  &lt;span class="n"&gt;Widget&lt;/span&gt; &lt;span class="n"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BuildContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;WidgetRef&lt;/span&gt; &lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;watch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;productProvider&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Scaffold&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nl"&gt;appBar:&lt;/span&gt; &lt;span class="n"&gt;AppBar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;title:&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'Products'&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
      &lt;span class="nl"&gt;body:&lt;/span&gt; &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;when&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nl"&gt;data:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ListView&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;builder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nl"&gt;itemCount:&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="nl"&gt;itemBuilder:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ListTile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;title:&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nl"&gt;loading:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Center&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;child:&lt;/span&gt; &lt;span class="n"&gt;CircularProgressIndicator&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
        &lt;span class="nl"&gt;error:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ErrorRetry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="nl"&gt;message:&lt;/span&gt; &lt;span class="s"&gt;'Could not load products'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="nl"&gt;onRetry:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;ref&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;invalidate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;productProvider&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Old patterns
&lt;/h2&gt;

&lt;p&gt;Why not FutureBuilder? (legacy)&lt;/p&gt;

&lt;p&gt;&lt;code&gt;FutureBuilder&lt;/code&gt; re-runs its future on every rebuild unless cached, causing&lt;br&gt;
duplicate network calls. Providers cache and dedupe by default. Prefer&lt;br&gt;
providers for any server state.&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Notice&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;how&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;much&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;work&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;description&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;does,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;how&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;rules&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ban&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;wrong&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;instincts&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;before&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;listing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;right&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ones,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;how&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;workflow&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;ends&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;validator&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;loop,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;how&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;example&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;complete&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;enough&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;copy.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;That&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;whole&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;recipe&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;one&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;file.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;##&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;How&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;actually&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;evaluate&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;your&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;skill&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;This&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;step&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;almost&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;everyone&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;skips,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;one&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;separates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;skill&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;feels&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;good&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;one&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;good.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Both&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Anthropic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Flutter&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;team&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;are&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;emphatic:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;do&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;not&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;trust&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;vibes.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Measure.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;###&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Build&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;evaluation&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;first&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;Anthropic&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;calls&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;evaluation-driven&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;development,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;order&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;matters:&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**Find&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;gap.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Run&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;agent&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;real&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;task&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;no&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;skill.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Document&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;exactly&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;where&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;fails&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;or&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;writes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;outdated&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;code.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**Write&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;three&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;eval&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;scenarios**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;target&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;those&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;failures.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**Establish&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;baseline.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Measure&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;performance&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;without&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;skill.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**Write&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;minimum&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;instructions**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;needed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;pass.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;**Iterate.**&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Re-run,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;compare&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;baseline,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;refine.&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="err"&gt;This&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;guarantees&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;you&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;are&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;solving&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;real&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;problem&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;instead&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;documenting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;an&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;imaginary&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;one.&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;A&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;simple&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;eval&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;just&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;structured&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;expectations:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"skills"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"building-riverpod-async-screens"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Build a screen that loads the user's order history from OrderRepository and shows it in a list"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"expected_behavior"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Creates a FutureProvider or AsyncNotifier that calls OrderRepository, not a StatefulWidget with setState"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Renders loading, error, and data states using AsyncValue.when"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Includes a retry action in the error branch that invalidates the provider"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Generated code passes `dart analyze` with no errors"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Grade it the way Flutter does
&lt;/h3&gt;

&lt;p&gt;The Dart and Flutter teams run an experimental evals framework (open-sourced at the flutter/evals repository) built around &lt;strong&gt;critical user journeys&lt;/strong&gt;: realistic developer tasks rather than toy prompts. They score on two axes, which is a great rubric to copy for your own skills:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic correctness&lt;/strong&gt;: does it compile, pass &lt;code&gt;dart analyze&lt;/code&gt;, and pass the tests? Objective, machine-checkable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Qualitative performance&lt;/strong&gt;: is the reasoning sound, the output concise, the approach safe? Graded by an automated model judge and by expert humans.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For your own skill, that translates to a dead-simple loop: run the task with and without the skill, then ask "did the deterministic checks pass, and is the code meaningfully better?" If the skill does not move either axis, it is not earning its context budget.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use two agents: author with one, test with the other
&lt;/h3&gt;

&lt;p&gt;Anthropic's most practical tip: develop the skill with one instance (call it the author) and test it with a fresh instance that has no memory of the conversation (the tester). The author helps you write and tighten the &lt;code&gt;SKILL.md&lt;/code&gt;. The tester reveals what the instructions actually communicate to a cold agent. When the tester stumbles, bring the specific failure back to the author and refine. Repeat. This observe-refine-test loop is how the official skills were hardened, and it works because the model understands both how to write agent instructions and what an agent needs to receive.&lt;/p&gt;

&lt;h2&gt;
  
  
  The security caveat nobody mentions
&lt;/h2&gt;

&lt;p&gt;Skills can include scripts and reference external resources. That means an untrusted skill can introduce vulnerabilities or quietly exfiltrate data. Before you install a community skill, read it, the same way you would read a dependency before adding it to &lt;code&gt;pubspec.yaml&lt;/code&gt;. For any skill that runs terminal commands or touches infrastructure, add an explicit "Safety" section documenting exactly what it does. Treat skills as code, because they are.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest take from the community
&lt;/h2&gt;

&lt;p&gt;When the official skills dropped, the Flutter corner of X and Reddit reacted the way it always does: screenshots, threads, and declarations that AI coding just changed again. I want to be straight, because the skeptics have a point worth hearing.&lt;/p&gt;

&lt;p&gt;More than one experienced Flutter dev read the actual skill files and came away underwhelmed, noting the initial set is fairly thin and covers ground a competent dev already knows. That is fair. And it is also the wrong frame.&lt;/p&gt;

&lt;p&gt;A skill is not a magic file that makes your agent brilliant. It is a discipline. The value is not in any single skill the Flutter team shipped. It is in the workflow the format unlocks: codify a pattern once, evaluate it, refine it on a loop, and every future session inherits it. The teams that win with AI in 2026 are not the ones with the best model. They are the ones who got good at writing down what they already know, then testing that the agent actually follows it.&lt;/p&gt;

&lt;p&gt;That is the real reason to learn this recipe. Not to consume the official skills, but to write the ones your team actually needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The ship-it checklist
&lt;/h2&gt;

&lt;p&gt;Before you commit a Flutter skill, verify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Description is third person, states what AND when, and front-loads trigger words.&lt;/li&gt;
&lt;li&gt;[ ] Description has a "Do not use" boundary to prevent over-activation.&lt;/li&gt;
&lt;li&gt;[ ] &lt;code&gt;SKILL.md&lt;/code&gt; body is under 500 lines; depth is in &lt;code&gt;references/&lt;/code&gt; or &lt;code&gt;examples/&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;[ ] No content explaining things the model already knows.&lt;/li&gt;
&lt;li&gt;[ ] Anti-patterns are stated explicitly, not just the happy path.&lt;/li&gt;
&lt;li&gt;[ ] Multi-step work is a copyable checklist.&lt;/li&gt;
&lt;li&gt;[ ] There is a validator loop (&lt;code&gt;dart analyze&lt;/code&gt; / &lt;code&gt;flutter test&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;[ ] No time-sensitive info outside a collapsed "old patterns" section.&lt;/li&gt;
&lt;li&gt;[ ] At least one complete, runnable example.&lt;/li&gt;
&lt;li&gt;[ ] References are one level deep; long ones have a table of contents.&lt;/li&gt;
&lt;li&gt;[ ] At least three eval scenarios exist and the skill beats the no-skill baseline.&lt;/li&gt;
&lt;li&gt;[ ] Any scripts are audited and documented for safety.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What is a Flutter agent skill?&lt;/strong&gt;&lt;br&gt;
A folder containing a &lt;code&gt;SKILL.md&lt;/code&gt; file that gives an AI coding agent task-specific, expert instructions for a Flutter or Dart workflow. It loads on demand via progressive disclosure, so it adds expertise without permanently bloating the context window.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What makes an agent skill good?&lt;/strong&gt;&lt;br&gt;
A precise, trigger-rich description (the single biggest factor), ruthless conciseness, explicitly stated anti-patterns, a checklist workflow, a validator feedback loop, and at least one complete example, all verified against evaluations rather than vibes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I write the description so the skill actually triggers?&lt;/strong&gt;&lt;br&gt;
Third person, state both what the skill does and when to use it, front-load the trigger words a developer would type, and add a "Do not use" clause to prevent it firing on the wrong tasks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I evaluate a skill?&lt;/strong&gt;&lt;br&gt;
Build evals before writing docs. Run the task without the skill to establish a baseline, write three scenarios with expected behaviors, then measure deterministic correctness (compiles, passes &lt;code&gt;dart analyze&lt;/code&gt; and tests) and qualitative quality against that baseline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does a skill I write for Claude Code work in Codex and Antigravity?&lt;/strong&gt;&lt;br&gt;
Yes. &lt;code&gt;SKILL.md&lt;/code&gt; is an open standard. Skills that stick to the core format (frontmatter plus Markdown instructions) work across Claude Code, Codex, Antigravity, Gemini CLI, and Cursor. Only advanced, tool-specific features need adjustment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How is a skill different from a rules file or AGENTS.md?&lt;/strong&gt;&lt;br&gt;
Rules and &lt;code&gt;AGENTS.md&lt;/code&gt; are always-on, repository-wide instructions (setup commands, standards). A skill is loaded only when its description matches the current task. Use always-on files for global rules and short if/then triggers, and skills for specific, repeatable workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How long should a SKILL.md be?&lt;/strong&gt;&lt;br&gt;
Keep the body under 500 lines. If it grows past that, move depth into one-level-deep &lt;code&gt;references/&lt;/code&gt; files and keep the main file as a lean overview.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;The official Dart and Flutter skills are a starting point, not the destination. The real unlock is the recipe behind them: a discovery-optimized description, concise expert instructions, anti-patterns stated out loud, a checklist, a validator loop, and an evaluation that proves it works. Get those right and you can encode your team's hardest-won Flutter patterns into something every agent on the team follows automatically.&lt;/p&gt;

&lt;p&gt;Write one skill this week. Pick the task where your AI agent annoys you most, encode the correct pattern, and evaluate it against the no-skill baseline. Then tell me what you built. I read every comment, and I want to know which Flutter pattern you taught your agent first. 🥊&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sources: &lt;a href="https://docs.flutter.dev/ai/agent-skills" rel="noopener noreferrer"&gt;Flutter docs: Agent skills&lt;/a&gt; and &lt;a href="https://docs.flutter.dev/ai/evals" rel="noopener noreferrer"&gt;AI Evaluations&lt;/a&gt;, &lt;a href="https://github.com/flutter/skills" rel="noopener noreferrer"&gt;flutter/skills&lt;/a&gt;, &lt;a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices" rel="noopener noreferrer"&gt;Anthropic: Skill authoring best practices&lt;/a&gt;, &lt;a href="https://antigravity.google/docs/skills" rel="noopener noreferrer"&gt;Google Antigravity skills docs&lt;/a&gt;, and &lt;a href="https://developers.openai.com/codex/skills" rel="noopener noreferrer"&gt;OpenAI Codex: Agent skills&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>flutter</category>
      <category>dart</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Flutter Agent Skills: How to Make Your AI Agent Actually Good at Flutter</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Fri, 12 Jun 2026 15:58:16 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/flutter-agent-skills-how-to-make-your-ai-agent-actually-good-at-flutter-3831</link>
      <guid>https://dev.to/sayed_ali_alkamel/flutter-agent-skills-how-to-make-your-ai-agent-actually-good-at-flutter-3831</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Your AI coding assistant is a generalist. It writes Flutter that &lt;em&gt;looks&lt;/em&gt; right but quietly reaches for 2022 patterns. &lt;strong&gt;Agent Skills&lt;/strong&gt; are a new, official way (from the Dart and Flutter teams) to hand your agent task-specific, battle-tested workflows it loads on demand. Two repos, &lt;a href="https://github.com/flutter/skills" rel="noopener noreferrer"&gt;&lt;code&gt;flutter/skills&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://github.com/dart-lang/skills" rel="noopener noreferrer"&gt;&lt;code&gt;dart-lang/skills&lt;/code&gt;&lt;/a&gt;, ship ready-to-use skills for responsive layouts, routing, testing, localization, static analysis, and more. Install in one command:&lt;/p&gt;


&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills add flutter/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; &lt;span class="s1"&gt;'*'&lt;/span&gt; &lt;span class="nt"&gt;--agent&lt;/span&gt; universal
npx skills add dart-lang/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; &lt;span class="s1"&gt;'*'&lt;/span&gt; &lt;span class="nt"&gt;--agent&lt;/span&gt; universal
&lt;/code&gt;&lt;/pre&gt;


&lt;p&gt;This post breaks down what they are, how they differ from rules files and MCP, the full catalog, what a real skill looks like under the hood, and whether they actually move the needle. (Spoiler: mostly yes, with one honest caveat.)&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;Let me tell you about a fight I have almost every day.&lt;/p&gt;

&lt;p&gt;I ask my AI agent to make a screen adapt to tablets. It confidently hands me code that switches layout based on &lt;code&gt;MediaQuery.orientationOf(context)&lt;/code&gt;. It looks clean. It compiles. It even &lt;em&gt;runs&lt;/em&gt;. And it's wrong, because device orientation has nothing to do with how much window space your app actually has on a foldable, in split-screen, or in a resizable desktop window.&lt;/p&gt;

&lt;p&gt;The model isn't dumb. It's a generalist trained on a giant pile of Flutter code, much of it old. And here's the uncomfortable truth the Flutter team said out loud when they launched this feature: &lt;strong&gt;Flutter and Dart ship new features faster than LLMs can update their training data.&lt;/strong&gt; That lag has a name, the &lt;em&gt;knowledge gap&lt;/em&gt;, and it's why your agent keeps writing rookie Flutter with a straight face.&lt;/p&gt;

&lt;p&gt;Agent Skills are the Flutter team's answer to that gap. I've been running them on real projects, and they're one of the few "AI workflow" things in 2026 that earned the hype instead of borrowing it. Let's get into it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Table of Contents
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The real problem: your AI is a generalist&lt;/li&gt;
&lt;li&gt;What are Agent Skills, exactly?&lt;/li&gt;
&lt;li&gt;Skills vs Rules vs MCP: who does what&lt;/li&gt;
&lt;li&gt;The full catalog: every official Flutter and Dart skill&lt;/li&gt;
&lt;li&gt;Anatomy of a skill: what's actually inside that file&lt;/li&gt;
&lt;li&gt;Get it running in two minutes&lt;/li&gt;
&lt;li&gt;The honest take: hype or substance?&lt;/li&gt;
&lt;li&gt;Write your own skill&lt;/li&gt;
&lt;li&gt;FAQ&lt;/li&gt;
&lt;li&gt;Wrapping up&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The real problem: your AI is a generalist
&lt;/h2&gt;

&lt;p&gt;AI agents can write Flutter and Dart. That's not in question anymore. The problem is &lt;em&gt;which&lt;/em&gt; Flutter they write.&lt;/p&gt;

&lt;p&gt;A generalist model has seen a million Stack Overflow answers, half of them deprecated. So when you ask for something specific, it averages across years of patterns and hands you the most &lt;em&gt;statistically common&lt;/em&gt; answer, not the &lt;em&gt;currently correct&lt;/em&gt; one. You get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Layouts keyed off device type ("phone" vs "tablet") instead of available space.&lt;/li&gt;
&lt;li&gt;Locked orientations that letterbox your app on foldables.&lt;/li&gt;
&lt;li&gt;Hand-rolled &lt;code&gt;fromJson&lt;/code&gt; that drifts from your actual model.&lt;/li&gt;
&lt;li&gt;Routing glued together imperatively when your app needed deep links and browser history.&lt;/li&gt;
&lt;li&gt;Hallucinated APIs that were never real, delivered with total confidence.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You already know this dance. You ask, it answers, you spend ten minutes correcting it back to what a senior Flutter dev would have written in the first place. Multiply that across a week and the "productivity tool" is quietly taxing you.&lt;/p&gt;

&lt;p&gt;Agent Skills attack this at the source: instead of hoping the model remembers the right pattern, you &lt;em&gt;give&lt;/em&gt; it the right pattern as a repeatable workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are Agent Skills, exactly?
&lt;/h2&gt;

&lt;p&gt;An &lt;strong&gt;Agent Skill&lt;/strong&gt; is a folder of plain-Markdown instructions, a &lt;code&gt;SKILL.md&lt;/code&gt; file plus any supporting resources, that teaches an AI agent &lt;em&gt;how&lt;/em&gt; to do one specific job the way a professional would.&lt;/p&gt;

&lt;p&gt;Think of it as a task-oriented blueprint. Not "here's everything about Flutter," but "here is the exact, correct, step-by-step way to build an adaptive layout, including the traps to avoid and a checklist to verify you're done."&lt;/p&gt;

&lt;p&gt;The mechanism that makes this practical is something the Flutter team calls &lt;strong&gt;progressive disclosure&lt;/strong&gt;, and if you write Flutter you already understand it intuitively: it's deferred loading for your context window.&lt;/p&gt;

&lt;p&gt;Here's the idea. Your agent doesn't slurp every instruction from every skill into its context up front. That would torch your token budget and bury the signal. Instead:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The agent reads only the lightweight &lt;strong&gt;metadata&lt;/strong&gt; of each skill first (a name and a short description of when to use it).&lt;/li&gt;
&lt;li&gt;When a task actually matches (say, you ask for a responsive layout), it pulls in the &lt;strong&gt;full&lt;/strong&gt; instructions for &lt;em&gt;that one skill&lt;/em&gt;, on demand.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Lazy-loaded expertise. The agent stays lean until the moment it needs to be an expert, then loads exactly the right playbook. That's the whole trick, and it's why this scales to dozens of skills without drowning the model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Skills vs Rules vs MCP: who does what
&lt;/h2&gt;

&lt;p&gt;This is where most people get confused, because Flutter now gives you &lt;em&gt;three&lt;/em&gt; ways to steer an AI agent. They're not competitors; they're layers. Here's the mental model I use:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What it is&lt;/th&gt;
&lt;th&gt;What it controls&lt;/th&gt;
&lt;th&gt;Analogy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rules files&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Persistent project-wide config (e.g. an AI rules file)&lt;/td&gt;
&lt;td&gt;The agent's &lt;em&gt;general&lt;/em&gt; behavior across &lt;strong&gt;every&lt;/strong&gt; task&lt;/td&gt;
&lt;td&gt;The house style guide that's always in effect&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent Skills&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Task-specific &lt;code&gt;SKILL.md&lt;/code&gt; workflows, loaded on demand&lt;/td&gt;
&lt;td&gt;The &lt;em&gt;step-by-step know-how&lt;/em&gt; for &lt;strong&gt;one&lt;/strong&gt; specific job&lt;/td&gt;
&lt;td&gt;The expert playbook you open for a specific task&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP (Dart &amp;amp; Flutter MCP server)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A server exposing real tools to the agent&lt;/td&gt;
&lt;td&gt;The &lt;em&gt;machinery&lt;/em&gt;: hot reload, runtime errors, analysis&lt;/td&gt;
&lt;td&gt;The power tools in the workshop&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The cleanest way to say it: &lt;strong&gt;MCP gives the agent the tools. A Skill teaches the agent how to use those tools correctly for a specific task. Rules set the tone for everything.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In practice they stack beautifully. The MCP server lets the agent connect to your running app and hot reload after a change. A skill tells it &lt;em&gt;when&lt;/em&gt; and &lt;em&gt;why&lt;/em&gt; to do that as part of, say, fixing a layout overflow. Rules keep the whole thing consistent with your project's conventions. Together, that's an agent that behaves less like an intern and more like a teammate who's read your codebase's standards.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full catalog: every official Flutter and Dart skill
&lt;/h2&gt;

&lt;p&gt;Both repos are maintained by the actual Dart and Flutter teams and licensed BSD-3-Clause, so this isn't some random community dump; it's first-party guidance. Here's what shipped.&lt;/p&gt;

&lt;h3&gt;
  
  
  Flutter skills (&lt;a href="https://github.com/flutter/skills" rel="noopener noreferrer"&gt;&lt;code&gt;flutter/skills&lt;/code&gt;&lt;/a&gt;)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Skill&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-build-responsive-layout&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Build layouts that adapt to &lt;strong&gt;window space&lt;/strong&gt; using &lt;code&gt;LayoutBuilder&lt;/code&gt;, &lt;code&gt;MediaQuery&lt;/code&gt;, &lt;code&gt;Expanded&lt;/code&gt;/&lt;code&gt;Flexible&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-fix-layout-issues&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Diagnose and fix overflows and unbounded-constraint errors ("RenderFlex overflowed", etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-apply-architecture-best-practices&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Structure the app in the recommended layered approach (UI, Logic, Data)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-setup-declarative-routing&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Wire up &lt;code&gt;MaterialApp.router&lt;/code&gt; with &lt;code&gt;go_router&lt;/code&gt; for deep linking and browser history&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-implement-json-serialization&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generate &lt;code&gt;fromJson&lt;/code&gt;/&lt;code&gt;toJson&lt;/code&gt; model classes correctly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-use-http-package&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Make GET/POST/PUT/DELETE calls to REST APIs with the &lt;code&gt;http&lt;/code&gt; package&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-setup-localization&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Add &lt;code&gt;flutter_localizations&lt;/code&gt; + &lt;code&gt;intl&lt;/code&gt;, configure &lt;code&gt;l10n.yaml&lt;/code&gt;, scaffold translations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-add-widget-test&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Write component tests with &lt;code&gt;WidgetTester&lt;/code&gt; for rendering and interactions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-add-widget-preview&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Add interactive widget previews via the &lt;code&gt;previews.dart&lt;/code&gt; system&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;flutter-add-integration-test&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Configure Flutter Driver and turn agent UI actions into permanent integration tests&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Dart skills (&lt;a href="https://github.com/dart-lang/skills" rel="noopener noreferrer"&gt;&lt;code&gt;dart-lang/skills&lt;/code&gt;&lt;/a&gt;)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Skill&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-run-static-analysis&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Run &lt;code&gt;dart analyze&lt;/code&gt; and auto-apply mechanical fixes with &lt;code&gt;dart fix --apply&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-fix-static-analysis-errors&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A workflow for hunting down and fixing analyzer errors after edits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-fix-runtime-errors&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Pull an active stack trace, locate the failing line, fix it, verify via hot reload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-add-unit-test&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Write and organize unit tests with &lt;code&gt;package:test&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-generate-test-mocks&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generate mocks with &lt;code&gt;package:mockito&lt;/code&gt; + &lt;code&gt;build_runner&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-collect-coverage&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Collect coverage with the &lt;code&gt;coverage&lt;/code&gt; package and emit an LCOV report&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-resolve-package-conflicts&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fix &lt;code&gt;pub get&lt;/code&gt; version conflicts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-build-cli-app&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Build CLI utilities: entrypoint structure, exit codes, cross-platform scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-use-pattern-matching&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Apply switch expressions and pattern matching where they belong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;dart-migrate-to-checks-package&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Migrate assertions from &lt;code&gt;package:matcher&lt;/code&gt; to &lt;code&gt;package:checks&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Look at that list as a Flutter dev and the value clicks immediately: these are exactly the tasks where a generalist model fumbles the &lt;em&gt;current&lt;/em&gt; best practice. Adaptive layout. Routing. Localization. Serialization. The unglamorous, easy-to-get-subtly-wrong stuff.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anatomy of a skill: what's actually inside that file
&lt;/h2&gt;

&lt;p&gt;"A Markdown file" sounds underwhelming until you open one. Let me show you the &lt;code&gt;flutter-build-responsive-layout&lt;/code&gt; skill, because it's the perfect rebuttal to that orientation bug I opened with.&lt;/p&gt;

&lt;p&gt;A skill is structured guidance: rules, anti-patterns, workflows-as-checklists, and runnable examples. Here's the core of what this one teaches the agent:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The non-negotiable rules:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;MediaQuery.sizeOf(context)&lt;/code&gt; to measure the &lt;strong&gt;app window&lt;/strong&gt;, not the physical device.&lt;/li&gt;
&lt;li&gt;Use &lt;code&gt;LayoutBuilder&lt;/code&gt; and branch on &lt;code&gt;constraints.maxWidth&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Do not&lt;/strong&gt; use &lt;code&gt;MediaQuery.orientationOf&lt;/code&gt; or &lt;code&gt;OrientationBuilder&lt;/code&gt; near the top of the tree to swap layouts; orientation doesn't reflect available window space.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Do not&lt;/strong&gt; check hardware type ("phone" vs "tablet"). Flutter runs in resizable windows, split-screen, and picture-in-picture.&lt;/li&gt;
&lt;li&gt;Internalize the core rule: &lt;strong&gt;Constraints go down. Sizes go up. Parent sets position.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't lock orientation.&lt;/strong&gt; It letterboxes your app on foldables and large-screen Android tiers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's a senior Flutter dev's hard-won instinct, written down where the model can't ignore it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A workflow as an explicit checklist&lt;/strong&gt; (the agent literally walks this):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ ] Identify the widget that needs adaptive behavior.
[ ] Wrap the tree in a LayoutBuilder.
[ ] Extract constraints.maxWidth from the builder callback.
[ ] Define a breakpoint (e.g. largeScreenMinWidth = 600).
[ ] If maxWidth &amp;gt; breakpoint: return the large-screen layout (Row + sidebar).
[ ] Else: return the small-screen layout (Column / standard nav).
[ ] Run validator -&amp;gt; resize the window -&amp;gt; review transitions -&amp;gt; fix overflows.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;And a runnable reference example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="s"&gt;'package:flutter/material.dart'&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="n"&gt;largeScreenMinWidth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;600.0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AdaptiveLayout&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="n"&gt;StatelessWidget&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;AdaptiveLayout&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nd"&gt;@override&lt;/span&gt;
  &lt;span class="n"&gt;Widget&lt;/span&gt; &lt;span class="n"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BuildContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;LayoutBuilder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nl"&gt;builder:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;constraints&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;constraints&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;maxWidth&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;largeScreenMinWidth&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_buildLargeScreenLayout&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_buildSmallScreenLayout&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;Widget&lt;/span&gt; &lt;span class="n"&gt;_buildLargeScreenLayout&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Row&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nl"&gt;children:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;SizedBox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;width:&lt;/span&gt; &lt;span class="mi"&gt;250&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nl"&gt;child:&lt;/span&gt; &lt;span class="n"&gt;Placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;color:&lt;/span&gt; &lt;span class="n"&gt;Colors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;blue&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;VerticalDivider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;width:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;Expanded&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;child:&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;color:&lt;/span&gt; &lt;span class="n"&gt;Colors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;green&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
      &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;Widget&lt;/span&gt; &lt;span class="n"&gt;_buildSmallScreenLayout&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;Placeholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;color:&lt;/span&gt; &lt;span class="n"&gt;Colors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;green&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See what happened? The skill doesn't just say "be responsive." It bans the exact wrong instinct (&lt;code&gt;orientation&lt;/code&gt;), prescribes the right one (&lt;code&gt;constraints.maxWidth&lt;/code&gt;), hands over a verification loop, and gives a canonical example. The agent stops guessing and starts executing a known-good plan. That's the difference between "AI that writes Flutter" and "AI that writes &lt;em&gt;your team's&lt;/em&gt; Flutter."&lt;/p&gt;

&lt;h2&gt;
  
  
  Get it running in two minutes
&lt;/h2&gt;

&lt;p&gt;The skills are distributed through an &lt;code&gt;npx&lt;/code&gt; CLI, so you'll need &lt;a href="https://nodejs.org/" rel="noopener noreferrer"&gt;Node.js&lt;/a&gt; installed. From your project root:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# All Flutter skills&lt;/span&gt;
npx skills add flutter/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; &lt;span class="s1"&gt;'*'&lt;/span&gt; &lt;span class="nt"&gt;--agent&lt;/span&gt; universal

&lt;span class="c"&gt;# All Dart skills&lt;/span&gt;
npx skills add dart-lang/skills &lt;span class="nt"&gt;--skill&lt;/span&gt; &lt;span class="s1"&gt;'*'&lt;/span&gt; &lt;span class="nt"&gt;--agent&lt;/span&gt; universal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A couple of things worth knowing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;--agent universal&lt;/code&gt; flag drops everything into the standard &lt;strong&gt;&lt;code&gt;.agents/skills&lt;/code&gt;&lt;/strong&gt; directory, the folder that compatible agents auto-discover. Commit it, and your whole team (and CI) inherits the same expertise.&lt;/li&gt;
&lt;li&gt;Want a subset instead of &lt;code&gt;'*'&lt;/code&gt;? Swap in a specific skill name, e.g. &lt;code&gt;--skill flutter-build-responsive-layout&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Update later with &lt;code&gt;npx skills update&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once they're installed, my favorite first move is to just &lt;em&gt;ask the agent what it's now capable of&lt;/em&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Review my &lt;code&gt;.agents/skills&lt;/code&gt; directory. Which installed skills can help with making this dashboard work on tablet and desktop?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It'll surface &lt;code&gt;flutter-build-responsive-layout&lt;/code&gt;, explain the plan, and you're off; no guessing which incantation to type.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest take: hype or substance?
&lt;/h2&gt;

&lt;p&gt;When these dropped, the Flutter corner of X and Reddit did what it always does: screenshots, "AI coding just changed again," the works. I want to be straight with you, because hype helps no one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The substance is real.&lt;/strong&gt; This isn't a vibe. It's first-party, task-scoped guidance that demonstrably stops common, &lt;em&gt;current&lt;/em&gt; mistakes, and because of progressive disclosure it does so while &lt;em&gt;lowering&lt;/em&gt; token usage and &lt;em&gt;raising&lt;/em&gt; accuracy. For the unglamorous 80% of app work (layouts, routing, serialization, localization, tests), it's a genuine upgrade. I've shipped cleaner first drafts because of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Here's the caveat.&lt;/strong&gt; Skills are guardrails, not a brain transplant. They make your agent reliably &lt;em&gt;correct&lt;/em&gt; on known tasks; they don't make it &lt;em&gt;inventive&lt;/em&gt; on novel architecture, and they won't save a vague prompt. The initial set is deliberately scoped to "the most common hurdles"; the Flutter team has been explicit that this is a starting point, not a finished product, and they're openly asking the community to file issues and contribute more.&lt;/p&gt;

&lt;p&gt;So the right framing isn't "AI can now build my app." It's: &lt;strong&gt;your agent just stopped failing the easy stuff, so you can spend your judgment on the hard stuff.&lt;/strong&gt; That's a trade I'll take every single day.&lt;/p&gt;

&lt;p&gt;If you're already using Cursor, Claude Code, Antigravity, or any agent that reads &lt;code&gt;.agents/skills&lt;/code&gt;, there's basically no reason not to try this on your next task.&lt;/p&gt;

&lt;h2&gt;
  
  
  Write your own skill
&lt;/h2&gt;

&lt;p&gt;Once you feel the difference, you'll want to encode &lt;em&gt;your&lt;/em&gt; team's patterns: your state management choice, your folder conventions, your API client. Because a skill is just a Markdown folder, that's very doable. The shape that works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Create the folder&lt;/strong&gt; under &lt;code&gt;.agents/skills/your-skill-name/&lt;/code&gt; with a &lt;code&gt;SKILL.md&lt;/code&gt; inside.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write tight metadata.&lt;/strong&gt; A name and a one-line "use this when…" description. This is the part the agent reads first, so make the trigger condition unambiguous.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State the rules and anti-patterns.&lt;/strong&gt; Don't just say what to do; say what &lt;em&gt;not&lt;/em&gt; to do. The "don't use &lt;code&gt;orientation&lt;/code&gt;" line is what makes the responsive skill bulletproof.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add a workflow checklist.&lt;/strong&gt; Spell out the steps as a literal task list the agent can march through and self-verify.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Include a runnable example.&lt;/strong&gt; One canonical, correct snippet is worth a thousand adjectives.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Then evaluate it like code: try it, watch where the agent still drifts, tighten the wording, repeat. Treat skill-writing as a loop (create, test, refine), not a one-and-done config file. The teams that win with AI in 2026 aren't the ones with the best model; they're the ones with the best-encoded knowledge.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What are Flutter Agent Skills?&lt;/strong&gt;&lt;br&gt;
They're folders of Markdown instructions (&lt;code&gt;SKILL.md&lt;/code&gt; files) that give an AI coding agent task-specific, expert Flutter and Dart workflows. They're maintained officially by the Dart and Flutter teams in the &lt;a href="https://github.com/flutter/skills" rel="noopener noreferrer"&gt;&lt;code&gt;flutter/skills&lt;/code&gt;&lt;/a&gt; and &lt;a href="https://github.com/dart-lang/skills" rel="noopener noreferrer"&gt;&lt;code&gt;dart-lang/skills&lt;/code&gt;&lt;/a&gt; repositories.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How are Agent Skills different from a rules file?&lt;/strong&gt;&lt;br&gt;
Rules configure the agent's general behavior across &lt;em&gt;all&lt;/em&gt; tasks. A skill gives step-by-step instructions for &lt;em&gt;one specific&lt;/em&gt; job, and is loaded only when that job comes up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do skills relate to the Dart &amp;amp; Flutter MCP server?&lt;/strong&gt;&lt;br&gt;
MCP provides the tools (hot reload, runtime errors, analysis). Skills provide the know-how to use those tools correctly. They're complementary layers, not alternatives.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where do skills get installed?&lt;/strong&gt;&lt;br&gt;
Into the &lt;code&gt;.agents/skills&lt;/code&gt; directory in your project, which compatible agents discover automatically. Use &lt;code&gt;npx skills add ... --agent universal&lt;/code&gt; to target that standard location.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need anything special to install them?&lt;/strong&gt;&lt;br&gt;
Just &lt;a href="https://nodejs.org/" rel="noopener noreferrer"&gt;Node.js&lt;/a&gt;, since the &lt;code&gt;skills&lt;/code&gt; CLI runs via &lt;code&gt;npx&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does this only work with one specific AI tool?&lt;/strong&gt;&lt;br&gt;
No. The &lt;code&gt;.agents/skills&lt;/code&gt; convention is designed to be picked up by compatible agents broadly, which is the point of the &lt;code&gt;universal&lt;/code&gt; flag.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Will skills make my AI write perfect apps?&lt;/strong&gt;&lt;br&gt;
No, and that's the honest part. They make it reliably correct on common, well-defined tasks. Novel architecture and vague prompts are still on you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrapping up
&lt;/h2&gt;

&lt;p&gt;The story of AI-assisted Flutter in 2026 isn't "the model got smarter." It's "we got better at telling the model what we already know." Agent Skills are the cleanest expression of that shift I've seen: take the hard-won patterns a senior Flutter dev carries in their head, write them down once, and hand them to your agent exactly when they're needed.&lt;/p&gt;

&lt;p&gt;Install the official sets, ask your agent which skills apply to your current task, and watch the rookie mistakes quietly disappear. Then start encoding your own. That's where the real leverage is.&lt;/p&gt;

&lt;p&gt;If you try this, I want to hear how it goes, especially which skill saved you the most correcting. Drop it in the comments, and if this was useful, follow me for more on Flutter, Dart, and building with AI agents without losing the plot. 🥊&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sources: &lt;a href="https://docs.flutter.dev/ai/agent-skills" rel="noopener noreferrer"&gt;Flutter docs: Agent skills&lt;/a&gt;, &lt;a href="https://github.com/flutter/skills" rel="noopener noreferrer"&gt;flutter/skills&lt;/a&gt;, &lt;a href="https://github.com/dart-lang/skills" rel="noopener noreferrer"&gt;dart-lang/skills&lt;/a&gt;, and the official Dart &amp;amp; Flutter team launch announcement (May 2026).&lt;/em&gt;&lt;/p&gt;

</description>
      <category>flutter</category>
      <category>dart</category>
      <category>ai</category>
      <category>programming</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Thu, 11 Jun 2026 11:12:45 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/-4lam</link>
      <guid>https://dev.to/sayed_ali_alkamel/-4lam</guid>
      <description>&lt;div class="ltag__link--embedded"&gt;
  &lt;div class="crayons-story "&gt;
  &lt;a href="https://dev.to/sayed_ali_alkamel/agentic-flutter-development-your-ai-agent-just-got-hot-reload-me5" class="crayons-story__hidden-navigation-link"&gt;Agentic Flutter Development: Your AI Agent Just Got Hot Reload 🔥&lt;/a&gt;


  &lt;div class="crayons-story__body crayons-story__body-full_post"&gt;
    &lt;div class="crayons-story__top"&gt;
      &lt;div class="crayons-story__meta"&gt;
        &lt;div class="crayons-story__author-pic"&gt;

          &lt;a href="/sayed_ali_alkamel" class="crayons-avatar  crayons-avatar--l  "&gt;
            &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2652218%2F63a5dfd1-8229-48c1-85eb-54a58560297f.jpg" alt="sayed_ali_alkamel profile" class="crayons-avatar__image" width="96" height="96"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
        &lt;div&gt;
          &lt;div&gt;
            &lt;a href="/sayed_ali_alkamel" class="crayons-story__secondary fw-medium m:hidden"&gt;
              Sayed Ali Alkamel
            &lt;/a&gt;
            &lt;div class="profile-preview-card relative mb-4 s:mb-0 fw-medium hidden m:inline-block"&gt;
              
                Sayed Ali Alkamel
                
              
              &lt;div id="story-author-preview-content-3865541" class="profile-preview-card__content crayons-dropdown branded-7 p-4 pt-0"&gt;
                &lt;div class="gap-4 grid"&gt;
                  &lt;div class="-mt-4"&gt;
                    &lt;a href="/sayed_ali_alkamel" class="flex"&gt;
                      &lt;span class="crayons-avatar crayons-avatar--xl mr-2 shrink-0"&gt;
                        &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2652218%2F63a5dfd1-8229-48c1-85eb-54a58560297f.jpg" class="crayons-avatar__image" alt="" width="96" height="96"&gt;
                      &lt;/span&gt;
                      &lt;span class="crayons-link crayons-subtitle-2 mt-5"&gt;Sayed Ali Alkamel&lt;/span&gt;
                    &lt;/a&gt;
                  &lt;/div&gt;
                  &lt;div class="print-hidden"&gt;
                    
                      Follow
                    
                  &lt;/div&gt;
                  &lt;div class="author-preview-metadata-container"&gt;&lt;/div&gt;
                &lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;

          &lt;/div&gt;
          &lt;a href="https://dev.to/sayed_ali_alkamel/agentic-flutter-development-your-ai-agent-just-got-hot-reload-me5" class="crayons-story__tertiary fs-xs"&gt;&lt;time&gt;Jun 11&lt;/time&gt;&lt;span class="time-ago-indicator-initial-placeholder"&gt;&lt;/span&gt;&lt;/a&gt;
        &lt;/div&gt;
      &lt;/div&gt;

    &lt;/div&gt;

    &lt;div class="crayons-story__indention"&gt;
      &lt;h2 class="crayons-story__title crayons-story__title-full_post"&gt;
        &lt;a href="https://dev.to/sayed_ali_alkamel/agentic-flutter-development-your-ai-agent-just-got-hot-reload-me5" id="article-link-3865541"&gt;
          Agentic Flutter Development: Your AI Agent Just Got Hot Reload 🔥
        &lt;/a&gt;
      &lt;/h2&gt;
        &lt;div class="crayons-story__tags"&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/flutter"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;flutter&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/ai"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;ai&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/dart"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;dart&lt;/a&gt;
            &lt;a class="crayons-tag  crayons-tag--monochrome " href="/t/mcp"&gt;&lt;span class="crayons-tag__prefix"&gt;#&lt;/span&gt;mcp&lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="crayons-story__bottom"&gt;
        &lt;div class="crayons-story__details"&gt;
          &lt;a href="https://dev.to/sayed_ali_alkamel/agentic-flutter-development-your-ai-agent-just-got-hot-reload-me5" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left"&gt;
            &lt;div class="multiple_reactions_aggregate"&gt;
              &lt;span class="multiple_reactions_icons_container"&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/exploding-head-daceb38d627e6ae9b730f36a1e390fca556a4289d5a41abb2c35068ad3e2c4b5.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/multi-unicorn-b44d6f8c23cdd00964192bedc38af3e82463978aa611b4365bd33a0f1f4f3e97.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
                  &lt;span class="crayons_icon_container"&gt;
                    &lt;img src="https://assets.dev.to/assets/sparkle-heart-5f9bee3767e18deb1bb725290cb151c25234768a0e9a2bd39370c382d02920cf.svg" width="24" height="24"&gt;
                  &lt;/span&gt;
              &lt;/span&gt;
              &lt;span class="aggregate_reactions_counter"&gt;5&lt;span class="hidden s:inline"&gt;&amp;nbsp;reactions&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/a&gt;
            &lt;a href="https://dev.to/sayed_ali_alkamel/agentic-flutter-development-your-ai-agent-just-got-hot-reload-me5#comments" class="crayons-btn crayons-btn--s crayons-btn--ghost crayons-btn--icon-left flex items-center"&gt;
              

              &lt;span class="hidden s:inline"&gt;Add&amp;nbsp;Comment&lt;/span&gt;
            &lt;/a&gt;
        &lt;/div&gt;
        &lt;div class="crayons-story__save"&gt;
          &lt;small class="crayons-story__tertiary fs-xs mr-2"&gt;
            10 min read
          &lt;/small&gt;
            
              &lt;span class="bm-initial crayons-icon c-btn__icon"&gt;
                

              &lt;/span&gt;
              &lt;span class="bm-success crayons-icon c-btn__icon"&gt;
                

              &lt;/span&gt;
            
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/div&gt;
&lt;/div&gt;

&lt;/div&gt;


</description>
    </item>
    <item>
      <title>Agentic Flutter Development: Your AI Agent Just Got Hot Reload 🔥</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Thu, 11 Jun 2026 11:12:22 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/agentic-flutter-development-your-ai-agent-just-got-hot-reload-me5</link>
      <guid>https://dev.to/sayed_ali_alkamel/agentic-flutter-development-your-ai-agent-just-got-hot-reload-me5</guid>
      <description>&lt;p&gt;Fellow denizens of the digital age: your Flutter app has spent its entire life as a sealed aquarium.&lt;/p&gt;

&lt;p&gt;You could watch the fish swim. Your tools could watch. But the AI "assistant" next to you was functionally blind. It wrote code &lt;em&gt;about&lt;/em&gt; your app without ever seeing it run. It suggested a fix, you copy-pasted, hot-reloaded, squinted at the simulator, described what broke, and repeated.&lt;/p&gt;

&lt;p&gt;You weren't pair programming. You were a courier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agentic Flutter development&lt;/strong&gt; is what happens when someone hands the AI a key to the aquarium. And in 2026, Google didn't just hand over the key. They shipped it in the SDK.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Agentic Flutter development means AI agents that analyze, edit, run, &lt;em&gt;observe&lt;/em&gt;, and fix Flutter apps in a closed feedback loop, not just generate code snippets.&lt;/li&gt;
&lt;li&gt;The official Dart and Flutter MCP server (Dart 3.9+ / Flutter 3.35+) exposes the analyzer, hot reload, widget inspection, screenshots, tests, and pub.dev search to any MCP-compatible agent.&lt;/li&gt;
&lt;li&gt;Flutter 3.44, released at Google I/O 2026, introduced Agentic Hot Reload: agents auto-discover your running app and reload it after every edit.&lt;/li&gt;
&lt;li&gt;The Flutter team's April 2026 strategy post reports that 79% of Flutter developers already use AI assistants, while a 46% trust gap remains around AI accuracy.&lt;/li&gt;
&lt;li&gt;The Flutter Extension for Gemini CLI (launched October 2025) bundles the MCP server with commands like &lt;code&gt;/create-app&lt;/code&gt;, &lt;code&gt;/modify&lt;/code&gt;, and &lt;code&gt;/commit&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;It works with Claude Code, Cursor, Antigravity, Gemini CLI, and any client speaking MCP over stdio.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Is Agentic Flutter Development?
&lt;/h2&gt;

&lt;p&gt;Agentic Flutter development is a workflow where AI agents use the Model Context Protocol (MCP) to directly operate Flutter's developer tools: running the analyzer, editing code, triggering hot reload, inspecting widgets, and reading runtime errors. The agent completes tasks autonomously in a closed feedback loop instead of merely suggesting code for a human to apply.&lt;/p&gt;

&lt;p&gt;That's the citable definition. Now the planetarium version.&lt;/p&gt;

&lt;p&gt;For a decade, hot reload was Flutter's superpower. &lt;em&gt;For humans.&lt;/em&gt; Sub-second feedback made us faster than everyone else at the UI game.&lt;/p&gt;

&lt;p&gt;Agentic development asks a deliciously simple question: what if the agent gets the superpower too?&lt;/p&gt;

&lt;p&gt;The answer arrived in stages. First, the Dart team shipped an experimental MCP server exposing the full toolchain. Then Flutter 3.44 landed at Google I/O on &lt;strong&gt;May 19, 2026&lt;/strong&gt;, and the Dart Tooling Daemon started automatically advertising your running app's connection to the MCP server. Now an agent edits your code, the app reloads itself, the agent looks at the result, and iterates.&lt;/p&gt;

&lt;p&gt;The aquarium is open.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Does Agentic Flutter Development Actually Work?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The MCP server is the agent's nervous system
&lt;/h3&gt;

&lt;p&gt;One sentence first: the Dart and Flutter MCP server translates between "agent speak" (MCP tool calls) and Flutter's actual developer tooling.&lt;/p&gt;

&lt;p&gt;MCP is often described as USB-C for AI. One standard port, any agent can plug in. Through it, agents can analyze and fix errors, resolve symbols, hot reload, capture screenshots, inspect the selected widget, manage &lt;code&gt;pubspec.yaml&lt;/code&gt;, run tests, and search pub.dev.&lt;/p&gt;

&lt;p&gt;Pause on that last one. Community doc servers cover over &lt;strong&gt;50,000 packages&lt;/strong&gt; on pub.dev. If you spent just ten minutes evaluating each package, that's 500,000 minutes. Roughly 347 days of non-stop, no-sleep package review. Nearly a year of your life.&lt;/p&gt;

&lt;p&gt;The agent queries it in seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  The closed loop, step by step
&lt;/h3&gt;

&lt;p&gt;Here's the anatomy of an agentic fix, in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You say: &lt;em&gt;"The checkout screen overflows on small phones. Fix it."&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;The agent calls the MCP server to fetch runtime errors from your &lt;strong&gt;running&lt;/strong&gt; app.&lt;/li&gt;
&lt;li&gt;It inspects the widget tree, takes a screenshot, and reads the actual &lt;code&gt;RenderFlex&lt;/code&gt; overflow.&lt;/li&gt;
&lt;li&gt;It edits the code and triggers hot reload through the Dart Tooling Daemon. No human courier.&lt;/li&gt;
&lt;li&gt;It re-screenshots, verifies the fix visually, then runs the analyzer and your tests.&lt;/li&gt;
&lt;li&gt;It reports back, ideally with the smugness disabled.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Think of the MCP server as a hospital intercom. The agent doesn't wander the halls; it pages exactly the instrument it needs. Technically, what's happening is JSON-RPC tool invocations over stdio, with the tooling daemon brokering access to the VM service of your live app.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting it up takes one config block
&lt;/h3&gt;

&lt;p&gt;It works with Claude Code, Cursor, Windsurf, and friends. For Claude Code, drop this into your project's &lt;code&gt;.mcp.json&lt;/code&gt; (requires Dart 3.9+ on your PATH):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"dart"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dart"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"mcp-server"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Gemini CLI, one command installs the Flutter extension &lt;em&gt;and&lt;/em&gt; auto-configures the MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Requires Gemini CLI installed; extension is experimental (alpha)&lt;/span&gt;
gemini extensions &lt;span class="nb"&gt;install &lt;/span&gt;https://github.com/gemini-cli-extensions/flutter
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, I know what you're thinking: &lt;em&gt;"Sayed, isn't this just vibe coding with extra steps?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Fair question. Here's where it gets interesting. Vibe coding is &lt;strong&gt;open-loop&lt;/strong&gt;: the model guesses and you pray. Agentic development is &lt;strong&gt;closed-loop&lt;/strong&gt;: every change gets checked against the analyzer, the test suite, and the actual rendered pixels of a running app. It's the difference between throwing darts blindfolded and throwing darts while watching the board.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkijk3tjz4tkrg2dpnje3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkijk3tjz4tkrg2dpnje3.png" alt="Diagram of the agentic Flutter feedback loop: agent to MCP server to analyzer, hot reload, and running app, with observations flowing back to the agent" width="799" height="436"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Which Agentic Flutter Tools Should You Use?
&lt;/h2&gt;

&lt;p&gt;As of June 2026, here's the honest landscape:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Maintainer&lt;/th&gt;
&lt;th&gt;Requirements&lt;/th&gt;
&lt;th&gt;Standout capability&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Dart &amp;amp; Flutter MCP server&lt;/td&gt;
&lt;td&gt;Google (Dart/Flutter teams)&lt;/td&gt;
&lt;td&gt;Dart 3.9+ / Flutter 3.35+&lt;/td&gt;
&lt;td&gt;Full first-party toolchain: analyzer, hot reload, widget inspection, tests&lt;/td&gt;
&lt;td&gt;Experimental, official&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Flutter Extension for Gemini CLI&lt;/td&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;Gemini CLI; auto-configures MCP server&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;/create-app&lt;/code&gt;, &lt;code&gt;/modify&lt;/code&gt;, &lt;code&gt;/commit&lt;/code&gt; workflow commands&lt;/td&gt;
&lt;td&gt;Alpha (launched Oct 2025)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agentic Hot Reload&lt;/td&gt;
&lt;td&gt;Google (Flutter 3.44)&lt;/td&gt;
&lt;td&gt;Flutter 3.44 (May 19, 2026)&lt;/td&gt;
&lt;td&gt;Auto-discovers running app, zero-config reload after agent edits&lt;/td&gt;
&lt;td&gt;Stable in 3.44&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;mcp_flutter&lt;/code&gt; (Arenukvern)&lt;/td&gt;
&lt;td&gt;Community&lt;/td&gt;
&lt;td&gt;Flutter app + package&lt;/td&gt;
&lt;td&gt;27 tools incl. semantic snapshots, tapping widgets, typing into forms&lt;/td&gt;
&lt;td&gt;Active, community&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;flutter-mcp&lt;/code&gt; (adamsmaka)&lt;/td&gt;
&lt;td&gt;Community&lt;/td&gt;
&lt;td&gt;Any MCP client&lt;/td&gt;
&lt;td&gt;Version-accurate docs for 50,000+ pub.dev packages (kills hallucinated APIs)&lt;/td&gt;
&lt;td&gt;Active, community&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Notice the pattern: the official server controls &lt;em&gt;your tools&lt;/em&gt;, while the strongest community servers let agents control &lt;em&gt;your app itself&lt;/em&gt; or ground them in &lt;em&gt;accurate documentation&lt;/em&gt;. They stack beautifully.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Should Flutter Developers Care About Agentic Development?
&lt;/h2&gt;

&lt;p&gt;The numbers say you mostly already do. The Flutter team reports &lt;strong&gt;79% of Flutter developers&lt;/strong&gt; use AI assistants. In a ten-person standup, that's eight of you. The question is no longer adoption. It's depth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Imagine&lt;/strong&gt; a design-system migration where the agent doesn't just rewrite 200 widget files. It hot-reloads after each batch, screenshots golden screens, and flags the three that visually regressed. Overnight. While you sleep, it works the backlog.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Imagine&lt;/strong&gt; onboarding a junior developer whose "senior pair" answers &lt;em&gt;"why is this screen janky?"&lt;/em&gt; by actually attaching to the running app and reading the raster stats, instead of pasting a generic performance checklist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Imagine&lt;/strong&gt; killing the manual tax you pay today. An agent iterating 40 times on a tricky UI costs you roughly 90 seconds of copy-paste-reload-describe per round. That's 3,600 seconds. &lt;strong&gt;A full hour of being a courier&lt;/strong&gt;, eliminated per feature. Meanwhile, on a 120Hz display, every time you blink (about 300ms), Impeller has already drawn 36 frames. The machines were never the bottleneck. The ferrying was.&lt;/p&gt;

&lt;p&gt;And the direction of travel is explicit. Mariam Hasnany of the Flutter team laid out the 2026 strategy around three personas (traditional, AI-assisted, and AI-first developers), anchored by principles like "Humans first" and "Add, don't replace": Dart stays readable for people even when agents write it. The team is also collaborating on guidance for Flutter with Antigravity and the "Vibe once, deploy everywhere" concept introduced at Google I/O 2025.&lt;/p&gt;

&lt;p&gt;The official account has been beating this drum publicly too:&lt;/p&gt;

&lt;p&gt;&lt;iframe class="tweet-embed" id="tweet-1979230928820908084-856" src="https://platform.twitter.com/embed/Tweet.html?id=1979230928820908084"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1979230928820908084-856');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1979230928820908084&amp;amp;theme=dark"
  }



&lt;/p&gt;




&lt;h2&gt;
  
  
  Limitations and Honest Caveats
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Almost everything here is labeled experimental or alpha.&lt;/strong&gt; The official MCP server is explicitly experimental and evolving quickly; the Gemini CLI extension is alpha. APIs will shift under you. Pin versions. Read changelogs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The trust gap is real and measured.&lt;/strong&gt; The same Flutter team post that celebrates 79% adoption reports a &lt;strong&gt;46% trust gap&lt;/strong&gt; around AI accuracy. Agents confidently invoke deprecated APIs. That's precisely why doc-grounding servers exist.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Closed loops amplify both quality and mistakes.&lt;/strong&gt; An agent that can edit, reload, and "verify" can also convince itself a wrong fix worked because the screenshot &lt;em&gt;looked&lt;/em&gt; fine. Visual confirmation is not semantic correctness. Your test suite is now load-bearing infrastructure for your AI teammates. Write it accordingly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Cost and security.&lt;/strong&gt; Agents that think before every tool call burn tokens. And an agent with shell access plus a live VM service connection deserves the same scrutiny you'd give a new hire with prod credentials.&lt;/p&gt;

&lt;p&gt;And the humility beat, because you've earned it. Honestly? Nobody knows where the reliability ceiling for these agents lands. Not Google, not the MCP authors, not me. That 46% trust gap isn't a marketing problem to message away; it's an open research question we're all living inside. And that should excite you, because the people figuring it out in public are the ones writing the next decade's best practices.&lt;/p&gt;




&lt;h2&gt;
  
  
  How to Get Started With Agentic Flutter Development
&lt;/h2&gt;

&lt;p&gt;Time to first agentic hot reload: &lt;strong&gt;about 15 minutes.&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Upgrade: &lt;code&gt;flutter upgrade&lt;/code&gt;. You want Flutter 3.44+ for Agentic Hot Reload (minimum Dart 3.9 / Flutter 3.35 for the MCP server).&lt;/li&gt;
&lt;li&gt;Pick a client: Claude Code, Cursor, Gemini CLI, or Antigravity.&lt;/li&gt;
&lt;li&gt;Wire the server: add the &lt;code&gt;.mcp.json&lt;/code&gt; block above, or run &lt;code&gt;gemini extensions install https://github.com/gemini-cli-extensions/flutter&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Run your app with &lt;code&gt;flutter run&lt;/code&gt;. On 3.44 the tooling daemon exposes the connection automatically.&lt;/li&gt;
&lt;li&gt;Give it a real task: &lt;em&gt;"Find and fix the overflow on the settings screen, verify with a screenshot, then run the tests."&lt;/em&gt; Watch the loop close.&lt;/li&gt;
&lt;li&gt;Add guardrails: a strong &lt;code&gt;analysis_options.yaml&lt;/code&gt;, meaningful tests, and an &lt;code&gt;AGENTS.md&lt;/code&gt; or &lt;code&gt;CLAUDE.md&lt;/code&gt; describing your conventions.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does the Dart and Flutter MCP server work with Claude Code and Cursor?
&lt;/h3&gt;

&lt;p&gt;Yes. The server speaks MCP over standard I/O, so it works with any compliant client, including Claude Code, Cursor, Windsurf, Gemini CLI, and Antigravity. Clients that support Tools, Resources, and Roots get the full feature set. If your client mishandles roots, launch the server with the &lt;code&gt;--force-roots-fallback&lt;/code&gt; flag.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do I need Flutter 3.44 for agentic development?
&lt;/h3&gt;

&lt;p&gt;No, but you want it. The MCP server itself requires only Dart 3.9 / Flutter 3.35. Flutter 3.44 (Google I/O, May 2026) adds Agentic Hot Reload, where the tooling daemon auto-exposes your running app so agents reload it with zero configuration. It's the single biggest quality-of-life jump in the whole workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is agentic Flutter development free?
&lt;/h3&gt;

&lt;p&gt;The tooling is. The Dart and Flutter MCP server ships with the SDK as part of the open-source ecosystem, and the Gemini CLI Flutter extension is free to install. Your real cost is model inference: agentic loops make many tool calls per task, so token usage runs meaningfully higher than chat-style assistance. Budget accordingly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Will AI agents replace Flutter developers?
&lt;/h3&gt;

&lt;p&gt;The evidence points to role transformation, not replacement. The Flutter team's own strategy is explicitly "Humans first" and "Add, don't replace": Dart stays optimized for human readability even when agents generate it. The skill being devalued is fast widget-tree transcription. The skills appreciating are system design, review judgment, and writing the guardrails agents operate inside.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is Agentic Hot Reload in Flutter 3.44?
&lt;/h3&gt;

&lt;p&gt;It's the mechanism, shipped May 19, 2026, by which the Dart Tooling Daemon automatically shares your running app's connection details with the MCP server. After an agent edits code, the app hot-reloads itself and the agent observes the result. An edit-verify cycle that previously needed a human collapses from minutes to seconds.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Zoom-Out
&lt;/h2&gt;

&lt;p&gt;Step back far enough and this isn't a story about Flutter, or even about AI. Every leap in software history has been a leap in &lt;em&gt;feedback speed&lt;/em&gt;: punch cards to terminals, compile cycles to REPLs, restart to hot reload. Each one didn't just make programmers faster. It changed who got to be a programmer at all.&lt;/p&gt;

&lt;p&gt;Agentic Flutter development is the next compression. The loop that once ran through your fingers now runs through a protocol, and your job migrates up the stack: from typing the change to defining what "correct" means. From Muscat to Mountain View, the developers who thrive won't be the ones who fear the agent in the aquarium. They'll be the ones who learned to be excellent aquarium architects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The future of Flutter isn't humans or agents writing code. It's humans deciding what's worth building, with agents who finally got hot reload.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;Enjoyed the ride? Hit the ❤️ and 🦄, drop a comment with your take, I read every one. Follow me here on Dev.to and on X &lt;a href="https://x.com/Sayed3li97" rel="noopener noreferrer"&gt;@Sayed3li97&lt;/a&gt; for more deep dives where tech meets wonder.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Hasnany, M. "How Dart and Flutter are thinking about AI in 2026," Flutter Blog (Apr 2026). &lt;a href="https://blog.flutter.dev/how-dart-and-flutter-are-thinking-about-ai-in-2026-e2fd64e1fdd0" rel="noopener noreferrer"&gt;https://blog.flutter.dev/how-dart-and-flutter-are-thinking-about-ai-in-2026-e2fd64e1fdd0&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;"Dart and Flutter MCP server," Flutter Docs. &lt;a href="https://docs.flutter.dev/ai/mcp-server" rel="noopener noreferrer"&gt;https://docs.flutter.dev/ai/mcp-server&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;"Flutter extension for Gemini CLI," Flutter Docs. &lt;a href="https://docs.flutter.dev/ai/gemini-cli-extension" rel="noopener noreferrer"&gt;https://docs.flutter.dev/ai/gemini-cli-extension&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;Ryan, J. "Meet the Flutter Extension for Gemini CLI," Flutter Blog (Oct 2025). &lt;a href="https://blog.flutter.dev/meet-the-flutter-extension-for-gemini-cli-f8be3643eaad" rel="noopener noreferrer"&gt;https://blog.flutter.dev/meet-the-flutter-extension-for-gemini-cli-f8be3643eaad&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;"Flutter 3.44 at Google I/O 2026," Very Good Ventures Blog (May 2026). &lt;a href="https://verygood.ventures/blog/google-io-2026/" rel="noopener noreferrer"&gt;https://verygood.ventures/blog/google-io-2026/&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;"Flutter 3.44: Agentic Hot Reload and SwiftPM Default," byteiota (May 2026). &lt;a href="https://byteiota.com/flutter-344-agentic-hot-reload-swiftpm-default/" rel="noopener noreferrer"&gt;https://byteiota.com/flutter-344-agentic-hot-reload-swiftpm-default/&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;"7 MCP Servers Every Dart and Flutter Developer Should Know," Very Good Ventures (Feb 2026). &lt;a href="https://verygood.ventures/blog/7-mcp-servers-every-dart-and-flutter-developer-should-know/" rel="noopener noreferrer"&gt;https://verygood.ventures/blog/7-mcp-servers-every-dart-and-flutter-developer-should-know/&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;Arenukvern. &lt;code&gt;mcp_flutter&lt;/code&gt;, GitHub. &lt;a href="https://github.com/Arenukvern/mcp_flutter" rel="noopener noreferrer"&gt;https://github.com/Arenukvern/mcp_flutter&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;adamsmaka. &lt;code&gt;flutter-mcp&lt;/code&gt;, GitHub. &lt;a href="https://github.com/adamsmaka/flutter-mcp" rel="noopener noreferrer"&gt;https://github.com/adamsmaka/flutter-mcp&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;@FlutterDev on X. MCP server auto-configuration announcement. &lt;a href="https://x.com/FlutterDev/status/1979230928820908084" rel="noopener noreferrer"&gt;https://x.com/FlutterDev/status/1979230928820908084&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;li&gt;"Create with AI," Flutter Docs (GenUI SDK, Genkit Dart, Antigravity). &lt;a href="https://docs.flutter.dev/ai/create-with-ai" rel="noopener noreferrer"&gt;https://docs.flutter.dev/ai/create-with-ai&lt;/a&gt; (accessed Jun 10, 2026)&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>flutter</category>
      <category>ai</category>
      <category>dart</category>
      <category>mcp</category>
    </item>
    <item>
      <title>[Boost]</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Sun, 15 Mar 2026 12:20:17 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/-3fgm</link>
      <guid>https://dev.to/sayed_ali_alkamel/-3fgm</guid>
      <description>&lt;div class="ltag__link"&gt;
  &lt;a href="/sayed_ali_alkamel" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__pic"&gt;
      &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2652218%2F63a5dfd1-8229-48c1-85eb-54a58560297f.jpg" alt="sayed_ali_alkamel"&gt;
    &lt;/div&gt;
  &lt;/a&gt;
  &lt;a href="https://dev.to/sayed_ali_alkamel/vibe-coding-flutter-the-senior-devs-honest-take-1k0f" class="ltag__link__link"&gt;
    &lt;div class="ltag__link__content"&gt;
      &lt;h2&gt;Vibe Coding Flutter: The Senior Dev's Honest Take&lt;/h2&gt;
      &lt;h3&gt;Sayed Ali Alkamel ・ Mar 13&lt;/h3&gt;
      &lt;div class="ltag__link__taglist"&gt;
        &lt;span class="ltag__link__tag"&gt;#flutter&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#dart&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#ai&lt;/span&gt;
        &lt;span class="ltag__link__tag"&gt;#vibecoding&lt;/span&gt;
      &lt;/div&gt;
    &lt;/div&gt;
  &lt;/a&gt;
&lt;/div&gt;


</description>
      <category>flutter</category>
      <category>dart</category>
      <category>ai</category>
      <category>vibecoding</category>
    </item>
    <item>
      <title>Vibe Coding Flutter: The Senior Dev's Honest Take</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Fri, 13 Mar 2026 12:46:50 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/vibe-coding-flutter-the-senior-devs-honest-take-1k0f</link>
      <guid>https://dev.to/sayed_ali_alkamel/vibe-coding-flutter-the-senior-devs-honest-take-1k0f</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;I'm a Google Developer Expert in Dart &amp;amp; Flutter. This is my honest, research-backed take on vibe coding — with real tweets, real code, and a real workflow you can use Monday morning.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  ⚡ TL;DR — The Five Things You Need to Know
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Vibe coding is AI-assisted dev where &lt;strong&gt;intent replaces syntax&lt;/strong&gt; — coined by Andrej Karpathy in Feb 2025, now a legitimate workflow&lt;/li&gt;
&lt;li&gt;Flutter is one of the &lt;strong&gt;best frameworks for vibe coding&lt;/strong&gt; because Dart is strongly-typed, widget trees are predictable, and hot-reload is unbeatable&lt;/li&gt;
&lt;li&gt;It genuinely crushes boilerplate, scaffolding, tests, and Figma-to-widget conversion&lt;/li&gt;
&lt;li&gt;It breaks badly on state management, platform channels, and performance tuning — &lt;strong&gt;you still need to read the code&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;The power move: pair &lt;code&gt;AGENTS.md&lt;/code&gt; + MCP servers + a clear PRD, and treat AI as your most productive junior dev&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Where It Started: A Throwaway Tweet That Changed Everything
&lt;/h2&gt;

&lt;p&gt;On February 2, 2025, Andrej Karpathy — OpenAI co-founder, former Tesla AI Senior Director — posted something that wasn't supposed to be a manifesto. It was a shower thought about his weekend hobby.&lt;/p&gt;

&lt;p&gt;It got &lt;strong&gt;million of views&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;

&lt;iframe class="tweet-embed" id="tweet-1886192184808149383-570" src="https://platform.twitter.com/embed/Tweet.html?id=1886192184808149383"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1886192184808149383-570');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1886192184808149383&amp;amp;theme=dark"
  }





&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs are getting too good. I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I accept all, I don't read the diffs anymore. I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;@karpathy&lt;/strong&gt;, Feb 2, 2025 · &lt;a href="https://x.com/karpathy/status/1886192184808149383" rel="noopener noreferrer"&gt;x.com/karpathy/status/1886192184808149383&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Exactly one year later, Karpathy revisited the post. The word "vibe coding" had earned a Merriam-Webster entry (March 2025) and was named the Collins English Dictionary Word of the Year 2025 and spawned university courses. But his preferred term had quietly evolved:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Today, programming via LLM agents is increasingly becoming a default workflow for professionals, except with more oversight and scrutiny. My current favorite term is 'agentic engineering' — agentic because you're orchestrating agents who write the code, engineering because there's an art &amp;amp; science to it. It's something you can learn and become better at."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;@karpathy&lt;/strong&gt;, Feb 2026 retrospective · &lt;a href="https://x.com/karpathy/status/2019137879310836075" rel="noopener noreferrer"&gt;x.com/karpathy/status/2019137879310836075&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This evolution matters. What started as a "forget the code exists" attitude has matured into &lt;strong&gt;deliberate orchestration with oversight&lt;/strong&gt;. And that nuance is exactly what Flutter developers need to internalize.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Flutter Community Is Actually Saying on X
&lt;/h2&gt;

&lt;p&gt;Twitter / X has been the real-time laboratory for Flutter vibe coding opinions. Here's an honest cross-section — hype, skepticism, and people actually shipping things.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Enthusiasts: "It Unlocks Everything"
&lt;/h3&gt;

&lt;p&gt;

&lt;iframe class="tweet-embed" id="tweet-1906004591570825422-677" src="https://platform.twitter.com/embed/Tweet.html?id=1906004591570825422"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1906004591570825422-677');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1906004591570825422&amp;amp;theme=dark"
  }





&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Vibe coding with #Flutter basically lets anyone build whatever niche app they want in a couple of hours."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;@FlutterCarl&lt;/strong&gt; · &lt;a href="https://x.com/FlutterCarl/status/1906004591570825422" rel="noopener noreferrer"&gt;x.com/FlutterCarl/status/1906004591570825422&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;

&lt;iframe class="tweet-embed" id="tweet-1900578339770843172-196" src="https://platform.twitter.com/embed/Tweet.html?id=1900578339770843172"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-1900578339770843172-196');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=1900578339770843172&amp;amp;theme=dark"
  }





&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Preparing a complete Vibe coding with Flutter video... Stay tuned 👀"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;@mcflyDev (Gautier 💙)&lt;/strong&gt; · &lt;a href="https://x.com/mcflyDev/status/1900578339770843172" rel="noopener noreferrer"&gt;x.com/mcflyDev/status/1900578339770843172&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Can't wait to try Vibe coding Flutter apps in the Vide!"&lt;/em&gt; — reacting to Norbert Kozsir's Flutter AI-IDE that runs and tests widgets it creates, implements pixel-perfect widgets from screenshots, and writes code exactly the way you want.&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;&lt;a class="mentioned-user" href="https://dev.to/csells"&gt;@csells&lt;/a&gt; (Chris Sells)&lt;/strong&gt; · &lt;a href="https://x.com/csells/status/1903182124141908165" rel="noopener noreferrer"&gt;x.com/csells/status/1903182124141908165&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"🔴 LIVE Vibe Coding with @norbertkozsir @devangelslondon and @esratech! #VibeCoding #Flutter #Dart #FlutterCommunity"&lt;/em&gt; — 2,729 Views on the live session.&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;@FlutterComm&lt;/strong&gt; · &lt;a href="https://x.com/FlutterComm/status/1914001730355814887" rel="noopener noreferrer"&gt;x.com/FlutterComm/status/1914001730355814887&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Pragmatists: "It's a Multiplier, Not a Magic Wand"
&lt;/h3&gt;

&lt;p&gt;Andrea Bizzotto (codewithandrea.com), who maintains one of the most respected Flutter newsletters (22k+ subscribers), captured the nuanced position that most senior Flutter developers share:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"AI is a multiplier that amplifies both your skills and your mistakes. So learn to use it well, and don't feel like you need to go all-in. Sometimes the old-fashioned way of writing code manually is still the right call. The decision matrix is simple: compare prompting effort vs coding effort."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;Andrea Bizzotto&lt;/strong&gt; · &lt;a href="https://codewithandrea.com/newsletter/november-2025/" rel="noopener noreferrer"&gt;codewithandrea.com/newsletter/november-2025&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Skeptics: "Don't Actually Forget the Code Exists"
&lt;/h3&gt;

&lt;p&gt;Andrew Chen from a16z pointed out the macro trajectory that makes seasoned engineers uncomfortable:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Random thoughts/predictions on where vibe coding might go: most code will be written by the time-rich. Thus, most code will be written by kids/students rather than software engineers. This is the same trend as video, photos, and other social media — we are in the early innings..."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;@andrewchen&lt;/strong&gt; · March 9, 2025&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And the Dart language team itself signaled where things are heading — not pure vibe coding, but structured, type-safe AI tooling for Dart specifically:&lt;/p&gt;

&lt;p&gt;

&lt;iframe class="tweet-embed" id="tweet-2031499569096528324-155" src="https://platform.twitter.com/embed/Tweet.html?id=2031499569096528324"&gt;
&lt;/iframe&gt;

  // Detect dark theme
  var iframe = document.getElementById('tweet-2031499569096528324-155');
  if (document.body.className.includes('dark-theme')) {
    iframe.src = "https://platform.twitter.com/embed/Tweet.html?id=2031499569096528324&amp;amp;theme=dark"
  }





&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"AI apps just got a lot easier to build 🏗️ Genkit Dart (Preview) is officially out, bringing type-safety and a model-agnostic API to the Dart side. Support for Gemini, Claude, OpenAI. Type-safe AI flows. Dev UI for AI testing and traces. #Genkit #Dart #Flutter"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;@dart_lang&lt;/strong&gt; · &lt;a href="https://x.com/dart_lang/status/2031499569096528324" rel="noopener noreferrer"&gt;x.com/dart_lang/status/2031499569096528324&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Why Flutter Is Unusually Well-Suited for Vibe Coding
&lt;/h2&gt;

&lt;p&gt;Most vibe coding discussion centers on web (React, Next.js) or Python backends. But Flutter has structural properties that make it arguably &lt;strong&gt;better suited&lt;/strong&gt; for AI-assisted development than most frameworks. Here's why.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Dart's Strong Typing Catches AI Mistakes Early
&lt;/h3&gt;

&lt;p&gt;When an LLM generates incorrect Flutter code, Dart's type system screams immediately. You don't get mysterious runtime crashes at 2 AM — the &lt;strong&gt;compiler tells you exactly where the AI hallucinated&lt;/strong&gt; a method that doesn't exist.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="c1"&gt;// ❌ AI hallucinated this parameter — Dart catches it at compile time&lt;/span&gt;
&lt;span class="n"&gt;Widget&lt;/span&gt; &lt;span class="nf"&gt;buildCard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BuildContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Card&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nl"&gt;roundedCorners:&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Error: "The named parameter 'roundedCorners' isn't defined"&lt;/span&gt;
    &lt;span class="nl"&gt;child:&lt;/span&gt; &lt;span class="n"&gt;ListTile&lt;/span&gt;&lt;span class="p"&gt;(...),&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// ✅ What it should be — and you know immediately&lt;/span&gt;
&lt;span class="n"&gt;Widget&lt;/span&gt; &lt;span class="nf"&gt;buildCard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BuildContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Card&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nl"&gt;shape:&lt;/span&gt; &lt;span class="n"&gt;RoundedRectangleBorder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nl"&gt;borderRadius:&lt;/span&gt; &lt;span class="n"&gt;BorderRadius&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;circular&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nl"&gt;child:&lt;/span&gt; &lt;span class="n"&gt;ListTile&lt;/span&gt;&lt;span class="p"&gt;(...),&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Widget Trees Are Declarative — Perfect for AI Reasoning
&lt;/h3&gt;

&lt;p&gt;Flutter's declarative UI model maps almost 1:1 with how LLMs "think" about layout. When you say &lt;em&gt;"add a gradient background to this container with rounded corners and a shadow,"&lt;/em&gt; there's a direct, unambiguous translation to a &lt;code&gt;DecoratedBox&lt;/code&gt; with a &lt;code&gt;BoxDecoration&lt;/code&gt;. The AI doesn't have to guess about lifecycle methods or imperative DOM manipulation.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Hot Reload Is the Tightest Feedback Loop in Mobile Dev
&lt;/h3&gt;

&lt;p&gt;Vibe coding thrives on fast iteration: &lt;strong&gt;prompt → generate → verify → adjust&lt;/strong&gt;. Flutter's hot reload (now including &lt;a href="https://codewithandrea.com/newsletter/march-2025/" rel="noopener noreferrer"&gt;Flutter Web hot reload (stable since 3.35)&lt;/a&gt;) makes the verify step nearly instantaneous. You see AI output materialise on your device in under a second.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Dart Is Expressive but Readable
&lt;/h3&gt;

&lt;p&gt;Unlike Java or Kotlin, Dart is concise enough that AI-generated code stays readable. Unlike TypeScript in a React codebase, there's no JSX/CSS-in-JS impedance mismatch. A Flutter file generated by AI tends to look like Flutter code a human would write — which makes review dramatically faster.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Truth Matrix: When to Vibe, When to Think
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Generate 10 model classes from API spec&lt;/td&gt;
&lt;td&gt;✅ Use AI&lt;/td&gt;
&lt;td&gt;Pure mechanical translation, zero judgment needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build a multi-step onboarding flow UI&lt;/td&gt;
&lt;td&gt;✅ Use AI&lt;/td&gt;
&lt;td&gt;Declarative widgets, verify visually with hot reload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Write 40 widget tests for existing screens&lt;/td&gt;
&lt;td&gt;✅ Use AI&lt;/td&gt;
&lt;td&gt;Pattern-heavy, AI is exceptionally good at this&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fix a 2px pixel misalignment&lt;/td&gt;
&lt;td&gt;✋ Code it&lt;/td&gt;
&lt;td&gt;Low coding effort, high prompt effort — just fix it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architect a real-time sync feature&lt;/td&gt;
&lt;td&gt;✋ Code it&lt;/td&gt;
&lt;td&gt;Requires domain knowledge of your specific constraints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Implement fingerprint auth with fallback&lt;/td&gt;
&lt;td&gt;🔍 AI draft + review&lt;/td&gt;
&lt;td&gt;AI can scaffold, but you must audit every security line&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optimise a ListView with 10,000 items&lt;/td&gt;
&lt;td&gt;🔍 AI draft + review&lt;/td&gt;
&lt;td&gt;AI knows the patterns, you need to profile the output&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Rule of thumb from Andrea Bizzotto:&lt;/strong&gt; If the prompting effort is lower than the coding effort → use AI. If manually fixing it is faster than explaining it → just fix it manually.&lt;/p&gt;

&lt;h3&gt;
  
  
  What AI crushes ✅
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Scaffold generation (routes, services, models)&lt;/li&gt;
&lt;li&gt;Figma design → Flutter widget conversion&lt;/li&gt;
&lt;li&gt;Boilerplate (copyWith, fromJson, toJson, Freezed classes)&lt;/li&gt;
&lt;li&gt;Responsive layout skeletons&lt;/li&gt;
&lt;li&gt;Animation scaffolding (implicit animations)&lt;/li&gt;
&lt;li&gt;Simple CRUD screens with Firebase&lt;/li&gt;
&lt;li&gt;README and documentation generation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where to tread carefully ⚠️
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Complex Riverpod / BLoC state logic&lt;/li&gt;
&lt;li&gt;Platform channel implementations&lt;/li&gt;
&lt;li&gt;Performance tuning (jank, Impeller issues)&lt;/li&gt;
&lt;li&gt;Custom painters and shaders&lt;/li&gt;
&lt;li&gt;Security-sensitive code (auth, storage, network)&lt;/li&gt;
&lt;li&gt;Deep navigation flows with guards&lt;/li&gt;
&lt;li&gt;Accessibility semantics&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Production-Grade Vibe Coding Workflow for Flutter
&lt;/h2&gt;

&lt;p&gt;Based on real-world patterns from Viktor Lidholt (Serverpod), the Globe.dev team, and senior Flutter practitioners — here's the workflow that actually holds up.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Write a PRD Before You Touch the AI
&lt;/h3&gt;

&lt;p&gt;A Product Requirements Document doesn't have to be formal — a Markdown file describing the feature, expected behavior, and constraints is enough. AI with context is an entirely different animal from AI without it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Context is everything.&lt;/strong&gt; It's like assigning a task to a junior engineer without any context — poor delivery is almost guaranteed. — &lt;a href="https://globe.dev/blog/beyond-vibe-coding-production-flutter-dart-ai/" rel="noopener noreferrer"&gt;globe.dev/blog/beyond-vibe-coding-production-flutter-dart-ai&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 2: Create an &lt;code&gt;AGENTS.md&lt;/code&gt; in Your Repo Root
&lt;/h3&gt;

&lt;p&gt;This file tells every AI agent about your architecture, state management choice, folder structure, code style, and package preferences. It's the &lt;strong&gt;single highest-leverage action&lt;/strong&gt; you can take. Without it, agents default to whatever they were trained on — which probably isn't your codebase.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# AGENTS.md — Flutter App Agent Instructions&lt;/span&gt;

&lt;span class="gu"&gt;## Architecture&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; State management: Riverpod (hooks_riverpod 2.x)
&lt;span class="p"&gt;-&lt;/span&gt; Navigation: go_router 13.x with nested routes
&lt;span class="p"&gt;-&lt;/span&gt; Data layer: Repository pattern, Freezed models
&lt;span class="p"&gt;-&lt;/span&gt; Network: Dio with interceptors, no direct http package

&lt;span class="gu"&gt;## Folder Structure&lt;/span&gt;
lib/
  features/          # One folder per feature
    auth/
      data/          # Repositories, DTOs
      domain/        # Models, use cases
      presentation/  # Widgets, controllers
  core/              # Shared utils, theme, routing

&lt;span class="gu"&gt;## Rules&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; NEVER use setState in feature widgets, always Riverpod
&lt;span class="p"&gt;-&lt;/span&gt; ALL models must be Freezed + json_serializable
&lt;span class="p"&gt;-&lt;/span&gt; Widget tests are REQUIRED for all new screens
&lt;span class="p"&gt;-&lt;/span&gt; Use const constructors wherever possible
&lt;span class="p"&gt;-&lt;/span&gt; Follow existing theme tokens in core/theme/

&lt;span class="gu"&gt;## MCP Servers in Use&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Figma MCP: for any UI implementation task
&lt;span class="p"&gt;-&lt;/span&gt; Dart MCP: for pub.dev package lookups
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Connect the Right MCP Servers
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Dart MCP server&lt;/strong&gt;, &lt;strong&gt;Figma MCP server&lt;/strong&gt;, and &lt;strong&gt;Firebase Studio&lt;/strong&gt; dramatically improve output quality. They give agents access to your actual APIs, your actual design specs — instead of making plausible-sounding things up.&lt;/p&gt;

&lt;p&gt;Very Good Ventures published a great resource: &lt;a href="https://verygood.ventures/blog/7-mcp-servers-every-dart-and-flutter-developer-should-know/" rel="noopener noreferrer"&gt;7 MCP Servers Every Dart and Flutter Developer Should Know&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Plan Mode First, Then Execute
&lt;/h3&gt;

&lt;p&gt;Use Cursor's or Claude Code's planning mode to &lt;strong&gt;generate a plan before any code gets written&lt;/strong&gt;. Review it. Catch architectural misalignments before they become 300-line mistakes. This one habit catches 80% of problems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Hot Reload, Inspect, Iterate
&lt;/h3&gt;

&lt;p&gt;Run the generated code immediately. Flutter's hot reload plus DevTools makes verifying AI output faster than reading it line-by-line — for simple UI changes. But &lt;strong&gt;always read state and logic code&lt;/strong&gt; before accepting it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Audit. Don't Just Accept All.
&lt;/h3&gt;

&lt;p&gt;Karpathy's original "Accept All, don't read the diffs" was fine for his personal weekend projects. For anything that ships to users, AI-generated code is your technical debt. Read it. If it's messy, ask the AI to clean it up before accepting.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real Vibe Coding Scenarios That Actually Work in Flutter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Scenario 1: Weekend Proof of Concept — Liquid Glass iOS Effect
&lt;/h3&gt;

&lt;p&gt;Viktor Lidholt from the Serverpod team vibe-coded a Flutter proof-of-concept for Apple's Liquid Glass effect (introduced in iOS 26) over a weekend.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Obviously not production-level code, but it shows the viability of the approach."&lt;/em&gt; — Viktor Lidholt, &lt;a href="https://serverpod.dev/blog/vibe-coding-flutter" rel="noopener noreferrer"&gt;serverpod.dev/blog/vibe-coding-flutter&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Perfect use case: time-boxed, exploratory, visual feedback. Ship the vibe, then decide if it's worth productionising.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 2: Figma → Flutter with the Figma MCP Server
&lt;/h3&gt;

&lt;p&gt;Using the Figma MCP server + Claude Code, developers are mapping complete design files to Flutter widget trees. The MCP reliably captures dimensional requirements — sizes, spacing, font scales — so the AI output is close enough that human refinement takes minutes, not hours.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Prompt: "Implement the ProductCard from the Figma design"&lt;/span&gt;
&lt;span class="c1"&gt;// Figma MCP provides exact sizes, spacing, colors from the design file&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductCard&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="n"&gt;StatelessWidget&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;ProductCard&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;required&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;product&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;Product&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nd"&gt;@override&lt;/span&gt;
  &lt;span class="n"&gt;Widget&lt;/span&gt; &lt;span class="n"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BuildContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Container&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nl"&gt;width:&lt;/span&gt; &lt;span class="mi"&gt;160&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// from Figma&lt;/span&gt;
      &lt;span class="nl"&gt;decoration:&lt;/span&gt; &lt;span class="n"&gt;BoxDecoration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nl"&gt;color:&lt;/span&gt; &lt;span class="n"&gt;Theme&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;colorScheme&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;surface&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nl"&gt;borderRadius:&lt;/span&gt; &lt;span class="n"&gt;BorderRadius&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;circular&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;// from Figma&lt;/span&gt;
        &lt;span class="nl"&gt;boxShadow:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AppShadows&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;card&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="nl"&gt;child:&lt;/span&gt; &lt;span class="n"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nl"&gt;crossAxisAlignment:&lt;/span&gt; &lt;span class="n"&gt;CrossAxisAlignment&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nl"&gt;children:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
          &lt;span class="n"&gt;ClipRRect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nl"&gt;borderRadius:&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;BorderRadius&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;vertical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="nl"&gt;top:&lt;/span&gt; &lt;span class="n"&gt;Radius&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;circular&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nl"&gt;child:&lt;/span&gt; &lt;span class="n"&gt;CachedNetworkImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="nl"&gt;imageUrl:&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;imageUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="nl"&gt;height:&lt;/span&gt; &lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// from Figma&lt;/span&gt;
              &lt;span class="nl"&gt;fit:&lt;/span&gt; &lt;span class="n"&gt;BoxFit&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;cover&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
          &lt;span class="p"&gt;),&lt;/span&gt;
          &lt;span class="n"&gt;Padding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nl"&gt;padding:&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="n"&gt;EdgeInsets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;// from Figma&lt;/span&gt;
            &lt;span class="nl"&gt;child:&lt;/span&gt; &lt;span class="n"&gt;Column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="cm"&gt;/* title, price, rating */&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
          &lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Scenario 3: Full Test Suite from Existing Screens
&lt;/h3&gt;

&lt;p&gt;Prompt: &lt;em&gt;"Write widget tests for every screen in the &lt;code&gt;lib/features/&lt;/code&gt; folder, following the existing patterns in &lt;code&gt;test/&lt;/code&gt;."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The AI finds your patterns, replicates them, and generates a scaffold for 20–40 tests in a single pass. Review them, run them, patch the failures. What would take a full day takes two hours.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Anti-Patterns That Will Burn You
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"If you let too much bad code into your project, the models will perform worse over time, and your ability to keep the project clean will suffer."&lt;/strong&gt; — Viktor Lidholt, Serverpod&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;❌ No &lt;code&gt;AGENTS.md&lt;/code&gt;&lt;/strong&gt; — Without explicit instructions, the AI defaults to its training data. It might use BLoC when your project uses Riverpod, old &lt;code&gt;go_router&lt;/code&gt; patterns, or target the wrong platform entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Accepting security-sensitive code without audit&lt;/strong&gt; — Documented cases exist of apps deployed with hardcoded secrets, missing authentication checks, and insecure data storage — all AI-generated, none reviewed. Rule: &lt;strong&gt;any code that touches auth, storage, or network gets a human eyeball every time, no exceptions.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Vibe coding your state management&lt;/strong&gt; — Riverpod, BLoC, and Provider have subtleties around lifecycle, disposal, and async state that LLMs frequently get wrong in non-trivial cases. The AI will generate code that looks correct, compiles cleanly, but leaks memory or produces incorrect state transitions. Tests often don't catch this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ No stopping condition on agentic loops&lt;/strong&gt; — Autonomous agents can spiral. Give them bounded tasks with clear acceptance criteria. &lt;code&gt;"Implement the login screen per the Figma design and the PRD"&lt;/code&gt; is a good prompt. &lt;code&gt;"Build the entire app"&lt;/code&gt; is a recipe for a mess you'll spend three days untangling.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Flutter Vibe Coding Stack in 2026
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; — Karpathy's own 2025 year-in-review called it &lt;em&gt;"the first convincing demonstration of what an LLM Agent looks like."&lt;/em&gt; It runs in your environment with your private context, making it the most Flutter-codebase-aware option when paired with &lt;code&gt;AGENTS.md&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor&lt;/strong&gt; — The dominant AI IDE with excellent multi-file operations and planning mode. Pairs well with the Dart MCP server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gemini CLI + Flutter Extension&lt;/strong&gt; — Google's own answer. The Flutter Extension for Gemini CLI combines the Dart and Flutter MCP Server with additional context and commands — natural choice for Firebase-heavy Flutter projects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Firebase Studio / DreamFlow&lt;/strong&gt; — Higher-level "vibey" tools where you interact only with the generated app, not the code. Best for non-engineers or pure prototyping, not for production Flutter development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dart MCP Server&lt;/strong&gt; — Not a standalone tool but the connective tissue. Every AI agent for Flutter becomes measurably better with it. Add it to your setup before anything else.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Key Insight from Karpathy's 1-Year Retrospective
&lt;/h2&gt;

&lt;p&gt;Karpathy's 2026 anniversary post is the most important document in this space right now. He named what skilled developers are actually doing — and it's not "vibe coding" in the throwaway sense:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Agentic engineering: agentic because the new default is that you are not writing the code directly 99% of the time, you are orchestrating agents who do, and acting as oversight. Engineering to emphasize that there is an art &amp;amp; science and expertise to it."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— &lt;strong&gt;@karpathy&lt;/strong&gt;, Feb 2026 · &lt;a href="https://x.com/karpathy/status/2019137879310836075" rel="noopener noreferrer"&gt;x.com/karpathy/status/2019137879310836075&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This reframe is everything for Flutter developers. You're not abdicating engineering judgment — you're &lt;strong&gt;operating at a higher level of abstraction&lt;/strong&gt;. Your expertise shifts from &lt;em&gt;"how do I write this AnimationController"&lt;/em&gt; to &lt;em&gt;"what is the right interaction model and how do I verify the AI implemented it correctly."&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;p&gt;Vibe coding is real, it's here, and dismissing it is as silly as dismissing hot reload in 2018. But the breathless "just describe your app and it builds itself" narrative sets up a failure mode that is very real for production Flutter development.&lt;/p&gt;

&lt;p&gt;Flutter's architecture gives you a genuine edge. The widget tree is predictable, Dart's type system is loud about mistakes, and hot reload gives you the tightest feedback loop in mobile development. These properties mean you can move fast &lt;em&gt;and&lt;/em&gt; maintain visibility into what the AI is generating.&lt;/p&gt;

&lt;p&gt;Karpathy evolved from "forget the code exists" to "agentic engineering with oversight." &lt;strong&gt;That's the right frame.&lt;/strong&gt; You're the architect, the QA engineer, the tech lead. The AI is your most productive junior developer — high throughput, close review, clear instructions, and never unsupervised in your security code.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Use vibe coding wisely: accelerate where it helps, but don't let it erode your developer skills."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;— Viktor Lidholt, Serverpod · &lt;a href="https://serverpod.dev/blog/vibe-coding-flutter" rel="noopener noreferrer"&gt;serverpod.dev/blog/vibe-coding-flutter&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Flutter developers who learn to &lt;strong&gt;orchestrate agents well&lt;/strong&gt; — not just prompt and accept — are the ones who will ship extraordinarily fast without accruing the technical debt that kills velocity later.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have you been vibe coding in Flutter? What's your workflow? Drop it in the comments — I'm genuinely curious what the community has converged on.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If this was useful, follow me here on dev.to and on X [@YourHandle] for more Flutter content.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;#flutter&lt;/code&gt; &lt;code&gt;#dart&lt;/code&gt; &lt;code&gt;#ai&lt;/code&gt; &lt;code&gt;#vibecoding&lt;/code&gt; &lt;code&gt;#productivity&lt;/code&gt;&lt;/p&gt;

</description>
      <category>flutter</category>
      <category>dart</category>
      <category>ai</category>
      <category>vibecoding</category>
    </item>
    <item>
      <title>I Built an AI Boardroom App in 8 Hours with Flutter &amp; AI 🚀 (Open Source)</title>
      <dc:creator>Sayed Ali Alkamel</dc:creator>
      <pubDate>Mon, 05 Jan 2026 18:43:02 +0000</pubDate>
      <link>https://dev.to/sayed_ali_alkamel/i-built-an-ai-boardroom-app-in-8-hours-with-flutter-ai-open-source-1fjm</link>
      <guid>https://dev.to/sayed_ali_alkamel/i-built-an-ai-boardroom-app-in-8-hours-with-flutter-ai-open-source-1fjm</guid>
      <description>&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fg0tg12pwhotgtzro60fn.png" alt="App Screenshot 1" width="800" height="1734"&gt;&lt;/th&gt;
&lt;th&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd9xixw662t55riyg93le.png" alt="App Screenshot 2" width="800" height="1734"&gt;&lt;/th&gt;
&lt;th&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2an2qt3bgjscdjgqcew4.png" alt="App Screenshot 3" width="800" height="1734"&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;"What if you could have a board of directors made up of the world's smartest AI models, debating your problems in real-time?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That was the question. The result? &lt;strong&gt;LLM Council&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And the craziest part? I built the entire mobile app between &lt;strong&gt;lunch and late dinner&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;As a &lt;strong&gt;Google Developer Expert in Flutter&lt;/strong&gt;, I've built hundreds of apps, but the speed at which we can now ship software using AI tools like &lt;strong&gt;Antigravity&lt;/strong&gt; is frankly mind-blowing.&lt;/p&gt;

&lt;p&gt;Here’s the story of how I took inspiration from an AI legend, fired up my IDE, and shipped a premium cross-platform app in a single day. 👇&lt;/p&gt;




&lt;h2&gt;
  
  
  💡 The Inspiration
&lt;/h2&gt;

&lt;p&gt;It started when I saw &lt;strong&gt;Andrej Karpathy&lt;/strong&gt; (founding member of OpenAI, former Director of AI at Tesla) tweet about his project: &lt;a href="https://github.com/karpathy/llm-council" rel="noopener noreferrer"&gt;llm-council&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;He built a web interface where you can ask a question, and multiple LLMs (GPT-4, Claude, etc.) answer it. Then, they "read" each other's answers and a "Chairman" model synthesizes the best advice.&lt;/p&gt;

&lt;p&gt;I loved the concept. &lt;strong&gt;But I wanted it in my pocket.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I wanted a premium, executive-tier mobile experience. Something that felt like walking into a boardroom. Dark mode, gold accents, smooth animations.&lt;/p&gt;

&lt;p&gt;So I challenged myself: &lt;strong&gt;Can I build this before dinner?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🛠️ The Tech Stack
&lt;/h2&gt;

&lt;p&gt;To move fast without breaking things, I stuck to a battle-tested stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flutter 3.6+&lt;/strong&gt;: For that silky smooth 60fps UI on iOS and Android.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bloc &amp;amp; Clean Architecture&lt;/strong&gt;: Because "fast" shouldn't mean "messy code".&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenRouter API&lt;/strong&gt;: To access all models (Claude 3.5 Sonnet, GPT-4o, Gemini 1.5 Pro) with one key.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Antigravity&lt;/strong&gt;: The AI coding assistant that acted as my pair programmer on steroids.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  ⚡ The "Lunch to Late Dinner" Sprint (1:00 PM - 9:00 PM)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1:00 PM - The Setup 🏗️
&lt;/h3&gt;

&lt;p&gt;I didn't waste time on boilerplate. I initialized the Flutter project and set up the domain layer.&lt;br&gt;
&lt;em&gt;User -&amp;gt; Question -&amp;gt; Council -&amp;gt; Deliberation -&amp;gt; Synthesis.&lt;/em&gt;&lt;/p&gt;
&lt;h3&gt;
  
  
  2:30 PM - The "Antigravity" Boost 🚀
&lt;/h3&gt;

&lt;p&gt;This is where things got wild. Instead of manually typing out every model class and repository, I used &lt;strong&gt;Antigravity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Me: &lt;em&gt;"Generate a repository that hits OpenRouter. It needs to handle streaming responses from 4 different models simultaneously."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Antigravity: &lt;em&gt;Done.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It didn't just write code; it wrote &lt;em&gt;good&lt;/em&gt; code. It handled the &lt;code&gt;Dio&lt;/code&gt; interceptors, the error parsing, and the concurrent &lt;code&gt;Future.wait&lt;/code&gt; calls for the council members.&lt;/p&gt;
&lt;h3&gt;
  
  
  4:30 PM - The UI Polish ✨
&lt;/h3&gt;

&lt;p&gt;A "Council" implies prestige. A standard Material Design look wouldn't cut it.&lt;br&gt;
I went for a "Succession-style" aesthetic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deep Navy Backgrounds&lt;/strong&gt; (&lt;code&gt;#0F172A&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gold Accents&lt;/strong&gt; for the active speaker.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anonymized Peer Reviews&lt;/strong&gt;: Models rank each other blindly (Model A doesn't know Model B wrote the answer).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I implemented &lt;code&gt;flutter_animate&lt;/code&gt; to make the messages slide in. It felt alive.&lt;/p&gt;
&lt;h3&gt;
  
  
  6:30 PM - The Synthesis Logic 🧠
&lt;/h3&gt;

&lt;p&gt;The magic of this app is the &lt;strong&gt;Chairman&lt;/strong&gt;.&lt;br&gt;
The app aggregates all the answers, strips the names, and feeds them back to the Chairman model with the prompt:&lt;br&gt;
&lt;em&gt;"Review these perspectives and provide a synthesized, executive summary."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The result? Answers that are significantly more balanced and nuanced than any single model could provide.&lt;/p&gt;
&lt;h3&gt;
  
  
  8:00 PM - Final Optimizations &amp;amp; Testing 🏁
&lt;/h3&gt;

&lt;p&gt;The last hour was spent on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adding local persistence with &lt;code&gt;sqflite&lt;/code&gt; so conversations are saved.&lt;/li&gt;
&lt;li&gt;Securing the API key storage.&lt;/li&gt;
&lt;li&gt;Ensuring the "Chairman" animation was buttery smooth.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;9:00 PM&lt;/strong&gt;: Commit, Push, Done. Dinner time.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  🧑‍💻 The Code (Open Source)
&lt;/h2&gt;

&lt;p&gt;I’m making the whole thing open source. You can clone it, put in your own keys, and have your personal AI board of directors.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check out the repo here:&lt;/strong&gt;&lt;br&gt;
👉 &lt;strong&gt;&lt;a href="https://github.com/sayed3li97/llm_council_app" rel="noopener noreferrer"&gt;github.com/sayed3li97/llm_council_app&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;(Note: If the link is 404, I'm just polishing the README! Check back in 5 mins)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here is a snippet of how we handle the parallel consultation using Dart's concurrency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="n"&gt;Future&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;CouncilSession&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;consult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;async&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// 1. Fire off requests to all members in parallel&lt;/span&gt;
  &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;responses&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;Future&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;members&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;_api&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Anonymize and Request Peer Reviews&lt;/span&gt;
  &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;reviews&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_conductPeerReviews&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;responses&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Synthesis by Chairman&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;_chairman&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;synthesize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;responses&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reviews&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🚀 Why This Matters
&lt;/h2&gt;

&lt;p&gt;We are entering a new era of development. It's not about typing speed anymore; it's about &lt;strong&gt;architectural vision&lt;/strong&gt; and &lt;strong&gt;tool leverage&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;As a GDE, my advice to developers in 2026 is simple: &lt;strong&gt;Embrace the tools.&lt;/strong&gt;&lt;br&gt;
by using an AI agent like Antigravity, I focused on the &lt;em&gt;product experience&lt;/em&gt; the animations, the flow, the value while the AI handled the plumbing.&lt;/p&gt;

&lt;p&gt;I built a production-ready app in 8 hours.&lt;br&gt;
&lt;strong&gt;What will you build?&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you enjoyed this, drop a star on the repo and follow me for more Flutter &amp;amp; AI experiments!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>flutter</category>
      <category>dart</category>
      <category>ai</category>
      <category>mobile</category>
    </item>
  </channel>
</rss>
