<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Amit</title>
    <description>The latest articles on DEV Community by Amit (@amitrix).</description>
    <link>https://dev.to/amitrix</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3962358%2F978a8f18-68b0-409b-9b3a-2156d0be550c.png</url>
      <title>DEV Community: Amit</title>
      <link>https://dev.to/amitrix</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/amitrix"/>
    <language>en</language>
    <item>
      <title>Your AI Agent Needs Communication Modes, Not a Voice Clone</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:21:21 +0000</pubDate>
      <link>https://dev.to/amitrix/your-ai-agent-needs-communication-modes-not-a-voice-clone-1526</link>
      <guid>https://dev.to/amitrix/your-ai-agent-needs-communication-modes-not-a-voice-clone-1526</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Every AI platform collapses communication into a single flat voice profile — but knowledge workers switch between at least six distinct registers daily (casual, professional, leadership, field, publishing, builder), and averaging them produces output that's wrong for every context.&lt;/li&gt;
&lt;li&gt;The fix is engrams: mode-specific profiles with tone calibration, vocabulary boundaries, structural patterns, values integration, and — most importantly — an anti-pattern library. Anti-patterns are more distinctive than positive examples.&lt;/li&gt;
&lt;li&gt;Agent output should amplify intent, not clone raw voice. A casual voice message delivers intent; the engram-calibrated agent delivers a draft that exceeds real-time output quality for that register.&lt;/li&gt;
&lt;li&gt;Automatic mode detection (from a config-backed priority hierarchy: override → recipient → role → channel → intent keywords) eliminates manual mode selection entirely.&lt;/li&gt;
&lt;li&gt;Spend one hour building six mode engrams. Every AI interaction improves for the rest of the year.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Every AI assistant on the market today offers some version of "write like me." Upload your writing samples, set a style preference, and the model will dutifully mimic your patterns. The output reads like a slightly off photocopy — recognizably shaped like you, but missing the judgment calls that make communication actually work.&lt;/p&gt;

&lt;p&gt;The problem is not that voice cloning fails technically. It is that voice cloning answers the wrong question. The question is not "how do I sound?" The question is: "how should this message land, given who it is for and what it needs to do?"&lt;/p&gt;

&lt;p&gt;Knowledge workers do not have one voice. They have six.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Flat Persona Trap
&lt;/h2&gt;

&lt;p&gt;Every major AI platform now offers persistent voice customization. ChatGPT has Custom GPTs and Projects with system instructions. Claude has Projects, Styles, and custom instructions — including a "Taste Interviewer" prompt pattern that &lt;a href="https://www.linkedin.com/posts/ruben-hassid_how-to-make-ai-sound-exactly-like-you-forever-activity-7419982189787951104-f6Ru" rel="noopener noreferrer"&gt;extracts voice DNA from conversation&lt;/a&gt;. Gemini has Gems. Anthropic recently added a &lt;a href="https://www.ikangai.com/anthropics-enhanced-writing-styles/" rel="noopener noreferrer"&gt;Styles feature&lt;/a&gt; where you pre-select formal, concise, or explanatory modes — or upload custom examples.&lt;/p&gt;

&lt;p&gt;All of these tools treat voice as a single axis. You feed the model writing samples, it pattern-matches your sentence length, vocabulary, and quirks, and then every output comes through that same filter. A casual DM to a friend sounds the same as an exec briefing to a VP. A customer email sounds the same as an internal team message. A published thought piece sounds the same as a handoff task to another agent.&lt;/p&gt;

&lt;p&gt;As one practitioner put it: &lt;a href="https://writingbeginner.substack.com/p/the-3-paragraph-trick-that-makes" rel="noopener noreferrer"&gt;"The standard advice for getting AI to match your voice is to feed it samples and say 'write like this.' This barely works."&lt;/a&gt; The reason it barely works is not sample quality. It is that a single-mode voice profile collapses context that professionals spend years learning to calibrate.&lt;/p&gt;

&lt;p&gt;Linguistics has a word for what happens next: &lt;a href="https://teacherste.wordpress.com/2016/06/03/3757/" rel="noopener noreferrer"&gt;register&lt;/a&gt;. Register is the form that language takes in different circumstances — and "code switching" is the ability to move between registers guided by context. UCLA research confirms what anyone in a professional setting already knows: &lt;a href="https://languagedlife.ucla.edu/tag/norms/" rel="noopener noreferrer"&gt;people employ casual, slang-infused language among peers while adopting structured, formal language with leadership&lt;/a&gt;. This is not inconsistency. It is competence.&lt;/p&gt;

&lt;p&gt;A flat persona strips that competence away.&lt;/p&gt;

&lt;h2&gt;
  
  
  Six Modes, Not One Voice
&lt;/h2&gt;

&lt;p&gt;If you work in any knowledge-intensive role — product management, solutions architecture, engineering leadership, GTM strategy — you switch between at least six distinct communication modes every day:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;When&lt;/th&gt;
&lt;th&gt;What it needs to sound like&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Casual / Inner Circle&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;DMs with close colleagues, peers you trust&lt;/td&gt;
&lt;td&gt;Direct, warm, zero ceremony. Short. Familiar but not sloppy.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Professional / Peer-to-Peer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cross-functional threads, team channels, project syncs&lt;/td&gt;
&lt;td&gt;Strategic, data-specific, action-oriented. Advisory posture — flag, connect, advise.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Leadership / Upward&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Exec emails, endorsement requests, VP briefings&lt;/td&gt;
&lt;td&gt;Personal but purposeful. Confident, not deferential. Ask first, context later.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Field / External&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Customer emails, partner comms, external stakeholders&lt;/td&gt;
&lt;td&gt;Customer-obsessed, growth-oriented, warm but measured.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Publishing / Thought Leadership&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Blog posts, strategy docs, public writing&lt;/td&gt;
&lt;td&gt;Evidence-based, opinionated, universally framed. Not personal — "If you work in..."&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Builder / Technical&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Handoff tasks, system docs, architecture, code&lt;/td&gt;
&lt;td&gt;Precise, structured, executable. Written for machines and humans simultaneously.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each mode has different vocabulary, sentence structure, opening patterns, closing patterns, and — critically — a different set of things you would never say. A casual message that opens with "I hope this note finds you well" is wrong. A leadership message that opens with "Hey dude" is wrong. A published post that opens with "In my role as..." is wrong.&lt;/p&gt;

&lt;p&gt;The voice is not the variable. The mode is.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture: Engrams
&lt;/h2&gt;

&lt;p&gt;An engram is a mode-specific voice profile. Not a flat style guide — a structured analysis of how communication should work in a specific register, for a specific audience, with a specific intent.&lt;/p&gt;

&lt;p&gt;Each engram contains five components:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Tone Calibration&lt;/strong&gt;&lt;br&gt;
Not "friendly and professional" — that &lt;a href="https://unpromptable.substack.com/p/how-to-train-ai-on-your-brand-voice" rel="noopener noreferrer"&gt;describes 90% of the internet&lt;/a&gt;. Instead: "Direct. No ceremony. Two sentences max for the opener. Get to the ask within the first three lines." Specificity is the entire point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Vocabulary Boundaries&lt;/strong&gt;&lt;br&gt;
What to use and — more importantly — what to never use. The never-use list is more distinctive than the use list. Everyone uses "thanks." Not everyone avoids "appreciate it, brother." The anti-pattern is the fingerprint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Structural Patterns&lt;/strong&gt;&lt;br&gt;
How messages open, flow, and close. Casual mode: no opener, straight to content. Leadership mode: the ask comes first, the context follows only if they say yes. Publishing mode: the surprise or finding leads, not the setup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Organizational Values Integration&lt;/strong&gt;&lt;br&gt;
For organizations with articulated operating principles, each mode emphasizes different values. Casual mode leans on speed and directness. Leadership mode leans on trust-building. Publishing mode leans on big-picture thinking and customer focus. The values are not decorative — they calibrate judgment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Anti-Pattern Library&lt;/strong&gt;&lt;br&gt;
The explicit list of phrases, structures, and behaviors that are wrong for this mode. This is the highest-signal component. "No worries if not" at the end of a leadership ask signals lack of confidence. "I can pull together a one-pager" in an advisory message signals doer posture when advisor posture is required. "In my role as..." in a published post signals credential framing when universal framing is needed.&lt;/p&gt;

&lt;p&gt;The anti-patterns catch failures that positive instructions miss. "Be confident" is vague. "Never end a leadership message with an opt-out phrase like 'either way' or 'no pressure'" is actionable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Amplification, Not Cloning
&lt;/h2&gt;

&lt;p&gt;The critical distinction: agent output should not sound like a raw transcript of the human. It should sound like an &lt;strong&gt;amplified version of the human's intent&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When someone dictates a quick voice message — "hey can you reach out to Kevin and tell him I talked to Sarah about the promo thing and it'd be great if he could put in a good word" — they are not delivering final copy. They are delivering intent. The raw transcript captures the meaning but not the polish. A flat voice clone would reproduce the filler words, the incomplete thoughts, the verbal tics.&lt;/p&gt;

&lt;p&gt;An engram-calibrated agent does something different. It takes the intent, identifies the correct mode (casual / inner circle), applies the mode's structural patterns (direct opener, no ceremony, short), checks against the anti-pattern library (no sycophantic closings, no emoji overuse, no hedging), and produces output that is &lt;em&gt;more cohesive&lt;/em&gt; than what the human would have typed themselves — while remaining unmistakably shaped by the human's values and directness.&lt;/p&gt;

&lt;p&gt;This is not ghostwriting. It is amplification. The human reviews, edits, and sends — but they start from a draft that already exceeds their typical real-time output quality for that register.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.axios.com/2026/04/13/claude-project-chatgpt-custom-gpt-tips" rel="noopener noreferrer"&gt;Axios guide to building AI writing clones&lt;/a&gt; gets halfway there: "Don't ask the AI to go find your voice. Give it your voice. The CEO who uploads 50 documents gets a 10x better clone than the one who types a few simple prompts." True — but the 50 documents still produce one flat clone. The upgrade is giving it 50 documents &lt;em&gt;tagged by mode&lt;/em&gt;, so it knows which version of the voice to invoke.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Competition Offers Today
&lt;/h2&gt;

&lt;p&gt;A quick landscape of how current tools handle voice personalization:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;What It Gets Right&lt;/th&gt;
&lt;th&gt;What It Misses&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;ChatGPT&lt;/strong&gt; (Custom GPTs / Projects)&lt;/td&gt;
&lt;td&gt;System instructions + uploaded samples&lt;/td&gt;
&lt;td&gt;Persistent context across conversations&lt;/td&gt;
&lt;td&gt;Single mode per GPT/Project — no register switching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Claude&lt;/strong&gt; (Projects / Styles)&lt;/td&gt;
&lt;td&gt;Preset styles (formal, concise, explanatory) + custom examples&lt;/td&gt;
&lt;td&gt;Recently added &lt;a href="https://www.ikangai.com/anthropics-enhanced-writing-styles/" rel="noopener noreferrer"&gt;Styles feature with mode selection&lt;/a&gt;
&lt;/td&gt;
&lt;td&gt;Styles are generic presets, not user-specific mode profiles&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Claude Skills&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Markdown files encoding voice + workflow&lt;/td&gt;
&lt;td&gt;Eliminates the &lt;a href="https://substack.com/home/post/p-190025747" rel="noopener noreferrer"&gt;"Blank Slate Tax"&lt;/a&gt; — voice persists across sessions&lt;/td&gt;
&lt;td&gt;One skill = one voice. No multi-mode architecture.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemini Gems&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Custom instruction sets per Gem&lt;/td&gt;
&lt;td&gt;Quick setup, integrated with Google ecosystem&lt;/td&gt;
&lt;td&gt;Same single-mode limitation as Custom GPTs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Voice cloning prompts&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Feed samples → extract patterns → reproduce&lt;/td&gt;
&lt;td&gt;
&lt;a href="https://www.linkedin.com/posts/ruben-hassid_how-to-make-ai-sound-exactly-like-you-forever-activity-7419982189787951104-f6Ru" rel="noopener noreferrer"&gt;"Taste Interviewer" pattern&lt;/a&gt; produces detailed voice DNA&lt;/td&gt;
&lt;td&gt;Clones the raw voice including flaws — no amplification, no mode switching&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;None of these platforms offer mode-specific profiles. You can create separate GPTs or Projects per mode — but there is no architecture that automatically selects the right profile based on context (audience, channel, intent). The mode selection is entirely manual.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building It: The Practical Loop
&lt;/h2&gt;

&lt;p&gt;Here is how an engram system works in practice:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Collect samples across modes.&lt;/strong&gt; Pull your chat DMs (casual mode), sent emails (professional + leadership modes), published posts (publishing mode), and agent conversations (raw intent signal). Tag each sample by mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 — Analyze per mode.&lt;/strong&gt; For each mode, produce a structured analysis: tone, vocabulary, sentence patterns, openers/closers, anti-patterns, values emphasis. The analysis should be 500–800 words per mode — specific enough to calibrate, short enough to fit in a system prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3 — Build the anti-pattern library.&lt;/strong&gt; This is the highest-value step. Review agent outputs that you've corrected. Every correction is an anti-pattern: "Don't say 'appreciate it brother.'" "Don't hedge with 'either way.'" "Don't volunteer to build deliverables — flag, connect, advise." Corrections are more distinctive than examples.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4 — Save as persistent profiles.&lt;/strong&gt; Each mode becomes a named engram that the agent loads based on context. Writing a DM to a close colleague → load casual engram. Drafting an email to a VP → load leadership engram. Writing a blog post → load publishing engram.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5 — Iterate from corrections.&lt;/strong&gt; Every time you correct an agent draft, the correction feeds back into the relevant engram's anti-pattern library. The profiles sharpen over time through use, not through re-training.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Enterprise Implication
&lt;/h2&gt;

&lt;p&gt;For individual builders, engrams solve the "my AI sounds generic" problem. For organizations, the implication is larger.&lt;/p&gt;

&lt;p&gt;Institutional voice is not one voice. It is a set of registers that encode how the organization communicates in different contexts — with customers, with leadership, with the field, with the public. Today, that institutional knowledge lives in the heads of senior practitioners who have spent years calibrating their register-switching. When they leave, the calibration leaves with them.&lt;/p&gt;

&lt;p&gt;Engrams make that calibration portable. A senior practitioner builds mode-specific profiles. A new team member's agent loads those profiles and immediately communicates at a higher calibration than they could achieve alone — not replacing their judgment, but starting them at a higher baseline.&lt;/p&gt;

&lt;p&gt;This is not homogenization. Each person's anti-patterns are different, their vocabulary boundaries are different, their structural preferences are different. But the architecture — modes, not flat personas — can be shared. The organization provides the mode taxonomy and the values integration. The individual provides the voice within each mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  Automatic Mode Detection
&lt;/h2&gt;

&lt;p&gt;Manual mode switching breaks the flow — nobody wants to tell the agent "use casual mode" before every message. The fix is a classification function backed by a config file that encodes a signal priority hierarchy: explicit override → recipient-specific override → role-based mapping → channel/medium detection → intent keyword matching. The agent resolves the correct engram before generating a single word.&lt;/p&gt;

&lt;p&gt;Anti-pattern extraction from corrections is the second architectural piece. When you reject a draft — "don't say 'appreciate it brother'" — that correction should auto-classify to the relevant mode and append to that mode's engram. Corrections are the highest-signal input the system receives. Every rejection is a fingerprint.&lt;/p&gt;

&lt;h2&gt;
  
  
  So What
&lt;/h2&gt;

&lt;p&gt;"Friendly and professional" is not a voice. It is the absence of one. Knowledge workers switch between six or more distinct communication registers every day, and every AI platform on the market collapses them into a single flat profile.&lt;/p&gt;

&lt;p&gt;The fix is not better voice cloning. It is mode-specific profiles — engrams — that capture how communication should work for a specific audience, intent, and register. Anti-patterns over patterns. Amplification over imitation. Organizational values as behavioral calibration, not decoration.&lt;/p&gt;

&lt;p&gt;The person who invests an hour building six mode engrams will get better output from every AI interaction for the rest of the year. The organization that standardizes mode taxonomies will ship institutional communication quality that does not walk out the door when senior practitioners leave.&lt;/p&gt;

&lt;p&gt;Your agent does not need your voice. It needs your judgment about which voice to use when.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This post is part of a series on building mode-specific voice profiles for AI agents. The next post covers &lt;a href="https://artificialcuriositylabs.ai/posts/engram-builder-what-the-tool-gives-you/" rel="noopener noreferrer"&gt;what the engram builder gives you — and what you have to add on top&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 1 of the &lt;a href="https://dev.to/tags/voice/"&gt;Voice &amp;amp; Engrams&lt;/a&gt; series. &lt;a href="https://artificialcuriositylabs.ai/posts/engrams-voice-at-agent-speed/" rel="noopener noreferrer"&gt;Part 2: Your Agent Needs Six Voices, Not One →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>voice</category>
      <category>communication</category>
    </item>
    <item>
      <title>What Is an Agent — And What Isn't</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:20:45 +0000</pubDate>
      <link>https://dev.to/amitrix/what-is-an-agent-and-what-isnt-3cnd</link>
      <guid>https://dev.to/amitrix/what-is-an-agent-and-what-isnt-3cnd</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;An agent is defined by one thing: the loop — perceive, reason, act, observe, repeat. Chatbots, copilots, fixed workflows, and RPA scripts are not agents; they lack autonomous iteration.&lt;/li&gt;
&lt;li&gt;The three required properties: goal-directed behavior (outcome, not response), tool access (interacts with the world), autonomy (multi-step without confirmation at every step).&lt;/li&gt;
&lt;li&gt;Claude Code earned 84.6K GitHub stars and a 46% "most loved" rating among developers — compared to Cursor at 19% and Copilot at 9% — because it actually loops: writes, tests, observes failures, fixes.&lt;/li&gt;
&lt;li&gt;If you conflate chatbots with agents, you underinvest in the harness, set the wrong reliability bar, and build fixed pipelines that break when the world changes.&lt;/li&gt;
&lt;li&gt;Apply the loop test before building: does the system perceive, reason, act, observe, and iterate? If not, name it correctly and architect accordingly.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Every product launch in 2026 uses the word "agent." Customer support chatbots are agents. Autocomplete plugins are agents. Cron jobs with an LLM wrapper are agents. A Zapier flow with a model step is an agent now, apparently.&lt;/p&gt;

&lt;p&gt;None of those are agents.&lt;/p&gt;

&lt;p&gt;The word has become meaningless through overuse, and the confusion is not academic. If you think your chatbot is an agent, you will build the wrong thing, staff the wrong team, and set the wrong expectations with customers. The distinction matters because the architecture, the reliability requirements, and the trust model are fundamentally different.&lt;/p&gt;

&lt;p&gt;This post draws the line.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Agent Loop — The Only Definition That Matters
&lt;/h2&gt;

&lt;p&gt;An AI agent is &lt;a href="https://glenbradford.com/ai-agents-explained" rel="noopener noreferrer"&gt;a system that can autonomously take actions to accomplish a goal&lt;/a&gt; — not a system that responds to a single prompt and waits for the next one.&lt;/p&gt;

&lt;p&gt;The defining primitive is the &lt;strong&gt;loop&lt;/strong&gt;. An agent operates in a continuous cycle:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Perceive&lt;/strong&gt; — take in information from the environment (files, APIs, user input, tool outputs)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reason&lt;/strong&gt; — decide what to do next (which tool to call, what parameters to pass, whether the goal is met)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Act&lt;/strong&gt; — execute the chosen action (call a tool, write a file, send a message, run code)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observe&lt;/strong&gt; — evaluate the result (did the action succeed? did the state change? is the goal closer?)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Then it loops back. &lt;a href="https://www.ikangai.com/the-agentic-loop-explained-what-every-pm-should-know-about-how-ai-agents-actually-work/" rel="noopener noreferrer"&gt;Perceiving the new state, reasoning again, acting again&lt;/a&gt;. This loop can run once for a simple query or iterate dozens of times for a complex workflow. The key: the agent decides when to stop — not the human.&lt;/p&gt;

&lt;p&gt;Three properties separate a real agent from everything that calls itself one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Goal-directed behavior.&lt;/strong&gt; The agent pursues an outcome, not a response. "Organize this folder by project" is a goal. "What files are in this folder?" is a query.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool access.&lt;/strong&gt; The agent interacts with the external world — file systems, APIs, databases, browsers, shell commands. Without tools, the model is a brain in a jar.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autonomy.&lt;/strong&gt; The agent takes multi-step actions without requiring human confirmation at every step. The degree of autonomy varies (more on that below), but zero autonomy means zero agency.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the formula from &lt;a href="https://tianpan.co/blog/2026-02-27-anatomy-of-an-agent-harness" rel="noopener noreferrer"&gt;Tian Pan's anatomy of an agent harness&lt;/a&gt;: &lt;strong&gt;Agent = Model + Harness&lt;/strong&gt;. The model handles reasoning. The harness handles everything else — tool execution, context management, memory, safety. The loop is what ties them together.&lt;/p&gt;




&lt;h2&gt;
  
  
  What an Agent Is NOT
&lt;/h2&gt;

&lt;p&gt;The confusion comes from four categories that look agent-like but are not.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chatbots Are Not Agents
&lt;/h3&gt;

&lt;p&gt;A chatbot receives a prompt and returns a response. One turn. The user drives every interaction. There is no loop — the system does not perceive, act, observe, or iterate. ChatGPT in default mode is a chatbot. Claude.ai in a single-turn conversation is a chatbot. They are useful. They are not agents.&lt;/p&gt;

&lt;p&gt;The test: &lt;strong&gt;does the system take actions in the world without being asked?&lt;/strong&gt; A chatbot does not. It waits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Copilots Are Not Agents
&lt;/h3&gt;

&lt;p&gt;A copilot suggests. It autocompletes code. It offers a draft. It highlights errors. The human accepts or rejects every suggestion. GitHub Copilot, in its original autocomplete mode, is a copilot — the model proposes, the human disposes.&lt;/p&gt;

&lt;p&gt;The distinction is the approval gate. A copilot has a human-in-the-loop at every step. An agent has a human-in-the-loop at the goal level ("organize my downloads folder") but not at every action level ("rename file X, move file Y, create folder Z").&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.dench.com/blog/future-of-work-ai-agents" rel="noopener noreferrer"&gt;AI tools require humans to operate them. AI agents operate on behalf of humans.&lt;/a&gt; That distinction changes everything about how work gets done.&lt;/p&gt;

&lt;h3&gt;
  
  
  Workflows Are Not Agents
&lt;/h3&gt;

&lt;p&gt;A workflow is a fixed DAG — a directed acyclic graph of predetermined steps. Step 1 always leads to Step 2, which always leads to Step 3. The path is decided at design time, not runtime. LangChain chains, Airflow DAGs, Step Functions — these are workflows. They are deterministic. They do not reason about which step to take next.&lt;/p&gt;

&lt;p&gt;An agent selects its next action based on the current state. If a tool call fails, the agent can try a different tool, adjust parameters, or abandon that approach entirely. A workflow cannot — it follows the graph or it fails.&lt;/p&gt;

&lt;h3&gt;
  
  
  RPA Is Not Agentic
&lt;/h3&gt;

&lt;p&gt;Robotic process automation scripts follow brittle, pixel-mapped sequences. Click here, type there, wait for this element. They break when the UI changes. They cannot recover from unexpected states. They have no reasoning layer.&lt;/p&gt;

&lt;p&gt;An agent navigating a browser can handle unexpected popups, changed layouts, and missing elements because the model reasons about the visual state and adapts. An RPA script cannot.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Spectrum of Autonomy
&lt;/h2&gt;

&lt;p&gt;Not all agents are fully autonomous. The reality is a spectrum:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Confirm every action&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent proposes, human approves each step&lt;/td&gt;
&lt;td&gt;Early Claude Code (pre-trust)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Confirm risky actions&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent acts freely on reads, confirms on writes/sends&lt;/td&gt;
&lt;td&gt;Amazon Quick Desktop default mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fire and forget&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent runs to completion, human reviews the output&lt;/td&gt;
&lt;td&gt;Claude Code with &lt;code&gt;--dangerously-skip-permissions&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Continuous autonomous&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent runs on a schedule with no human trigger&lt;/td&gt;
&lt;td&gt;Amazon Quick Desktop scheduled agents&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The correct operating point depends on the stakes. Renaming files in a personal folder? Fire and forget. Sending an email to a VP on the user's behalf? Confirm that action.&lt;/p&gt;

&lt;p&gt;This is the trust ramp: start at the left, move right as the agent proves reliability. The ramp is not a product decision — it is a per-user, per-workflow decision that evolves over time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four Agents, Four Architectures
&lt;/h2&gt;

&lt;p&gt;The agent loop is universal. How it is implemented — what harness wraps the model, what tools are available, what the interaction surface looks like — varies by product. Here are four agents that demonstrate the range.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Code — The Terminal Agent
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.buildberg.co/blog/claude-code-complete-guide" rel="noopener noreferrer"&gt;Claude Code is Anthropic's terminal-native coding agent&lt;/a&gt;. It lives in the terminal. No IDE. No browser. No GUI. You describe a task in natural language, and the agent reads your codebase, plans multi-file edits, writes code, runs tests, observes failures, fixes them, and commits the result.&lt;/p&gt;

&lt;p&gt;The agent loop is visible: the model reads a file (perceive), decides what to change (reason), edits the code and runs the test suite (act), sees if the tests pass (observe), then iterates until they do.&lt;/p&gt;

&lt;p&gt;Claude Code &lt;a href="https://www.augmentcode.com/learn/claude-code-86k-github-stars-terminal-ai-agent" rel="noopener noreferrer"&gt;hit 84.6K GitHub stars&lt;/a&gt; by March 2026 and earned a &lt;a href="https://www.speechly.io/blog/what-is-claude-code-ai-coding-assistant-2026" rel="noopener noreferrer"&gt;46% "most loved" rating among developers&lt;/a&gt;, compared to Cursor at 19% and GitHub Copilot at 9%. It integrates with &lt;a href="https://www.speechly.io/blog/what-is-claude-code-ai-coding-assistant-2026" rel="noopener noreferrer"&gt;150+ tools via MCP&lt;/a&gt; (Model Context Protocol), spawns sub-agents for parallel work, and supports automatic memory across sessions via CLAUDE.md files.&lt;/p&gt;

&lt;p&gt;This is not autocomplete. This is an agent that writes, tests, and ships software.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Cowork — The Desktop Agent
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.alphamatch.ai/blog/claude-cowork-desktop-ai-agent-2026" rel="noopener noreferrer"&gt;Claude Cowork launched on January 12, 2026&lt;/a&gt; as a research preview inside the Claude Desktop app. It brought the same agentic architecture that powers Claude Code to non-developers.&lt;/p&gt;

&lt;p&gt;You grant Cowork access to a folder. You describe the outcome. It &lt;a href="https://www.linos.ai/technology/claude-cowork-review-2026-features-pricing-vs-claude-code/" rel="noopener noreferrer"&gt;reads, edits, creates, and organizes files&lt;/a&gt; within that scope — sorting chaotic downloads folders, pulling expense data from receipt screenshots, synthesizing research documents. Powered by &lt;a href="https://www.jploft.com/blog/anthropic-launched-cowork" rel="noopener noreferrer"&gt;Claude Opus 4.6 with a one-million-token context window&lt;/a&gt;, it plans an approach, executes across local files and connected applications, and returns a finished deliverable.&lt;/p&gt;

&lt;p&gt;The market noticed. Investors &lt;a href="https://vicky.dev/claude-cowork-guide/" rel="noopener noreferrer"&gt;wiped $285 billion from software stocks&lt;/a&gt; within days as the implications sank in: an AI capable of autonomous knowledge work, running on a desktop, for $20/month.&lt;/p&gt;

&lt;p&gt;Cowork is not a chatbot with file access. It is an agent that takes a goal and works until the goal is met.&lt;/p&gt;

&lt;h3&gt;
  
  
  Kiro — The Spec-Driven IDE Agent
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://www.digitalapplied.com/blog/amazon-kiro-aws-agentic-ide-complete-guide" rel="noopener noreferrer"&gt;Kiro is AWS's agentic IDE&lt;/a&gt;, built on a fundamentally different philosophy. Where Claude Code and most AI coding tools start with code, Kiro starts with specifications.&lt;/p&gt;

&lt;p&gt;The approach is called spec-driven development. &lt;a href="https://www.bitslovers.com/kiro-ai-ide-guide/" rel="noopener noreferrer"&gt;Before the agent writes anything, it generates structured specifications&lt;/a&gt; — requirements with acceptance criteria, a technical design document, and a numbered task list. You review and edit the specs. Then the agent implements from the spec.&lt;/p&gt;

&lt;p&gt;This inverts the model that Cursor, Copilot, and most AI assistants use. In Kiro, &lt;a href="https://www.digitalapplied.com/blog/amazon-kiro-aws-agentic-ide-complete-guide" rel="noopener noreferrer"&gt;the spec is source-of-truth and code is a build artifact&lt;/a&gt;. The agent loop runs at a higher level of abstraction: perceive (read the spec), reason (plan the implementation), act (write code, generate docs, create tests), observe (validate against acceptance criteria).&lt;/p&gt;

&lt;p&gt;Built on &lt;a href="https://aws.amazon.com/cn/documentation-overview/kiro/" rel="noopener noreferrer"&gt;Amazon Bedrock with multiple foundation models&lt;/a&gt;, Kiro treats the decisions made during development as first-class artifacts — not ephemeral chat messages that vanish when the tab closes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Amazon Quick Desktop — The Knowledge Work Agent
&lt;/h3&gt;

&lt;p&gt;Amazon Quick is &lt;a href="https://aws.amazon.com/quick/faqs/" rel="noopener noreferrer"&gt;an agentic AI-powered digital workspace&lt;/a&gt; that operates across the full surface of knowledge work — not code, but everything else.&lt;/p&gt;

&lt;p&gt;Running Quick Desktop daily with 250+ tools connected in a single conversation, the agent loop operates across Slack, Outlook email and calendar, Salesforce, SharePoint, file systems, knowledge graphs, web search, browser automation, Python/JavaScript execution, and image generation. The agent triages your inbox, drafts replies, searches across indexed folders, posts structured Slack threads, builds dashboards, and manages account context files — all driven by skills (encoded methodology) and a persistent memory system.&lt;/p&gt;

&lt;p&gt;A concrete example: the Slack MCP server exposes tools like &lt;code&gt;post_message&lt;/code&gt; and &lt;code&gt;search_messages&lt;/code&gt;. The Outlook MCP server exposes tools like &lt;code&gt;email_reply&lt;/code&gt; and &lt;code&gt;calendar_view&lt;/code&gt;. The Salesforce MCP server exposes tools like &lt;code&gt;query_opportunities&lt;/code&gt; and &lt;code&gt;update_contact&lt;/code&gt;. These MCP servers are the connectors; the tools are the individual functions the agent calls inside its loop. The distinction matters: you don't connect "Slack" to an agent — you connect a Slack MCP server that exposes a set of tools the agent can invoke.&lt;/p&gt;

&lt;p&gt;Quick Desktop's distinguishing feature: &lt;a href="https://aws.amazon.com/quick/pricing/" rel="noopener noreferrer"&gt;scheduled agents that run 24/7 in the cloud&lt;/a&gt;. These are continuous autonomous agents — they run on cron schedules or event triggers (new Slack message, new email, upcoming calendar event) without any human initiation. A morning briefing agent fires at 7:56 AM, scans email and Slack, classifies by priority, and posts a triage report before you open the app.&lt;/p&gt;

&lt;p&gt;This is the farthest point on the autonomy spectrum: agents that operate on behalf of humans with no human in the loop at execution time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why the Distinction Matters
&lt;/h2&gt;

&lt;p&gt;If you conflate chatbots with agents, three things go wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You underinvest in the harness.&lt;/strong&gt; A chatbot needs a prompt and a model. An agent needs tool execution infrastructure, context management, memory, safety enforcement, error recovery, and human-in-the-loop workflows. The harness — the infrastructure that wraps the model — is where reliability lives. Skip it, and your agent fails in production even if the model is brilliant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You set the wrong trust expectations.&lt;/strong&gt; Users expect chatbots to be wrong sometimes and shrug it off. Users expect agents — systems acting on their behalf — to be reliable. A chatbot that hallucinates wastes 30 seconds. An agent that sends a hallucinated email to a VP wastes a career. The reliability bar is categorically different.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You build fixed workflows instead of adaptive systems.&lt;/strong&gt; If you think "agent" means "workflow with an LLM step," you will build rigid pipelines that break when the world changes. Real agents adapt. They recover from tool failures, try alternative approaches, and ask for help when stuck. That adaptability requires the loop — and the loop requires a harness.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Agent Loop Is the Primitive
&lt;/h2&gt;

&lt;p&gt;Everything in the agent ecosystem builds on top of the loop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tools&lt;/strong&gt; give the agent hands — the ability to interact with the external world. Without tools, the agent is a brain in a jar. But tools alone are atomic: read a file, send a message, query a database. Each tool call is a single action in a single iteration of the loop. MCP servers are the connectors that expose those tools — the Slack MCP server, the Outlook MCP server, the Salesforce MCP server. The tools are the individual functions those servers make available to the agent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skills&lt;/strong&gt; give the agent methodology — the knowledge of how and when to act. A skill encodes a workflow: when to trigger, what inputs to gather, what tools to use in what order, what quality checks to run. Skills are what prevent the agent from reinventing its approach every session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The harness&lt;/strong&gt; gives the agent a body — context management, memory, safety enforcement, error recovery, state persistence. The harness is the infrastructure that keeps the loop running reliably across sessions, across tools, and across failures.&lt;/p&gt;

&lt;p&gt;Agent → Loop → Tools → Skills → Harness. Each concept builds on the one before it. Get the agent definition wrong, and the rest of the stack is built on sand.&lt;/p&gt;

&lt;p&gt;The industry will keep calling everything an agent. That does not mean everything is one. The loop is the line. If the system does not perceive, reason, act, observe, and iterate — it is something else. Call it what it is.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 1 of the &lt;a href="https://dev.to/tags/agents/"&gt;Agent Primitives&lt;/a&gt; series. &lt;a href="https://artificialcuriositylabs.ai/posts/what-is-a-tool/" rel="noopener noreferrer"&gt;Part 2: What Is a Tool →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>patterns</category>
    </item>
    <item>
      <title>What Is an Agent Harness — The Infrastructure That Makes Agents Actually Work</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:20:10 +0000</pubDate>
      <link>https://dev.to/amitrix/what-is-an-agent-harness-the-infrastructure-that-makes-agents-actually-work-3kch</link>
      <guid>https://dev.to/amitrix/what-is-an-agent-harness-the-infrastructure-that-makes-agents-actually-work-3kch</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Agent = Model + Harness. The model handles reasoning; the harness handles everything else: tool execution, context management, memory, state persistence, safety enforcement, error recovery, and human-in-the-loop workflows.&lt;/li&gt;
&lt;li&gt;Claude Code and Claude Cowork run the same underlying model — their experiences are entirely different because the harness is different. The model is a component; the harness is the product.&lt;/li&gt;
&lt;li&gt;Two agents using the same model but different harnesses produce wildly different results. This is not a metaphor — it is the literal architecture of every working agent in 2026.&lt;/li&gt;
&lt;li&gt;Models are commoditizing; harnesses are differentiating. Skills, memory, learned preferences, and institutional knowledge all live at the harness layer.&lt;/li&gt;
&lt;li&gt;Stop evaluating agents by model benchmarks. Evaluate by harness: does it persist memory, enforce safety, recover from failures, and compound institutional knowledge across sessions?&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;The industry talks about models. Which one is smartest. Which context window is largest. Which benchmark score is highest. That conversation misses the point entirely.&lt;/p&gt;

&lt;p&gt;The model is the brain. Without a body, a brain sits in a jar.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Formula
&lt;/h2&gt;

&lt;p&gt;Agent = Model + Harness.&lt;/p&gt;

&lt;p&gt;The model handles reasoning — what to do next, how to interpret results, when to change approach. The harness handles everything else: tool execution, context management, memory, state persistence, safety enforcement, error recovery, and human-in-the-loop workflows.&lt;/p&gt;

&lt;p&gt;This is not a metaphor. It is the &lt;a href="https://tianpan.co/blog/2026-02-27-anatomy-of-an-agent-harness" rel="noopener noreferrer"&gt;literal architecture of every working agent system in 2026&lt;/a&gt;. Strip the harness away and you have a stateless text-completion API. Add the harness back and you have a system that reads your codebase, triages your Slack, books your meetings, and runs overnight pipelines while you sleep.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.philschmid.de/agent-harness-2026" rel="noopener noreferrer"&gt;Phil Schmid puts it directly&lt;/a&gt;: "An Agent Harness is the infrastructure that wraps around an AI model to manage long-running tasks. It is not the agent itself. It is the software system that governs how the agent operates."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://firecrawl.dev/blog/what-is-an-agent-harness" rel="noopener noreferrer"&gt;Firecrawl's definition&lt;/a&gt; sharpens it further: "An agent harness is everything that wraps around an LLM — tool execution, memory, context management, state persistence — excluding the model itself."&lt;/p&gt;

&lt;p&gt;The model decides &lt;em&gt;what&lt;/em&gt; to do. The harness decides &lt;em&gt;how it gets done&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a Harness Provides
&lt;/h2&gt;

&lt;p&gt;LLMs are stateless by default. No memory across sessions. No tool access. No file system. No persistence. No safety boundaries. The harness adds all of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool execution.&lt;/strong&gt; The model emits a structured tool call — &lt;code&gt;read_file("report.md")&lt;/code&gt;. The harness routes that call to the actual API, handles authentication, manages rate limits, and returns the result. Without the harness, the model's tool call is a JSON blob that goes nowhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context management.&lt;/strong&gt; A million-token context window sounds infinite until you try to fit a codebase, a conversation history, a knowledge graph, and forty tool schemas into it simultaneously. The harness decides what enters the context window and what stays out — retrieval, summarization, priority ranking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Memory.&lt;/strong&gt; Short-term: the current conversation. Long-term: what the agent learned three weeks ago about your Slack triage preferences. Cross-session: the knowledge graph that compounds entity relationships across every email, Slack message, and meeting note the agent processes. The model has none of this natively. The harness provides all of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;State persistence.&lt;/strong&gt; Sessions survive restarts. Conversations resume. Work products are saved. A 90-minute research task that gets interrupted at minute 47 picks up where it left off. Without state persistence, every interruption restarts from zero.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Safety enforcement.&lt;/strong&gt; Permission boundaries — which folders can the agent read? Content filtering — does this output contain PII? Action approval gates — should a Slack message to #general require human confirmation? The harness enforces all of these. The model has no inherent concept of "don't post to the wrong channel."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error recovery.&lt;/strong&gt; Retry logic when an API call fails. Fallback strategies when a tool is rate-limited. Graceful degradation when context overflows. The model generates one response; the harness manages the recovery loop around it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Human-in-the-loop.&lt;/strong&gt; Trust ramps — confirm every action on Day 1, approve only high-risk actions by Day 30, fully autonomous by Day 60. The harness implements this progression. The model doesn't know what day it is.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sub-agent orchestration.&lt;/strong&gt; Spawning four parallel research agents, aggregating their results, managing dependencies between sequential steps. The model can reason about parallelism; the harness actually executes it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Same Model, Different Harness, Completely Different Experience
&lt;/h2&gt;

&lt;p&gt;This is the key insight. Two products can use the exact same underlying model and produce radically different user experiences — because the harness is different.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Code — The Terminal Harness
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://docs.anthropic.com/en/docs/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; is Anthropic's terminal-native coding agent — &lt;a href="https://www.augmentcode.com/learn/claude-code-86k-github-stars-terminal-ai-agent" rel="noopener noreferrer"&gt;84.6K GitHub stars&lt;/a&gt;, &lt;a href="https://www.speechly.io/blog/what-is-claude-code-ai-coding-assistant-2026" rel="noopener noreferrer"&gt;46% "most loved" rating among developers&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The harness is optimized for software engineering. Filesystem access scoped to the project directory. Git-aware — understands branches, diffs, commit history. Shell command execution. &lt;a href="https://www.linos.ai/technology/how-to-use-claude-code-2026/" rel="noopener noreferrer"&gt;Sub-agent spawning for parallel work&lt;/a&gt; — the model reasons about which files to edit, the harness executes the edits, runs tests, observes failures, and routes results back. CLAUDE.md files provide persistent project context that survives session restarts. &lt;a href="https://developersdigest.tech/blog/what-is-claude-code" rel="noopener noreferrer"&gt;Hooks&lt;/a&gt; enforce custom policies before and after every tool call.&lt;/p&gt;

&lt;p&gt;The model is Claude. The harness is a terminal runtime built for code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Cowork — The Desktop Harness
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aiwiki.ai/wiki/claude_cowork" rel="noopener noreferrer"&gt;Claude Cowork&lt;/a&gt; launched January 12, 2026 as a research preview inside the Claude Desktop app. &lt;a href="https://www.jploft.com/blog/anthropic-launched-cowork" rel="noopener noreferrer"&gt;Powered by Claude Opus 4.6 with a one-million-token context window&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The harness is optimized for knowledge workers who never open a terminal. Folder-scoped filesystem access — the user grants access to a specific folder. The agent &lt;a href="https://tldv.io/blog/claude-cowork" rel="noopener noreferrer"&gt;reads, edits, creates, renames, sorts, and deletes files&lt;/a&gt; within that scope. App automation connects to web and desktop applications. No shell. No git. No code execution.&lt;/p&gt;

&lt;p&gt;Same underlying model family. Completely different harness. Completely different user.&lt;/p&gt;

&lt;h3&gt;
  
  
  Kiro — The IDE Harness
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://aws.amazon.com/cn/documentation-overview/kiro/" rel="noopener noreferrer"&gt;Kiro&lt;/a&gt; is AWS's agentic IDE, built on Amazon Bedrock with &lt;a href="https://aws.amazon.com/cn/documentation-overview/kiro/" rel="noopener noreferrer"&gt;multiple foundation models&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The harness inverts the model familiar from Cursor and Copilot. &lt;a href="https://www.digitalapplied.com/blog/amazon-kiro-aws-agentic-ide-complete-guide" rel="noopener noreferrer"&gt;The spec is the source of truth; code is a build artifact&lt;/a&gt;. Before writing a single line, the harness generates structured specifications — requirements with acceptance criteria, technical design, numbered task list. The user reviews and edits. Then the agent implements from the spec.&lt;/p&gt;

&lt;p&gt;The harness is optimized for structured development — spec-driven, document-first, implementation-second. The model generates; the harness enforces the spec → design → task → code sequence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Amazon Quick Desktop — The Knowledge Work Harness
&lt;/h3&gt;

&lt;p&gt;Amazon Quick Desktop is a knowledge work agent that surfaces hundreds of tool functions in a single conversation. These tools come from connected MCP servers — Slack, Outlook email, calendar, Salesforce, SharePoint, OneDrive, web search, browser automation, image generation — alongside sandboxed Python and JavaScript execution and a local knowledge graph. All of it is accessible without switching apps. The harness decides which MCP servers to connect and which tools to surface to the model; that is the tool exposure point.&lt;/p&gt;

&lt;p&gt;The harness is optimized for cross-tool knowledge work. Scheduled agents run in the cloud 24/7 — they execute even when the user is offline. A skills system encodes reusable methodology (not prompts — methodology). Long-term memory compounds across sessions. A knowledge graph connects entities extracted from Slack, email, calendar, and local files. Feed notifications surface agent output as prioritized cards.&lt;/p&gt;

&lt;p&gt;The model reasons. The harness manages connections to dozens of MCP servers and their exposed tools, persists institutional knowledge across sessions, and orchestrates parallel sub-agents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bedrock AgentCore — The Managed Harness
&lt;/h3&gt;

&lt;p&gt;On April 22, 2026, AWS &lt;a href="https://aws.amazon.com/about-aws/whats-new/2026/04/agentcore-new-features-to-build-agents-faster/" rel="noopener noreferrer"&gt;announced a managed agent harness within Amazon Bedrock AgentCore&lt;/a&gt;. &lt;a href="https://www.forbes.com/sites/janakirammsv/2026/04/26/aws-cuts-ai-agent-setup-to-3-api-calls-in-agentcore-update/" rel="noopener noreferrer"&gt;Developers declare an agent's model, system prompt, and tools, then run it in three API calls&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The harness manages the full agent loop — reasoning, tool selection, action execution, response streaming — inside a &lt;a href="https://docs.aws.amazon.com/bedrock-agentcore/latest/devguide/harness-environment.html" rel="noopener noreferrer"&gt;dedicated microVM spun up for each session&lt;/a&gt;. No orchestration code required. AgentCore Gateway provides governed connectivity to APIs and MCP servers with built-in auth, access control, and policy enforcement.&lt;/p&gt;

&lt;p&gt;The harness is optimized for developers who want to build custom agents without reinventing infrastructure. The model plugs in (Claude, Llama, Mistral — any Bedrock model). The harness provides everything else. This is covered in depth in a separate post in this series.&lt;/p&gt;

&lt;h2&gt;
  
  
  Harness Engineering Is Becoming a Discipline
&lt;/h2&gt;

&lt;p&gt;The term comes from Mitchell Hashimoto, creator of Terraform and Ghostty. His definition: &lt;a href="https://www.decodingai.com/p/agentic-harness-engineering" rel="noopener noreferrer"&gt;"Anytime you find an agent makes a mistake, you take the time to engineer a solution such that the agent never makes that mistake again."&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is harness engineering. Not prompt engineering — the model's instructions are one input. Not fine-tuning — the model's weights are unchanged. Harness engineering is the practice of improving the infrastructure around the model so that reliability increases with every failure.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://blakecrosley.com/guides/agent-architecture" rel="noopener noreferrer"&gt;Blake Crosley frames the mental model precisely&lt;/a&gt;: "An AI coding agent is a programmable runtime with an LLM kernel. Every action the model takes passes through hooks you control. You define policies, not prompts."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://cobusgreyling.substack.com/p/the-rise-of-ai-harness-engineering" rel="noopener noreferrer"&gt;The discipline has formalized rapidly&lt;/a&gt;. Both OpenAI and Anthropic now use the term formally. Martin Fowler has written about it. An arXiv paper formalizes the pattern. This is not a buzzword — it is the &lt;a href="https://hackernoon.com/agent-harnessing-the-non-model-infrastructure-that-makes-ai-agents-actually-work" rel="noopener noreferrer"&gt;missing architectural layer that determines whether AI agents work in production&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The harness is where reliability lives. Models hallucinate; harnesses catch hallucinations. Models forget; harnesses persist memory. Models don't know your tools; harnesses expose the right tools at the right time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the Harness Matters More Than the Model
&lt;/h2&gt;

&lt;p&gt;Three reasons.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Models are commoditizing. Harnesses are differentiating.&lt;/strong&gt; You can swap Sonnet for Opus for Haiku and the harness stays the same. The model is a component. The harness is the product. Claude Code, Claude Cowork, Kiro, and Amazon Quick Desktop all have access to the same models — their differentiation is entirely in the harness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two agents using the same model but different harnesses produce wildly different results.&lt;/strong&gt; Give Claude Sonnet a terminal harness and it writes code. Give the same model a knowledge work harness and it triages your inbox. The model is identical. The experience is not.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The harness is where institutional knowledge lives.&lt;/strong&gt; Skills, memory, learned preferences, safety policies, workflow patterns — all harness-layer concerns. The model has no concept of "last time this customer asked about pricing, here's how we responded." The harness does.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Convergence
&lt;/h2&gt;

&lt;p&gt;Every harness category is expanding into the others.&lt;/p&gt;

&lt;p&gt;Terminal harnesses (Claude Code) are adding knowledge work features — memory, web search, MCP integrations with 150+ tools. Desktop harnesses (Cowork) are adding coding features — file manipulation, structured outputs. IDE harnesses (Kiro, Cursor) are adding agentic loops — autonomous multi-step execution beyond autocomplete. Knowledge work harnesses (Quick Desktop) are adding builder features — agent delegation to Claude Code and Kiro, with git and cloud account access on the roadmap.&lt;/p&gt;

&lt;p&gt;The winning harness will unify all four surfaces: terminal, desktop, IDE, and knowledge work — in a single runtime where the model switches modes but the harness provides continuity.&lt;/p&gt;

&lt;p&gt;The model conversation is nearly over. The harness conversation is where the actual competition lives.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 3 of the &lt;a href="https://dev.to/tags/agents/"&gt;Agent Primitives&lt;/a&gt; series.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;&lt;a href="https://artificialcuriositylabs.ai/posts/what-is-a-tool/" rel="noopener noreferrer"&gt;← Part 2: What Is a Tool&lt;/a&gt; · &lt;a href="https://artificialcuriositylabs.ai/posts/what-is-a-skill/" rel="noopener noreferrer"&gt;Part 4: What Is a Skill →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>infrastructure</category>
      <category>patterns</category>
    </item>
    <item>
      <title>What Is a Tool — The API Call Your Agent Makes on Your Behalf</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:19:34 +0000</pubDate>
      <link>https://dev.to/amitrix/what-is-a-tool-the-api-call-your-agent-makes-on-your-behalf-151e</link>
      <guid>https://dev.to/amitrix/what-is-a-tool-the-api-call-your-agent-makes-on-your-behalf-151e</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A tool is a single atomic capability with a JSON schema: name, description, parameters, output. The Slack API existed since 2013; what's new is an AI model autonomously deciding which one to call and when.&lt;/li&gt;
&lt;li&gt;MCP standardized tool discovery and invocation across platforms — 97 million SDK downloads and 13,000+ public servers in 16 months. The protocol won; the long-tail integration problem is now plug-in, not build.&lt;/li&gt;
&lt;li&gt;Work happens when the agent chains tool calls into a workflow — seven calls, one coherent sequence, no human orchestrating the order.&lt;/li&gt;
&lt;li&gt;Having 250 tools does not make an agent capable. The bottleneck shifts to methodology: which channel, what tone, what prior context to reference. That's the skill layer.&lt;/li&gt;
&lt;li&gt;Stop adding more tools. The tool layer is solved. Build the methodology layer above it.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;APIs have existed for thirty years. Your agent calling one on your behalf — that's new.&lt;/p&gt;

&lt;p&gt;If you've used an AI agent that reads your email, posts to Slack, queries your CRM, and books a meeting — all in one conversation — each of those actions was a tool call. A tool is a single, well-defined capability the agent can invoke. Read a file. Search the web. Create a calendar event. Send a message. Every tool has a name, a description, input parameters, and an output format. The agent reads the description, decides when to call it, fills in the parameters, and interprets the result.&lt;/p&gt;

&lt;p&gt;That sounds simple. It is simple. The interesting part is everything around it.&lt;/p&gt;

&lt;p&gt;This is the third post in the &lt;strong&gt;Agent Primitives&lt;/strong&gt; series — four posts that cut through the confusion around agents, skills, tools, and agent harnesses. The first post defined what an agent is (and isn't). The second defined skills — reusable encoded methodology. This one defines the atomic layer underneath both: tools.&lt;/p&gt;




&lt;h2&gt;
  
  
  A Tool Is a Function With a JSON Schema
&lt;/h2&gt;

&lt;p&gt;Strip away the marketing and a tool is a function signature. It has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A name&lt;/strong&gt;: &lt;code&gt;email_send&lt;/code&gt;, &lt;code&gt;file_read&lt;/code&gt;, &lt;code&gt;calendar_view&lt;/code&gt;, &lt;code&gt;web_search&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A description&lt;/strong&gt;: natural language explaining what the function does — this is what the model reads to decide whether to use it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input parameters&lt;/strong&gt;: typed fields with descriptions (e.g., &lt;code&gt;to: string&lt;/code&gt;, &lt;code&gt;subject: string&lt;/code&gt;, &lt;code&gt;body: string&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An output&lt;/strong&gt;: whatever the function returns — a list of emails, a file's content, search results, a confirmation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent's reasoning model reads the description and decides: given my current goal and the tools available, which one should I call next, and with what parameters?&lt;/p&gt;

&lt;p&gt;That decision — the model selecting and parameterizing a tool call at runtime — is the fundamental shift. APIs existed long before AI agents. SDKs, webhooks, REST endpoints, GraphQL queries. All the plumbing was already there. What changed is that the human is no longer the one deciding which API to call. The model is.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tools Are Not New. Who Calls Them Is.
&lt;/h2&gt;

&lt;p&gt;A Slack API endpoint for posting a message has existed since 2013. An Outlook API for reading email has existed since 2015. A Salesforce API for querying opportunities has existed since the early 2000s. None of this is novel infrastructure.&lt;/p&gt;

&lt;p&gt;What's novel: an AI model sitting in a reasoning loop, examining your goal ("triage my inbox and flag anything from Tier-1 accounts"), scanning its available tools, and autonomously deciding:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Call &lt;code&gt;email_inbox&lt;/code&gt; to pull the last 50 messages&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;file_read&lt;/code&gt; on &lt;code&gt;accounts.csv&lt;/code&gt; to load the Tier-1 list&lt;/li&gt;
&lt;li&gt;For each email, reason about sender, subject, and body against the Tier-1 list&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;file_write&lt;/code&gt; to produce a triage report&lt;/li&gt;
&lt;li&gt;Call &lt;code&gt;conversations_add_message&lt;/code&gt; to post the summary to Slack&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No human selected those tools. No human wrote that sequence. The agent reasoned through it based on the goal and the tools available.&lt;/p&gt;

&lt;p&gt;This is why the tool layer matters: it's the interface between the agent's reasoning and the external world. Without tools, the agent is a &lt;a href="https://firecrawl.dev/blog/what-is-an-agent-harness" rel="noopener noreferrer"&gt;brain in a jar&lt;/a&gt; — it can think about your email, but it can't read it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Seven Categories of Tools
&lt;/h2&gt;

&lt;p&gt;In practice, tools cluster into functional categories. Here's what a production knowledge-work agent actually uses:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;th&gt;What It Enables&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Communication&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Slack MCP server → tools: &lt;code&gt;post_message&lt;/code&gt;, &lt;code&gt;search_messages&lt;/code&gt;, &lt;code&gt;add_reaction&lt;/code&gt;; Outlook MCP server → tools: &lt;code&gt;email_read&lt;/code&gt;, &lt;code&gt;email_reply&lt;/code&gt;, &lt;code&gt;email_forward&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Agent reads and writes to your communication buses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Knowledge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;File read/write, semantic search (RAG), knowledge graph queries&lt;/td&gt;
&lt;td&gt;Agent accesses and updates your information layer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Salesforce MCP server → tools: &lt;code&gt;search_opportunities&lt;/code&gt;, &lt;code&gt;fetch_account_details&lt;/code&gt;, &lt;code&gt;update_opportunity&lt;/code&gt;; dashboard and spreadsheet tools&lt;/td&gt;
&lt;td&gt;Agent queries structured business data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Calendar&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;View events, check availability, book meetings, find rooms&lt;/td&gt;
&lt;td&gt;Agent manages your time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Web&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Search, fetch URLs, browser automation (click, type, screenshot)&lt;/td&gt;
&lt;td&gt;Agent reaches beyond your local corpus&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Run Python, run JavaScript, execute in sandboxed environments&lt;/td&gt;
&lt;td&gt;Agent computes, transforms, analyzes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Generation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Create images, transcribe audio, generate documents (PPTX, PDF, DOCX)&lt;/td&gt;
&lt;td&gt;Agent produces artifacts&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;No single tool is interesting on its own. &lt;code&gt;file_read&lt;/code&gt; by itself is &lt;code&gt;cat&lt;/code&gt;. &lt;code&gt;email_inbox&lt;/code&gt; by itself is Outlook. The power comes from what the agent does with the result of one tool to decide the next tool call. That's the agent loop, and it's the primitive that makes tools useful.&lt;/p&gt;




&lt;h2&gt;
  
  
  MCP: The Universal Tool Protocol
&lt;/h2&gt;

&lt;p&gt;Before 2024, every agent platform defined its own tool format. OpenAI had function calling. LangChain had tool classes. Anthropic had tool-use blocks. If you built a Slack integration for one platform, you rebuilt it for every other one.&lt;/p&gt;

&lt;p&gt;Then Anthropic released the &lt;a href="https://ibuidl.org/blog/mcp-model-context-protocol-complete-guide-20260316" rel="noopener noreferrer"&gt;Model Context Protocol&lt;/a&gt; — MCP — in late 2024. It standardized how agents discover, authenticate with, and invoke tools. JSON-RPC over stdio or Streamable HTTP. Language-agnostic. Open standard.&lt;/p&gt;

&lt;p&gt;The adoption curve has been vertical:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;97 million&lt;/strong&gt; monthly SDK downloads by March 2026 — up from ~2 million at launch. &lt;a href="https://www.digitalapplied.com/blog/mcp-97-million-downloads-model-context-protocol-mainstream" rel="noopener noreferrer"&gt;4,750% growth in 16 months&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;13,000+&lt;/strong&gt; public MCP servers on GitHub as of April 2026, spanning &lt;a href="https://particula.tech/blog/mcp-developer-guide" rel="noopener noreferrer"&gt;databases, dev tools, communication platforms, and cloud infrastructure&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Governed by the Linux Foundation's &lt;a href="https://www.marsdevs.com/blog/model-context-protocol-mcp" rel="noopener noreferrer"&gt;Agentic AI Foundation (AAIF)&lt;/a&gt; with backing from Anthropic, OpenAI, Google, Microsoft, and AWS.&lt;/li&gt;
&lt;li&gt;Natively supported in Claude, Cursor, Windsurf, VS Code, Kiro, and &lt;a href="https://www.buildfastwithai.com/blogs/what-is-model-context-protocol-mcp" rel="noopener noreferrer"&gt;200+ other tools&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;MCP is to agent tools what HTTP was to web services — the universal transport. Before HTTP, every networked application had its own protocol. After HTTP, you built one server and any client could talk to it. Before MCP, every agent had its own tool format. After MCP, you &lt;a href="https://bitloops.com/resources/agent-tooling/model-context-protocol-mcp-explained" rel="noopener noreferrer"&gt;build one server and any MCP-compatible agent can discover and call it&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The architecture has three roles:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Host&lt;/strong&gt;: the application holding the LLM (Claude Desktop, Amazon Quick Desktop, Cursor)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client&lt;/strong&gt;: maintains a stateful connection to a specific MCP server&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server&lt;/strong&gt;: an independent process exposing &lt;a href="https://tianpan.co/blog/2025-10-09-model-context-protocol-practical-guide" rel="noopener noreferrer"&gt;tools, resources, and prompts&lt;/a&gt; for the agent to use&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A single host connects to multiple servers simultaneously. Each server exposes its own tools. The agent sees all of them in one unified namespace.&lt;/p&gt;

&lt;p&gt;This is why MCP matters: it turns the long tail of integrations from a build problem into a plug-in problem. A team at &lt;a href="https://particula.tech/blog/mcp-developer-guide" rel="noopener noreferrer"&gt;Particula Tech shipped eleven enterprise integrations in nine days&lt;/a&gt; because every one spoke MCP. Two years earlier, that would have been three months of custom plumbing per integration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tool Composition: Where the Real Work Happens
&lt;/h2&gt;

&lt;p&gt;A single tool call is not interesting. &lt;code&gt;file_read&lt;/code&gt; returns text. &lt;code&gt;email_inbox&lt;/code&gt; returns a list. &lt;code&gt;web_search&lt;/code&gt; returns results. None of that constitutes work.&lt;/p&gt;

&lt;p&gt;Work happens when the agent chains tool calls into a workflow:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example — Morning triage:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;email_inbox&lt;/code&gt; → pull last 50 messages from priority inbox folder&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;file_read&lt;/code&gt; → load &lt;code&gt;triage-rules.csv&lt;/code&gt; for sender-tier classification&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;file_read&lt;/code&gt; → load &lt;code&gt;accounts.csv&lt;/code&gt; for account-level context&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;calendar_view&lt;/code&gt; → pull today's meetings for cross-reference&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Agent reasons&lt;/em&gt;: classify each email as T1/T2/T3 based on sender, keywords, account context, calendar overlap&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;file_write&lt;/code&gt; → produce a structured triage report with decision-lines&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;conversations_add_message&lt;/code&gt; → post the T1 items to Slack&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Seven tool calls. One coherent workflow. The agent selected each tool, parameterized it, interpreted the result, and decided the next step. No human orchestrated the sequence.&lt;/p&gt;

&lt;p&gt;That chain — tool selection, invocation, interpretation, next-step reasoning — is not a tool. It's the agent loop using tools. The distinction matters because people confuse tool access with capability. Having 250 tools does not make your agent capable. Having methodology for when and how to combine those tools does.&lt;/p&gt;

&lt;p&gt;That methodology is a skill. And that's the boundary between the two concepts.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 250-Tools-One-Conversation Reality
&lt;/h2&gt;

&lt;p&gt;The scale of tool access in production agents has blown past what anyone expected two years ago.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Amazon Quick Desktop&lt;/strong&gt; runs 250+ tools in a single conversation. Those 250 are individual tool functions spread across all connected MCP servers — Slack MCP server tools (read, write, search, react), Outlook MCP server tools (email and calendar), SharePoint tools, Salesforce MCP server tools, file system tools, Python and JavaScript execution, web search, browser automation, image generation, audio transcription, and more — plus native harness capabilities like AgentCore's code interpreter sandboxes and cloud browser sessions. (AgentCore is an AWS service with its own MCP server that exposes runtime primitives to agents; it gets its own dedicated post.) All simultaneously available. The agent selects from the full set based on the user's intent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code&lt;/strong&gt; integrates with &lt;a href="https://www.speechly.io/blog/what-is-claude-code-ai-coding-assistant-2026" rel="noopener noreferrer"&gt;150+ tools via MCP&lt;/a&gt; — file system, git, shell commands, web search, browser automation, and any MCP server the developer adds. Sub-agents can spawn with their own tool access to work in parallel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kiro&lt;/strong&gt; connects to AWS services natively through Amazon Bedrock, with MCP servers for additional tool access — SAM templates, CloudWatch, DynamoDB, Lambda.&lt;/p&gt;

&lt;p&gt;The number of tools is not the flex. The number is the precondition. Once you have 250 tools available, the bottleneck shifts. The question stops being "can the agent send a Slack message?" and becomes "does the agent know which channel to post in, what tone to use, when to thread vs. top-level, and what prior context to reference?"&lt;/p&gt;

&lt;p&gt;That's not a tool problem. That's a skill problem. Tools provide the hands. Skills provide the judgment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tools vs. Skills: The Critical Distinction
&lt;/h2&gt;

&lt;p&gt;This is where the industry gets confused. Arcade.dev put it precisely: &lt;a href="https://www.arcade.dev/blog/what-are-agent-skills-and-tools" rel="noopener noreferrer"&gt;"Tools and skills get used interchangeably in marketing decks and conference talks, but they represent fundamentally different approaches to extending agent capabilities."&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's the difference:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Skill&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scope&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single atomic capability&lt;/td&gt;
&lt;td&gt;Multi-step methodology&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;State&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stateless — call and return&lt;/td&gt;
&lt;td&gt;Stateful — encodes workflow, rules, quality checks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Analogy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A hand&lt;/td&gt;
&lt;td&gt;A brain directing the hand&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Example&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;email_send(to, subject, body)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"Draft a reply to the highest-priority email, using the right voice mode, following the triage classification rules, and checking against the quality bar before sending"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Persistence&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None — defined once in a schema&lt;/td&gt;
&lt;td&gt;Versioned, shareable, evolves from experience&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Knowledge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;What to do (send email)&lt;/td&gt;
&lt;td&gt;How and when to do it (which email, what voice, what rules)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A tool is an API call. A skill is institutional knowledge about when, why, and how to make that API call — and the six calls that should follow it.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://www.gtmaipodcast.com/p/what-is-an-agent-what-is-a-skill" rel="noopener noreferrer"&gt;GTM AI Podcast&lt;/a&gt; framed it well: "Tools let agents act. Skills provide the knowledge of how and when to act — including the company-specific, team-specific, and user-specific context that separates a capable AI from a competent one."&lt;/p&gt;

&lt;p&gt;If you're building an agent and you think the answer is "add more tools," you're solving the wrong problem. The agent already has the tools. What it's missing is the methodology to use them well. That's the skill layer, and it's the topic of the second post in this series.&lt;/p&gt;




&lt;h2&gt;
  
  
  So What
&lt;/h2&gt;

&lt;p&gt;Tools are the atomic capabilities that let agents interact with the world. They are necessary but not sufficient. MCP standardized how tools are discovered and invoked — that's a protocol win comparable to HTTP. The ecosystem response has been &lt;a href="https://particula.tech/blog/mcp-developer-guide" rel="noopener noreferrer"&gt;97 million SDK downloads and 13,000+ servers in 16 months&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But tools alone don't produce work. A tool is a single function call. Work is a chain of function calls governed by methodology — which tools to call, in what order, with what parameters, interpreted against what context, and checked against what quality bar.&lt;/p&gt;

&lt;p&gt;The tool layer is solved. MCP won. The open question is the layer above it: who encodes the methodology that makes those tools useful? That's the skill layer, and it's the new frontier.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 2 of the &lt;a href="https://dev.to/tags/agents/"&gt;Agent Primitives&lt;/a&gt; series.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;&lt;a href="https://artificialcuriositylabs.ai/posts/what-is-an-agent/" rel="noopener noreferrer"&gt;← Part 1: What Is an Agent&lt;/a&gt; · &lt;a href="https://artificialcuriositylabs.ai/posts/what-is-an-agent-harness/" rel="noopener noreferrer"&gt;Part 3: What Is an Agent Harness →&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>patterns</category>
      <category>mcp</category>
    </item>
    <item>
      <title>What Is a Skill — Why Methodology Resets Every Session Without One</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:18:59 +0000</pubDate>
      <link>https://dev.to/amitrix/what-is-a-skill-why-methodology-resets-every-session-without-one-jn0</link>
      <guid>https://dev.to/amitrix/what-is-a-skill-why-methodology-resets-every-session-without-one-jn0</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Every new agent session resets methodology — the model knows how to reason, the tools know the API, neither knows your 11-step triage heuristic or your sender-tier rules. That gap is where skills live.&lt;/li&gt;
&lt;li&gt;A skill is a structured &lt;code&gt;SKILL.md&lt;/code&gt; file encoding reusable methodology: when to trigger, what inputs to gather, what tools in what order, quality gates, and explicit anti-patterns. Portable across 30+ platforms.&lt;/li&gt;
&lt;li&gt;Skills are not prompts (ephemeral) or tools (stateless atoms) — they are durable, versioned, testable institutional knowledge that survives model upgrades and context resets.&lt;/li&gt;
&lt;li&gt;One caught failure encoded into a skill means every future session — yours and every teammate who installs it — inherits the fix automatically.&lt;/li&gt;
&lt;li&gt;Build the skill layer: tools give agents hands, skills give agents methodology. Without skills, 250 connected tools still produce mediocre output.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Every agent session starts from zero. The model is brilliant. The tools are connected. And still — you spend the first ten minutes re-explaining how you work.&lt;/p&gt;

&lt;p&gt;That's the methodology reset problem. And skills are the fix.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Reset Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Open a new session with any AI agent. Ask it to triage your Slack. It will read your channels, classify messages, and produce a summary. The summary will be wrong — not factually, but methodologically. It doesn't know that your manager's DMs are always Tier 1. It doesn't know that "Bedrock" appears in 44% of your emails and shouldn't auto-escalate to urgent. It doesn't know your account team handles Cursor through three specific Slack channels, not email.&lt;/p&gt;

&lt;p&gt;You explain this. The agent adjusts. The output improves. Forty minutes later, you have a workflow that works.&lt;/p&gt;

&lt;p&gt;Tomorrow, you open a new session. The agent has no memory of any of this. You start over.&lt;/p&gt;

&lt;p&gt;This happens because agents have two layers and are missing a third. They have &lt;strong&gt;reasoning&lt;/strong&gt; (the model) and &lt;strong&gt;capabilities&lt;/strong&gt; (the tools). What they don't have is &lt;strong&gt;methodology&lt;/strong&gt; — the encoded knowledge of how YOU approach work. The model knows how to reason about Slack messages. The Slack MCP server exposes tools that know how to call the Slack API. Neither knows that your triage rules classify senders into three tiers based on a contacts registry, apply an 11-step heuristic in strict priority order, and output decision-lines instead of summaries.&lt;/p&gt;

&lt;p&gt;The gap between "agent can do a thing" and "agent does the thing the way I need it done" — that's where skills live.&lt;/p&gt;




&lt;h2&gt;
  
  
  What a Skill Actually Is
&lt;/h2&gt;

&lt;p&gt;A skill is a structured file — typically &lt;code&gt;SKILL.md&lt;/code&gt; — that encodes reusable methodology for an AI agent. It tells the agent: when to activate, what inputs to gather, what tools to use in what order, what quality checks to run, and what mistakes to avoid.&lt;/p&gt;

&lt;p&gt;The format has converged across 30+ agent platforms. &lt;a href="https://blog.serghei.pl/posts/agent-skills-101/" rel="noopener noreferrer"&gt;SKILL.md works in Claude Code, GitHub Copilot, Cursor, Gemini CLI, OpenAI Codex, Windsurf, Roo Code, and others&lt;/a&gt; — a single skill file, portable across environments. YAML frontmatter declares metadata. Markdown body contains the instructions. Optional &lt;code&gt;scripts/&lt;/code&gt;, &lt;code&gt;references/&lt;/code&gt;, and &lt;code&gt;assets/&lt;/code&gt; directories carry supporting materials.&lt;/p&gt;

&lt;p&gt;The architecture uses &lt;a href="https://open.substack.com/pub/swirlai/p/agent-skills-progressive-disclosure" rel="noopener noreferrer"&gt;progressive disclosure&lt;/a&gt;: metadata (~100 tokens) loads always, full instructions (&amp;lt;5,000 tokens) load only when triggered, and resources load only during execution. This keeps token costs low while making the full methodology available on demand.&lt;/p&gt;

&lt;p&gt;A concrete example. My "Slack Triage" skill encodes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trigger&lt;/strong&gt;: "triage slack", "check my slack", "what did I miss"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inputs&lt;/strong&gt;: time window, channel scope (tier-1 only, deal-rooms, all)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data sources&lt;/strong&gt;: a contacts registry (54 contacts with tiers), a classification rules file (88 classification rules), a monitored channels list&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Methodology&lt;/strong&gt;: 11-step classification heuristic — check sender tier first, then noise patterns, then auto-sender detection, then keywords, then thread context, then account-name matching&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quality gates&lt;/strong&gt;: "Bedrock" over-trigger guard (prevents 44% of emails from flooding Tier 1), thread collapsing (same conversation → one item, highest tier wins)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anti-patterns&lt;/strong&gt;: Don't classify based on subject line alone. Don't treat all &lt;a class="mentioned-user" href="https://dev.to/company"&gt;@company&lt;/a&gt;.com senders as Tier 1. Don't skip the calendar cross-reference.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output format&lt;/strong&gt;: Decision-lines, not summaries. Each line is an action ("Reply yes/no", "Read before 11am"), not a description of what happened.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of that information lives in the model. None of it lives in the Slack API. It lives in the skill. Remove the skill, and the agent produces a generic Slack summary that ignores your sender tiers, your channel priorities, and your action-oriented output format.&lt;/p&gt;




&lt;h2&gt;
  
  
  Skills vs. Tools vs. Functions vs. Prompts
&lt;/h2&gt;

&lt;p&gt;The industry uses these terms interchangeably. They are four different things.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Primitive&lt;/th&gt;
&lt;th&gt;What it is&lt;/th&gt;
&lt;th&gt;Analogy&lt;/th&gt;
&lt;th&gt;Persistence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Function&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A single API endpoint the model can call. JSON schema: name, parameters, return type.&lt;/td&gt;
&lt;td&gt;A single verb — "read", "send", "search"&lt;/td&gt;
&lt;td&gt;None — stateless&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tool&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;An atomic capability exposed to the agent — read a file, post to Slack, query a database. A superset of functions (some tools compose multiple functions).&lt;/td&gt;
&lt;td&gt;The hands&lt;/td&gt;
&lt;td&gt;None — stateless&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Prompt&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A one-shot instruction to the model. No structure, no persistence, no versioning.&lt;/td&gt;
&lt;td&gt;A sticky note&lt;/td&gt;
&lt;td&gt;Session only — gone tomorrow&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Skill&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Encoded methodology combining multiple tools with domain knowledge, quality checks, and anti-patterns.&lt;/td&gt;
&lt;td&gt;The brain telling the hands what to do, in what order, and why&lt;/td&gt;
&lt;td&gt;Durable — versioned, portable, shareable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://www.arcade.dev/blog/what-are-agent-skills-and-tools" rel="noopener noreferrer"&gt;Arcade.dev&lt;/a&gt; states it directly: "'Tools' and 'skills' get used interchangeably in marketing decks and conference talks, but they represent fundamentally different approaches to extending agent capabilities. Understanding this distinction is the difference between building agents that work in demos versus agents that work in production."&lt;/p&gt;

&lt;p&gt;Another way to frame it, from &lt;a href="https://www.gtmaipodcast.com/p/what-is-an-agent-what-is-a-skill" rel="noopener noreferrer"&gt;GTM AI Podcast&lt;/a&gt;: "Tools let agents act. Skills provide the knowledge of how and when to act — including the company-specific, team-specific, and user-specific context that separates a capable AI from a competent one."&lt;/p&gt;

&lt;p&gt;The implication: you can give an agent 250 tools and it will still produce mediocre output if it lacks the methodology to use them correctly. Tools are necessary but not sufficient. Skills are what close the gap between capability and competence.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Separation Principle
&lt;/h2&gt;

&lt;p&gt;This is not a cosmetic distinction. It is an &lt;a href="https://aiforbeginners.substack.com/p/the-separation-principle" rel="noopener noreferrer"&gt;architectural one&lt;/a&gt;: "The model provides reasoning; the skill provides context; the composition produces behaviour that neither could generate alone."&lt;/p&gt;

&lt;p&gt;Skills separate &lt;strong&gt;what the model can do&lt;/strong&gt; from &lt;strong&gt;what the model should do in this specific context&lt;/strong&gt;. The model can draft an email. The skill knows that emails to VP+ recipients use the leadership voice, never hedge the close, and always lead with the direct ask. The model can search Slack. The skill knows that &lt;code&gt;from:&lt;/code&gt; queries require aliases, not display names, and that DM channel IDs starting with &lt;code&gt;D&lt;/code&gt; don't work with the &lt;code&gt;in:&lt;/code&gt; filter.&lt;/p&gt;

&lt;p&gt;This separation matters because it means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Skills survive model upgrades&lt;/strong&gt;. Swap Sonnet for Opus. The skill still works. The methodology is independent of the reasoning engine.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills survive context window resets&lt;/strong&gt;. New session, same skill file. No re-explanation needed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills are diffable and versionable&lt;/strong&gt;. They're Markdown files. Git tracks every change. You can review what changed, when, and why.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skills are testable&lt;/strong&gt;. You can define eval cases — specific inputs that should produce specific outputs — and verify the skill produces correct behavior after changes.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The 8-Phase Skill Lifecycle
&lt;/h2&gt;

&lt;p&gt;Skills are not static files. They evolve through a lifecycle — and the lifecycle is what separates a personal hack from institutional infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Catch&lt;/strong&gt; — An agent makes a mistake. You correct it. This is the raw material. Example: the agent replied to the wrong message in an email thread because it used the first &lt;code&gt;itemId&lt;/code&gt; instead of the target sender's &lt;code&gt;itemId&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Author&lt;/strong&gt; — You encode the correction as a skill. The &lt;code&gt;email-thread-reply&lt;/code&gt; skill now resolves the correct &lt;code&gt;itemId&lt;/code&gt; for the target sender's message before calling the reply tool. The failure mode is baked into the methodology so it never recurs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Discover&lt;/strong&gt; — Others find the skill. A shared store, shared knowledge spaces, a git repo — discovery is the prerequisite for distribution. &lt;a href="https://arxiv.org/html/2603.14805" rel="noopener noreferrer"&gt;An MIT/UCSB study&lt;/a&gt; validated that flat skill libraries fail without structured discovery and adaptation mechanisms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Chain&lt;/strong&gt; — Skills compose. The "morning briefing" isn't one skill — it's &lt;code&gt;slack-triage&lt;/code&gt; → &lt;code&gt;email-triage&lt;/code&gt; → &lt;code&gt;calendar-triage&lt;/code&gt; → &lt;code&gt;draft-responder&lt;/code&gt;, sequenced by a scheduler. Each skill is independent; the chain produces emergent capability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Scrub&lt;/strong&gt; — Before sharing, strip PII. Personal file paths, Slack channel IDs, customer names, CRM IDs — all of it gets replaced with parameters. The skill becomes portable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Distribute&lt;/strong&gt; — Push to a shared store. Today this happens through git repos, shared folders, or shared knowledge spaces acting as skill stores. Tomorrow it should be a native platform capability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Adapt&lt;/strong&gt; — A teammate installs the skill and adjusts it for their context. Different Slack channels. Different sender tiers. Different voice settings. The methodology stays; the parameters change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8. Evolve&lt;/strong&gt; — The skill improves from experience. A new failure mode is caught in phase 1 and baked back into the skill in phase 2. The cycle repeats. Every iteration makes the skill more durable.&lt;/p&gt;

&lt;p&gt;This lifecycle is not theoretical. I run 40+ skills through it. When I catch a triage failure at 8am, the fix is in the skill by 8:15am. Every future session — mine and anyone who installs the skill — inherits the fix automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  Skills Are Institutional Knowledge
&lt;/h2&gt;

&lt;p&gt;Here's the argument that changes how you think about skills.&lt;/p&gt;

&lt;p&gt;When one person catches a failure mode and encodes it in a skill, everyone who installs that skill inherits the fix. The knowledge compounds across people without meetings, training sessions, or documentation review cycles.&lt;/p&gt;

&lt;p&gt;Traditional institutional knowledge flows look like this: someone discovers a better way to do something → writes a wiki page → nobody reads it → the knowledge dies with the person when they change teams.&lt;/p&gt;

&lt;p&gt;Skill-based institutional knowledge flows look like this: someone discovers a better way to do something → encodes it in a skill → pushes to a shared store → anyone who installs it gets the improvement automatically → when they encounter a new failure, they push a fix back → the skill compounds.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.christopherspenn.com/2026/02/agents-vs-skills-in-ai-understanding-the-key-difference-for-smarter-automation/" rel="noopener noreferrer"&gt;Christopher Spencer Penn&lt;/a&gt; captures it: "In modern agentic AI systems, agents can use skills, and skills can invoke agents. For example, I might have a skill called 'find the bloody bug' that kicks off three different kinds of debugging agents."&lt;/p&gt;

&lt;p&gt;Skills are executable. Wiki pages are not. That's the difference between knowledge that sits and knowledge that works.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Skill Distribution Frontier
&lt;/h2&gt;

&lt;p&gt;Building a great skill means nothing if others can't find, install, and adapt it.&lt;/p&gt;

&lt;p&gt;This is the frontier. Today, skill distribution is duct tape — shared folders, git repos, manual copy-paste. The 8-phase lifecycle works for a solo builder maintaining 40 skills. It breaks at team scale without native platform support.&lt;/p&gt;

&lt;p&gt;What's needed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt;: Semantic search over a skill store — "I need a skill for triaging customer emails" should return relevant skills ranked by quality and adoption.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Install&lt;/strong&gt;: One-click install that parameterizes the skill for the user's context — their channels, their registries, their voice settings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adaptation&lt;/strong&gt;: Fork a skill, adjust it, contribute improvements back. Git for skills.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Quality signals&lt;/strong&gt;: Usage metrics, failure rates, user ratings. Not every skill is worth installing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PII scrubbing as a first-class gate&lt;/strong&gt;: Before any skill leaves a personal workspace, it passes through automated PII detection. File paths, channel IDs, customer names, CRM IDs — all parameterized.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://calmops.com/ai/ai-agent-skills-complete-guide-2026/" rel="noopener noreferrer"&gt;CalmOps&lt;/a&gt; describes the maturity curve: "Unlike generic tools that provide single functions, skills encapsulate the complete knowledge and logic required to handle a specialized domain."&lt;/p&gt;

&lt;p&gt;The platform that solves skill distribution — discovery, installation, adaptation, and quality feedback — will own the institutional knowledge layer for agent-native work. Every competing platform (Anthropic, Salesforce, ServiceNow, Glean) is building toward this. The winner will be the one that treats the full 8-phase lifecycle as a first-class system, not a marketplace bolted on top.&lt;/p&gt;




&lt;h2&gt;
  
  
  So What
&lt;/h2&gt;

&lt;p&gt;Skills are the missing primitive between tools and agents. Tools give agents hands. Skills give agents methodology. Without skills, every session starts from zero — the agent re-discovers your workflow, your quality bar, your anti-patterns through trial and error. With skills, the first session establishes the methodology and every subsequent session inherits it.&lt;/p&gt;

&lt;p&gt;Memories fade. Context windows reset. Prompts are ephemeral.&lt;/p&gt;

&lt;p&gt;Skills persist.&lt;/p&gt;

&lt;p&gt;The rest of this series covers the other three primitives: what an agent is (the loop), what a tool is (the hands), and what a harness is (the infrastructure). Skills are the brain that coordinates all three — the layer where institutional knowledge becomes executable, shareable, and compounding.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 4 of the &lt;a href="https://dev.to/tags/agents/"&gt;Agent Primitives&lt;/a&gt; series.&lt;/em&gt;&lt;br&gt;
&lt;em&gt;&lt;a href="https://artificialcuriositylabs.ai/posts/what-is-an-agent-harness/" rel="noopener noreferrer"&gt;← Part 3: What Is an Agent Harness&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>patterns</category>
      <category>skills</category>
    </item>
    <item>
      <title>We're All Builders Now</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:18:23 +0000</pubDate>
      <link>https://dev.to/amitrix/were-all-builders-now-56mg</link>
      <guid>https://dev.to/amitrix/were-all-builders-now-56mg</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;AI didn't make building easier — a 2025 METR study found experienced developers using AI tools took 19% &lt;em&gt;longer&lt;/em&gt; on complex tasks. The tool is an access enabler, not a simplifier.&lt;/li&gt;
&lt;li&gt;The gate that moved was technical credential, not judgment. Domain experts who always had the clearest view of the problem now have direct access to execution primitives.&lt;/li&gt;
&lt;li&gt;The internet democratized information access; AI democratizes creation access. Both left the hard part hard.&lt;/li&gt;
&lt;li&gt;What didn't move: knowing what to build, conviction, distribution, customer obsession, and the stubbornness to finish well.&lt;/li&gt;
&lt;li&gt;If you can see the problem clearly and you're waiting for someone with an engineering credential to build it — that argument is gone.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Building was never easy.&lt;/p&gt;

&lt;p&gt;Writing software took years to learn. Most people spent careers getting good at it — understanding systems, debugging at 2am, learning the hard way why certain architectural decisions rot over time. That's real craft. Nobody is taking that away.&lt;/p&gt;

&lt;p&gt;But access to creation? That was always a different problem. And that's what changed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Internet Did This First
&lt;/h2&gt;

&lt;p&gt;When the internet arrived, it didn't make information easier to create or understand. Journalism still required skill. Research still required rigor. Writing a good book was still hard.&lt;/p&gt;

&lt;p&gt;What it did was remove the gatekeeping between knowledge and the people who needed it. You no longer had to live near a great library, or know someone who subscribed to the right journals, or have an institution behind you to access what had been written. Geography, wealth, and institutional access stopped being filters. The knowledge was always there. The internet removed the walls around it.&lt;/p&gt;

&lt;p&gt;AI is doing the same thing — one layer up. Not to information. To creation.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Actually Changed
&lt;/h2&gt;

&lt;p&gt;For most of computing history, building digital things required a specific kind of access: the ability to write code. Or know someone who could. Or wait for a roadmap item to survive a planning cycle and land in an engineer's queue.&lt;/p&gt;

&lt;p&gt;That wasn't a commentary on who had the best ideas. Domain experts — the consultant who had seen the same broken workflow at forty companies, the operator who knew exactly where the process fell apart every quarter, the analyst who could have told you what the dashboard should show three years ago if anyone had asked — often had the clearest view of the problem. They couldn't build the solution themselves. The execution layer required a credential they didn't have.&lt;/p&gt;

&lt;p&gt;AI removed that specific gate. Not by making building easy. By making access to building primitives broadly available — to anyone with a clear enough problem and genuine enough curiosity to try.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.infoworld.com/article/2337677/low-code-development-technologies-market-forecast-to-hit-445-billion-by-2026.html" rel="noopener noreferrer"&gt;Gartner projects&lt;/a&gt; the low-code market will hit $44.5 billion by 2026, with 75% of new applications incorporating no-code or low-code solutions. Operations managers are shipping dashboards. Product leads are building internal tools. Finance analysts are deploying apps. The through-line in every case: the tool caught up with the intent. The thing that used to require engineering cycles now requires an afternoon and a clear enough problem statement.&lt;/p&gt;

&lt;p&gt;That's democratization. Not simplification.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Hard Part Is Still Hard
&lt;/h2&gt;

&lt;p&gt;Here's what AI didn't touch: the judgment gap.&lt;/p&gt;

&lt;p&gt;Knowing what to build — that's still human. Most people solve the wrong problem. They build clever solutions to the problem they wished existed, not the one that's actually costing someone time and money every week. Domain knowledge plus genuine customer obsession is still how you find the right problem. No model gives you that.&lt;/p&gt;

&lt;p&gt;Conviction — the willingness to commit to a specific bet when everything is uncertain — is still human. AI can generate options indefinitely. It cannot choose. The person who can look at an ambiguous situation and say "this specific thing, now, for these people" — that's still the rare thing.&lt;/p&gt;

&lt;p&gt;Distribution. Drive. Scaling judgment. The ability to know which customers to listen to and which to ignore. The stubbornness to keep going when it's not working. The clarity to kill something that's not working fast enough. None of that moved.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/" rel="noopener noreferrer"&gt;METR study (2025)&lt;/a&gt; makes this concrete: in a randomized controlled trial of experienced open-source developers working on complex, real-world tasks from their own repositories, those using AI tools took 19% &lt;em&gt;longer&lt;/em&gt; than those without. The tool is not a simplifier. It is an access enabler. What you do with that access still depends entirely on what you bring to it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Gate Moved, Not the Climb
&lt;/h2&gt;

&lt;p&gt;The analogy that keeps coming up: "The Internet Democratized Information, AI Democratizes Intelligence." That's close. But for builders specifically, the sharper version is: &lt;strong&gt;the internet democratized access to information, AI democratizes access to creation.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Both leave the hard part hard. Reading everything ever written about surgery doesn't make you a surgeon. Having access to every building primitive in existence doesn't make you a good builder. What it does is remove the argument that someone else has to build the thing you can see clearly and they can't.&lt;/p&gt;

&lt;p&gt;The gate was always at the wrong place. It filtered by technical credential when it should have filtered by judgment, curiosity, and understanding of the problem. That filter is now gone.&lt;/p&gt;

&lt;p&gt;Which means the people who always had the deepest understanding of the problem — the domain experts, the operators, the people closest to real pain — now have the tools to do something about it.&lt;/p&gt;

&lt;p&gt;That's the shift. Not that building got easier. That access to the act of building got broader.&lt;/p&gt;




&lt;h2&gt;
  
  
  So What
&lt;/h2&gt;

&lt;p&gt;Builder mode isn't a job title. It isn't a technical credential. It isn't something you earn after enough years in an IDE.&lt;/p&gt;

&lt;p&gt;It's an orientation — the decision to see a problem clearly, take it seriously, and do something about it rather than wait for someone else to. &lt;a href="https://artificialcuriositylabs.ai/posts/builder-is-an-operating-mode/" rel="noopener noreferrer"&gt;That operating mode is available to anyone&lt;/a&gt; with the curiosity to look hard at a problem and the drive to show up with something real.&lt;/p&gt;

&lt;p&gt;The access barrier is gone. The hard part — conviction, judgment, customer obsession, drive — was always human. Still is.&lt;/p&gt;

&lt;p&gt;What I haven't worked out: whether broader access actually produces better outcomes, or whether the judgment gap is wide enough that it just produces more noise. The METR data suggests experienced builders get &lt;em&gt;slower&lt;/em&gt; when AI is in the loop on complex work — which implies the thing that separates good builders from the rest isn't access to tools. If that's true, opening the gate changes who can start. It doesn't change who finishes well.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part 2 of the Builders series. &lt;a href="https://artificialcuriositylabs.ai/posts/builder-is-an-operating-mode/" rel="noopener noreferrer"&gt;← Part 1: Builder Is an Operating Mode, Not a Job Title&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>patterns</category>
    </item>
    <item>
      <title>Voice Is a Layer, Not a Setting</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:17:47 +0000</pubDate>
      <link>https://dev.to/amitrix/voice-is-a-layer-not-a-setting-491o</link>
      <guid>https://dev.to/amitrix/voice-is-a-layer-not-a-setting-491o</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Five writing skills with embedded voice instructions = five drifting definitions; the same person sounds like different writers within months.&lt;/li&gt;
&lt;li&gt;The fix is four independent layers: mode detection → voice + quality → format → publish. Voice lives in one place, called by everything else.&lt;/li&gt;
&lt;li&gt;A single correction to the centralized voice layer propagates instantly across blog posts, Slack threads, emails, and strategy docs — no hunting across five skills.&lt;/li&gt;
&lt;li&gt;Mode detection runs before a single word is written, resolving context from a five-signal hierarchy (explicit override, recipient, role, channel, intent keywords) with no manual selection required.&lt;/li&gt;
&lt;li&gt;Every major tool treats voice as a setting. Separate the layers, centralize the voice, and the drift problem disappears.&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;The person is the constant. The mode is the variable. The medium is irrelevant.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you have five writing skills and each one defines voice separately, you have five competing voice definitions. Over time they drift. A blog post and a Slack thread about the same topic come out sounding like different people wrote them. Not because the agent changed — because the voice instructions were never in the same place.&lt;/p&gt;

&lt;p&gt;The fix is architecture, not better prompts. Voice is a layer. It belongs in one place, called by everything else.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With Embedded Voice
&lt;/h2&gt;

&lt;p&gt;Every AI writing tool faces the same temptation: put the voice instructions where the writing happens. The blog skill says "write in a practitioner voice, evidence-based, no hedging." The email skill says "write professionally, direct, data-specific." The Slack skill says "keep it short, action-oriented."&lt;/p&gt;

&lt;p&gt;Three skills, three definitions of "professional." None of them wrong. All of them slightly different. The drift is imperceptible at first — a slightly different sentence rhythm here, a slightly different threshold for hedging there. After a few months of iterating each skill independently, the same person sounds like three different writers depending on which skill ran.&lt;/p&gt;

&lt;p&gt;This is not a voice problem. It is an architecture problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four Layers
&lt;/h2&gt;

&lt;p&gt;The fix is separating concerns that were bundled together:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Layer 1: MODE DETECTION
  What voice variant to use — casual, professional, leadership,
  field, publishing, or builder. Resolved from context before
  writing begins. Never manual.

Layer 2: VOICE + QUALITY
  The universal standards that apply regardless of mode.
  Cliché guard. Citation rules. Quality checklist. Anti-patterns.
  One definition. Called by everything.

Layer 3: FORMAT
  Structure, length, frontmatter, conventions.
  Blog format. Slack format. Email format. Strategy doc format.
  Each content type has its own format layer.
  Format knows nothing about voice.

Layer 4: PUBLISH
  Upload, verify, RAG optimization.
  Always a separate explicit step.
  Never bundled into format.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The calling skill provides the format layer. It calls the voice layer. The voice layer calls mode detection. The result: the voice is consistent across every content type because it lives in one place, not five.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 1: Mode Detection
&lt;/h2&gt;

&lt;p&gt;Mode detection runs before a single word is written. A five-signal priority hierarchy resolves the correct voice variant from context:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Explicit override&lt;/strong&gt; — "keep it casual" or "exec tone" wins immediately&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recipient override&lt;/strong&gt; — per-person config for people who always get a specific mode&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Role mapping&lt;/strong&gt; — looks up the recipient in a contacts registry, maps relationship (peer, manager, customer, close colleague) to mode&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Channel detection&lt;/strong&gt; — Slack public channel → professional; email to external domain → field; blog post → publishing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intent keywords&lt;/strong&gt; — "ping him," "heads up" → casual; "endorsement request" → leadership; "write a post" → publishing&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Default: professional.&lt;/p&gt;

&lt;p&gt;The agent never asks which mode to use. The signal is already there — recipient, channel, intent. The hierarchy reads it.&lt;/p&gt;

&lt;p&gt;What makes this maintainable: the detection logic lives in a YAML config file, not code. Adding a new recipient override is a one-line edit. Adjusting a keyword mapping takes ten seconds. No code change needed when the context changes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 2: Voice + Quality
&lt;/h2&gt;

&lt;p&gt;This is the layer most tools skip. Every writing skill embeds its own voice definition. The four-layer architecture pulls that definition out and centralizes it.&lt;/p&gt;

&lt;p&gt;The voice layer owns:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The cliché guard&lt;/strong&gt; — a universal banned-phrase list that runs on every piece of output regardless of mode or format. "Robust," "seamless," "comprehensive," "game-changing" — banned everywhere, always, because they are placeholders for the specific thing the writer actually means. The guard does not restrict expression. It forces specificity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The never_say lists&lt;/strong&gt; — mode-specific bans that load with the engram. Casual mode bans "I hope this note finds you well." Leadership mode bans "either way, no worries if not." Publishing mode bans credential framing. The bans are decisions, not style preferences — they encode what the writer has explicitly rejected in real output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The quality checklist&lt;/strong&gt; — conditions that must be met before output returns: opens with outcome not setup; every falsifiable claim has a source or "in my experience" label; no credential framing; has a "so what"; ends on action not opt-out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Citation rules&lt;/strong&gt; — inline links for every factual claim, "in my experience" for unlinkable observations. Not footnotes. Not optional.&lt;/p&gt;

&lt;p&gt;Because this layer is centralized, a correction made in one place propagates everywhere. When "robust" gets added to the cliché guard, it is banned in blog posts, Slack threads, emails, and strategy docs simultaneously. No hunting across five skills to update five separate voice definitions.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 3: Format
&lt;/h2&gt;

&lt;p&gt;Format is what changes by content type. A blog post needs frontmatter, a filename convention, a category, a length target. A Slack thread needs a hook, a body, a close. An email needs subject, greeting, body, action. A strategy doc needs thesis, evidence, what's missing, so what.&lt;/p&gt;

&lt;p&gt;Format skills are pluggable. Any format skill can call the voice layer. Blog format + publishing voice. Slack format + casual voice. Strategy doc format + leadership voice. The combination is arbitrary because the layers are independent.&lt;/p&gt;

&lt;p&gt;This is the same principle behind separation of concerns in software architecture. The format skill does not know about voice. The voice layer does not know about format. Both apply — simultaneously, independently.&lt;/p&gt;




&lt;h2&gt;
  
  
  Layer 4: Publish
&lt;/h2&gt;

&lt;p&gt;Publishing is always a separate explicit step. Never bundled into format.&lt;/p&gt;

&lt;p&gt;The format skill produces a draft. When the draft is ready, a publish step handles the mechanics: RAG optimization for AI-readable structure, filename validation, upload, verification. One publish skill works for any content to any destination — because publish is format-agnostic.&lt;/p&gt;

&lt;p&gt;Why separate? Because "format" and "ready to publish" are different states. A draft can be formatted correctly and still need review. Separating the layers makes that review natural — the format skill delivers a draft, the author reviews, the publish step runs when ready.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Market Offers
&lt;/h2&gt;

&lt;p&gt;Every major tool treats voice as a setting, not a layer:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;th&gt;What's missing&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Custom GPT / Claude Styles&lt;/td&gt;
&lt;td&gt;Single voice profile from samples&lt;/td&gt;
&lt;td&gt;No mode switching. DM = exec email = blog post.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-skill voice encoding&lt;/td&gt;
&lt;td&gt;Voice defined inside each writing skill&lt;/td&gt;
&lt;td&gt;5 skills = 5 definitions = drift&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Engram builder (native)&lt;/td&gt;
&lt;td&gt;Extracts one profile from message corpus&lt;/td&gt;
&lt;td&gt;Single mode. No auto-detection. No never_say.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Brand voice guides&lt;/td&gt;
&lt;td&gt;Organizational standards&lt;/td&gt;
&lt;td&gt;Not machine-readable. Not enforced at write time.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Nobody has separated mode detection, voice quality, format, and publish into independent layers with clean interfaces between them. The closest analog is what Google did for visual identity with DESIGN.md — a single machine-readable source of truth for brand standards, called by any agent building UI. The writing equivalent is a centralized voice layer, called by any skill producing written output.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Consistency Principle in Practice
&lt;/h2&gt;

&lt;p&gt;What changes by mode: casual is shorter and warmer. Professional is strategic and data-specific. Leadership is personal and confident. Field is customer-obsessed. Publishing is universal and evidence-based. Builder is precise and structured.&lt;/p&gt;

&lt;p&gt;What never changes: evidence-backed claims. No clichés. No hedging. Specific over vague. Peer voice, not trainer voice.&lt;/p&gt;

&lt;p&gt;A blog post and a Slack thread about the same topic should feel like the same person wrote them. One is longer and more structured. The other is shorter and more direct. But the thinking, the specificity, the conviction, and the anti-patterns are identical — because those properties live in the voice layer, not in the blog skill or the Slack skill.&lt;/p&gt;

&lt;p&gt;The person is the constant. The mode is the variable. The medium is irrelevant.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is the final post in a four-part series on building mode-specific voice profiles for AI agents. The series starts &lt;a href="https://artificialcuriositylabs.ai/posts/your-ai-agent-needs-communication-modes/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>voice</category>
      <category>patterns</category>
    </item>
    <item>
      <title>Two AI interfaces. Same desktop. Completely different jobs.</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:17:10 +0000</pubDate>
      <link>https://dev.to/amitrix/two-ai-interfaces-same-desktop-completely-different-jobs-42i7</link>
      <guid>https://dev.to/amitrix/two-ai-interfaces-same-desktop-completely-different-jobs-42i7</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Two tools, same model, same MCP servers, same tools — the only difference is who drives: interactive co-pilot vs. autonomous delegate.&lt;/li&gt;
&lt;li&gt;Precision work (strategy docs, KB edits, blog drafts) belongs in co-pilot mode; you need judgment at every step. Routine work (Slack triage, insight summaries, handoff drafts) belongs in delegate mode; approving every step kills the time savings.&lt;/li&gt;
&lt;li&gt;The dividing line: if you'd feel comfortable not watching it work, delegate. If you'd feel nervous, co-pilot.&lt;/li&gt;
&lt;li&gt;Using one mode for everything produces mediocre results from both tools; the friction is a mismatch signal, not a tool failure.&lt;/li&gt;
&lt;li&gt;Ask "what role do I want to play in this task?" before opening any AI interface.&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;I've been running Claude Code and CoWork side by side for weeks. They use the same model. The same MCP servers. The same tools. So why do they feel like completely different things?&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I'm not a developer. My work is strategy and knowledge work — positioning, research, content, handoffs. I don't write code for a living. But I spend a lot of time in AI interfaces, and for the past several weeks I've had two of them open simultaneously: Claude Code in the left panel, CoWork on the right.&lt;/p&gt;

&lt;p&gt;At first I kept asking myself: why do I need both? They run on the same model. They connect to the same tools — my email, calendar, Slack, Salesforce. They can both draft a document, search my files, or pull insights from a Slack channel. They're the same thing.&lt;/p&gt;

&lt;p&gt;Except they're not. And once I understood why, how I work with AI changed completely.&lt;/p&gt;




&lt;h2&gt;
  
  
  The thing nobody says clearly
&lt;/h2&gt;

&lt;p&gt;Most comparisons of AI tools focus on the model, the features, the connectors, the pricing. That's the wrong frame.&lt;/p&gt;

&lt;p&gt;The actual difference between Claude Code and CoWork isn't the model. It isn't the surface — I use Claude Code in a desktop interface, not a terminal. It isn't even the capabilities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The difference is who drives.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Claude Code is an interactive tool. Every step gets surfaced. I approve tool calls. I redirect when something's off. The AI proposes; I dispose. This makes it slow and deliberate by design.&lt;/p&gt;

&lt;p&gt;CoWork is an autonomous agent. I hand it a task and walk away. It executes end-to-end, using connectors and skills to complete multi-step workflows without me narrating every move.&lt;/p&gt;

&lt;p&gt;Same intelligence. Different workflow model. That's the whole thing.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this actually means for a knowledge worker
&lt;/h2&gt;

&lt;p&gt;If you're a developer, the framing makes obvious sense: Claude Code is for precision work where you want to review each diff. Fine.&lt;/p&gt;

&lt;p&gt;But I'm not a developer. So why do I need both?&lt;/p&gt;

&lt;p&gt;It took me a while to see it, but the answer is exactly the same — just applied to different kinds of work.&lt;/p&gt;

&lt;p&gt;Some of my work requires precision and judgment at each step. When I'm editing KB content, writing positioning docs, or modifying a repo structure, I want to see each move. An agent that acts autonomously in my codebase without my oversight is a liability. Claude Code wins here: interactive, deliberate, controlled.&lt;/p&gt;

&lt;p&gt;Some of my work is routine and high-frequency. Triaging Slack channels, pulling last week's customer insights, drafting a handoff doc, summarizing email threads. I don't need to narrate these. I don't want to narrate these. If I have to approve every step of "find the key themes from a customer insights channel this week," I've already lost the time savings. CoWork wins here: autonomous, fast, hands-off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The question isn't which interface is better. The question is whether you want to co-pilot or delegate.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The prior art gap
&lt;/h2&gt;

&lt;p&gt;I went looking for this framing written down somewhere. The closest I found is Ethan Mollick at One Useful Thing, who writes thoughtfully about AI as a collaborative tool for knowledge workers. The interactive/autonomous distinction shows up in agent architecture discussions. Microsoft Copilot and Google Workspace AI also orbit this space.&lt;/p&gt;

&lt;p&gt;But the specific insight — &lt;strong&gt;two AI interfaces running on the same desktop, same model, same tools, distinguished only by workflow mode, for a non-developer knowledge worker&lt;/strong&gt; — I couldn't find it named.&lt;/p&gt;

&lt;p&gt;People write about AI assistants versus AI agents as if they're different products. In my experience, they're different modes of the same product. The interface that makes sense for a given task depends entirely on whether you want to stay in the loop or get out of it.&lt;/p&gt;

&lt;p&gt;I'd call this the &lt;strong&gt;co-pilot/delegate distinction&lt;/strong&gt;: the fundamental question you should ask before opening an AI interface isn't "what can this tool do?" It's "do I want to co-pilot this task or delegate it?"&lt;/p&gt;




&lt;h2&gt;
  
  
  The practical split
&lt;/h2&gt;

&lt;p&gt;Here's how it actually shakes out for me:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Editing KB content, repos&lt;/td&gt;
&lt;td&gt;Claude Code (interactive)&lt;/td&gt;
&lt;td&gt;Each change matters — I want to review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Positioning docs, strategy&lt;/td&gt;
&lt;td&gt;Claude Code (interactive)&lt;/td&gt;
&lt;td&gt;High judgment required throughout&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Weekly Slack triage&lt;/td&gt;
&lt;td&gt;CoWork (autonomous)&lt;/td&gt;
&lt;td&gt;Routine, high-frequency, well-defined&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Customer insight summaries&lt;/td&gt;
&lt;td&gt;CoWork (autonomous)&lt;/td&gt;
&lt;td&gt;Pattern work, same structure every time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Drafting handoffs from notes&lt;/td&gt;
&lt;td&gt;CoWork (autonomous)&lt;/td&gt;
&lt;td&gt;I define the output; it executes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Blog drafts&lt;/td&gt;
&lt;td&gt;Claude Code (interactive)&lt;/td&gt;
&lt;td&gt;Voice and tone need my judgment at every step&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The dividing line isn't hard to find: &lt;strong&gt;if you'd feel comfortable not watching it work, delegate. If you'd feel nervous not watching it work, co-pilot.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What changes when you see it this way
&lt;/h2&gt;

&lt;p&gt;Before I had this frame, I was using one tool for everything and getting mediocre results from both. I'd use Claude Code for routine tasks and resent that I had to approve every step. I'd use CoWork for precision work and get anxious that I wasn't reviewing the output.&lt;/p&gt;

&lt;p&gt;The tools weren't failing me. I was using them in the wrong mode.&lt;/p&gt;

&lt;p&gt;Once I separated the work by mode — interactive vs autonomous — the friction dropped. CoWork handles the steady-state workflow. Claude Code handles the work that needs my judgment. Between the two, almost nothing falls through.&lt;/p&gt;

&lt;p&gt;That's not a productivity hack. It's a different mental model for how to work with AI: not "what tool should I use?" but "what role do I want to play in this task?"&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're a non-developer figuring out how to structure AI-assisted knowledge work, I'd be interested to hear how you're thinking about it.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>patterns</category>
      <category>claudecode</category>
    </item>
    <item>
      <title>The Website Is Not the Product</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:16:34 +0000</pubDate>
      <link>https://dev.to/amitrix/the-website-is-not-the-product-2pdf</link>
      <guid>https://dev.to/amitrix/the-website-is-not-the-product-2pdf</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Anthropic's crawler now retrieves 60,000 pages per visitor sent back — up from 6,000:1 six months ago. The website as distribution channel is being extracted, not visited.&lt;/li&gt;
&lt;li&gt;Three consumer types have incompatible requirements: humans want narrative, LLMs want dense signal, agents want machine-readable endpoints. Almost nobody is building for the third.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;llms.txt&lt;/code&gt; (1,000+ sites) is the right model but static — it describes content; it doesn't serve it.&lt;/li&gt;
&lt;li&gt;The emerging layer is MCP endpoints: dynamic, agent-queryable knowledge interfaces where monetization is context access, not page views.&lt;/li&gt;
&lt;li&gt;If you publish knowledge, the question is whether you're building the layer underneath the website — because that's where the next distribution advantage is.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Anthropic's crawler retrieves 60,000 pages for every one visitor it sends back to a publisher. Six months ago that ratio was 6,000:1. OpenAI's crawler is at 1,500:1. Google, which used to crawl 2 pages per visitor sent, is now at 18:1 and climbing.&lt;/p&gt;

&lt;p&gt;The website as distribution channel is not declining. It is being extracted.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Actually Happening
&lt;/h2&gt;

&lt;p&gt;AI systems are consuming the web at scale without returning the traffic that made web publishing economically viable. Cloudflare's CEO confirmed this publicly in January: search referrals have "plummeted" because users trust AI summaries and don't follow the footnotes.&lt;/p&gt;

&lt;p&gt;This is not a content quality problem. It's a distribution layer problem.&lt;/p&gt;

&lt;p&gt;The web was built for a world where humans navigated to URLs. That world is not coming back. What's replacing it is a world where AI agents retrieve context on behalf of humans — and the agents don't care about your navigation, your layout, your calls to action, or your SEO metadata. They care about whether your content is parseable, structured, and context-dense.&lt;/p&gt;

&lt;p&gt;The website is becoming the fallback UI. The primary interface is everything else.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Wrong Lesson
&lt;/h2&gt;

&lt;p&gt;The wrong lesson from this is "optimize for AI crawlers." That's SEO-era thinking applied to a post-SEO world.&lt;/p&gt;

&lt;p&gt;The right lesson is: distribution is now a systems design problem, not a marketing problem.&lt;/p&gt;

&lt;p&gt;If you publish knowledge, you now have three consumer types with incompatible requirements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Humans&lt;/strong&gt; — want narrative, context, explanation, a reason to keep reading&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLMs used as tools&lt;/strong&gt; — want structured summaries, dense signal, minimal noise&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI agents acting autonomously&lt;/strong&gt; — want machine-readable endpoints, not pages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most content strategies optimize for the first. Some are starting to acknowledge the second. Nobody is building for the third.&lt;/p&gt;

&lt;h2&gt;
  
  
  Context Engineering Is the New Distribution
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;llms.txt&lt;/code&gt; is the first concrete mechanism with traction. Over 1,000 sites now publish one — a curated, structured entry point that tells AI systems what the site contains and where to find it. The model is correct: LLM context windows can't process full websites, so a structured index is practical infrastructure.&lt;/p&gt;

&lt;p&gt;But &lt;code&gt;llms.txt&lt;/code&gt; is still a static file. It describes content. It doesn't serve it.&lt;/p&gt;

&lt;p&gt;The next layer is dynamic — an MCP endpoint that serves knowledge on demand, at the right granularity, with the right context pre-loaded. Not a website. Not a search index. A knowledge interface built for agents.&lt;/p&gt;

&lt;p&gt;The thesis: structured, machine-readable content delivered through MCP endpoints is the emerging distribution layer for knowledge. Humans subscribe through whatever surface they prefer. AI agents retrieve through the endpoint. The monetization layer is access to context — not page views.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Missing
&lt;/h2&gt;

&lt;p&gt;Nobody is building this. Substack has no &lt;code&gt;llms.txt&lt;/code&gt;. No major publication has an MCP endpoint. The entire publishing industry is optimizing for a consumption model that is already being displaced.&lt;/p&gt;

&lt;p&gt;The gap is not technical — the tools exist. &lt;code&gt;llms.txt&lt;/code&gt; is simple to implement. MCP servers take days to build. The gap is conceptual: most publishers still think the website is the product.&lt;/p&gt;

&lt;p&gt;The website is the UI. Context is the product.&lt;/p&gt;

&lt;p&gt;So what: if you publish knowledge, the question is not whether to maintain a website. It's whether you're building the layer underneath it — the one that serves agents, not browsers. That's where the next distribution advantage is. And right now, almost nobody is there.&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>contextengineering</category>
      <category>distribution</category>
      <category>mcp</category>
    </item>
    <item>
      <title>The Skill Is the Teacher</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:15:59 +0000</pubDate>
      <link>https://dev.to/amitrix/the-skill-is-the-teacher-3pjj</link>
      <guid>https://dev.to/amitrix/the-skill-is-the-teacher-3pjj</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Practitioners who run a well-designed AI skill repeatedly internalize its logic and start applying it unprompted — the skill teaches the workflow, not just automates it.&lt;/li&gt;
&lt;li&gt;Five naming frames exist: Tool, Behavior, Practice, Ritual, Role — most libraries are full of Tools, which produces users, not practitioners.&lt;/li&gt;
&lt;li&gt;Gerunds (&lt;code&gt;thesis-thinking&lt;/code&gt;) signal ongoing practice; nouns (&lt;code&gt;thesis-generator&lt;/code&gt;) signal a button to push. The form changes what users expect to compound.&lt;/li&gt;
&lt;li&gt;The design-system three-tier model (primitive → semantic → component) applies directly: primitive and semantic tiers are where behavior change lives.&lt;/li&gt;
&lt;li&gt;Audit your skills library with one question: does this name build a practitioner or a dependency?&lt;/li&gt;
&lt;/ul&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;A user extracts value once and moves on. A practitioner builds capability over time.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We've been naming AI skills like tools. That's the wrong frame — and it's quietly limiting how much value teams get from them.&lt;/p&gt;

&lt;p&gt;When someone names a skill &lt;code&gt;thesis-generator&lt;/code&gt; or &lt;code&gt;review-helper&lt;/code&gt;, they're making a framing decision that affects every person who ever uses it. They're saying: this is a vending machine. Insert task, receive output, move on. Nobody expects to grow from using a vending machine.&lt;/p&gt;

&lt;p&gt;But watch what actually happens when a skilled practitioner runs a well-designed AI workflow repeatedly. They watch the agent work through a customer thesis — asking about the customer's business model first, then the technical constraint, then the competitive risk, then the opportunity window. They see the logic. They internalize the sequence. Over weeks, they start asking those questions themselves in meetings, before they even open the tool. The skill didn't just automate a workflow. It taught them the workflow.&lt;/p&gt;

&lt;p&gt;The skill is the teacher. And if that's true, the name is the first lesson.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Learning Science Already Knows
&lt;/h2&gt;

&lt;p&gt;There's a concept in learning theory called &lt;a href="https://www.activityanalysis.net/the-concept-of-mediation/" rel="noopener noreferrer"&gt;artifact-mediated learning&lt;/a&gt;: humans acquire mental models by interacting with artifacts, not by reading instructions. The artifact — its design, its behavior, its name — teaches the mental model implicitly, through use.&lt;/p&gt;

&lt;p&gt;This has a direct implication for AI skills. The skill's name and description are often the only "interface" a user ever reads before running it. The name shapes their expectation. It tells them what to bring to the interaction and what to take away. Name it &lt;code&gt;thesis-generator&lt;/code&gt; and they bring a task and take away a document. Name it &lt;code&gt;thesis-thinking&lt;/code&gt; and they bring a question and take away a way of thinking.&lt;/p&gt;

&lt;p&gt;Same tool. Different frame. Different trajectory of internalization.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Five Naming Frames
&lt;/h2&gt;

&lt;p&gt;Not all names are created equal. There are five distinct ways to name a skill, and each one produces a different user behavior.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tool&lt;/strong&gt; (&lt;code&gt;thesis-generator&lt;/code&gt;) — "This is something I use." The framing is transactional. The user runs it when they need an output, gets what they came for, and moves on. Nothing accumulates over time. The skill remains external.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Behavior&lt;/strong&gt; (&lt;code&gt;think-in-theses&lt;/code&gt;) — "This is something I do." The framing is active. The verb signals a repeatable action, not a one-time task. Users begin to associate the skill with a way of working rather than a particular output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Practice&lt;/strong&gt; (&lt;code&gt;thesis-thinking&lt;/code&gt;) — "This is something I'm learning." The gerund form is doing real work here. It signals ongoing development, not a completed action. This is the most underused frame in skills design, and arguably the most powerful for team knowledge transfer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ritual&lt;/strong&gt; (&lt;code&gt;thesis-before-calls&lt;/code&gt;) — "This is something we do, together, at a specific moment." Ritual framing attaches the skill to a trigger — a meeting, a proposal, a weekly review. It creates shared cadence. Teams that adopt a ritual aren't using a tool; they're building a practice into their operating rhythm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Role&lt;/strong&gt; (&lt;code&gt;thesis-thinker&lt;/code&gt;) — "This is who I am." Role framing is the most ambitious. It embeds the skill into professional identity. When someone thinks of themselves as a person who thinks in theses, they apply the frame even in conversations where no AI is present.&lt;/p&gt;

&lt;p&gt;Most skills libraries today are full of tools. The opportunity is to build practices, rituals, and eventually roles.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Gerunds Work
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.behaviormodel.org" rel="noopener noreferrer"&gt;BJ Fogg's research on habit formation&lt;/a&gt; shows that behaviors become automatic when they're paired with consistent triggers and framed as ongoing practices rather than discrete tasks. The linguistic marker for an ongoing practice in English is the gerund — the &lt;code&gt;-ing&lt;/code&gt; form.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Thesis-thinking&lt;/code&gt; feels like something you do repeatedly and get better at. &lt;code&gt;Thesis-generator&lt;/code&gt; feels like a button you push.&lt;/p&gt;

&lt;p&gt;This isn't a minor stylistic choice. It changes what users expect to get from repeated use. A tool delivers a consistent output. A practice delivers compound returns — you get better, you internalize more, the cognitive scaffold gradually becomes part of how you think.&lt;/p&gt;

&lt;p&gt;When naming skills that you want teams to internalize, default to gerunds: &lt;code&gt;thinking&lt;/code&gt;, &lt;code&gt;reviewing&lt;/code&gt;, &lt;code&gt;analyzing&lt;/code&gt;, &lt;code&gt;reasoning&lt;/code&gt;. The form signals the intent.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Design System Analogy
&lt;/h2&gt;

&lt;p&gt;The best design systems use a three-tier naming hierarchy worth borrowing for skills.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Primitive tier&lt;/strong&gt; — the foundational concept, named for what it &lt;em&gt;is&lt;/em&gt;: &lt;code&gt;thesis-thinking&lt;/code&gt;, &lt;code&gt;working-backwards&lt;/code&gt;, &lt;code&gt;commitment-verification&lt;/code&gt;. These are the core mental models. The name alone signals the practice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantic tier&lt;/strong&gt; — the concept applied in context: &lt;code&gt;thesis-before-calls&lt;/code&gt;, &lt;code&gt;rigor-in-proposals&lt;/code&gt;, &lt;code&gt;architecture-in-security-reviews&lt;/code&gt;. These add a trigger or domain to the core practice, making the skill habit-ready by attaching it to an existing workflow moment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Component tier&lt;/strong&gt; — the concept applied to a specific artifact: &lt;code&gt;review-proposal-with-rigor&lt;/code&gt;, &lt;code&gt;audit-page-for-freshness&lt;/code&gt;. These are the most task-specific. They belong in a skills library but shouldn't crowd out the practices above them.&lt;/p&gt;

&lt;p&gt;Most teams build only component-tier skills. They're useful, but they don't scale into culture. The primitive and semantic tiers are where behavior change lives.&lt;/p&gt;




&lt;h2&gt;
  
  
  Working Backwards as the Gold Standard
&lt;/h2&gt;

&lt;p&gt;Amazon's Working Backwards process is the best example of a practice-named mental model that actually changed organizational behavior at scale. The name isn't &lt;code&gt;customer-requirements-generator&lt;/code&gt; or &lt;code&gt;prfaq-template&lt;/code&gt;. It's &lt;code&gt;Working Backwards&lt;/code&gt; — a description of the cognitive operation, not the output.&lt;/p&gt;

&lt;p&gt;Because the name describes the thinking pattern (start from the customer outcome, work backward to the solution), it can be applied anywhere. You can work backwards from a product decision, a hiring decision, a pricing decision. The practice generalizes because the name names the practice, not the artifact.&lt;/p&gt;

&lt;p&gt;This is exactly the model for a mature skills library. Skills named after mental models and practices generalize. Skills named after outputs don't.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Homebrew Tap Analogy
&lt;/h2&gt;

&lt;p&gt;A skills library is a tap — a curated collection of installable practices. The Homebrew tap model is instructive here not architecturally but culturally: taps are collections of maintained, versioned, trustworthy tools. When you install from a tap, you're implicitly trusting the maintainer's curation decisions.&lt;/p&gt;

&lt;p&gt;The difference between a skills tap and a prompt dump is naming and intent. A prompt dump is a vending machine: functional, transactional, unmemorable. A skills tap is a practice library: curated, versioned, designed to transfer knowledge over time.&lt;/p&gt;

&lt;p&gt;The install-and-forget pattern happens when skills are named as tools. The return-and-internalize pattern happens when skills are named as practices.&lt;/p&gt;




&lt;h2&gt;
  
  
  An Audit You Can Do Today
&lt;/h2&gt;

&lt;p&gt;If you have a shared skills or prompt library — for Claude Code, Cursor, Kiro, Copilot, or any other AI tool — take five minutes and audit the names.&lt;/p&gt;

&lt;p&gt;For each skill, ask one question: does this name make someone a practitioner, or a user?&lt;/p&gt;

&lt;p&gt;A user extracts value once and moves on. A practitioner builds capability over time, internalizes the mental model, and eventually applies the frame even without the tool.&lt;/p&gt;

&lt;p&gt;Your skills library is either building practitioners or building dependency. The name is where that choice gets made.&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>skills</category>
      <category>patterns</category>
    </item>
    <item>
      <title>The Person Is the Constant</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:15:23 +0000</pubDate>
      <link>https://dev.to/amitrix/the-person-is-the-constant-5ceb</link>
      <guid>https://dev.to/amitrix/the-person-is-the-constant-5ceb</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Voice personalization tools extract one profile from your message history — but that history contains at least six distinct registers, so the average is wrong for every context.&lt;/li&gt;
&lt;li&gt;The person is the constant; the mode is the variable. Conflating them produces output that sounds like you and fails every time.&lt;/li&gt;
&lt;li&gt;Universal identity layer: quality standards, conviction, cliché guard. Mode layer: register, sentence length, what you'd never say in this context specifically.&lt;/li&gt;
&lt;li&gt;The right architecture resolves mode automatically from context — recipient, channel, intent — with no manual selection.&lt;/li&gt;
&lt;li&gt;Build one voice layer and six mode profiles. Stop averaging.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;A blog post and a Slack DM to a close colleague should feel like the same person wrote them.&lt;/p&gt;

&lt;p&gt;One is 900 words with evidence and structure. The other is two sentences. But the thinking, the conviction, the things you'd never say — those should be identical. Because the person didn't change. The context did.&lt;/p&gt;

&lt;p&gt;This is where most voice personalization tools get it wrong. They extract a single profile from your message history and call it "your voice." But your message history contains at least six distinct registers: DMs to people you've worked with for years, emails to executives, external partner updates, published writing, technical handoffs, and Slack threads in public channels. Averaging them produces output that's recognizably shaped like you — and consistently wrong for the context.&lt;/p&gt;

&lt;p&gt;The person is the constant. The mode is the variable. The medium is irrelevant.&lt;/p&gt;

&lt;p&gt;What belongs in a single universal layer: the quality standards, the cliché guard, the conviction, the things you'd never write regardless of who's reading. These don't change by context. They're who you are.&lt;/p&gt;

&lt;p&gt;What belongs in mode-specific profiles: register, sentence length, opening patterns, vocabulary choices, what you'd never say in &lt;em&gt;this&lt;/em&gt; context specifically. A "no worries if not" close is wrong in a leadership email but fine in a casual DM. A credential opener is wrong everywhere but for different reasons in each mode.&lt;/p&gt;

&lt;p&gt;The architecture that works keeps the person constant and makes the mode a variable the infrastructure resolves automatically from context — recipient, channel, intent. No manual selection.&lt;/p&gt;

&lt;p&gt;One voice. Six modes. The person is always the same.&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>voice</category>
      <category>patterns</category>
    </item>
    <item>
      <title>The Name Is the First Lesson</title>
      <dc:creator>Amit</dc:creator>
      <pubDate>Sat, 06 Jun 2026 07:14:48 +0000</pubDate>
      <link>https://dev.to/amitrix/the-name-is-the-first-lesson-4h31</link>
      <guid>https://dev.to/amitrix/the-name-is-the-first-lesson-4h31</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The name is the entire interface before a user runs a skill; it determines whether they arrive as a consumer expecting an output or a learner developing a capability.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;thesis-generator&lt;/code&gt; produces a document and resets; &lt;code&gt;thesis-thinking&lt;/code&gt; signals ongoing development — users who run it repeatedly start doing the sequence themselves before opening the tool.&lt;/li&gt;
&lt;li&gt;Skills named after mental models generalize; skills named after outputs don't.&lt;/li&gt;
&lt;li&gt;Most skill libraries stop at the component tier (task-specific, artifact-bound); behavior change lives at the primitive and semantic tiers.&lt;/li&gt;
&lt;li&gt;Audit your library: for each skill, does the name make someone a practitioner or a user? That choice is made at naming time, not at run time.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Before a user runs a skill, they read its name. That's the entire interface. One word, maybe two, to shape what they bring to the interaction and what they expect to take away.&lt;/p&gt;

&lt;p&gt;Name a skill &lt;code&gt;thesis-generator&lt;/code&gt; and they bring a task. They leave with a document. Nothing accumulates between uses. The skill stays external — a vending machine they return to when they need the output again.&lt;/p&gt;

&lt;p&gt;Name it &lt;code&gt;thesis-thinking&lt;/code&gt; and something different happens. The gerund signals ongoing development. They bring a question. They watch how the agent works through it — what it asks first, what it checks, how it builds the argument. Over weeks, they start doing that sequence themselves before they open the tool. The skill taught them the workflow.&lt;/p&gt;

&lt;p&gt;The name is the first lesson. It decides whether the user is a consumer or a learner before they ever see the instructions.&lt;/p&gt;

&lt;p&gt;This is why design systems use three tiers: primitives (&lt;code&gt;thesis-thinking&lt;/code&gt;), semantics (&lt;code&gt;thesis-before-calls&lt;/code&gt;), components (&lt;code&gt;review-proposal-with-rigor&lt;/code&gt;). Most skills libraries stop at the component tier — task-specific, artifact-bound, useful but not memorable. The primitive and semantic tiers are where behavior change lives. Skills named after mental models generalize. Skills named after outputs don't.&lt;/p&gt;

&lt;p&gt;One audit you can run today: for each skill in your library, ask whether the name makes someone a practitioner or a user. The practitioner builds capability over time and eventually applies the frame without the tool. The user returns only when they need the output again.&lt;/p&gt;

&lt;p&gt;Your skills library is building one or the other. The name is where that choice gets made.&lt;/p&gt;

</description>
      <category>ainative</category>
      <category>agents</category>
      <category>skills</category>
      <category>patterns</category>
    </item>
  </channel>
</rss>
