<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rishabh Poddar</title>
    <description>The latest articles on DEV Community by Rishabh Poddar (@rish_poddar).</description>
    <link>https://dev.to/rish_poddar</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3272541%2F9a1ec10d-eee8-4926-a586-132f2ac7fd81.png</url>
      <title>DEV Community: Rishabh Poddar</title>
      <link>https://dev.to/rish_poddar</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rish_poddar"/>
    <language>en</language>
    <item>
      <title>Human-in-the-Loop AI Agents: Approvals, Permissions, and Audit Trails</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Fri, 26 Jun 2026 05:02:16 +0000</pubDate>
      <link>https://dev.to/rish_poddar/human-in-the-loop-ai-agents-approvals-permissions-and-audit-trails-i0f</link>
      <guid>https://dev.to/rish_poddar/human-in-the-loop-ai-agents-approvals-permissions-and-audit-trails-i0f</guid>
      <description>&lt;p&gt;Human-in-the-loop AI is a practical operating model for production systems. In this model, the AI prepares work or suggests actions, while a person checks the important steps before anything risky happens. That review is what turns AI from a confident assistant into a system you can trust in production.&lt;/p&gt;

&lt;p&gt;That matters more as agents become more capable. A chatbot that answers a question is one thing. An agent that can send messages, touch files, change records, or trigger workflows is something else entirely. Once the system can act, three questions matter: who approved it, what could it do, and what happened after it ran? Answering these questions requires three distinct controls: approvals, permissions, and audit trails.&lt;/p&gt;

&lt;h2&gt;
  
  
  What human-in-the-loop AI actually means
&lt;/h2&gt;

&lt;p&gt;Human-in-the-loop AI means the model does not get the final say on its own when the action matters. It can draft, rank, recommend, or prepare an action, but a person still reviews the result before execution.&lt;/p&gt;

&lt;p&gt;In practice, that could look like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;an agent drafts a customer reply, then a human approves it before it is sent&lt;/li&gt;
&lt;li&gt;an IT agent prepares a config change, then waits for sign-off before applying it&lt;/li&gt;
&lt;li&gt;a finance workflow gathers the data, then a reviewer confirms the payment or transfer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead of making every action manual, the goal is to keep judgment where it belongs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why approvals matter
&lt;/h2&gt;

&lt;p&gt;Approvals are the obvious part of the system, but they are also the part teams get wrong first.&lt;/p&gt;

&lt;p&gt;Without a real approval step, agentic workflows drift into “do it now, explain later.” That is fine for low-risk drafts. It is a bad idea for anything that touches customers, credentials, production systems, or money.&lt;/p&gt;

&lt;p&gt;Approvals create a pause at the moment when the system is about to cross from intent into execution. That pause does a few useful things at once:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it keeps a human accountable for the decision&lt;/li&gt;
&lt;li&gt;it reduces the chance of silent mistakes&lt;/li&gt;
&lt;li&gt;it gives the reviewer one clear place to intervene&lt;/li&gt;
&lt;li&gt;it makes the workflow easier to explain to security, legal, and operations teams&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A good approval prompt should be specific. A vague “approve this?” is not enough. The reviewer should see what the agent wants to do, why it wants to do it, and what the impact will be if it goes wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why permissions matter even more
&lt;/h2&gt;

&lt;p&gt;Approvals without permissions are only half a control system.&lt;/p&gt;

&lt;p&gt;An agent still needs to know what it is allowed to touch before it gets to the approval step. If everything is broadly available, then the approval process becomes a thin layer on top of an overpowered system.&lt;/p&gt;

&lt;p&gt;Good permissions keep the agent small by default. A research agent should not have the same reach as an ops agent. A workflow that drafts a message should not be able to delete records. A tool that reads data should not automatically inherit write access.&lt;/p&gt;

&lt;p&gt;That is the same direction we discussed in &lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt; and &lt;a href="https://dev.to/blog/ai-agent-governance-identity-security-budget-line"&gt;AI Agent Governance: Why Identity Security Is the New Budget Line&lt;/a&gt;. Once agents become real actors in your stack, identity and access stop being background details.&lt;/p&gt;

&lt;p&gt;The simplest rule is still the best one: give the agent only the access it needs for the job it is actually doing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What an audit trail should record
&lt;/h2&gt;

&lt;p&gt;While approvals pause actions and permissions set boundaries, the audit trail provides the permanent record. You need to be able to answer what the agent tried to do, who approved it, what context the reviewer saw, and whether the action really happened. If you cannot reconstruct that later, you lack true governance and rely only on trust.&lt;/p&gt;

&lt;p&gt;A useful audit trail usually captures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the agent identity&lt;/li&gt;
&lt;li&gt;the human reviewer or approver&lt;/li&gt;
&lt;li&gt;the requested action&lt;/li&gt;
&lt;li&gt;the policy or workflow that allowed it&lt;/li&gt;
&lt;li&gt;the time of review and execution&lt;/li&gt;
&lt;li&gt;the result of the action&lt;/li&gt;
&lt;li&gt;any error, override, or escalation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is especially important when the workflow touches secrets or sensitive data. If the agent is allowed to see too much, the audit trail becomes the only way to understand how a problem happened.&lt;/p&gt;

&lt;p&gt;That is one reason &lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt; matters so much. If a model can see raw credentials, the blast radius gets much bigger than most teams expect.&lt;/p&gt;

&lt;h2&gt;
  
  
  The common failure mode
&lt;/h2&gt;

&lt;p&gt;Most bad HITL systems fail in the same way: they keep the human in the loop in name only.&lt;/p&gt;

&lt;p&gt;The reviewer gets too much noise. The approval prompt is vague. The agent has too much access. The logs are hard to search. Nobody knows which decisions need escalation and which do not. Over time, the team starts clicking approve because it is easier than reading the context.&lt;/p&gt;

&lt;p&gt;That is automation bias in a nutshell.&lt;/p&gt;

&lt;p&gt;Instead of removing the human, make their job smaller and clearer.&lt;/p&gt;

&lt;h2&gt;
  
  
  A better pattern for teams
&lt;/h2&gt;

&lt;p&gt;If you are designing a workflow from scratch, start with a simple separation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the agent proposes&lt;/li&gt;
&lt;li&gt;the system checks policy&lt;/li&gt;
&lt;li&gt;a human approves the risky step&lt;/li&gt;
&lt;li&gt;the system executes with limited scope&lt;/li&gt;
&lt;li&gt;the action gets logged&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That separation makes the logic easier to reason about. It also keeps you from copying brittle approval logic into every new automation.&lt;/p&gt;

&lt;p&gt;This is the kind of pattern that teamcopilot.ai is built for. Teams can reuse a workflow once it has the right guardrails instead of rebuilding the same approval step over and over.&lt;/p&gt;

&lt;p&gt;If you want a concrete example of why this matters, &lt;a href="https://dev.to/blog/ai-coding-agent-deleted-production-database"&gt;An AI Coding Agent Deleted a Production Database. Here's What Happened and How to Prevent It&lt;/a&gt; is a good reminder that fast automation without control can become expensive very quickly.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for product teams
&lt;/h2&gt;

&lt;p&gt;For product teams, HITL must be integrated directly into the core user experience.&lt;/p&gt;

&lt;p&gt;If the approval step is too noisy, people ignore it. If the permissions are too broad, security pushes back. If the audit trail is weak, nobody trusts the workflow after the first incident. The best systems balance those three things so the workflow still feels fast.&lt;/p&gt;

&lt;p&gt;That usually means building for three types of actions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;low-risk actions that can run automatically&lt;/li&gt;
&lt;li&gt;medium-risk tasks that require a quick manual check&lt;/li&gt;
&lt;li&gt;high-risk operations that always demand explicit, multi-step approval&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once you draw that line, the design gets much clearer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this leaves the market
&lt;/h2&gt;

&lt;p&gt;The industry is moving toward autonomous systems, but success belongs to systems with well-defined limits.&lt;/p&gt;

&lt;p&gt;Governance-heavy discussions keep showing up across the market. Teams want the speed of AI, but they also want the ability to explain what happened when something goes wrong. Human-in-the-loop design is the bridge between those two needs.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is human-in-the-loop AI?
&lt;/h3&gt;

&lt;p&gt;It is an AI setup where a person reviews or approves important actions before the system executes them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is human-in-the-loop AI the same as human oversight?
&lt;/h3&gt;

&lt;p&gt;Not exactly. Human oversight is the broader idea. Human-in-the-loop is the workflow pattern that puts the human directly into the decision path.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do all AI actions need human approval?
&lt;/h3&gt;

&lt;p&gt;No. Low-risk actions can often run automatically. The point is to reserve human review for actions that are risky, irreversible, or sensitive.&lt;/p&gt;

&lt;h3&gt;
  
  
  What kinds of actions should usually require approval?
&lt;/h3&gt;

&lt;p&gt;Anything that changes production systems, moves money, sends external messages, grants access, or exposes sensitive data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why are permissions important if I already have approvals?
&lt;/h3&gt;

&lt;p&gt;Because approvals do not help much if the agent already has too much access. Permissions should shrink the blast radius before the approval step even starts.&lt;/p&gt;

&lt;h3&gt;
  
  
  What should an audit trail include for AI agents?
&lt;/h3&gt;

&lt;p&gt;It should record the agent, the reviewer, the requested action, the policy used, the time, the result, and any override or escalation.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can human-in-the-loop slow teams down?
&lt;/h3&gt;

&lt;p&gt;It can, if the workflow is poorly designed. A good HITL system reduces friction by making the review step small, specific, and easy to act on.&lt;/p&gt;

&lt;h3&gt;
  
  
  How is teamcopilot.ai relevant here?
&lt;/h3&gt;

&lt;p&gt;It gives teams a way to run reusable AI workflows with permissions, approvals, and control instead of treating every agent like an unbounded assistant.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the biggest mistake teams make with agent approvals?
&lt;/h3&gt;

&lt;p&gt;They make the approval step too vague and let the agent keep too much access. That creates noise for reviewers and risk for the system.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the simplest way to start?
&lt;/h3&gt;

&lt;p&gt;Start with one workflow, one risky action, and one clear approval step. Get the logging right, then expand from there.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>automation</category>
      <category>security</category>
    </item>
    <item>
      <title>Claude in Slack Explained: What Claude Tag Can Do, Benefits, and Downsides</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Wed, 24 Jun 2026 04:32:07 +0000</pubDate>
      <link>https://dev.to/rish_poddar/claude-in-slack-explained-what-claude-tag-can-do-benefits-and-downsides-1e91</link>
      <guid>https://dev.to/rish_poddar/claude-in-slack-explained-what-claude-tag-can-do-benefits-and-downsides-1e91</guid>
      <description>&lt;p&gt;Anthropic's Claude Tag looks simple at first glance. Put Claude inside Slack, let people tag it into threads, give it access to selected tools and data, and let it work in the same place the team is already talking.&lt;/p&gt;

&lt;p&gt;The simplicity is a bit deceptive. Once an AI agent becomes a shared teammate instead of a private chat, the questions get much sharper. What can it see? Who can ask it to do work? How much memory does it keep? How do you stop it from becoming noisy, expensive, or hard to control?&lt;/p&gt;

&lt;p&gt;This post walks through what Claude Tag does, where it is genuinely useful, where it starts to fray, and why a more model-agnostic workflow layer like teamcopilot.ai can be a better fit for teams that want tighter control.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Claude Tag is
&lt;/h2&gt;

&lt;p&gt;Claude Tag is Anthropic's Slack-native agent. According to Anthropic's &lt;a href="https://www.anthropic.com/news/introducing-claude-tag" rel="noopener noreferrer"&gt;announcement&lt;/a&gt;, you can tag &lt;code&gt;@Claude&lt;/code&gt; into a thread, give it access to the tools and data it needs, and let it work on behalf of the channel.&lt;/p&gt;

&lt;p&gt;Claude Tag acts as a shared presence directly inside your Slack channels. Everyone in the channel can monitor its progress, jump into the thread, and rely on the agent to maintain context over time.&lt;/p&gt;

&lt;p&gt;Anthropic's &lt;a href="https://www.claude.com/docs/claude-tag/overview" rel="noopener noreferrer"&gt;docs&lt;/a&gt; make the positioning even clearer. Claude Tag is meant to catch up on messy threads, pull numbers, draft PRs, prep for calls, watch channels, and keep work moving without forcing people to switch tabs.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it can do well
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Work where the conversation already happens
&lt;/h3&gt;

&lt;p&gt;This is the main win. Most teams already decide things in Slack. The problem is that the decision, the follow-up, the doc, and the action item end up scattered across different tools.&lt;/p&gt;

&lt;p&gt;Claude Tag tries to close that gap. If a thread turns into a task, you can hand the task to Claude in the same place you discussed it. That cuts out the usual copy-and-paste dance.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Keep shared context in public view
&lt;/h3&gt;

&lt;p&gt;The multiplayer part matters. A shared agent in a channel can be easier to use than a private agent hidden in one person's account because the whole team can see what was asked, what Claude did, and what is still open.&lt;/p&gt;

&lt;p&gt;It also makes handoffs less painful. If one person leaves for the day, another person can pick up the same thread without starting over.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Handle repetitive coordination work
&lt;/h3&gt;

&lt;p&gt;Claude Tag is strongest when the task is not deeply bespoke. Think summaries, status pulls, ticket drafting, call prep, channel monitoring, or chasing down a missing detail.&lt;/p&gt;

&lt;p&gt;That is the kind of work teams usually tolerate in the background and never quite automate properly.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Add proactive behavior
&lt;/h3&gt;

&lt;p&gt;Anthropic leans hard into ambient and asynchronous work here. Claude can watch, follow up, and surface things that went quiet.&lt;/p&gt;

&lt;p&gt;That is useful when the work is more like coordination than code. It is not just answering questions. It is nudging the team forward.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it gets awkward
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Slack is a constraint, not just a feature
&lt;/h3&gt;

&lt;p&gt;Slack is where many teams work, but not all teams. And even for teams that do use Slack heavily, it is still only one surface.&lt;/p&gt;

&lt;p&gt;If your work spans Slack, GitHub, docs, internal tools, and approvals, a Slack-only agent can feel like the front door to a much larger system that it does not really control.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Shared memory is useful and risky
&lt;/h3&gt;

&lt;p&gt;Memory is a benefit until it becomes stale, noisy, or wrong.&lt;/p&gt;

&lt;p&gt;The HN thread around the launch went straight to the obvious concerns: token usage, memory bloat, permissions, and whether a shared Slack agent can really know what should or should not be remembered. That is the right criticism. Team memory is only helpful if teams can control what gets retained and what gets ignored.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Permissions get complicated fast
&lt;/h3&gt;

&lt;p&gt;Anthropic has a thoughtful access model for Claude Tag, including channel-scoped identities and admin-controlled access. That is better than a naive shared bot.&lt;/p&gt;

&lt;p&gt;But the moment an agent sits in a shared channel, permissions stop being abstract. The agent has to know whose tools it can use, what data it can read, what gets logged, and what requires a human to approve.&lt;/p&gt;

&lt;p&gt;For a lot of companies, that becomes the product.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Token cost is a real concern
&lt;/h3&gt;

&lt;p&gt;Running a proactive, memory-heavy agent in a busy channel gets expensive quickly because every summary and follow-up consumes tokens. If the channel is busy, the costs can add up quickly. This is just a reminder that agent design is also cost design.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. It can feel too tied to one vendor and one model
&lt;/h3&gt;

&lt;p&gt;Using Claude Tag also ties you directly to Anthropic's ecosystem, which limits your ability to swap models or use different tools as your needs change.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where teamcopilot.ai fits
&lt;/h2&gt;

&lt;p&gt;teamcopilot.ai centers the workflow rather than the chat surface, giving you direct control over what runs, which tools the agent can touch, and when a human must step in. This approach makes it easier to stay transparent about what the agent is actually doing, not just what it said it would do.&lt;/p&gt;

&lt;p&gt;It is also model-agnostic, which matters more over time than people like to admit. The best model today is not guaranteed to be the best model for every task next quarter. If your workflow layer is separate from the model layer, you keep more flexibility and less lock-in.&lt;/p&gt;

&lt;p&gt;For teams fully committed to Anthropic's ecosystem who want a quick, Slack-native assistant, Claude Tag is a strong fit. If you need to control the underlying workflow, maintain deep transparency, and avoid vendor lock-in, teamcopilot.ai is a better choice.&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical read on the launch
&lt;/h2&gt;

&lt;p&gt;Claude Tag is not a gimmick. It is a serious attempt to make AI feel like a teammate instead of a tab.&lt;/p&gt;

&lt;p&gt;That makes it interesting, but it also highlights why the limitations matter.&lt;/p&gt;

&lt;p&gt;Once an agent becomes multiplayer, the hard problems show up faster. Memory, permissions, and auditing all become much more difficult. And if the agent is buried inside one chat app, the lock-in question becomes impossible to ignore. This doesn't make Claude Tag bad, just honest.&lt;/p&gt;

&lt;p&gt;If your team lives in Slack and wants a fast way to delegate work, it is worth trying. If your team needs more control than that, a workflow-first system like teamcopilot.ai is probably the better long-term bet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/claude-code-security-permissions-prompt-injection-and-secrets"&gt;Claude Code Security: Permissions, Prompt Injection, and Secrets&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/mcp-vs-skills-why-skills-save-context-tokens"&gt;MCP vs Skills: Why Skills Save Context Tokens&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/what-is-an-agent-loop-how-ai-agents-reason-act-and-iterate"&gt;What Is an Agent Loop? How AI Agents Reason, Act, and Iterate&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/claude-code-for-teams"&gt;How to Use Claude Code with a Team: Shared Context, Permissions, and MCP&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Claude Tag the same as Claude in Slack?
&lt;/h3&gt;

&lt;p&gt;Basically yes. Claude Tag is Anthropic's newer Slack-native way to bring Claude into a team channel as a shared agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the main benefit of Claude Tag?
&lt;/h3&gt;

&lt;p&gt;It keeps work inside the thread where the conversation already happened. That makes it easier to assign tasks, get summaries, and keep context visible to the whole team.&lt;/p&gt;

&lt;h3&gt;
  
  
  What can Claude Tag actually do?
&lt;/h3&gt;

&lt;p&gt;It can summarize threads, pull data, watch channels, draft responses, prepare call notes, open PRs, and generally handle the coordination work that usually gets lost between messages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Claude Tag only for engineers?
&lt;/h3&gt;

&lt;p&gt;No. Anthropic is clearly aiming at broader team use. Support, ops, product, sales, and admin workflows all fit the pattern if the work lives in Slack.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are the biggest downsides?
&lt;/h3&gt;

&lt;p&gt;The biggest ones are Slack lock-in, token cost, permission complexity, and the risk of letting a shared agent remember too much from too many threads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Claude Tag safe for sensitive company data?
&lt;/h3&gt;

&lt;p&gt;It is safer than a loose chatbot because Anthropic built admin-scoped identities and access controls around it. But safety still depends on how carefully the workspace is configured and what data you expose to the channel.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why do people worry about token usage?
&lt;/h3&gt;

&lt;p&gt;Because a proactive, memory-heavy agent can generate a lot of traffic in a busy workspace. Every extra summary, follow-up, and context refresh costs tokens, so the real bill depends on how the team uses it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Could Claude Tag replace a workflow platform?
&lt;/h3&gt;

&lt;p&gt;Not really. It is best thought of as a powerful interaction layer. A workflow platform handles more of the orchestration, approvals, branching logic, and auditability behind the scenes.&lt;/p&gt;

&lt;h3&gt;
  
  
  When should I choose teamcopilot.ai instead?
&lt;/h3&gt;

&lt;p&gt;Choose teamcopilot.ai if you want the agent to run controlled workflows across tools, stay model-agnostic, and make approvals and execution paths more explicit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Who should choose Claude Tag versus teamcopilot.ai?
&lt;/h3&gt;

&lt;p&gt;Teams deeply embedded in Slack who want a fast, collaborative assistant for daily coordination will get the most out of Claude Tag. On the other hand, teams that need reusable automations, strict governance, and independence from a single chat interface will find teamcopilot.ai a better fit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Should I use both?
&lt;/h3&gt;

&lt;p&gt;Sometimes, yes. Claude Tag can be the front door for quick team interaction, while teamcopilot.ai handles the more controlled automation behind it.&lt;/p&gt;

&lt;h3&gt;
  
  
  What should I watch out for before rolling out a tool like this?
&lt;/h3&gt;

&lt;p&gt;Start with access, logging, and approval paths. If you cannot explain what the agent can touch, who can invoke it, and how to review its actions, you are not ready to scale it.&lt;/p&gt;

&lt;h3&gt;
  
  
  How should a team make the final decision?
&lt;/h3&gt;

&lt;p&gt;Claude Tag is a real step toward shared, multiplayer AI work. It is useful. It is also opinionated. If that fits your team, great. If not, teamcopilot.ai gives you a cleaner way to keep the model separate from the workflow and the workflow separate from the chat surface.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>claude</category>
      <category>productivity</category>
    </item>
    <item>
      <title>MCP vs Skills: Why Skills Save Context Tokens</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Mon, 22 Jun 2026 09:54:11 +0000</pubDate>
      <link>https://dev.to/rish_poddar/mcp-vs-skills-why-skills-save-context-tokens-40na</link>
      <guid>https://dev.to/rish_poddar/mcp-vs-skills-why-skills-save-context-tokens-40na</guid>
      <description>&lt;p&gt;MCP is useful, but most of the time you do not actually need it. It gives an agent a clean way to discover tools, call APIs, and work with external systems. In practice, a skill file can describe the same usage path without dragging the whole MCP surface into context.&lt;/p&gt;

&lt;p&gt;But MCP is not free; rather than MCP itself, the real issue is the habit of loading a big MCP surface into every session, no matter what the session is actually about. Once a Claude Code or Codex run pulls in a bunch of servers, the model sees those tool definitions right away, even if the job is just writing docs or fixing a small bug. That is where the waste starts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hidden cost of always-on MCP
&lt;/h2&gt;

&lt;p&gt;Every MCP server brings metadata with it: tool names, descriptions, argument schemas, nested parameters, enums, examples, and sometimes prompts or resources. While useful, this is still context.&lt;/p&gt;

&lt;p&gt;If you connect a handful of lightweight tools, the overhead is annoying but manageable. If you connect a real stack of services, the cost compounds fast.&lt;/p&gt;

&lt;p&gt;In practice, you end up paying for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tool discovery before the task starts&lt;/li&gt;
&lt;li&gt;schema text the model may never use&lt;/li&gt;
&lt;li&gt;repeated loading across unrelated sessions&lt;/li&gt;
&lt;li&gt;extra context pressure that pushes out the actual work&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point matters more than people think. Context acts as the active working set the model uses to reason. The more of it you burn on static tool catalogs, the less room you have for the user request, the repo state, prior reasoning, and the actual answer.&lt;/p&gt;

&lt;p&gt;Anthropic has already written about this problem directly in the context of MCP. Their engineering post on code execution with MCP calls out tool-definition bloat and shows how direct tool calls can consume a lot of context before the model even starts doing the real job. The tool list is not just setup noise; it is part of the session cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why skills are cheaper
&lt;/h2&gt;

&lt;p&gt;Skills take a different path. A skill file keeps the always-loaded portion tiny. Usually that means just the skill name and a short description in the frontmatter. The detailed instructions stay in &lt;code&gt;SKILL.md&lt;/code&gt; and only load when the model actually needs them. This progressive disclosure is the whole trick:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The model sees a lightweight skill name and description up front.&lt;/li&gt;
&lt;li&gt;If the task matches, it loads the skill file.&lt;/li&gt;
&lt;li&gt;If the skill needs supporting files, those are read only when needed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For repeated operational knowledge, that is a much better tradeoff than dumping a full MCP tool surface into every session. You get the guidance when it matters, and you do not spend tokens on it when it does not.&lt;/p&gt;

&lt;p&gt;This is why skills are a better default for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;team-specific procedures&lt;/li&gt;
&lt;li&gt;prompt templates&lt;/li&gt;
&lt;li&gt;review checklists&lt;/li&gt;
&lt;li&gt;internal conventions&lt;/li&gt;
&lt;li&gt;reusable task instructions&lt;/li&gt;
&lt;li&gt;“how we do this here” knowledge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They are not trying to be live integrations. They are trying to be cheap, reusable context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Skills can replace the MCP layer
&lt;/h2&gt;

&lt;p&gt;Skills are for instructions, decision-making, and the actual usage pattern, while MCP is usually just extra protocol surface. In practice, that means skills can replace MCP for the part humans actually interact with. The model does not need a full tool catalog in context just to know how to use a service.&lt;/p&gt;

&lt;p&gt;If the agent needs to use a database, hit a SaaS API, or make authenticated requests in real time, the skill can still describe the flow clearly and keep the model on the narrow path it needs.&lt;/p&gt;

&lt;p&gt;If the agent just needs to know how your team wants it to behave, a skill is the better shape. Most of the time, that is the whole job.&lt;/p&gt;

&lt;p&gt;The mistake is to keep a heavy protocol layer around when a skill file can do the same job with far less context.&lt;/p&gt;

&lt;h2&gt;
  
  
  A simple rule
&lt;/h2&gt;

&lt;p&gt;Use skills by default.&lt;/p&gt;

&lt;p&gt;Treat MCP as optional, not foundational.&lt;/p&gt;

&lt;p&gt;That sounds obvious, but a lot of agent setups blur the line. They stuff every possible tool into every session, then wonder why the model gets slower, more expensive, and harder to steer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this looks like in practice
&lt;/h2&gt;

&lt;p&gt;If you have a service that exposes 40 or 50 MCP tools, it might be fine for a developer who uses it every day. But most sessions do not need all 50 tools. A lot of the time, the agent just needs one narrow procedure, such as looking up a user, updating a record, creating a ticket, or formatting a request safely.&lt;/p&gt;

&lt;p&gt;The skill can tell the model exactly how to handle the task, what fields matter, what not to do, and which edge cases to watch for. The model does not need a giant always-on MCP tool catalog to do that well.&lt;/p&gt;

&lt;p&gt;That is the real token saving. You stop paying for the full runtime surface when all you needed was the operating playbook.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to convert MCP into a skill
&lt;/h2&gt;

&lt;p&gt;If you have an MCP server that mostly behaves like a reusable API wrapper, you should turn the useful parts into a skill.&lt;/p&gt;

&lt;p&gt;The easiest way to inspect what you actually need is to use &lt;a href="//mcpview.teamcopilot.ai"&gt;MCPViewer tool&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here is the workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open the &lt;a href="//mcpview.teamcopilot.ai"&gt;MCPViewer tool&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Paste the MCP server URL.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Analyze&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Scroll down and click &lt;strong&gt;Download spec&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Copy the downloaded JSON.&lt;/li&gt;
&lt;li&gt;Paste it into a &lt;code&gt;SKILL.md&lt;/code&gt; file as the skill’s content reference.&lt;/li&gt;
&lt;li&gt;Set the skill description to something like &lt;code&gt;How to use APIs for &amp;lt;service name&amp;gt; service&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This flow extracts the useful service knowledge into a lighter, reusable skill that the model can load only when needed, rather than trying to preserve every tool forever.&lt;/p&gt;

&lt;p&gt;If the service changes often, keep the skill narrow and update it when the API changes. If the service is stable, the skill becomes a better long-term home for the instructions than the full MCP surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  A good pattern for teams
&lt;/h2&gt;

&lt;p&gt;For most teams, the best setup is skills everywhere, using skill files for the things that must be remembered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how to format requests&lt;/li&gt;
&lt;li&gt;how to review output&lt;/li&gt;
&lt;li&gt;team conventions&lt;/li&gt;
&lt;li&gt;approval rules&lt;/li&gt;
&lt;li&gt;safe operating procedures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If a service still needs live execution, the skill can describe that path without dragging its whole protocol surface into every session. This keeps the agent lean and makes the system easier to maintain, because procedural knowledge is no longer spread across a large tool registry.&lt;/p&gt;

&lt;p&gt;It is also easier to reason about failure. If the skill is wrong, you update instructions. If you need to change how a service is used, you update the skill. Those are different jobs, and it helps to keep them separate.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real goal: less context waste
&lt;/h2&gt;

&lt;p&gt;The problem is not just token cost in the billing sense. It is context waste. Every extra tool definition you stuff into a session is one more thing the model has to carry around while solving the actual task.&lt;/p&gt;

&lt;p&gt;Skills let you defer that cost until the model really needs the information. They are a good fit for repeated workflows, company knowledge, and reusable operating rules.&lt;/p&gt;

&lt;p&gt;If MCP is the transport, skills are the memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/claude-code-for-teams"&gt;How to Use Claude Code with a Team: Shared Context, Permissions, and MCP&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/claude-code-guide"&gt;The Complete Guide to Claude Code: Setup, Skills, Hooks, and the Agent Loop&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is MCP bad?
&lt;/h3&gt;

&lt;p&gt;MCP is not the main problem. The problem is loading it into sessions that do not need it when a skill file would do the job with far less context.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do skills replace MCP?
&lt;/h3&gt;

&lt;p&gt;Yes, for most practical cases. If the goal is to teach the agent how to use a service, a skill can replace MCP and keep the context much smaller.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why do skills save tokens?
&lt;/h3&gt;

&lt;p&gt;Because the always-loaded part is small, the model sees the skill name and description first, then loads the full &lt;code&gt;SKILL.md&lt;/code&gt; only when the skill is relevant.&lt;/p&gt;

&lt;h3&gt;
  
  
  What kind of content belongs in a skill?
&lt;/h3&gt;

&lt;p&gt;Reusable instructions, procedures, checklists, formatting rules, and team-specific guidance. If the content is mostly about how to behave, it belongs in a skill.&lt;/p&gt;

&lt;h3&gt;
  
  
  What kind of content belongs in MCP?
&lt;/h3&gt;

&lt;p&gt;Very little, unless you have a special case, as the same operational knowledge usually fits better in a skill.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I keep both MCP and skills for the same service?
&lt;/h3&gt;

&lt;p&gt;Yes. That is often the best setup. MCP handles the runtime connection. The skill handles the playbook for using it well.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why use mcpview.teamcopilot.ai?
&lt;/h3&gt;

&lt;p&gt;Because it lets you inspect the actual MCP surface before you decide what should stay as MCP and what should become a lighter skill. That makes the conversion less guessy.&lt;/p&gt;

&lt;h3&gt;
  
  
  What if the MCP spec changes?
&lt;/h3&gt;

&lt;p&gt;Update the skill the same way you would update any other documentation or wrapper. If the API changes often, keep the skill narrow so maintenance stays easy.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the best short description for a converted skill?
&lt;/h3&gt;

&lt;p&gt;Something specific and boring, such as &lt;code&gt;How to use APIs for &amp;lt;service name&amp;gt; service&lt;/code&gt;. This pattern tells the model exactly what the skill is for without wasting words.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>mcp</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Sakana AI's Fugu Explained: How the Multi-Agent Model Orchestrates Frontier LLMs</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Mon, 22 Jun 2026 05:23:12 +0000</pubDate>
      <link>https://dev.to/rish_poddar/sakana-ais-fugu-explained-how-the-multi-agent-model-orchestrates-frontier-llms-28eh</link>
      <guid>https://dev.to/rish_poddar/sakana-ais-fugu-explained-how-the-multi-agent-model-orchestrates-frontier-llms-28eh</guid>
      <description>&lt;p&gt;Sakana AI's Fugu is a good example of where the industry is heading.&lt;/p&gt;

&lt;p&gt;Instead of trying to win with one massive model, it coordinates a pool of strong models well. On the surface, Fugu is presented as a single API, but under the hood, it behaves like a learned manager that routes tasks, chooses roles, and stitches together the output of multiple frontier models. This makes Fugu a multi-agent orchestration system delivered as a single model, rather than just a chatbot with a nicer prompt.&lt;/p&gt;

&lt;p&gt;A lot of the messy work in production AI comes from orchestration: choosing the right model, deciding when to verify, splitting a task into subtasks, and avoiding expensive calls when a cheaper one will do. Fugu turns that problem into the product.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Fugu actually is
&lt;/h2&gt;

&lt;p&gt;Sakana AI describes Fugu as a multi-agent system as a model. You send one request to a single endpoint, and Fugu decides how to distribute the work across a pool of specialist models.&lt;/p&gt;

&lt;p&gt;That pool is not locked to a single vendor. The system can dynamically assemble agents, coordinate them, and even let users opt out of specific models or providers to fit privacy, data, or compliance requirements. The goal is to keep the API simple while making the backend coordination much smarter than a hand-built router.&lt;/p&gt;

&lt;p&gt;There are two public variants:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fugu, which balances latency and quality&lt;/li&gt;
&lt;li&gt;Fugu Ultra, which uses a deeper pool of agents for harder tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This split is useful because not every task deserves the most expensive path. A lot of day-to-day coding, review, and internal support work needs a fast default. More difficult tasks, like deep reasoning, paper reproduction, or security analysis, can justify a heavier orchestration setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;The basic workflow is different from a normal single-model call. First, the incoming task is routed into a learned coordination process. Fugu decides which agents should participate, what role each one should play, and how the exchange should proceed. The system learns collaboration patterns that are not obvious to a human operator, but work well in practice.&lt;/p&gt;

&lt;p&gt;Fugu is grounded in two ICLR 2026 papers: TRINITY and Conductor. TRINITY uses a lightweight evolved coordinator that assigns roles like Thinker, Worker, and Verifier across a multi-turn task. Conductor learns natural-language coordination strategies with reinforcement learning. Together, they show that instead of hand-designing every workflow, you can train a system to discover how to orchestrate other models. This points to a broader shift: while the last wave of AI progress focused on making single models stronger, this wave is about making model systems smarter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the orchestration layer matters
&lt;/h2&gt;

&lt;p&gt;Most teams already know that different models are good at different things. While one model might excel at code, others are better suited for long reasoning or factual retrieval. In a hand-built stack, someone has to decide when to call which model, how to verify the output, and when to stop paying for more inference. Fugu tries to learn those decisions instead of hard-coding them.&lt;/p&gt;

&lt;p&gt;This approach improves cost-performance. If the system can route easy subtasks to lighter agents and reserve heavier agents for the hard parts, the overall result can be better than sending every request to the most expensive model in the pool.&lt;/p&gt;

&lt;p&gt;It also improves reliability. A lot of failures in agentic systems happen because orchestration is brittle. When one model does everything, a single mistake ripples through the whole chain. Fugu's design reduces that risk by using specialists and verification roles more deliberately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Fugu versus Fugu Ultra
&lt;/h2&gt;

&lt;p&gt;The difference between the two variants is mostly about how much orchestration you want to pay for.&lt;/p&gt;

&lt;p&gt;Fugu is the balanced option, designed as the practical default for coding, interactive work, and general workloads where latency still matters.&lt;/p&gt;

&lt;p&gt;Fugu Ultra goes further, with Sakana positioning it for more complex, high-stakes, multi-step work where answer quality matters more than speed. The examples they highlight include paper reproduction, Kaggle competitions, security analysis, literature review, and patent research.&lt;/p&gt;

&lt;p&gt;This framing shows what the product is really for. Fugu is not just a better chat model; it is a system for tasks where the model has to reason, delegate, verify, and even disagree with itself before it answers.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the benchmarks suggest
&lt;/h2&gt;

&lt;p&gt;Sakana reports strong performance across coding, reasoning, science, and agentic benchmarks. Fugu and Fugu Ultra compare well with publicly available frontier models, sometimes sitting right alongside or ahead of them.&lt;/p&gt;

&lt;p&gt;The benchmarks they call out include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SWE-Pro for coding&lt;/li&gt;
&lt;li&gt;TerminalBench for terminal and tool use&lt;/li&gt;
&lt;li&gt;LiveCodeBench and LiveCodeBench Pro&lt;/li&gt;
&lt;li&gt;Humanity's Last Exam for hard reasoning&lt;/li&gt;
&lt;li&gt;GPQA-D for scientific reasoning&lt;/li&gt;
&lt;li&gt;SciCode&lt;/li&gt;
&lt;li&gt;Long-context reasoning&lt;/li&gt;
&lt;li&gt;MRCRv2&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The exact numbers matter less than the pattern. Rather than claiming to be a single monolithic model, Fugu demonstrates that orchestration itself can produce frontier-level results on difficult tasks.&lt;/p&gt;

&lt;p&gt;Their qualitative examples make that point even more clearly. Sakana shows Fugu on tasks like autonomous research, classical Japanese reading-order recovery, Rubik's Cube solving, CAD generation for a mechanical iris, blindfold chess, and trading simulations. These environments are very different, but they all reward a system that can choose the right internal strategy instead of guessing once and hoping for the best.&lt;/p&gt;

&lt;h2&gt;
  
  
  The product details that matter
&lt;/h2&gt;

&lt;p&gt;Fugu is delivered through an OpenAI-compatible API, which means teams do not need to rebuild their integration layer to try it. If you already have a client, a harness, or an internal agent stack that talks to an OpenAI-style endpoint, Fugu slots in without much friction.&lt;/p&gt;

&lt;p&gt;Sakana offers both subscription and pay-as-you-go plans. The pay-as-you-go model avoids stacking fees across every model in the pool; you pay a single rate based on the top-tier model involved in the configured pool. This makes orchestration financially viable instead of prohibitively expensive.&lt;/p&gt;

&lt;p&gt;One limitation: Fugu is not yet available in the EU/EEA while Sakana works toward compliance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is a bigger product than it looks like
&lt;/h2&gt;

&lt;p&gt;At first glance, Fugu sounds like a very good router, but that description undersells it. The deeper idea is that model orchestration itself is becoming a first-class capability. If that holds, the value is not only in better benchmark scores, but in turning a pile of expensive, specialized models into a single system that a team can use without hand-tuning workflows from scratch.&lt;/p&gt;

&lt;p&gt;The system is useful for real teams because it hides just enough complexity to make multi-model workflows practical.&lt;/p&gt;

&lt;p&gt;There is also a strategic angle. Relying on one provider for every critical task is a risk. A learned orchestration layer that can route around constraints, swap agents, or exclude a provider reduces that dependency. Sakana is clearly leaning into that idea.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where teamcopilot.ai fits
&lt;/h2&gt;

&lt;p&gt;teamcopilot.ai is a shared control layer for AI workflows, permissions, and approvals. That makes it a natural fit for a system like Fugu. If Fugu is the orchestration engine for a task, teamcopilot.ai is the governance layer around it. You can route work through reusable workflows, keep approvals visible, and decide who can do what before the model ever touches the task. Production AI requires making models safe, repeatable, and shareable across a team.&lt;/p&gt;

&lt;p&gt;Related reads:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ai-agent-platform-comparison"&gt;Best AI Agent Platforms for Teams in 2026: Comparing 13 Tools&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/claude-code-for-teams"&gt;How to Use Claude Code with a Team: Shared Context, Permissions, and MCP&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The tradeoffs
&lt;/h2&gt;

&lt;p&gt;Fugu is impressive, but it has tradeoffs. Latency will always be part of the conversation when a system calls into multiple models or multiple agent steps. If you need instant responses for a live UI, a simpler single-model path may still win.&lt;/p&gt;

&lt;p&gt;The routing logic is also proprietary. Sakana does not expose the exact internal selection process, so you get the benefits of orchestration without full visibility into every decision. Additionally, while the standard Fugu allows opt-outs, Fugu Ultra uses the full agent pool. If you need strict control over every provider in the loop, that is worth keeping in mind.&lt;/p&gt;

&lt;p&gt;Still, these are normal tradeoffs for a new product category. The real test is whether the system earns that complexity back with better results.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bigger takeaway
&lt;/h2&gt;

&lt;p&gt;Fugu is a sign that the market is moving from single-model thinking to system thinking. That change is easy to miss if you only look at raw benchmark numbers, but the product story is clear. Sakana AI is betting that the most useful AI systems will be coordinated pools of models, with a learned layer deciding how to use them. Many teams are already heading in this direction manually, and Fugu simply makes the orchestration layer explicit.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is Sakana Fugu?
&lt;/h3&gt;

&lt;p&gt;Sakana Fugu is a multi-agent orchestration system presented as a single model API. It coordinates a pool of frontier models instead of relying on one model to do everything.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Fugu a model or a product?
&lt;/h3&gt;

&lt;p&gt;It is both. Sakana exposes it as a model API, but the real value is in the orchestration system behind it.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the difference between Fugu and Fugu Ultra?
&lt;/h3&gt;

&lt;p&gt;Fugu is the balanced, lower-latency option. Fugu Ultra uses a deeper agent pool for harder, higher-stakes tasks where quality matters more than speed.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does Fugu work?
&lt;/h3&gt;

&lt;p&gt;It routes tasks across multiple specialist models, assigns roles, and coordinates the response. The research behind it comes from TRINITY and Conductor.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why not just call one frontier model directly?
&lt;/h3&gt;

&lt;p&gt;Because different models excel at different tasks. Fugu decides when to delegate, verify, or switch strategies instead of making one model carry the whole load.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I control which models Fugu uses?
&lt;/h3&gt;

&lt;p&gt;Yes, for Fugu. Sakana lets you opt out of specific models or providers to fit privacy, data, or compliance needs. Fugu Ultra uses the full pool.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Fugu OpenAI-compatible?
&lt;/h3&gt;

&lt;p&gt;Yes. It fits into existing clients and agent stacks without requiring a major integration rewrite.&lt;/p&gt;

&lt;h3&gt;
  
  
  What tasks is Fugu best for?
&lt;/h3&gt;

&lt;p&gt;Coding, reasoning, research, security analysis, paper reproduction, and other multi-step workflows where orchestration matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Fugu good for real-time apps?
&lt;/h3&gt;

&lt;p&gt;Not necessarily. The more agents you coordinate, the more latency becomes a factor, so it may not be ideal for instant responses.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Fugu show which underlying models it used?
&lt;/h3&gt;

&lt;p&gt;No. Sakana treats the exact routing logic as proprietary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can teams use Fugu safely?
&lt;/h3&gt;

&lt;p&gt;Yes, if the surrounding workflow is controlled. Approval layers, audit trails, and secret handling are essential for making any model safe and useful in a team setting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why should teams care about orchestration at all?
&lt;/h3&gt;

&lt;p&gt;Because orchestration is where real productivity wins happen. Choosing the right model for the right subtask can matter as much as choosing the model itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where does teamcopilot.ai fit in?
&lt;/h3&gt;

&lt;p&gt;teamcopilot.ai provides a shared control layer for AI workflows, permissions, and approvals, making it easy to run systems like Fugu inside a governed, reusable process.&lt;/p&gt;

&lt;h3&gt;
  
  
  Will Fugu replace single-model workflows?
&lt;/h3&gt;

&lt;p&gt;Not entirely. Simple tasks are still better served by a single call, but harder workflows that benefit from delegation and verification will increasingly rely on systems like Fugu.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>llm</category>
    </item>
    <item>
      <title>What Is an Agent Loop? How AI Agents Reason, Act, and Iterate</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Sun, 21 Jun 2026 05:24:07 +0000</pubDate>
      <link>https://dev.to/rish_poddar/what-is-an-agent-loop-how-ai-agents-reason-act-and-iterate-1675</link>
      <guid>https://dev.to/rish_poddar/what-is-an-agent-loop-how-ai-agents-reason-act-and-iterate-1675</guid>
      <description>&lt;p&gt;People keep talking about agent loops because they make an AI agent actually do useful work instead of just sounding smart.&lt;/p&gt;

&lt;p&gt;Without a loop, a model answers a question and stops. With a loop, it can keep going: analyze the task, take action, inspect the result, and decide what to do next. That is the basic shape of agentic AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  The short version
&lt;/h2&gt;

&lt;p&gt;An agent loop is an iterative cycle that usually looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Understand the goal&lt;/li&gt;
&lt;li&gt;Gather context&lt;/li&gt;
&lt;li&gt;Decide on the next action&lt;/li&gt;
&lt;li&gt;Use a tool or API&lt;/li&gt;
&lt;li&gt;Observe the result&lt;/li&gt;
&lt;li&gt;Repeat until the task is done or the agent should stop&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;An agent is built to act, check the outcome of its actions, and adjust its course until the job is done.&lt;/p&gt;

&lt;p&gt;If you want to see how this idea shows up in a broader production setting, our post on &lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt; is a good companion read. The loop acts as the engine, while the control plane keeps it from driving through a wall.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the loop matters
&lt;/h2&gt;

&lt;p&gt;A single model call works fine for simple tasks, but falls short when the work involves multiple steps, dependencies, and feedback.&lt;/p&gt;

&lt;p&gt;Say you ask an agent to research a vendor, compare pricing, draft a summary, and update a ticket. That is not one answer, but a sequence of actions with a check after each step. The loop is what lets the system recover when something changes halfway through.&lt;/p&gt;

&lt;p&gt;That is why the loop matters more than the prompt itself. While the prompt starts the work, the loop keeps it honest.&lt;/p&gt;

&lt;h2&gt;
  
  
  ReAct is the pattern behind it
&lt;/h2&gt;

&lt;p&gt;Most explanations of agent loops eventually land on ReAct, short for Reason + Act. This pattern encourages a model to alternate between thinking and doing instead of trying to solve everything in one shot.&lt;/p&gt;

&lt;p&gt;The model reasons about what to do next, takes an action, sees what happened, and then reasons again. This simple loop is why agent frameworks keep converging on the same basic shape even when the tooling changes.&lt;/p&gt;

&lt;p&gt;You can see that logic in posts like &lt;a href="https://dev.to/blog/claude-code-guide"&gt;The Complete Guide to Claude Code: Setup, Skills, Hooks, and the Agent Loop&lt;/a&gt; and &lt;a href="https://dev.to/blog/coding-agent-best-practices-how-to-set-up-ai-agents-securely-and-productively"&gt;Coding Agent Best Practices: How to Set Up AI Agents Securely and Productively&lt;/a&gt;. Once you have a loop, the real work becomes deciding what the agent is allowed to touch while it runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a good loop needs
&lt;/h2&gt;

&lt;p&gt;A good loop is more than just a while loop in code. It needs practical limits to stay useful:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A clear stopping condition&lt;/li&gt;
&lt;li&gt;Predictable tool calls&lt;/li&gt;
&lt;li&gt;State that survives each iteration&lt;/li&gt;
&lt;li&gt;A way to verify progress&lt;/li&gt;
&lt;li&gt;A cost or step limit so it does not run forever&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If those pieces are missing, the loop can get noisy fast. The agent may keep trying the same thing, burn tokens, or wander into actions that were never part of the job.&lt;/p&gt;

&lt;p&gt;That is where the risks start to show up. The loop can amplify mistakes just as easily as it can amplify productivity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where agent loops go wrong
&lt;/h2&gt;

&lt;p&gt;The most common failure is simple: the loop never really knows when to stop.&lt;/p&gt;

&lt;p&gt;If the task is vague, the agent keeps guessing. If the tools are too broad, it can take the wrong action with confidence. If the verification step is weak, the loop can keep repeating a bad plan and make it worse each time.&lt;/p&gt;

&lt;p&gt;That is also where production incidents happen. An agent with write access, weak guardrails, and no approvals can cause real damage quickly. We covered one version of that in &lt;a href="https://dev.to/blog/ai-coding-agent-deleted-production-database"&gt;An AI Coding Agent Deleted a Production Database. Here's What Happened and How to Prevent It&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The lesson is not to avoid loops, but to wrap them in boundaries.&lt;/p&gt;

&lt;h2&gt;
  
  
  Human in the loop is not optional for everything
&lt;/h2&gt;

&lt;p&gt;Low-risk work can run on its own, but high-risk actions require a human checkpoint. You might let an agent draft a summary, fetch files, or propose a change. But if it needs to delete data, send money, change permissions, or touch production, a person should approve the step.&lt;/p&gt;

&lt;p&gt;This is where teamcopilot.ai fits in. It gives teams a way to run agents with permissions, approvals, secret handling, and audit trails around the loop, keeping the process transparent.&lt;/p&gt;

&lt;p&gt;For a deeper look at security, &lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt; explains how to handle secrets safely.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this looks like in practice
&lt;/h2&gt;

&lt;p&gt;In practice, an agent loop usually needs three layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A reasoning model to plan the task&lt;/li&gt;
&lt;li&gt;Tools that let the agent take action&lt;/li&gt;
&lt;li&gt;Guardrails to define what is allowed and what needs human review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That makes the loop useful for things like research, code review, routing, summarization, and repetitive workflow work. It also makes the loop fragile if you skip the guardrails and assume the model will stay on task by default.&lt;/p&gt;

&lt;p&gt;The best teams treat the loop as a structured workflow engine rather than a black box.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why teams care now
&lt;/h2&gt;

&lt;p&gt;Teams want work to move, not another chat window. This is especially true when dealing with repeated decisions, messy handoffs, and routine approvals. A loop can cut out a lot of manual repetition, provided the system is designed to stop, check, and continue in the right places.&lt;/p&gt;

&lt;p&gt;The real value lies in a repeatable system that can work, fail, recover, and keep going. It goes beyond a fancy demo or a one-time prompt.&lt;/p&gt;

&lt;p&gt;For a broader comparison of the platforms trying to do this for teams, see &lt;a href="https://dev.to/blog/ai-agent-platform-comparison"&gt;Best AI Agent Platforms for Teams in 2026: Comparing 13 Tools&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is an agent loop in AI?
&lt;/h3&gt;

&lt;p&gt;An agent loop is the repeatable cycle an AI agent uses to reason about a task, take an action, observe the result, and decide what to do next.&lt;/p&gt;

&lt;h3&gt;
  
  
  How is an agent loop different from a chatbot?
&lt;/h3&gt;

&lt;p&gt;A chatbot usually gives one response and stops. An agent loop keeps going until the task is complete or a stopping condition is reached.&lt;/p&gt;

&lt;h3&gt;
  
  
  What does ReAct mean?
&lt;/h3&gt;

&lt;p&gt;ReAct means Reason + Act. It is the common pattern behind agent loops, where the model alternates between thinking and tool use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why do agent loops need guardrails?
&lt;/h3&gt;

&lt;p&gt;Because a loop can repeat mistakes as easily as it repeats good decisions. Guardrails help control tool access, approvals, retries, and stopping conditions.&lt;/p&gt;

&lt;h3&gt;
  
  
  When should a human stay in the loop?
&lt;/h3&gt;

&lt;p&gt;For anything high-stakes, irreversible, or sensitive. That includes production changes, permissions, financial actions, and anything that touches secrets.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can an agent loop run forever?
&lt;/h3&gt;

&lt;p&gt;Yes, if you do not set clear stop conditions. Good loops include step limits, confidence checks, or approval checkpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the biggest risk with agent loops?
&lt;/h3&gt;

&lt;p&gt;Uncontrolled tool access. If the agent can act freely without review, a small mistake can turn into a real incident.&lt;/p&gt;

&lt;h3&gt;
  
  
  Are agent loops only for coding agents?
&lt;/h3&gt;

&lt;p&gt;No. They show up in research, support, operations, workflow automation, and anything else that needs repeated decisions.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does TeamCopilot use the idea of a loop?
&lt;/h3&gt;

&lt;p&gt;teamcopilot.ai adds permissions, approvals, secret handling, and workflow control around the loop so teams can use agents without giving them blanket access.&lt;/p&gt;

&lt;h3&gt;
  
  
  What should I read next?
&lt;/h3&gt;

&lt;p&gt;Start with &lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt; and &lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt;. Those two posts cover the control and security side of the same problem.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>automation</category>
      <category>llm</category>
    </item>
    <item>
      <title>Claude Code Security: Permissions, Prompt Injection, and Secrets</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Fri, 19 Jun 2026 04:12:05 +0000</pubDate>
      <link>https://dev.to/rish_poddar/claude-code-security-permissions-prompt-injection-and-secrets-fel</link>
      <guid>https://dev.to/rish_poddar/claude-code-security-permissions-prompt-injection-and-secrets-fel</guid>
      <description>&lt;p&gt;Claude Code is useful because it can actually do things. It can inspect a repo, follow instructions, run commands, and move work forward without turning every change into a copy-paste exercise. That is also where the security question starts. Once an agent can read files and execute actions, the real issue is not how clever it is, but what it can access and how much damage a bad input can do before anyone notices.&lt;/p&gt;

&lt;p&gt;Most Claude Code security problems start quietly. An agent might read a file it shouldn't, or run a command that exposes a secret. Sometimes a repository contains instructions meant for a human that the agent accidentally executes. Because nothing looks dramatic at first, the eventual damage is often much larger than it should be.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real security problem is exposure, not intelligence
&lt;/h2&gt;

&lt;p&gt;People often talk about coding agents as if the danger is that they might "think wrong." However, the real problem is access. If Claude Code can read your repo, shell history, environment variables, local config, and connected tools, then any bad instruction it encounters has a lot more room to cause trouble. The model does not need to be malicious for something to go wrong. It only needs to be nudged in the wrong direction while holding too much power. Claude Code security is really about boundaries. Clean boundaries make bad mistakes smaller.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt injection is the messiest part
&lt;/h2&gt;

&lt;p&gt;Prompt injection happens when untrusted text steers the agent. This text can come from issue comments, READMEs, pasted chat logs, build artifacts, webpages, or other tool outputs. If the agent treats this text as instructions rather than data, it can be tricked. This is a practical problem because agent workflows constantly ask models to summarize, inspect, or act on external content. The simplest defense is to keep untrusted content separate from trusted instructions so the agent never blurs them together. If you want a deeper look at this problem, &lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt; is a direct companion piece.&lt;/p&gt;

&lt;h2&gt;
  
  
  Secrets are the easy target
&lt;/h2&gt;

&lt;p&gt;If prompt injection is the steering wheel, secrets are the gas tank. When an agent can read raw API keys, tokens, or long-lived credentials, a small mistake gets expensive fast. The risks include theft, accidental exposure, over-broad access, and treating credentials like ordinary data. The rule is boring but effective: the model should only access secret names. That means keeping raw &lt;code&gt;.env&lt;/code&gt; files out of the agent's line of sight, avoiding copying production credentials into tasks for convenience, and never assuming that redacting logs later is enough. Once a secret enters the context, the damage is done. Teams often get sloppy here, but a harmless-looking task like fetching a file or explaining a build can easily carry credentials into places they do not belong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Permissions should be narrow by default
&lt;/h2&gt;

&lt;p&gt;Claude Code gets more useful when it can act, but every permission you add should answer a real need. Keep tasks read-only if they only need to inspect files, and limit the writable surface when modifying code. You should also define exactly why network access is required and restrict secrets to specific, short-lived needs. Treating permissions as a one-time setup is a mistake; they are an ongoing part of the job. The best setups stay specific, giving the agent only the minimum access needed for the immediate task. There is a useful parallel here with &lt;a href="https://dev.to/blog/claude-code-for-teams"&gt;How to Use Claude Code with a Team: Shared Context, Permissions, and MCP&lt;/a&gt;, even if you are working alone. Once the access model gets vague, risk starts to climb.&lt;/p&gt;

&lt;h2&gt;
  
  
  Safer defaults are not optional
&lt;/h2&gt;

&lt;p&gt;Security gets easier when the default mode is conservative. Claude Code should ask before making meaningful changes, warn before touching sensitive files, and fail closed rather than open. A dangerous setup allows the agent to drift from useful to risky without a clear checkpoint, turning small tasks into large surprises. The recent Claude Code security changes from Anthropic point in the right direction with tighter edit behavior, explicit security guidance, and clearer boundaries. A mature tool should help you work while making dangerous paths harder to take.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a safer Claude Code setup looks like
&lt;/h2&gt;

&lt;p&gt;You do not need a huge policy document to improve security. A few habits do most of the work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep raw secrets out of the agent context.&lt;/li&gt;
&lt;li&gt;Use separate environments for exploratory work and sensitive work.&lt;/li&gt;
&lt;li&gt;Do not let the agent run with broad production access.&lt;/li&gt;
&lt;li&gt;Treat unknown text strictly as data.&lt;/li&gt;
&lt;li&gt;Require review before anything destructive or irreversible.&lt;/li&gt;
&lt;li&gt;Keep commands, diffs, and approvals visible.&lt;/li&gt;
&lt;li&gt;Rotate credentials if there is any chance they were exposed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These basic habits keep problems small. For a broader guide, read &lt;a href="https://dev.to/blog/coding-agent-best-practices-how-to-set-up-ai-agents-securely-and-productively"&gt;Coding Agent Best Practices: How to Set Up AI Agents Securely and Productively&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where teamcopilot.ai fits
&lt;/h2&gt;

&lt;p&gt;If you are using Claude Code for shared work, teamcopilot.ai provides guardrails without slowing down your workflow. It keeps raw secrets out of the model, simplifies permission boundaries, and requires approval for silent actions. That is not a replacement for judgment, but it keeps the judgment point where it belongs. This setup is especially useful when the same agent is used by more than one person, which is usually when vague access turns into a real problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The security mindset that actually holds up
&lt;/h2&gt;

&lt;p&gt;Claude Code security is not about making the agent perfect. The right goal is to make mistakes smaller. When an agent is tricked by a bad prompt, the damage must be limited. Seeing an unsafe file shouldn't grant access to everything else, and secrets should only be provided at the exact moment they are needed. This approach builds a tool you can actually trust, rather than one that just feels powerful.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is the biggest Claude Code security risk?
&lt;/h3&gt;

&lt;p&gt;The biggest risk comes from what the model is allowed to access. Broad file access, raw secrets, and unchecked tool use create most of the real exposure.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is prompt injection in Claude Code?
&lt;/h3&gt;

&lt;p&gt;Prompt injection is when untrusted text tries to influence the agent's behavior. It can appear in files, web pages, issue comments, command output, or other content the agent reads.&lt;/p&gt;

&lt;h3&gt;
  
  
  Should Claude Code be allowed to read &lt;code&gt;.env&lt;/code&gt; files?
&lt;/h3&gt;

&lt;p&gt;Not by default. If the agent can read raw secrets, then those secrets can be exposed to the context window and mishandled later.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is redacting logs enough to secure Claude Code?
&lt;/h3&gt;

&lt;p&gt;No. Redaction helps with visibility after the fact, but it does not stop the model from seeing the secret in the first place.&lt;/p&gt;

&lt;h3&gt;
  
  
  How should permissions be set up?
&lt;/h3&gt;

&lt;p&gt;Use the smallest useful set. Read-only for read-only tasks, narrow write access for edits, and explicit approval for anything risky or irreversible.&lt;/p&gt;

&lt;h3&gt;
  
  
  What should I do if Claude Code reads something untrusted?
&lt;/h3&gt;

&lt;p&gt;Treat that content strictly as data. If there is any doubt, stop the task, review what was read, and rerun the work with tighter boundaries.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can prompt instructions alone secure Claude Code?
&lt;/h3&gt;

&lt;p&gt;No. Instructions help, but they are not a security boundary. Real safety comes from permissions, secret handling, and approval gates.&lt;/p&gt;

&lt;h3&gt;
  
  
  When does Claude Code need human approval?
&lt;/h3&gt;

&lt;p&gt;Anything that can change access, secrets, production config, billing, or deployment boundaries should have a human checkpoint.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I know if my setup is too permissive?
&lt;/h3&gt;

&lt;p&gt;If one prompt can reach too many files, too many tools, or too many credentials, the setup is probably too loose.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Claude Code safe for solo use?
&lt;/h3&gt;

&lt;p&gt;It can be, if you keep the same basics in place: scoped secrets, narrow permissions, careful input handling, and review before risky actions.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does teamcopilot.ai help here?
&lt;/h3&gt;

&lt;p&gt;It gives you a way to keep secrets, permissions, and approvals under control so the agent stays useful without seeing everything or touching everything.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the simplest good security rule?
&lt;/h3&gt;

&lt;p&gt;Do not give the agent more access than the task needs.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>claude</category>
      <category>security</category>
    </item>
    <item>
      <title>SpaceX Acquires Cursor Maker Anysphere to Build an AI Coding Agent Model</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Wed, 17 Jun 2026 04:18:40 +0000</pubDate>
      <link>https://dev.to/rish_poddar/spacex-acquires-cursor-maker-anysphere-to-build-an-ai-coding-agent-model-1f85</link>
      <guid>https://dev.to/rish_poddar/spacex-acquires-cursor-maker-anysphere-to-build-an-ai-coding-agent-model-1f85</guid>
      <description>&lt;p&gt;SpaceX's acquisition of Anysphere, the maker of Cursor, signals a major shift in how we build software.&lt;/p&gt;

&lt;p&gt;Coding agents started as simple helper panels inside IDEs. Now, they are becoming critical infrastructure.&lt;/p&gt;

&lt;p&gt;The companies that win this space won't just have the slickest demos. They will own the developer's workflow, the underlying models, the distribution, and the compute.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this deal matters
&lt;/h2&gt;

&lt;p&gt;Cursor grew popular by letting developers write, edit, and refine code in plain language without ever leaving their editor. This eliminates context switching and speeds up development.&lt;/p&gt;

&lt;p&gt;Once a tool becomes this deeply embedded in a team's daily routine, it stops being a feature and becomes a core layer of the software stack.&lt;/p&gt;

&lt;p&gt;With this acquisition, SpaceX secures direct control over a critical developer workflow. Reuters reports that the deal includes plans to deepen model training, suggesting a long-term goal of reducing reliance on third-party model providers.&lt;/p&gt;

&lt;p&gt;The competition is shifting from building the best chat interface to owning the entire coding system-the editor, the agent, the model, the compute, and the developer relationship.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coding agents are becoming the new platform layer
&lt;/h2&gt;

&lt;p&gt;A few years ago, developer tooling centered on the IDE, the package manager, and the CI pipeline. Today, coding agents span all three.&lt;/p&gt;

&lt;p&gt;They can open files, modify code, run tests, call APIs, and manage multi-step workflows. By sitting inside the daily workflow of thousands of developers, the company behind the agent controls far more than simple autocomplete.&lt;/p&gt;

&lt;p&gt;This is why the industry is paying close attention. Coding agents are no longer valued merely as productivity helpers; they represent ownership of developer attention, workflows, and proprietary data.&lt;/p&gt;

&lt;p&gt;The market is crowding quickly. Anthropic has Claude Code, OpenAI has Codex, and Google is building its own alternatives. The pressure has moved from the agent itself to the surrounding stack, with everyone fighting for daily developer habits.&lt;/p&gt;

&lt;p&gt;For a broader look at this shift, our guides on &lt;a href="https://dev.to/blog/claude-code-for-teams"&gt;How to Use Claude Code with a Team: Shared Context, Permissions, and MCP&lt;/a&gt; and &lt;a href="https://dev.to/blog/coding-agent-best-practices-how-to-set-up-ai-agents-securely-and-productively"&gt;Coding Agent Best Practices&lt;/a&gt; cover the operational side of this trend.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it means for developers
&lt;/h2&gt;

&lt;p&gt;For developers, the outlook is mixed.&lt;/p&gt;

&lt;p&gt;On one hand, heavy investment brings faster product updates, more compute, better model quality, and tighter integration. If Cursor trains its own models to reduce reliance on third-party APIs, developers could see better speed, consistency, and pricing.&lt;/p&gt;

&lt;p&gt;But developers also care deeply about trust. Can the agent safely touch a production codebase? Can you audit its actions when something breaks? Can your team control its access and visibility?&lt;/p&gt;

&lt;p&gt;As agents become more central, these questions grow critical. A fast agent without governance isn't a helpful teammate-it is just a faster way to introduce risk.&lt;/p&gt;

&lt;p&gt;Security is a core product feature, not an afterthought. We explored this in &lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt; and &lt;a href="https://dev.to/blog/ai-coding-agent-deleted-production-database"&gt;An AI Coding Agent Deleted a Production Database&lt;/a&gt;. Powerful agents need tighter boundaries than traditional software tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the model question is the real story
&lt;/h2&gt;

&lt;p&gt;The model strategy is the real story here: if Cursor and SpaceX train more of their models in-house, they gain control over the core intelligence of the product, not just the user interface. While expensive, this is the only way to truly differentiate.&lt;/p&gt;

&lt;p&gt;This acquisition is a shortcut to vertical integration in the AI coding market. If every coding tool relies on the same foundation models, the product layer commoditizes quickly. The real competitive advantage shifts to distribution, workflow lock-in, and proprietary data. A custom, code-tuned model allows a company to capture far more of the value chain.&lt;/p&gt;

&lt;p&gt;For developers, this means better tools. For startups, it raises the bar. The next generation of coding assistants must offer deep workflow integration, robust security, or a distinct distribution advantage rather than just wrapping an existing API in a new UI.&lt;/p&gt;

&lt;h2&gt;
  
  
  The industry gets more serious, faster
&lt;/h2&gt;

&lt;p&gt;This deal signals that the coding agent category is maturing. Early-stage markets chase raw growth. Mature markets focus on infrastructure, control, and long-term economics. Cursor's acquisition shows we have entered this second phase.&lt;/p&gt;

&lt;p&gt;That has a few consequences:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Capital will shift toward coding infrastructure rather than flashy demos.&lt;/li&gt;
&lt;li&gt;Owning the model will become more important than simply accessing it.&lt;/li&gt;
&lt;li&gt;Enterprise buyers will demand stricter controls over permissions, logging, and secrets.&lt;/li&gt;
&lt;li&gt;Developers will expect agents to manage entire workflows, not just generate code snippets.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The market is shifting from cool features to an operating system for software development.&lt;/p&gt;

&lt;h2&gt;
  
  
  What teams should do now
&lt;/h2&gt;

&lt;p&gt;If you are building with coding agents, do not wait for the market to settle.&lt;/p&gt;

&lt;p&gt;Treat agents like critical infrastructure now. Give them scoped access, keep secrets out of their context, and separate routine coding from destructive actions. Require human approval for any production changes.&lt;/p&gt;

&lt;p&gt;Establishing these guardrails early allows you to adopt advanced agents safely. We outline this approach in &lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt; and &lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;AI Agent Secret Proxy&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For teams, teamcopilot.ai provides these shared workflows, permissions, and secret management out of the box, replacing unmanaged prompts with structured collaboration.&lt;/p&gt;

&lt;p&gt;The SpaceX and Cursor deal shows where the category is headed, with coding agents becoming strategic assets. The winning companies won't just offer better chat interfaces; they will provide superior workflows, tighter security controls, better data, and proprietary models. For developers, this means more powerful tools, but it also means the systems you use daily are becoming part of the software control plane. The more capable these agents become, the more carefully they must be managed. The future of AI coding is not just about what agents can write, but what they can safely own.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Why is the Cursor acquisition such a big deal?
&lt;/h3&gt;

&lt;p&gt;It shows that coding agents are transitioning from simple productivity add-ons to strategic platform assets embedded in the developer workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does this mean coding agents are becoming the new IDE?
&lt;/h3&gt;

&lt;p&gt;Not quite, but they are becoming the core intelligence layer within the IDE, handling an increasing share of the actual development work.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why would SpaceX want an AI coding company?
&lt;/h3&gt;

&lt;p&gt;To secure direct control over its software development pipeline, proprietary workflow data, and model training strategy.&lt;/p&gt;

&lt;h3&gt;
  
  
  What does a custom model change for coding agents?
&lt;/h3&gt;

&lt;p&gt;A custom, code-specific model improves execution speed, output quality, and cost efficiency while eliminating reliance on external API providers.&lt;/p&gt;

&lt;h3&gt;
  
  
  What does this mean for developers?
&lt;/h3&gt;

&lt;p&gt;Expect more capable tools alongside intense competition and pressure to integrate AI agents into daily workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Will coding agents replace software engineers?
&lt;/h3&gt;

&lt;p&gt;No. They accelerate drafting, refactoring, and testing, but they do not replace human engineering judgment.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the biggest risk with coding agents?
&lt;/h3&gt;

&lt;p&gt;Over-privileged access. An agent with too much visibility or authority can quickly introduce security vulnerabilities or disrupt production environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  How should teams use coding agents safely?
&lt;/h3&gt;

&lt;p&gt;Implement scoped permissions, mandatory approval gates, detailed audit logs, and secure secret management.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is this acquisition good or bad for the market?
&lt;/h3&gt;

&lt;p&gt;It drives rapid product innovation but also accelerates consolidation, which could lead to more closed ecosystems.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where does teamcopilot.ai fit into this trend?
&lt;/h3&gt;

&lt;p&gt;teamcopilot.ai helps teams adopt coding agents safely by providing shared permissions, secret management, and structured workflows.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>news</category>
      <category>tooling</category>
    </item>
    <item>
      <title>How to Fine-Tune LLMs on Your Own Data: Open-Source Models, RL Environments, and Evals</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Mon, 15 Jun 2026 04:04:56 +0000</pubDate>
      <link>https://dev.to/rish_poddar/how-to-fine-tune-llms-on-your-own-data-open-source-models-rl-environments-and-evals-h35</link>
      <guid>https://dev.to/rish_poddar/how-to-fine-tune-llms-on-your-own-data-open-source-models-rl-environments-and-evals-h35</guid>
      <description>&lt;p&gt;If you use LLMs long enough, you hit the same wall.&lt;/p&gt;

&lt;p&gt;The frontier model is impressive, but it is not always the best model for your job. It may be too expensive. It may be too slow. It may be too general. And once you start asking it to follow your company’s rules, tone, domain language, and task structure, the gap between “smart” and “useful” gets obvious fast.&lt;/p&gt;

&lt;p&gt;That is where post-training comes in.&lt;/p&gt;

&lt;p&gt;The short version is this: if you have enough good data, you can often take an open-source model and make it better for your specific task than a much larger frontier model, while spending less to run it. Success requires the full loop of data, evals, and environments, rather than simple fine-tuning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why post-training matters
&lt;/h2&gt;

&lt;p&gt;Pre-trained models know a lot, but they lack context about your business, such as which form fields matter, which edge cases are acceptable, how your style guide looks, or how your internal tools behave when a field is missing. Prompting can help, but it has limits. Retrieval helps, but it does not change the model’s behavior. Post-training does.&lt;/p&gt;

&lt;p&gt;That is why a smaller open-source model can beat a giant general model on a narrow task. Once you train on the right examples, the model starts behaving like a specialist instead of a smart generalist.&lt;/p&gt;

&lt;p&gt;This pattern is showing up everywhere now, with vendors pushing fine-tuning on open-source models, research teams using evaluation harnesses as reward signals, and open-source RL libraries making the entire process much less mysterious.&lt;/p&gt;

&lt;h2&gt;
  
  
  Start with supervised fine-tuning
&lt;/h2&gt;

&lt;p&gt;For most teams, supervised fine-tuning is the right first step.&lt;/p&gt;

&lt;p&gt;You collect prompt-response pairs from your own data, clean them up, and train the model to imitate the answers you actually want. If your task is classification, structured extraction, support replies, code review comments, or domain-specific writing, SFT often gives the quickest improvement.&lt;/p&gt;

&lt;p&gt;The important part is data quality. A few hundred excellent examples usually matter more than a mountain of noisy ones. Your target outputs should look like the real thing. If your best internal answer is short and direct, avoid training on long, polished prose, and make sure to preserve any strict formatting required by your workflow.&lt;/p&gt;

&lt;p&gt;A fine-tuned open-source model that knows your task can be much cheaper to serve than calling a frontier model every time, although frontier models still make sense where they are worth the extra spend.&lt;/p&gt;

&lt;h2&gt;
  
  
  Add RL when the task has a clear signal
&lt;/h2&gt;

&lt;p&gt;Fine-tuning gets you the basic behavior. Reinforcement learning can push things further when the task has a clean reward signal.&lt;/p&gt;

&lt;p&gt;That reward signal does not need to be abstract. It can be concrete and mechanical, such as checking whether the generated SQL ran, the code passed tests, the agent completed the workflow, or the answer matched a known correct output. The best RL setups are often the ones where success can be checked automatically.&lt;/p&gt;

&lt;p&gt;This is why RL works well for tool use, coding, and agent workflows. You can build a small environment, let the model act in it, and score the outcome. When the model takes the wrong path, the environment flags it, whereas a reliable solution earns a positive reward.&lt;/p&gt;

&lt;p&gt;The catch is that RL is only as good as the signal you give it. If the reward is sloppy, the model learns to game the reward instead of solving the task. Instead of starting with RL because it sounds impressive, only use it when the task actually deserves a structured reward system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Treat RL environments as part of the product
&lt;/h2&gt;

&lt;p&gt;This is the part people skip.&lt;/p&gt;

&lt;p&gt;An RL environment is not just a training toy. It is the place where the model proves it can do the job. If you want an agent to use tools, follow procedures, or complete multi-step work, the environment has to resemble the real task closely enough that success means something. This usually requires:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;realistic inputs&lt;/li&gt;
&lt;li&gt;deterministic graders where possible&lt;/li&gt;
&lt;li&gt;frozen fixtures for external data&lt;/li&gt;
&lt;li&gt;held-out tasks the model has not seen before&lt;/li&gt;
&lt;li&gt;clear pass/fail rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you train on a live system and evaluate on the same live system, you can fool yourself. A frozen environment with stable checks is much better for learning whether the model is actually improving or just exploiting quirks.&lt;/p&gt;

&lt;p&gt;This matters for team products too. If your internal agent is going to make decisions, fill forms, or act on shared workflows, the training setup should look like the workflow people will actually use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use evals before, during, and after training
&lt;/h2&gt;

&lt;p&gt;Evals are not a final checkpoint; they keep you honest. Initial evaluations highlight the model's weaknesses, while checks during training show if you are moving in the right direction, and final tests reveal whether the new model is actually better or just broken in new ways.&lt;/p&gt;

&lt;p&gt;A good eval suite usually mixes a few types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;golden-answer tasks for exact correctness&lt;/li&gt;
&lt;li&gt;rubric-based scoring for subjective output&lt;/li&gt;
&lt;li&gt;task completion checks for agents and workflows&lt;/li&gt;
&lt;li&gt;regression tests for the weird edge cases that already hurt you once&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The best evals are specific to your use case. When fine-tuning a support model, you should measure policy compliance and escalation paths rather than just fluency, just as training a coding model requires running the tests instead of merely checking code style.&lt;/p&gt;

&lt;p&gt;One useful pattern is to turn your eval harness into a reward source. When the evaluator is good enough, it can guide both selection and RL. That gives you a much tighter loop than guessing from model output alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why open-source models often win on ROI
&lt;/h2&gt;

&lt;p&gt;This is where the economics start to matter.&lt;/p&gt;

&lt;p&gt;Frontier models are strong, but they come with recurring usage costs and less control over deployment. Open-source models give you more room to shape behavior, run locally or privately, and keep serving costs under control. If the task is narrow enough, that tradeoff can be excellent.&lt;/p&gt;

&lt;p&gt;You also get more leverage from your own data. Once you have a decent training set, every improvement compounds. Better data makes better fine-tuning. Better fine-tuning makes better evals. Better evals make better RL. And the cycle keeps tightening.&lt;/p&gt;

&lt;p&gt;That is why “use the biggest model” is not the right default. The better question is whether the task is worth specializing. If it is, an open-source model on your data often gives you better performance per dollar.&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical workflow
&lt;/h2&gt;

&lt;p&gt;If you want to do this well, keep the sequence boring:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Define the task clearly.&lt;/li&gt;
&lt;li&gt;Collect a clean dataset from real examples.&lt;/li&gt;
&lt;li&gt;Build evals before training anything.&lt;/li&gt;
&lt;li&gt;Start with supervised fine-tuning.&lt;/li&gt;
&lt;li&gt;Add RL only when the environment and reward are solid.&lt;/li&gt;
&lt;li&gt;Re-run the evals and compare against the baseline.&lt;/li&gt;
&lt;li&gt;Deploy only after you can explain why the new model is better.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;While this approach isn't flashy, it works, and it fits team workflows better than one-off prompting. TeamCopilot.ai provides this structure for broader agent workflows, making the system repeatable, auditable, and safe enough for a team to rely on.&lt;/p&gt;

&lt;p&gt;If you want a related angle, &lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt; and &lt;a href="https://dev.to/blog/coding-agent-best-practices-how-to-set-up-ai-agents-securely-and-productively"&gt;Coding Agent Best Practices: How to Set Up AI Agents Securely and Productively&lt;/a&gt; are useful companions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this breaks down
&lt;/h2&gt;

&lt;p&gt;Post-training is not magic. It works best when the task is stable and the data is good. It works less well when the problem changes every week or the label quality is weak.&lt;/p&gt;

&lt;p&gt;It also does not remove the need for a strong fallback model. Sometimes the best setup is a specialized open-source model for the common path and a frontier model for the weird edge cases. That hybrid setup is often the most practical one.&lt;/p&gt;

&lt;p&gt;The real mistake is treating model choice like a religion. Instead, use the smallest model that does the job, fine-tune it on your data, measure the results honestly, and keep the option that performs best rather than the one that is newest.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is post-training in LLMs?
&lt;/h3&gt;

&lt;p&gt;Post-training is everything you do after pre-training to make a model more useful for a specific task. That includes supervised fine-tuning, preference optimization, reinforcement learning, and similar adaptation methods.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is fine-tuning always better than prompting?
&lt;/h3&gt;

&lt;p&gt;No. Prompting is faster to try and often good enough for small tasks. Fine-tuning becomes worth it when you need consistent behavior, lower latency, lower cost, or better results on your own data.&lt;/p&gt;

&lt;h3&gt;
  
  
  When should I use RL instead of supervised fine-tuning?
&lt;/h3&gt;

&lt;p&gt;Use RL when you can define a reliable reward signal or a clear success condition. If the task has a measurable outcome, RL can help push the model beyond imitation.&lt;/p&gt;

&lt;h3&gt;
  
  
  What makes a good RL environment?
&lt;/h3&gt;

&lt;p&gt;A good RL environment mirrors the real task closely, has clear grading, uses deterministic fixtures when possible, and avoids hidden shortcuts that let the model game the reward.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why are evals so important?
&lt;/h3&gt;

&lt;p&gt;Because they tell you whether the model actually got better. Without evals, training turns into guesswork. With good evals, you can compare models, catch regressions, and decide whether the change was worth it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can an open-source model really beat a frontier model?
&lt;/h3&gt;

&lt;p&gt;Yes, on a narrow task with good data, it often can. The smaller model may be worse in general, but better on your specific workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is this cheaper than using frontier models?
&lt;/h3&gt;

&lt;p&gt;Usually, yes, once the model is trained and deployed at scale. You pay upfront for data and training, but ongoing inference can be much cheaper.&lt;/p&gt;

&lt;h3&gt;
  
  
  What kind of data do I need?
&lt;/h3&gt;

&lt;p&gt;You need real examples of the task you want the model to do. Clean prompt-response pairs for SFT, plus outcome data or verifier logic if you want RL.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do I need a huge dataset?
&lt;/h3&gt;

&lt;p&gt;Not always. Good data matters more than a huge dataset. A smaller, well-curated set often beats a large noisy one.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where does TeamCopilot.ai fit in?
&lt;/h3&gt;

&lt;p&gt;TeamCopilot.ai is useful when you want the surrounding process to stay controlled. If your team is building or operating AI workflows, it helps keep permissions, approvals, and automation structured instead of ad hoc.&lt;/p&gt;

&lt;h3&gt;
  
  
  Should I ever keep using frontier models?
&lt;/h3&gt;

&lt;p&gt;Absolutely. Frontier models still make sense for hard reasoning, broad coverage, or tasks that change too fast to justify training. The point is to use them where they earn their cost.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>machinelearning</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Anthropic's Fable 5 Block Is a Reminder to Pick the Smallest Model That Passes</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Sun, 14 Jun 2026 04:58:36 +0000</pubDate>
      <link>https://dev.to/rish_poddar/anthropics-fable-5-block-is-a-reminder-to-pick-the-smallest-model-that-passes-15di</link>
      <guid>https://dev.to/rish_poddar/anthropics-fable-5-block-is-a-reminder-to-pick-the-smallest-model-that-passes-15di</guid>
      <description>&lt;p&gt;The sudden block of Anthropic's Fable 5 shows how vulnerable modern software is when it quietly depends on a single external model.&lt;/p&gt;

&lt;p&gt;A frontier model launched, gained rapid adoption, and was suddenly restricted by a government order. While the political details and technical claims remain highly contested, the operational lesson is clear: access is never guaranteed, and raw capability does not make a model the right choice.&lt;/p&gt;

&lt;p&gt;Instead of asking for the most powerful model available, teams should ask for the smallest model that passes their evaluations for a specific task.&lt;/p&gt;

&lt;h2&gt;
  
  
  What happened
&lt;/h2&gt;

&lt;p&gt;On June 12, 2026, the U.S. government reportedly ordered Anthropic to restrict access to Fable 5 and Mythos 5 for foreign nationals, citing national security concerns. Anthropic responded by disabling access more broadly to ensure compliance. The lack of public technical details makes this incident particularly notable. Even for a prominent company like Anthropic, model access can vanish overnight when policy, national security, and export controls collide.&lt;/p&gt;

&lt;p&gt;If you want the background on the model itself, see &lt;a href="https://dev.to/blog/what-is-claude-fable-5-capabilities-benchmarks-pricing-and-how-to-access-it"&gt;What Is Claude Fable 5? Capabilities, Benchmarks, Pricing, and How to Access It&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;Most teams treat model selection as a capability problem, comparing benchmarks and context windows before picking the strongest option. While this approach works for demos, production systems require a different standard. In a real workflow, unnecessary capability brings extra cost, latency, variability, and risk. If a smaller model can do the job, a larger one only increases your potential blast radius. This is especially true for agents handling files, tools, and credentials; narrow tasks require a model that reliably meets the requirements rather than one that merely excels on generic benchmarks.&lt;/p&gt;

&lt;p&gt;That same mindset shows up in &lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt; and &lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt;. The model is only one part of the system. Governance matters just as much.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the smallest passing model is usually the right one
&lt;/h2&gt;

&lt;p&gt;Choosing the smallest model that clears your evaluations offers several practical advantages. First, smaller models lower operational costs and run faster, which directly improves the user experience. They also reduce risk; with less unnecessary general capability, there are fewer opportunities for unexpected behavior. While this doesn't make them safe by default, it limits the potential damage. Finally, smaller models are much easier to replace. If a provider changes its pricing, policies, or access terms, a team using a smaller, highly targeted model can migrate far more easily. The Fable 5 block proves that even an excellent model can be an unreliable dependency.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to choose the smallest model that works
&lt;/h2&gt;

&lt;p&gt;Finding the right model requires a structured evaluation set rather than intuition. Start by gathering real examples from your target task, such as actual support tickets for classification, real notes for summarization, or production-grade workflows for action-taking.&lt;/p&gt;

&lt;p&gt;Use these to build a compact evaluation set containing normal examples, edge cases, ambiguities, historical production failures, and a few deliberately difficult scenarios. Once you run this set against several models, avoid chasing the highest score at all costs; instead, aim for the smallest model that meets your acceptance threshold.&lt;/p&gt;

&lt;p&gt;In practice, you should measure latency, cost, pass rates, and the types of mistakes the model makes. If two models both pass, choose the smaller one. If the smaller model barely squeaks by, keep it under review and add more difficult examples to your evaluation set. Anthropic itself advocates for starting with small, realistic test sets. Defining success first and iterating is far more effective than starting with the largest model and hoping brute force compensates for a poorly defined task.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a good eval set looks like
&lt;/h2&gt;

&lt;p&gt;A good evaluation set is boring in the best way: close to your real task, stable enough to rerun, and small enough for manual inspection. Avoid building a massive benchmark before you even have a working workflow. A set of 20 to 50 carefully chosen examples is often plenty to make a clear decision early on. The most useful test cases usually come from real mistakes, like a misread document, a wrong routing decision, or a failed tool call. Turning these failures into tests is far more valuable than using generic prompts from a benchmark blog post. A task is simply not ready for production until you can explain exactly why the model passed.&lt;/p&gt;

&lt;h2&gt;
  
  
  This is also a governance problem
&lt;/h2&gt;

&lt;p&gt;Model selection is often treated like an engineering detail, but it is not. The model you choose dictates what your system can do, what it is allowed to access, and how much damage it can cause if it fails. This is why permissions, approvals, and audit trails are critical once models handle real work. A system like teamcopilot.ai helps keep these choices inside an environment where access, approvals, and secrets are managed properly. The goal is to make AI usable without turning every model choice into a risk multiplier.&lt;/p&gt;

&lt;h2&gt;
  
  
  The practical takeaway
&lt;/h2&gt;

&lt;p&gt;The Fable 5 block reminds us that frontier models can be impressive yet unstable dependencies, and that most tasks simply do not require the largest model available. To build a durable setup, follow these steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Define the task clearly.&lt;/li&gt;
&lt;li&gt;Build a small eval set from real examples.&lt;/li&gt;
&lt;li&gt;Test multiple models, starting from the smallest plausible option.&lt;/li&gt;
&lt;li&gt;Pick the smallest model that passes.&lt;/li&gt;
&lt;li&gt;Re-run the evals whenever you change the task or the model.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This process takes slightly longer upfront, but it is much faster than debugging a bad production choice later.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The Fable 5 Incident and Frontier Model Risks
&lt;/h3&gt;

&lt;p&gt;Anthropic disabled access to Fable 5 following a government order tied to national security concerns and export controls. While the public explanation remains thin, this event highlights the inherent risks of relying solely on frontier models for production. This does not mean frontier models are too risky to use entirely, but they should be selected carefully and earn their spot through rigorous testing rather than default assumptions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Designing and Scaling Your Evaluation Sets
&lt;/h3&gt;

&lt;p&gt;An evaluation set is a curated group of test examples representing the task you want the model to handle, allowing you to compare models consistently. When starting out, keep the set small - 20 to 50 real examples are often better than a massive synthetic benchmark. Your set should include ordinary cases, edge cases, historical production failures, and a few intentionally difficult scenarios. If you don't have an evaluation set yet, start by turning your recent production failures, bad outputs, or support escalations into your first test cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Choosing the Right Model and Measuring Success
&lt;/h3&gt;

&lt;p&gt;You should measure latency, cost, and the types of mistakes the model makes. Not necessarily - use the smallest model that passes your task requirements. If a smaller model fails important edge cases, move up the capability ladder until it passes. If the task changes or grows more complex over time, simply re-run your evaluation set to ensure your chosen model still fits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Governance, Safety, and Control
&lt;/h3&gt;

&lt;p&gt;Cost is important, but control is often the primary driver. Smaller models are easier to justify, replace, and operate safely. However, safety still depends heavily on permissions, approvals, and what the model is allowed to touch. For sensitive workflows, your evaluations must be stricter and your guardrails stronger, ensuring that model selection and access control are designed together from the start.&lt;/p&gt;

&lt;h3&gt;
  
  
  Further Reading
&lt;/h3&gt;

&lt;p&gt;To learn more about managing models and agents, read &lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt;, &lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt;, and &lt;a href="https://dev.to/blog/anthropic-and-self-improving-ai"&gt;Anthropic and Self-Improving AI&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>news</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Is Siri AI? How Apple's Voice Assistant Really Works</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Fri, 12 Jun 2026 04:09:15 +0000</pubDate>
      <link>https://dev.to/rish_poddar/is-siri-ai-how-apples-voice-assistant-really-works-4j5</link>
      <guid>https://dev.to/rish_poddar/is-siri-ai-how-apples-voice-assistant-really-works-4j5</guid>
      <description>&lt;p&gt;Apple finally gave Siri the kind of upgrade people have been asking for, on and off, for years.&lt;/p&gt;

&lt;p&gt;The new Siri AI is not just better speech recognition or a slightly smarter search box. Apple says it can understand what is on your screen, use personal context across messages and email, answer questions from the web, and take actions across apps. That moves Siri from a voice shortcut into something that looks a lot more like a real assistant.&lt;/p&gt;

&lt;p&gt;That matters because Siri has always been one of the most visible consumer AI products on earth. When Apple changes Siri, it changes what a lot of people think an assistant should be able to do.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed in practice
&lt;/h2&gt;

&lt;p&gt;At WWDC 2026, Apple introduced Siri AI as a rebuilt assistant powered by Apple Intelligence. The changes are pretty straightforward:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It can use personal context to find things in your messages, emails, photos, and other apps.&lt;/li&gt;
&lt;li&gt;It has onscreen awareness, so it can answer questions about what you are currently looking at.&lt;/li&gt;
&lt;li&gt;It can reach out to the web for up-to-date answers.&lt;/li&gt;
&lt;li&gt;It can perform more systemwide actions across apps.&lt;/li&gt;
&lt;li&gt;It now has a dedicated app for conversation history.&lt;/li&gt;
&lt;li&gt;It includes expanded writing and editing tools.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apple also leaned hard on privacy. Siri AI runs through Apple’s on-device and Private Cloud Compute architecture, which is the company’s way of saying it wants the assistant to stay useful without becoming a data leak.&lt;/p&gt;

&lt;p&gt;That privacy angle is the interesting part. Most AI products get better by seeing more. Apple is trying to get more capable while seeing less.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this feels different
&lt;/h2&gt;

&lt;p&gt;Old Siri was mostly a command layer. You asked for a timer, a reminder, a weather check, or a quick lookup. It was useful, but it stayed in a small lane.&lt;/p&gt;

&lt;p&gt;Siri AI is trying to do something broader. If a friend texts you a restaurant recommendation, Siri should be able to find it. If you are looking at a message about a trip, it should help you act on it. If you are writing, it should help draft and edit in context.&lt;/p&gt;

&lt;p&gt;That is a very different product shape.&lt;/p&gt;

&lt;p&gt;Instead of just answering questions, the assistant can now handle multi-step tasks for you—like finding a flight confirmation in your email and adding it directly to your calendar without you having to copy and paste.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why teams should care
&lt;/h2&gt;

&lt;p&gt;While the Siri update is a consumer story, the lesson is bigger than Apple: as assistants get more powerful, the hard part stops being raw intelligence and becomes control. Who can the assistant see? What can it touch? When should it ask for approval? How do you keep secrets safe? How do you know what it did later?&lt;/p&gt;

&lt;p&gt;That is the same reason TeamCopilot exists. Teams do not just need a smart assistant. They need a shared assistant they can trust, with permissions, approvals, workflows, and secret handling built in.&lt;/p&gt;

&lt;p&gt;Once AI starts acting on your behalf, governance stops being a nice-to-have. If you want to see how to manage these permissions and security risks in your own team, these resources are a good place to start:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ai-agent-governance-control-plane"&gt;AI Agent Governance Is the New Enterprise Control Plane&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ai-agent-secret-proxy"&gt;Why Your AI Agent Should Never See Your API Keys&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/blog/ai-agent-platform-comparison"&gt;Best AI Agent Platforms for Teams in 2026: Comparing 13 Tools&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What to watch next
&lt;/h2&gt;

&lt;p&gt;Siri AI's real test lies in daily use rather than keynote demos. Will people trust it to search across personal data? Will it stay fast enough to feel natural? Will it actually replace some of the tiny tasks people do every day? And will Apple keep the privacy promise intact as the assistant gets more capable?&lt;/p&gt;

&lt;p&gt;Those questions matter because they will shape the rest of the market too. If Apple makes a privacy-first assistant feel genuinely useful, it will push every other assistant maker to explain how they handle memory, context, and action.&lt;/p&gt;

&lt;p&gt;That pressure helps users, but it also helps set a better baseline for teams building internal agents. The future belongs to systems designed to act safely.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for TeamCopilot
&lt;/h2&gt;

&lt;p&gt;Siri AI is a good reminder that people do not want tools that only answer questions; they want tools that understand context, take action, and stay out of the way. But once an assistant can do more, it also needs more guardrails.&lt;/p&gt;

&lt;p&gt;That is the gap TeamCopilot is built for. Shared skills, workflows, secret management, and approval controls let teams use AI agents without handing them unchecked access.&lt;/p&gt;

&lt;p&gt;The real challenge isn't just making Siri smarter—it's building an assistant that users and teams can actually trust with their sensitive data.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is Siri AI now?
&lt;/h3&gt;

&lt;p&gt;Yes. Apple has rebuilt Siri around Apple Intelligence, with personal context, onscreen awareness, web answers, and app actions.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is new in Siri AI?
&lt;/h3&gt;

&lt;p&gt;The big changes are contextual understanding, better conversation, a dedicated app, visual intelligence, and stronger writing tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Siri AI use personal data?
&lt;/h3&gt;

&lt;p&gt;It can use personal context from apps like Messages, Mail, and Photos, but Apple says it does so through privacy-preserving architecture and on-device processing where possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Siri AI the same as ChatGPT or Gemini?
&lt;/h3&gt;

&lt;p&gt;Not really. Siri AI is Apple’s own assistant layer, built into the operating system and designed around Apple hardware and privacy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why does Siri AI matter for businesses?
&lt;/h3&gt;

&lt;p&gt;It shows that assistants are moving from simple chat into real action. That is the same shift businesses face when they adopt AI agents internally.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the biggest risk with more powerful AI assistants?
&lt;/h3&gt;

&lt;p&gt;The biggest risk is overreach. If an assistant can see too much or act too freely, it can create privacy, security, and reliability problems.&lt;/p&gt;

&lt;h3&gt;
  
  
  How is TeamCopilot different from Siri AI?
&lt;/h3&gt;

&lt;p&gt;Siri AI is a personal consumer assistant built into Apple devices. TeamCopilot is a shared team agent with skills, workflows, approvals, and secret controls for business use.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the main takeaway from Apple's Siri AI launch?
&lt;/h3&gt;

&lt;p&gt;The main takeaway is that as assistants gain the power to act on our behalf, security and trust become just as important as capability.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>ios</category>
      <category>news</category>
      <category>nlp</category>
    </item>
    <item>
      <title>How to Humanize AI Text Without Sounding Robotic</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Fri, 12 Jun 2026 04:00:55 +0000</pubDate>
      <link>https://dev.to/rish_poddar/how-to-humanize-ai-text-without-sounding-robotic-2aio</link>
      <guid>https://dev.to/rish_poddar/how-to-humanize-ai-text-without-sounding-robotic-2aio</guid>
      <description>&lt;p&gt;AI text usually sounds robotic for the same boring reasons. The sentences are too even. The transitions are too polished. The wording is technically fine, but it never quite sounds like a person.&lt;/p&gt;

&lt;p&gt;So the fix is not just "make this sound human." You need a method that checks the draft, spots the AI tells, and rewrites it in passes until it reads naturally.&lt;/p&gt;

&lt;p&gt;That is what the code below does.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "humanized" actually means
&lt;/h2&gt;

&lt;p&gt;Humanized text is not slangy or messy. It keeps the meaning intact, but it sounds like a real person wrote it once instead of a model polishing it three times.&lt;/p&gt;

&lt;p&gt;In practice, that means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;sentence lengths vary&lt;/li&gt;
&lt;li&gt;ideas do not all arrive in the same shape&lt;/li&gt;
&lt;li&gt;the tone matches the audience&lt;/li&gt;
&lt;li&gt;the wording is specific instead of generic&lt;/li&gt;
&lt;li&gt;the draft does not lean on obvious AI crutches like repetitive transitions, fake nuance, or overclean symmetry&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The code at a glance
&lt;/h2&gt;

&lt;p&gt;The code takes a long input string and runs it through an iterative loop.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;It splits the text into markdown chunks using top-level &lt;code&gt;##&lt;/code&gt; headings.&lt;/li&gt;
&lt;li&gt;It runs a detector prompt on each chunk in parallel.&lt;/li&gt;
&lt;li&gt;It collects the strongest AI-writing signals.&lt;/li&gt;
&lt;li&gt;It rewrites the full original text using those findings.&lt;/li&gt;
&lt;li&gt;It repeats until the text looks clean or it hits the max iteration count.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The code does not guess from vibes. It asks the model to point at concrete snippets, explain why they sound AI-like, and say what should change.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full workflow code
&lt;/h2&gt;

&lt;p&gt;This is the actual implementation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;uuid&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;as_completed&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pathlib&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;


&lt;span class="n"&gt;MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-3.5-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;INPUT_COST_PER_1M&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.50&lt;/span&gt;
&lt;span class="n"&gt;OUTPUT_COST_PER_1M&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;9.00&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_local_env_var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;env_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.env&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;KeyError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;
            &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;key_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'"'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;

    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;KeyError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;load_local_env_var&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;genai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;OUTPUT_DIR&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__file__&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Snippet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Short excerpt from the text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Why the excerpt reads AI-like&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;fix&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What to change&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DetectionResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;has_issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Whether the text has AI-writing issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;snippets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Snippet&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;default_factory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_length&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RewriteResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;rewritten_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The full rewritten text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_usage_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;estimate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt_tokens&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;1_000_000.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;INPUT_COST_PER_1M&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_tokens&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;1_000_000.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;OUTPUT_COST_PER_1M&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;trim_to_first_words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;word_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;word_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;word_count&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;split_into_chunks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;splitlines&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^##\s+&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
            &lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call_gemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;models&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GenerateContentConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;top_p&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;response_mime_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;response_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;raw_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Gemini returned empty text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_validate_json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;usage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;usage_metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;prompt_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_usage_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt_token_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_token_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;output_tokens&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_usage_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;candidates_token_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output_token_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response_token_count&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;estimate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_tokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;raw_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;output_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_detect_single_chunk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;detection_raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_output_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;call_gemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;build_detection_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_text&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;DetectionResult&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;chunk_snippets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snippet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;trim_to_first_words&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;detection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;snippets&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_snippets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_cost&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_output_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_snippets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_issues_in_chunks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;detect_issues_in_chunks_parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_issues_in_chunks_silent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;detect_issues_in_chunks_parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_issues_in_chunks_parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;all_snippets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;seen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;total_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
    &lt;span class="n"&gt;total_chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

    &lt;span class="n"&gt;max_workers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;futures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_detect_single_chunk&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_text&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
        &lt;span class="n"&gt;chunk_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_snippets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_cost&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_output_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;snippet_count&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;=== Chunk &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; Detection Result ===&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;detection_raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Chunk &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; detection cost: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;detection_cost&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;detection_prompt_tokens&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; input tokens, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;detection_output_tokens&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; output tokens)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;chunk_snippets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_snippets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ensure_ascii&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunk_snippets&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snippet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;seen&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;
            &lt;span class="n"&gt;seen&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;all_snippets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;total_cost&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;detection_cost&lt;/span&gt;
        &lt;span class="n"&gt;total_chunks&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;all_snippets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_cost&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_chunks&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_iterations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;current_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;input_text&lt;/span&gt;
    &lt;span class="n"&gt;total_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
    &lt;span class="n"&gt;iteration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="n"&gt;first_iteration_issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;forced_rewrite_done&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;iteration&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_iterations&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;split_into_chunks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Iteration &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; chunk(s)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;all_snippets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detection_cost&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;detect_issues_in_chunks_parallel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;iteration&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;first_iteration_issues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;all_snippets&lt;/span&gt;

        &lt;span class="n"&gt;has_issues&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;all_snippets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Iteration &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;all_snippets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; issue(s) found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;iteration_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;detection_cost&lt;/span&gt;

        &lt;span class="n"&gt;should_force_rewrite&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;forced_rewrite_done&lt;/span&gt;
        &lt;span class="n"&gt;should_rewrite&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;has_issues&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;should_force_rewrite&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;should_rewrite&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;total_cost&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;iteration_cost&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Iteration &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; total cost: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;iteration_cost&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;

        &lt;span class="n"&gt;rewrite_raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewrite_output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewrite_prompt_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewrite_output_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewrite_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;call_gemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nf"&gt;build_rewrite_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;all_snippets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;response_model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;RewriteResult&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;current_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rewrite_output&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rewritten_text&lt;/span&gt;
        &lt;span class="n"&gt;forced_rewrite_done&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
        &lt;span class="n"&gt;iteration_cost&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;rewrite_cost&lt;/span&gt;
        &lt;span class="n"&gt;total_cost&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;iteration_cost&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Iteration &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: rewrite applied&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;current_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_cost&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iterations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;iteration&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;MODEL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;first_iteration_issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;first_iteration_issues&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;




&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_detection_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Audit the text for AI-writing tells.

Focus on concrete signals, not vague vibes:

- Em dash overuse: repeated `—`, lists built around em dashes, em dashes used as a default stylistic crutch, or multiple paragraphs that lean on em dashes for rhythm.
- Contrast formulas: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;it&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s X, not Y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;not only X, but also Y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;this isn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t just X, it&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s Y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;the real answer isn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t X&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.
- Meta-openers: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;what usually gets skipped here is...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;what people often miss is...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;the part people forget is...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;what matters most is...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;what gets overlooked is...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;this is exactly the sort of...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;this is basically why...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;this is the kind of...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.
- One-line paragraph patterns: lots of short single-sentence paragraphs in a row.
- Paragraph stubs: a paragraph that starts with a short setup sentence and then immediately follows with a tiny paragraph of 2-5 words, or a paragraph that is itself only 2-5 words long.
- Outline scaffolding: numbered headings or label-only sections like `### 1. Security`, `### 2. Reliability`, or repeated short section labels that read like an outline instead of prose.
- Choppy fragmentation: paragraphs with 2-4 very short sentences that all carry pieces of one idea and could naturally become a single sentence. Prioritize this signal even if the paragraph also contains a meta-opener.
- Template transitions: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;that works, until it doesn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;in today&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s world&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;at the end of the day&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ultimately&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;overall&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.
- Symmetry: repeated sentence openings, repeated clause shapes, evenly balanced comparisons, or neat bullet/list structures that feel engineered.
- Generic polish: buzzwords, hedging, fake nuance, or conclusion sentences that sound like a marketing page.
- Sycophancy: excessive praise toward the user, flattering language, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;you nailed it&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;great question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;this is exactly the right approach&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, or over-agreeable validation that adds no substance.
- Over-polished cadence: text that is technically clean but too symmetrical, too tidy, too templated, or too optimized for scannability.
- Repetitive feature-list formatting: bold labels, em-dash lists, or repeated &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;X — Y&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; constructions that create a slide-deck feel.

Return JSON only in this exact shape:
{{
  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;has_issues&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: true|false,
  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snippets&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: [
    {{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snippet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;short identifying excerpt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;short reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;short fix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;}}
  ]
}}

Rules:
- Return at most 5 snippets.
- Snippets must be short. Use only the first few words needed to identify the line.
- Keep `reason` and `fix` concise.
- If the text is natural, return `has_issues: false` and `snippets: []`.
- Prefer the strongest signals only.
- If a paragraph splits one idea across multiple short sentences, flag that fragmentation even if each sentence is individually understandable.
- If a paragraph is just 2-5 words, always flag it.

Text:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;build_rewrite_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;issues_json&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ensure_ascii&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;rewrite_style_block&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Additional rewrite style guidance:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Rewrite the text so it sounds human and natural while preserving meaning, facts, proper nouns, and formatting that still helps the content.

Use these rules:
- Direct over ornate.
- Specific over vague.
- Mix short and long sentences.
- Prefer simple verbs and concrete nouns.
- Delete fluff instead of disguising it.
- Avoid robotic symmetry, generic closings, and over-polished transitions.
- Keep the level of formality appropriate to the original content.

Address these flagged issues:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;issues_json&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;rewrite_style_block&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;

Return valid JSON only with this schema:
{{
  &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rewritten_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;the full rewritten text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
}}

Original text:
&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;original_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Humanise text with iterative Gemini rewrites.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--input_text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The text to humanise&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--max_iterations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Maximum rewrite/detection loops to run&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--rewrite_style&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Additional style guidance for the rewrite prompt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;process_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_iterations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_iterations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rewrite_style&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verbose&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;OUTPUT_DIR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mkdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;output_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;OUTPUT_DIR&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_rewrite_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nb"&gt;hex&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;console_summary&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output_path&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iterations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iterations&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;console_summary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ensure_ascii&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;SystemExit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;SystemExit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How the code works
&lt;/h2&gt;

&lt;p&gt;The code looks long, but the logic is simple.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. It loads Gemini securely
&lt;/h3&gt;

&lt;p&gt;The workflow expects &lt;code&gt;GEMINI_API_KEY&lt;/code&gt; to be present in the runtime environment. If it is not there, it tries to read a local &lt;code&gt;.env&lt;/code&gt; file inside the workflow folder.&lt;/p&gt;

&lt;p&gt;That matters because TeamCopilot workflows are meant to run with secrets injected safely, not pasted into prompts or hardcoded into source.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. It uses structured outputs
&lt;/h3&gt;

&lt;p&gt;Both the detector and the rewrite step use Pydantic models.&lt;/p&gt;

&lt;p&gt;That means the model cannot just return a messy blob of text. It has to return valid JSON in the shape the workflow expects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;DetectionResult&lt;/code&gt; gives &lt;code&gt;has_issues&lt;/code&gt; plus up to five flagged snippets&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;RewriteResult&lt;/code&gt; gives the full rewritten text&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It makes the workflow predictable, which is what you want if other people are going to reuse the output.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. It scans in chunks
&lt;/h3&gt;

&lt;p&gt;The workflow splits the input on top-level &lt;code&gt;##&lt;/code&gt; headings so it can inspect each section separately.&lt;/p&gt;

&lt;p&gt;That helps because long blog drafts usually mix clean sections with sections that still sound synthetic. Chunking keeps the detector focused instead of asking it to judge a giant wall of text all at once.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. It runs detection in parallel
&lt;/h3&gt;

&lt;p&gt;Each chunk is checked in parallel with a thread pool, which keeps the process moving on longer drafts.&lt;/p&gt;

&lt;p&gt;The detector is not just looking for one thing. It flags patterns like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;em dash overuse&lt;/li&gt;
&lt;li&gt;contrast formulas like "it's X, not Y"&lt;/li&gt;
&lt;li&gt;meta-openers like "what people often miss is"&lt;/li&gt;
&lt;li&gt;overclean section scaffolding&lt;/li&gt;
&lt;li&gt;repetitive sentence shapes&lt;/li&gt;
&lt;li&gt;generic closing lines&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. It rewrites only when needed
&lt;/h3&gt;

&lt;p&gt;If the detector finds issues, the workflow rewrites the full original text using those findings.&lt;/p&gt;

&lt;p&gt;If it finds nothing, it stops early.&lt;/p&gt;

&lt;p&gt;There is one exception. If you pass a non-empty &lt;code&gt;rewrite_style&lt;/code&gt;, the workflow forces at least one rewrite pass even if the detector says the text is clean. That is useful when you want a specific voice or house style layered in.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. It writes the final result to disk
&lt;/h3&gt;

&lt;p&gt;When the loop is done, the workflow writes the final rewrite to a file in &lt;code&gt;data/&lt;/code&gt; and prints a small JSON summary to stdout.&lt;/p&gt;

&lt;p&gt;That summary includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;output_path&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;iterations&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;total_cost&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;model&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to use it
&lt;/h2&gt;

&lt;p&gt;In TeamCopilot, the workflow takes these inputs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;input_text&lt;/code&gt;: the text you want humanized&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;max_iterations&lt;/code&gt;: how many detect and rewrite loops to run, default &lt;code&gt;10&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;rewrite_style&lt;/code&gt;: optional style guidance that forces at least one rewrite pass&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A simple run looks like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"input_text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AI tools are changing how teams work. We need to use these tools to make our workflows faster and more efficient, which ultimately helps us deliver better results."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_iterations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rewrite_style"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Keep it warm, practical, and a little conversational."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The workflow might return a summary like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"output_path"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"data/final_rewrite_7a1d8f2c3d9e4b31b4d2d1ef9c1c5e5f.json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iterations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_cost"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.008412&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gemini-3.5-flash"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The file at &lt;code&gt;output_path&lt;/code&gt; contains the final rewritten text.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example inputs and outputs
&lt;/h2&gt;

&lt;p&gt;Here are simple before and after examples.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 1
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Input
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are exactly right! The code indeed has a bug.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The code has a bug.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is another example with a style guide added.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example 2
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Input
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;That's exactly the right approach. The way to solve this problem is to use an iterative loop with a detector and a rewrite step.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;rewrite_style&lt;/code&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Write it as a friendly reddit comment in first person style.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I've had the best luck tackling this with an iterative loop. You basically just need a detector and a rewrite step to get it done.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why this is better than a single rewrite prompt
&lt;/h2&gt;

&lt;p&gt;A single prompt can help, but it usually stops at surface cleanup.&lt;/p&gt;

&lt;p&gt;This workflow is stronger because it separates the problem into two jobs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;find the AI tells&lt;/li&gt;
&lt;li&gt;rewrite around those tells&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That is how editors work too. First they spot the bad habits. Then they fix them. Then they read it again.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What does the code actually do?
&lt;/h3&gt;

&lt;p&gt;It audits text for AI-writing patterns, flags the strongest issues, rewrites the full draft, and repeats until the text reads naturally or the maximum iteration count is reached.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does it change the meaning of the text?
&lt;/h3&gt;

&lt;p&gt;It is designed not to. The rewrite prompt tells the model to preserve meaning, facts, proper nouns, and useful formatting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why does it split text into chunks?
&lt;/h3&gt;

&lt;p&gt;Long drafts are easier to inspect section by section. Chunking also lets the detector focus on one section at a time instead of making one big judgment on the entire post.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why use structured output instead of plain text?
&lt;/h3&gt;

&lt;p&gt;Structured output makes the workflow reliable. The detector must return a clear JSON object with snippets, reasons, and fixes. The rewrite step must return the final text in a known shape.&lt;/p&gt;

&lt;h3&gt;
  
  
  What if the detector finds nothing?
&lt;/h3&gt;

&lt;p&gt;The workflow stops early and returns the current text. That saves time and cost.&lt;/p&gt;

&lt;h3&gt;
  
  
  What does &lt;code&gt;rewrite_style&lt;/code&gt; do?
&lt;/h3&gt;

&lt;p&gt;It adds extra guidance for tone or voice. If you pass a non-empty value, the workflow runs at least one rewrite pass even when the detector does not find obvious issues.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I use this on blog posts only?
&lt;/h3&gt;

&lt;p&gt;No. It can be used on blog posts, docs, support text, internal notes, landing pages, and other long-form writing.&lt;/p&gt;

&lt;h3&gt;
  
  
  What kind of AI tells does the code look for?
&lt;/h3&gt;

&lt;p&gt;It looks for things like repetitive transitions, overly symmetrical sentence shapes, one-line paragraph patterns, vague buzzwords, fake nuance, em dash overuse, and contrast formulas that read like template writing.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the best input format?
&lt;/h3&gt;

&lt;p&gt;Use a full draft with headings and normal paragraph flow. The code is built to work on substantial text, not just a sentence or two.&lt;/p&gt;

&lt;h3&gt;
  
  
  How many iterations should I use?
&lt;/h3&gt;

&lt;p&gt;The default is 10. In practice, most drafts should not need that many. If the text is already decent, the code stops early.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is the output deterministic?
&lt;/h3&gt;

&lt;p&gt;No. It uses a model, so two runs can produce slightly different rewrites. The structure of the code stays the same, though, which is what matters for repeatability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does this replace human editing?
&lt;/h3&gt;

&lt;p&gt;No. It gets the draft much closer, but the best results still come from a human pass at the end.&lt;/p&gt;

&lt;h3&gt;
  
  
  How is this different from a generic AI humanizer?
&lt;/h3&gt;

&lt;p&gt;Most generic tools just rewrite surface phrasing. This code does the more useful thing: it detects the patterns first, rewrites in passes, and gives you a predictable process you can reuse.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>tutorial</category>
      <category>writing</category>
    </item>
    <item>
      <title>How to Visually Debug Multi AI-Agent Flows</title>
      <dc:creator>Rishabh Poddar</dc:creator>
      <pubDate>Wed, 18 Jun 2025 14:30:00 +0000</pubDate>
      <link>https://dev.to/rish_poddar/how-to-visually-debug-multi-ai-agent-flows-310p</link>
      <guid>https://dev.to/rish_poddar/how-to-visually-debug-multi-ai-agent-flows-310p</guid>
      <description>&lt;p&gt;Multi AI-Agent systems represent a significant advancement in software development, characterized by two fundamental aspects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Collaborative Intelligence&lt;/strong&gt;: Multiple AI agents working together to achieve shared objectives, each contributing their specialized capabilities.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dynamic Decision Making&lt;/strong&gt;: Systems that employ AI-driven tool calling to determine execution paths based on real-time context, rather than following predetermined logic flows.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The dynamic nature of these systems presents a unique challenge: understanding the flow of logic becomes increasingly complex as the system scales. A single query might trigger no tool calls, while another could initiate a cascade of nested interactions where AI agents themselves trigger additional tool calls. This complexity makes system behavior difficult to track and debug.&lt;/p&gt;

&lt;p&gt;To address this challenge, I'm introducing &lt;a href="https://github.com/rishabhpoddar/agentgraph" rel="noopener noreferrer"&gt;AgentGraph&lt;/a&gt;, an open-source library that seamlessly integrates with Python or Node.js backends. This tool captures and visualizes LLM interactions and tool calls in real-time, presenting them as an interactive graph that provides clear visibility into system behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  Visualizing Agent Interactions
&lt;/h2&gt;

&lt;p&gt;Let's examine a practical example: a database query agent with access to two tools:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;SQLAgent: Converts natural language queries into SQL&lt;/li&gt;
&lt;li&gt;DatabaseAgent: Executes SQL queries and returns results&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's how the agent graph visualizes different types of interactions:&lt;/p&gt;

&lt;h3&gt;
  
  
  Query 1: "Hi there!"
&lt;/h3&gt;

&lt;p&gt;A simple greeting that doesn't require database access results in no tool calls:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F19nifdx1icn9r43brpw9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F19nifdx1icn9r43brpw9.png" alt="Simple greeting with no tool calls" width="679" height="674"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Query 2: "How many users do I have?"
&lt;/h3&gt;

&lt;p&gt;This query triggers a sequence of tool calls:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;SQLAgent converts the query to SQL&lt;/li&gt;
&lt;li&gt;DatabaseAgent executes the query&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The main agent's graph shows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4wwn76065qf2gldn3c5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4wwn76065qf2gldn3c5.png" alt="Main agent graph" width="800" height="660"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The second bubble contains the user input (also indicating that the LLM used tools), followed by the final response bubble. Clicking the second bubble reveals the tool calls:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4zf2qqsk4r54kpu1xmq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk4zf2qqsk4r54kpu1xmq.png" alt="Tool calls" width="800" height="123"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The LLM first calls SQLAgent, then DatabaseAgent. Clicking on the SQLAgent row shows its subgraph:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0w2541wcrqp4qqdmkz3x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0w2541wcrqp4qqdmkz3x.png" alt="SQLAgent subgraph" width="800" height="563"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The subgraph displays the tool input (orange box), output (green box), and complete LLM chat history. While this example shows a simple tool, the visualization scales to handle complex nested tool calls. The DatabaseAgent's subgraph shows:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8dijap3zv1d7tykcno8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm8dijap3zv1d7tykcno8.png" alt="DatabaseAgent subgraph" width="667" height="496"&gt;&lt;/a&gt;&lt;br&gt;
Since this tool doesn't use an LLM, it simply displays the input and output.&lt;/p&gt;
&lt;h2&gt;
  
  
  Integration Guide
&lt;/h2&gt;

&lt;p&gt;Let's walk through implementing AgentGraph in a Node.js backend. You can find the complete example code &lt;a href="https://github.com/rishabhpoddar/agentgraph/tree/main/examples" rel="noopener noreferrer"&gt;here&lt;/a&gt;. While we'll focus on Node.js, the implementation is similar for Python backends.&lt;/p&gt;

&lt;p&gt;To get started, install the AgentGraph package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @trythisapp/agentgraph
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Setting Up the Main Agent
&lt;/h3&gt;

&lt;p&gt;The core of the integration involves wrapping your LLM calls with AgentGraph's &lt;code&gt;callLLMWithToolHandling&lt;/code&gt; function. Here's how to set up the main agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;callLLMWithToolHandling&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;clearSessionId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@trythisapp/agentgraph&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;v4&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;uuidv4&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;uuid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;mainAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Responses&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ResponseInput&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`You are a helpful assistant for querying a database...`&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userInput&lt;/span&gt;
    &lt;span class="p"&gt;}];&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;callLLMWithToolHandling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mainAgent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;// Agent name for visualization&lt;/span&gt;
            &lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;             &lt;span class="c1"&gt;// Unique session identifier&lt;/span&gt;
            &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;             &lt;span class="c1"&gt;// Tool ID (undefined for main agent)&lt;/span&gt;
            &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;       &lt;span class="c1"&gt;// LLM call function&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openaiClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;responses&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;inp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="cm"&gt;/* tool definitions */&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="p"&gt;});&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                 &lt;span class="c1"&gt;// Input messages&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="cm"&gt;/* tool implementations */&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output_text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;clearSessionId&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// Clean up session data&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Session Management
&lt;/h3&gt;

&lt;p&gt;Each interaction requires a unique session ID to track the conversation flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;sessionId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;uuidv4&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Session ID: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. File is saved in ./agentgraph_output/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.json`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The session ID serves two purposes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Visualization&lt;/strong&gt;: Groups related interactions in the same graph&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;File Output&lt;/strong&gt;: Creates a JSON file with the complete interaction trace&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Always call &lt;code&gt;clearSessionId(sessionId)&lt;/code&gt; when the session ends to clean up memory.&lt;/p&gt;

&lt;h3&gt;
  
  
  Defining Tools
&lt;/h3&gt;

&lt;p&gt;Tools are defined using OpenAI's function calling format within the LLM call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;strict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SQLAgent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Use this tool when you need to convert the natural language query into a SQL query&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The natural language query to convert into a SQL query&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;query&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="na"&gt;additionalProperties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;strict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;databaseAgent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Use this tool when you need to execute the SQL query and return the results&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The SQL query to execute&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;query&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="na"&gt;additionalProperties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Implementing Tool Handlers
&lt;/h3&gt;

&lt;p&gt;The final parameter of &lt;code&gt;callLLMWithToolHandling&lt;/code&gt; is an array of tool implementations:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;[{&lt;/span&gt;
    &lt;span class="na"&gt;toolName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SQLAgent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;impl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="na"&gt;toolId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;runSQLAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;toolId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;toolName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;databaseAgent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="na"&gt;impl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;runDatabaseAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each tool implementation receives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Parameters&lt;/strong&gt;: The arguments passed by the LLM (destructured from the first parameter)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool ID&lt;/strong&gt;: A unique identifier for this specific tool call (used for nested visualization)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Creating Nested Agents
&lt;/h3&gt;

&lt;p&gt;For tools that use LLMs internally (like SQLAgent), you create nested agents by calling &lt;code&gt;callLLMWithToolHandling&lt;/code&gt; again:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runSQLAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;toolId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Responses&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ResponseInput&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;system&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`You are an expert SQLite SQL query generator...`&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;
    &lt;span class="p"&gt;}];&lt;/span&gt;

    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;callLLMWithToolHandling&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;SQLAgent&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;// Nested agent name&lt;/span&gt;
        &lt;span class="nx"&gt;sessionId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="c1"&gt;// Same session ID&lt;/span&gt;
        &lt;span class="nx"&gt;toolId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;            &lt;span class="c1"&gt;// Tool ID from parent call&lt;/span&gt;
        &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openaiClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;responses&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gpt-4.1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;inp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;json_schema&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sqlQuery&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* JSON schema */&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;});&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;[]&lt;/span&gt;                 &lt;span class="c1"&gt;// No tools for this agent&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output_text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice how the nested agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Uses the same &lt;code&gt;sessionId&lt;/code&gt; to maintain session continuity&lt;/li&gt;
&lt;li&gt;Receives the &lt;code&gt;toolId&lt;/code&gt; from the parent to establish the parent-child relationship&lt;/li&gt;
&lt;li&gt;Can have its own tools or none at all&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Simple Tool Implementation
&lt;/h3&gt;

&lt;p&gt;For tools that don't use LLMs (like databaseAgent), the implementation is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;runDatabaseAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;queryDatabase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;  &lt;span class="c1"&gt;// Direct function call&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Viewing the Results
&lt;/h3&gt;

&lt;p&gt;After running your agent, AgentGraph seamlessly handles the complete interaction flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Automatic Tool Orchestration&lt;/strong&gt;: When the LLM decides to invoke a tool, &lt;code&gt;callLLMWithToolHandling&lt;/code&gt; automatically executes the corresponding tool implementation and feeds the result back to the LLM for further processing.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Complete Trace Capture&lt;/strong&gt;: Every interaction, tool call, and response is captured and saved to &lt;code&gt;./agentgraph_output/${sessionId}.json&lt;/code&gt;, creating a comprehensive execution trace.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Interactive Visualization&lt;/strong&gt;: The generated JSON file can be visualized using the &lt;a href="https://github.com/rishabhpoddar/agentgraph/tree/main?tab=readme-ov-file#visualizing-the-graph" rel="noopener noreferrer"&gt;AgentGraph visualizer&lt;/a&gt;, which renders your agent interactions as an interactive graph.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The visualization provides:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chat Flow&lt;/strong&gt;: LLM interactions as conversation bubbles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool Calls&lt;/strong&gt;: Expandable sections showing tool inputs and outputs
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nested Agents&lt;/strong&gt;: Subgraphs for tools that use LLMs internally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Execution Flow&lt;/strong&gt;: Clear visual representation of the decision-making process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This integration approach provides complete visibility into your multi-agent system without requiring significant changes to your existing code structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Take away
&lt;/h2&gt;

&lt;p&gt;As AI agent systems become increasingly sophisticated, the ability to understand and debug their behavior becomes crucial for reliable deployment. AgentGraph addresses this challenge by providing real-time visualization of multi-agent interactions, making complex decision flows transparent and debuggable.&lt;/p&gt;

&lt;p&gt;Key benefits of using AgentGraph include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero-friction Integration&lt;/strong&gt;: Minimal code changes required to instrument your existing agent systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complete Visibility&lt;/strong&gt;: Track every LLM interaction, tool call, and nested agent execution&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive Debugging&lt;/strong&gt;: Visual graph interface that lets you drill down into specific interactions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-language Support&lt;/strong&gt;: Works seamlessly with both Python and Node.js backends&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session Management&lt;/strong&gt;: Automatic trace capture and file generation for analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Whether you're building simple tool-calling agents or complex multi-agent orchestration systems, AgentGraph provides the observability you need to understand, debug, and optimize your AI workflows.&lt;/p&gt;

&lt;p&gt;Check out the &lt;a href="https://github.com/rishabhpoddar/agentgraph" rel="noopener noreferrer"&gt;AgentGraph repository&lt;/a&gt; and give it a ⭐ if you find it useful! Your support helps me continue developing tools that make AI development more transparent and accessible.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
