<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: CodeKing</title>
    <description>The latest articles on DEV Community by CodeKing (@codekingai).</description>
    <link>https://dev.to/codekingai</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3843914%2Fedc4fbb1-edd3-4c7d-9c94-e2b13dbc1af0.jpg</url>
      <title>DEV Community: CodeKing</title>
      <link>https://dev.to/codekingai</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/codekingai"/>
    <language>en</language>
    <item>
      <title>"I Added a /yolo Button to My Local AI Assistant"</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Mon, 15 Jun 2026 09:32:38 +0000</pubDate>
      <link>https://dev.to/codekingai/i-added-a-yolo-button-to-my-local-ai-assistant-4klh</link>
      <guid>https://dev.to/codekingai/i-added-a-yolo-button-to-my-local-ai-assistant-4klh</guid>
      <description>&lt;p&gt;I like local AI assistants that ask before they do risky things.&lt;/p&gt;

&lt;p&gt;I do &lt;strong&gt;not&lt;/strong&gt; like approving the same task six times in a row.&lt;/p&gt;

&lt;p&gt;That was the failure mode I kept hitting while working on CliGate. A normal task would start with one reasonable command, then another, then another, and every tiny step wanted a fresh confirmation.&lt;/p&gt;

&lt;p&gt;So I added a blunt but useful switch:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/yolo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It sounds reckless, but the real goal was the opposite: make the assistant feel faster &lt;strong&gt;without&lt;/strong&gt; training me to click through meaningless approval spam.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bad version of safety was too chatty
&lt;/h2&gt;

&lt;p&gt;The old loop looked safe on paper:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;assistant picks a mutating tool
-&amp;gt; asks for confirmation
-&amp;gt; user approves
-&amp;gt; one tool runs
-&amp;gt; assistant continues
-&amp;gt; next tiny tool asks again
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is technically correct.&lt;/p&gt;

&lt;p&gt;It is also miserable in practice.&lt;/p&gt;

&lt;p&gt;A real task is rarely one tool call. Reading a document, checking a project, or sending a result through a channel usually means several small steps in a row. If every step interrupts the user, the approval system stops communicating risk and starts communicating friction.&lt;/p&gt;

&lt;p&gt;That is when safety UI turns into background noise.&lt;/p&gt;

&lt;h2&gt;
  
  
  /yolo made the approval scope match the user's intent
&lt;/h2&gt;

&lt;p&gt;The fix was not "remove approvals." It was "remember what the user actually meant."&lt;/p&gt;

&lt;p&gt;In CliGate, &lt;code&gt;/yolo&lt;/code&gt; flips a conversation-level flag:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;later mutating tool calls in the same conversation auto-approve&lt;/li&gt;
&lt;li&gt;the assistant stops asking for every tiny follow-up step&lt;/li&gt;
&lt;li&gt;the user can turn strict mode back on with &lt;code&gt;/safe&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Under the hood, the flag lives on the conversation metadata as &lt;code&gt;assistantCore.autoApproveTools&lt;/code&gt;, and both web chat and channel conversations read the same source of truth.&lt;/p&gt;

&lt;p&gt;That detail mattered because I did not want one behavior in the web UI and a different one in DingTalk or Feishu.&lt;/p&gt;

&lt;h2&gt;
  
  
  Natural language mattered as much as the slash command
&lt;/h2&gt;

&lt;p&gt;The more interesting part is that users do not always type &lt;code&gt;/yolo&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;They say things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"后续都同意"&lt;/li&gt;
&lt;li&gt;"不要再问我了"&lt;/li&gt;
&lt;li&gt;"直接执行"&lt;/li&gt;
&lt;li&gt;"from now on, just do it"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I added sticky-approval phrase detection too.&lt;/p&gt;

&lt;p&gt;That means the assistant can recognize conversation-wide consent from normal language, not only from a command. But it still treats denial phrases separately, so "我不同意" does not accidentally enable auto-approve just because it contains the word "同意".&lt;/p&gt;

&lt;p&gt;This turned out to be one of those small pieces of product logic that matters more than the model prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  I still kept a real high-risk boundary
&lt;/h2&gt;

&lt;p&gt;The trap with a feature named &lt;code&gt;/yolo&lt;/code&gt; is obvious: if everything gets auto-approved, then safety is fake.&lt;/p&gt;

&lt;p&gt;So I kept one hard rule.&lt;/p&gt;

&lt;p&gt;Routine local work can flow through auto-approve mode, but genuinely destructive or external actions still need a fresh explicit confirmation.&lt;/p&gt;

&lt;p&gt;That means things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deleting files or directories&lt;/li&gt;
&lt;li&gt;overwriting data in a destructive way&lt;/li&gt;
&lt;li&gt;publishing outward&lt;/li&gt;
&lt;li&gt;sending messages to other people or other conversations&lt;/li&gt;
&lt;li&gt;submitting forms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;still stop and ask.&lt;/p&gt;

&lt;p&gt;That boundary is what makes the mode usable. The assistant can stop being noisy about low-level execution steps while still pausing when the consequence is actually irreversible or external.&lt;/p&gt;

&lt;h2&gt;
  
  
  One confirmation can now cover a whole batch
&lt;/h2&gt;

&lt;p&gt;I also found a second bug while fixing this.&lt;/p&gt;

&lt;p&gt;Sometimes one pending confirmation represented &lt;strong&gt;multiple&lt;/strong&gt; queued tool calls. Historically, approving it only executed the first one, which silently dropped the rest.&lt;/p&gt;

&lt;p&gt;The confirmation service now expands a pending action into all captured tool invocations and runs each of them in order. One approve means the whole batch gets executed, not just item one.&lt;/p&gt;

&lt;p&gt;That sounds like an implementation detail, but it changes user trust a lot. If the UI says "confirmed," users expect the intended action to finish, not partially disappear.&lt;/p&gt;

&lt;h2&gt;
  
  
  The result feels less magical and more honest
&lt;/h2&gt;

&lt;p&gt;My favorite part of this change is that it did not make the assistant feel more autonomous.&lt;/p&gt;

&lt;p&gt;It made it feel more aligned.&lt;/p&gt;

&lt;p&gt;The assistant now behaves closer to how a human collaborator would interpret the conversation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;if I said "just continue," it continues&lt;/li&gt;
&lt;li&gt;if I said &lt;code&gt;/yolo&lt;/code&gt;, it stops nagging me for every tiny step&lt;/li&gt;
&lt;li&gt;if the next move is truly risky, it still pauses&lt;/li&gt;
&lt;li&gt;if I want strict mode back, &lt;code&gt;/safe&lt;/code&gt; restores it immediately&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That balance is what I want from a local control plane: not maximum freedom, not maximum ceremony, just the right amount of friction in the right place.&lt;/p&gt;

&lt;p&gt;CliGate is the open-source local control plane I use to route Claude Code, Codex CLI, channels, desktop control, and assistant workflows through one place: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;CliGate&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are building local agents, where do you draw the line between approval memory and real safety boundaries?&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I Added WeChat to My Claude Code and Codex Workflow</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Thu, 11 Jun 2026 07:47:14 +0000</pubDate>
      <link>https://dev.to/codekingai/i-added-wechat-to-my-claude-code-and-codex-workflow-395b</link>
      <guid>https://dev.to/codekingai/i-added-wechat-to-my-claude-code-and-codex-workflow-395b</guid>
      <description>&lt;p&gt;The awkward part of local AI tooling is that "local" usually means "only useful while I am sitting at the keyboard."&lt;/p&gt;

&lt;p&gt;That is fine when I am deep in an editor. It is less fine when Claude Code is waiting for approval, Codex finished a task, or I just want to ask "what happened?" from my phone.&lt;/p&gt;

&lt;p&gt;I already had browser chat and a few external channels wired into CliGate. But for my daily workflow, one missing channel was obvious: WeChat.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem was not another bot
&lt;/h2&gt;

&lt;p&gt;Adding a chat provider is easy if all you want is text in and text out.&lt;/p&gt;

&lt;p&gt;That was not the workflow I wanted.&lt;/p&gt;

&lt;p&gt;The useful version needs to preserve the shape of an AI coding task:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;start a Codex or Claude Code run&lt;/li&gt;
&lt;li&gt;keep follow-up messages attached to the same task&lt;/li&gt;
&lt;li&gt;surface approvals when the runtime asks for permission&lt;/li&gt;
&lt;li&gt;send progress and final results back to the user&lt;/li&gt;
&lt;li&gt;remember whether the conversation is a new task, a status check, or a continuation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without that, a WeChat integration would just become another notification pipe. Nice to demo, not very useful under pressure.&lt;/p&gt;

&lt;p&gt;The painful case is familiar: a local agent starts doing real work, hits one approval prompt, and then silently waits in a terminal you are not looking at. The problem is not that the model cannot reason. The problem is that the control loop is trapped in the wrong place.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix was to treat channels as task surfaces
&lt;/h2&gt;

&lt;p&gt;In CliGate, I ended up treating WeChat the same way I treat Telegram, Feishu, and DingTalk: not as separate products, but as surfaces over the same local task system.&lt;/p&gt;

&lt;p&gt;The runtime is still local. Claude Code and Codex still run on my machine. The proxy, account routing, logs, approvals, and task records still live on &lt;code&gt;localhost&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The channel only changes where I talk to the workflow.&lt;/p&gt;

&lt;p&gt;That distinction matters. It means the WeChat provider does not need to know how to run Codex. It needs to know how to receive a message, map a user to a conversation, and hand that message to the same supervisor layer that the web chat uses.&lt;/p&gt;

&lt;p&gt;The mental model looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WeChat message
-&amp;gt; channel gateway
-&amp;gt; supervisor conversation
-&amp;gt; Codex or Claude Code runtime
-&amp;gt; progress / approval / result
-&amp;gt; WeChat reply
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The interesting part is the middle. A message like "status?" should not be forwarded to Codex as a random new prompt. It should be answered from the remembered task state. A message like "also update the docs" should usually continue the current task, not start from scratch with the default provider.&lt;/p&gt;

&lt;p&gt;That task memory is what makes a channel feel like a real interface instead of a webhook.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed in the workflow
&lt;/h2&gt;

&lt;p&gt;Before this, I had a split-brain setup.&lt;/p&gt;

&lt;p&gt;The desktop dashboard was where the real task state lived. My phone was mostly for checking messages. If an AI coding task needed a decision, I had to return to the browser or terminal.&lt;/p&gt;

&lt;p&gt;Now the chat app can stay in the loop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;start a coding task from WeChat&lt;/li&gt;
&lt;li&gt;receive progress as the local runtime works&lt;/li&gt;
&lt;li&gt;answer approval prompts without finding the terminal&lt;/li&gt;
&lt;li&gt;ask for a status update without interrupting the agent&lt;/li&gt;
&lt;li&gt;continue the same task with a short follow-up&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That does not make WeChat special. That is actually the point.&lt;/p&gt;

&lt;p&gt;The more useful design is that the channel is replaceable. Telegram, Feishu, DingTalk, WeChat, and the web dashboard should all speak to the same task model. Users can pick the surface that matches their day.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I kept the tools unchanged
&lt;/h2&gt;

&lt;p&gt;I did not want to build a fake Claude Code or a fake Codex inside a chat app.&lt;/p&gt;

&lt;p&gt;Those tools already have their own strengths. The better layer is a local control plane around them: one place for routing, credentials, approvals, task records, logs, and channel delivery.&lt;/p&gt;

&lt;p&gt;That is what CliGate is trying to be.&lt;/p&gt;

&lt;p&gt;You still run it locally:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx cligate@latest start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the same machine that owns your code can also own the coordination loop. No hosted relay is required for the core workflow, and the AI coding tools do not need custom provider configs for every channel you add.&lt;/p&gt;

&lt;h2&gt;
  
  
  The lesson
&lt;/h2&gt;

&lt;p&gt;The main thing I learned is that "mobile support" for local agents is not just a smaller UI.&lt;/p&gt;

&lt;p&gt;It is a continuity problem.&lt;/p&gt;

&lt;p&gt;If the phone cannot see the task state, approvals, runtime choice, and final result, then it is only a remote text box. If it can see those things, it becomes a practical control surface for work that still runs safely on your own machine.&lt;/p&gt;

&lt;p&gt;That is the direction I want local AI tooling to go: the work stays local, but the conversation can meet you where you already are.&lt;/p&gt;

&lt;p&gt;The project is open source here: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;https://github.com/codeking-ai/cligate&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you use AI coding agents, where do you want their progress and approval prompts to show up: terminal, browser, Slack, Telegram, WeChat, or somewhere else?&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>My Coding Agent Asked Permission for Every Tiny Step</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Tue, 09 Jun 2026 07:09:30 +0000</pubDate>
      <link>https://dev.to/codekingai/my-coding-agent-asked-permission-for-every-tiny-step-26lg</link>
      <guid>https://dev.to/codekingai/my-coding-agent-asked-permission-for-every-tiny-step-26lg</guid>
      <description>&lt;p&gt;The most annoying bug in my local AI assistant was not that it refused to ask for permission.&lt;/p&gt;

&lt;p&gt;It was that it asked too often.&lt;/p&gt;

&lt;p&gt;I would give it a normal task like "read this PDF and tell me what is inside." The assistant would make a reasonable first move, ask for approval, run the command, and then immediately ask again for the next tiny probe.&lt;/p&gt;

&lt;p&gt;One task turned into a permission treadmill.&lt;/p&gt;

&lt;h2&gt;
  
  
  The approval loop felt safe but unusable
&lt;/h2&gt;

&lt;p&gt;Approval prompts are good. I want a local assistant to stop before it runs commands, writes files, opens apps, or touches the desktop.&lt;/p&gt;

&lt;p&gt;The problem is that real work is rarely one tool call.&lt;/p&gt;

&lt;p&gt;Reading a PDF might require checking whether Python exists, finding a converter, running the converter, reading the text output, and then summarizing it. A desktop task might require focusing a window, entering text, pressing a button, and verifying the result.&lt;/p&gt;

&lt;p&gt;If every step asks the same kind of question, the user stops evaluating risk and starts clicking through interruptions.&lt;/p&gt;

&lt;p&gt;That is worse than unsafe. It trains the user to ignore the safety system.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bug was treating approval as a single event
&lt;/h2&gt;

&lt;p&gt;The old flow in CliGate looked roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;assistant chooses tool
-&amp;gt; policy says confirmation required
-&amp;gt; user approves
-&amp;gt; tool runs
-&amp;gt; assistant continues
-&amp;gt; next tool asks again
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That was technically correct. Each tool call had its own confirmation.&lt;/p&gt;

&lt;p&gt;But it missed the shape of the user's intent. The user was not approving "run one tiny probe." They were approving the assistant to continue one bounded task.&lt;/p&gt;

&lt;p&gt;So I changed the behavior after the first approved action in a chat conversation.&lt;/p&gt;

&lt;p&gt;Once the user approves the first execution tool for the task, CliGate flips a conversation flag that lets later steps continue without another approval roundtrip. The assistant still feeds the real tool result back into the continuation run, so it can keep working from what actually happened instead of pretending the approval itself completed the task.&lt;/p&gt;

&lt;p&gt;The escape hatch matters too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/safe
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That turns the conversation back into explicit confirmation mode.&lt;/p&gt;

&lt;h2&gt;
  
  
  The continuation had to carry the real result
&lt;/h2&gt;

&lt;p&gt;There was another subtle part.&lt;/p&gt;

&lt;p&gt;When a user approves a pending action, the system cannot just say "confirmed" and stop. It has to continue the original job.&lt;/p&gt;

&lt;p&gt;The continuation prompt now tells the assistant that the previous tool call was approved, already executed, and produced a real result. That stops the assistant from asking for the same approval again or forgetting why the tool ran in the first place.&lt;/p&gt;

&lt;p&gt;This made the assistant feel much less like a form with a chatbot attached to it. The flow became:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ask once for the risky boundary&lt;/li&gt;
&lt;li&gt;execute the first step&lt;/li&gt;
&lt;li&gt;continue the task with real observations&lt;/li&gt;
&lt;li&gt;only interrupt again when the user explicitly returns to safe mode or a genuinely different boundary appears&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The interesting part is that the fix is not "disable approvals." It is "remember what approval was for."&lt;/p&gt;

&lt;h2&gt;
  
  
  I also hit a language bug
&lt;/h2&gt;

&lt;p&gt;This same continuation path exposed a smaller but very visible bug.&lt;/p&gt;

&lt;p&gt;The continuation message was system-generated and written in English. When the original user task was Chinese, the assistant sometimes detected the system continuation as the latest "user" text and switched the next confirmation prompt to English.&lt;/p&gt;

&lt;p&gt;That produced an ugly mixed-language workflow: first approval in Chinese, later approval prompts in English.&lt;/p&gt;

&lt;p&gt;The fix was simple in principle: system-authored continuation turns should not decide the reply language. The assistant now looks back to the latest genuine user message in the conversation before choosing the response language.&lt;/p&gt;

&lt;p&gt;That is the kind of detail that seems minor until you use the tool for a multi-step task. Then it is the difference between "this assistant is tracking me" and "this assistant is reacting to its own plumbing."&lt;/p&gt;

&lt;h2&gt;
  
  
  The rule I am keeping
&lt;/h2&gt;

&lt;p&gt;I do not think agent approval should be all-or-nothing.&lt;/p&gt;

&lt;p&gt;For local tools, the useful middle ground is task-scoped trust:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;ask before crossing a meaningful boundary&lt;/li&gt;
&lt;li&gt;remember that approval for the current task&lt;/li&gt;
&lt;li&gt;keep an obvious way to return to strict mode&lt;/li&gt;
&lt;li&gt;do not let system continuation messages masquerade as user intent&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is now how I am shaping approvals in CliGate, the local control plane I use for Claude Code, Codex CLI, Gemini CLI, desktop automation, channels, and model routing.&lt;/p&gt;

&lt;p&gt;The project is open source here: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;CliGate&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are building local agents, how are you handling approval fatigue: per tool call, per task, per session, or something more granular?&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I Wired Qwen and OpenRouter Into Claude Code and Codex Without New Configs</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Fri, 05 Jun 2026 08:47:05 +0000</pubDate>
      <link>https://dev.to/codekingai/i-wired-qwen-and-openrouter-into-claude-code-and-codex-without-new-configs-4p4m</link>
      <guid>https://dev.to/codekingai/i-wired-qwen-and-openrouter-into-claude-code-and-codex-without-new-configs-4p4m</guid>
      <description>&lt;p&gt;Every new model provider looks simple until it reaches your actual coding tools.&lt;/p&gt;

&lt;p&gt;Qwen has DashScope's OpenAI-compatible mode. OpenRouter gives you one API for a huge list of models. Both sound like they should be easy to plug into an AI coding workflow.&lt;/p&gt;

&lt;p&gt;Then the tools remind you that "OpenAI-compatible" does not mean "compatible with everything I use."&lt;/p&gt;

&lt;p&gt;Claude Code expects Anthropic Messages. Codex expects the Responses shape. Other clients speak Chat Completions. A provider can have a perfectly good API and still force you into another round of base URLs, API keys, model aliases, and small wrapper scripts.&lt;/p&gt;

&lt;p&gt;That was the part I wanted to avoid.&lt;/p&gt;

&lt;h2&gt;
  
  
  The annoying part was not the API key
&lt;/h2&gt;

&lt;p&gt;Adding one more key is easy.&lt;/p&gt;

&lt;p&gt;The mess starts when every tool wants to own the provider decision.&lt;/p&gt;

&lt;p&gt;If I want to try Qwen for cheap coding tasks, I do not want to edit Claude Code's environment, then Codex's config, then a chat client's model list. If I want to test an OpenRouter model slug like &lt;code&gt;anthropic/claude-3.7-sonnet&lt;/code&gt;, I do not want a routing layer to accidentally remap it because it looks unfamiliar.&lt;/p&gt;

&lt;p&gt;The shape I wanted was simpler:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tools keep pointing at localhost&lt;/li&gt;
&lt;li&gt;providers are added once&lt;/li&gt;
&lt;li&gt;model names are mapped or passed through in one place&lt;/li&gt;
&lt;li&gt;protocol conversion happens behind the gateway&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the latest CliGate change adds Qwen and OpenRouter as provider presets instead of making them another pile of one-off config.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix was a preset, not another bespoke provider
&lt;/h2&gt;

&lt;p&gt;CliGate already runs as a local control plane for AI coding tools. Claude Code, Codex CLI, Gemini CLI, OpenClaw, and OpenAI-compatible clients can point at one local server, while CliGate owns credentials, routing, logs, and model mapping.&lt;/p&gt;

&lt;p&gt;For Qwen and OpenRouter, the interesting part is that they are both OpenAI Chat-style providers. That means the new provider definition can be mostly data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;apiFormat&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;openai_chat&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://dashscope.aliyuncs.com/compatible-mode/v1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen-max&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen-plus&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen-turbo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwq-32b&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;tiers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;standard&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen-plus&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;fast&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen-turbo&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;OpenRouter gets the same treatment, with one important rule: model ids containing &lt;code&gt;/&lt;/code&gt; pass through as native model slugs. If I ask for &lt;code&gt;openai/gpt-4o-mini&lt;/code&gt; or &lt;code&gt;qwen/qwen-2.5-72b-instruct&lt;/code&gt;, CliGate should not pretend it knows better and remap that into a generic tier.&lt;/p&gt;

&lt;p&gt;That small rule matters because OpenRouter is not one model family. It is a catalog.&lt;/p&gt;

&lt;h2&gt;
  
  
  The protocol bridge is the real feature
&lt;/h2&gt;

&lt;p&gt;The provider preset is only half the story.&lt;/p&gt;

&lt;p&gt;Qwen and OpenRouter expose Chat Completions. Claude Code does not speak Chat Completions. Codex may enter through a Responses-compatible path. So the gateway now treats these providers as chat-native upstreams and bridges the tool-facing protocols around them.&lt;/p&gt;

&lt;p&gt;For Claude Code, CliGate translates Anthropic Messages into OpenAI Chat, sends the request to Qwen or OpenRouter, then translates the answer back into an Anthropic-style message. Tool calls and tool results are preserved in the translation.&lt;/p&gt;

&lt;p&gt;For Codex, CliGate deliberately does not pretend these providers have a native &lt;code&gt;/responses&lt;/code&gt; endpoint. It leaves native Responses support disabled, so the existing Responses-to-Chat fallback handles the request through Chat Completions instead.&lt;/p&gt;

&lt;p&gt;That detail is boring in the best way. The tools keep speaking their own protocols. The provider only has to support the API it actually supports.&lt;/p&gt;

&lt;h2&gt;
  
  
  What setup looks like now
&lt;/h2&gt;

&lt;p&gt;The user-facing setup is intentionally small:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx cligate@latest start
&lt;span class="c"&gt;# Add a Qwen or OpenRouter key in API Keys&lt;/span&gt;
&lt;span class="c"&gt;# Keep Claude Code and Codex pointed at localhost:8081&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that, routing can happen inside CliGate. Qwen can fill a cheap &lt;code&gt;fast&lt;/code&gt; or &lt;code&gt;standard&lt;/code&gt; tier. OpenRouter can be used when you want to try a specific upstream model slug. Existing OpenAI, Anthropic, Gemini, Azure OpenAI, DeepSeek, Vertex, and local routes can stay in the same dashboard.&lt;/p&gt;

&lt;p&gt;No CLI needs to know which provider won the route.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this feels better than another config file
&lt;/h2&gt;

&lt;p&gt;The main win is not that CliGate supports two more providers.&lt;/p&gt;

&lt;p&gt;The win is that adding a provider no longer means teaching every tool about that provider.&lt;/p&gt;

&lt;p&gt;Qwen and OpenRouter are now just credentials and routing choices inside the local control plane. Claude Code can still think it is talking to an Anthropic-shaped API. Codex can still enter through its expected path. OpenAI-compatible clients can still use Chat Completions directly.&lt;/p&gt;

&lt;p&gt;The provider decision moves out of the tool and into the layer that can actually inspect, route, price, and log the request.&lt;/p&gt;

&lt;p&gt;Source: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;github.com/codeking-ai/cligate&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How are you handling OpenRouter-style model catalogs in your AI coding setup? Do you pass model slugs through directly, map everything into tiers, or keep separate configs per tool?&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>"My AI Assistant Needed a Control Plane, Not a Bigger Loop"</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Wed, 03 Jun 2026 08:34:33 +0000</pubDate>
      <link>https://dev.to/codekingai/my-ai-assistant-needed-a-control-plane-not-a-bigger-loop-15aa</link>
      <guid>https://dev.to/codekingai/my-ai-assistant-needed-a-control-plane-not-a-bigger-loop-15aa</guid>
      <description>&lt;p&gt;I kept trying to make my AI assistant smarter by adding more tools to the same loop.&lt;/p&gt;

&lt;p&gt;That worked for a while. Then the assistant had to do normal user things: continue a Codex task from chat, answer a status question from DingTalk, remember how a desktop workflow succeeded, wait behind another run that was using the mouse, and still route Claude Code traffic through the same localhost server.&lt;/p&gt;

&lt;p&gt;At that point the problem was no longer "how many tools can one agent call?"&lt;/p&gt;

&lt;p&gt;The problem was architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  The loop got too many jobs
&lt;/h2&gt;

&lt;p&gt;The first shape was simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;user message -&amp;gt; assistant loop -&amp;gt; tools -&amp;gt; answer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is fine for a demo. It is not fine for a resident assistant.&lt;/p&gt;

&lt;p&gt;A resident assistant has to know whether a message is a new task, a follow-up, a status check, a correction, or a cancellation. It has to avoid stealing the desktop from another running task. It has to remember procedures without shoving every old transcript into context. It has to delegate coding work to Codex or Claude Code without pretending it is the executor.&lt;/p&gt;

&lt;p&gt;Those are different jobs. When I kept them inside one loop, every fix made the loop more capable and less understandable.&lt;/p&gt;

&lt;p&gt;So I stopped thinking about the assistant as one agent and started treating it as a local control plane.&lt;/p&gt;

&lt;h2&gt;
  
  
  The split that made the system easier to reason about
&lt;/h2&gt;

&lt;p&gt;In CliGate, the architecture now looks more like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Experience Plane
  -&amp;gt; Assistant Control Plane
  -&amp;gt; Runtime Execution Plane
  -&amp;gt; Proxy / Model Access Plane

Observation Plane + Memory / Policy Plane sit across the side.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The names sound formal, but the boundaries are practical.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;experience plane&lt;/strong&gt; owns where the user is talking from: dashboard chat, assistant tasks, Telegram, Feishu, DingTalk, scheduled jobs.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;assistant control plane&lt;/strong&gt; decides what kind of work this is. Should it answer from state? Should it start a task? Should it continue an existing one? Should it wait because the desktop is already held by another run?&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;runtime execution plane&lt;/strong&gt; is where Codex and Claude Code live. They do the actual coding work. The assistant can dispatch, continue, summarize, and coordinate them, but it does not need to become a worse version of them.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;proxy/model access plane&lt;/strong&gt; handles the boring but necessary provider work: protocol translation, account pools, API keys, routing, model mapping, request logs, and usage.&lt;/p&gt;

&lt;p&gt;The side planes are what keep the assistant sane:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;observation turns scattered runtime events into small structured facts&lt;/li&gt;
&lt;li&gt;memory stores reusable facts, directives, workflows, and references&lt;/li&gt;
&lt;li&gt;policy decides whether an action is allowed, queued, or needs approval&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Observation-first changed the behavior
&lt;/h2&gt;

&lt;p&gt;The biggest improvement came from making the assistant consume observations instead of raw logs.&lt;/p&gt;

&lt;p&gt;If a Codex run is waiting for approval, the assistant should not read a giant transcript to rediscover that. It should see a compact fact:&lt;/p&gt;

&lt;p&gt;"Task X is waiting for approval to run command Y."&lt;/p&gt;

&lt;p&gt;If another assistant run is currently driving the desktop, a new run should not guess from chat history. It should see a resource holder:&lt;/p&gt;

&lt;p&gt;"desktop is held by run R."&lt;/p&gt;

&lt;p&gt;That one change made status questions, cancellation, follow-ups, and concurrent runs much less fragile. The assistant no longer has to infer the system state from the last few messages. The system gives it a state model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory is not just longer context
&lt;/h2&gt;

&lt;p&gt;I also learned that "remembering" is not the same as stuffing more chat history into a prompt.&lt;/p&gt;

&lt;p&gt;For this assistant, memory is file-based and scoped. It can store a workflow, a fact, a standing directive, or a reference. On the next similar request, the prompt only gets a small memory index. If the assistant thinks one entry matters, it explicitly recalls the body.&lt;/p&gt;

&lt;p&gt;That keeps the default context small while still letting the assistant learn things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how a desktop publishing flow finally worked&lt;/li&gt;
&lt;li&gt;which project rule the user wants preserved&lt;/li&gt;
&lt;li&gt;where an internal document lives&lt;/li&gt;
&lt;li&gt;what not to try again&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For procedure memories, the rule is verify-then-trust. Try the remembered steps, but confirm the UI still matches. If it changed, explore again and update the memory after success.&lt;/p&gt;

&lt;p&gt;That is closer to how I want a practical assistant to evolve: not by growing a huge transcript, but by distilling successful work into reusable units.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters for local AI tools
&lt;/h2&gt;

&lt;p&gt;Local AI tooling is messy in a specific way.&lt;/p&gt;

&lt;p&gt;The user may have Claude Code, Codex CLI, Gemini CLI, OpenClaw, a browser session, a desktop app, a Telegram channel, and several provider accounts. The hard part is not only making one model call. The hard part is keeping all of those pieces coordinated without turning the assistant into an opaque supervisor that hijacks every message.&lt;/p&gt;

&lt;p&gt;That is why CliGate still keeps a direct runtime path. If the user is already talking to a Codex session, the message can go straight there. The assistant control plane is for explicit coordination, background tasks, memory, policy, desktop work, and cross-channel workflows.&lt;/p&gt;

&lt;p&gt;The split is not glamorous, but it is the difference between an impressive demo and a tool I can leave running.&lt;/p&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;I used to ask: how do I make the assistant loop smarter?&lt;/p&gt;

&lt;p&gt;Now I ask: which plane should own this responsibility?&lt;/p&gt;

&lt;p&gt;That question has prevented a lot of accidental complexity. It keeps provider routing out of the assistant loop, execution inside dedicated runtimes, observations out of raw logs, and memory out of unbounded chat history.&lt;/p&gt;

&lt;p&gt;The project is open source here: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;CliGate&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are building agents around existing tools, are you putting everything inside one loop, or are you starting to split control, execution, observation, and memory too?&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>webdev</category>
      <category>ai</category>
      <category>javascript</category>
    </item>
    <item>
      <title>"I Stopped Letting Every AI CLI Own Its Provider Config"</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Tue, 02 Jun 2026 07:27:37 +0000</pubDate>
      <link>https://dev.to/codekingai/i-stopped-letting-every-ai-cli-own-its-provider-config-4ih</link>
      <guid>https://dev.to/codekingai/i-stopped-letting-every-ai-cli-own-its-provider-config-4ih</guid>
      <description>&lt;p&gt;AI coding tools keep getting better.&lt;/p&gt;

&lt;p&gt;The setup around them keeps getting worse.&lt;/p&gt;

&lt;p&gt;Claude Code wants one style of config. Codex CLI wants another. Gemini CLI has its own path. OpenClaw has a different provider file again. Each tool makes sense in isolation. Together, they turn provider management into a small pile of duplicated state.&lt;/p&gt;

&lt;p&gt;That was the part I wanted to stop.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real problem is ownership
&lt;/h2&gt;

&lt;p&gt;The annoying part is not that every tool needs a base URL or an API key.&lt;/p&gt;

&lt;p&gt;The annoying part is that every tool starts acting like it owns your provider strategy.&lt;/p&gt;

&lt;p&gt;If I change from OpenAI to Azure OpenAI, I should not have to remember which CLI has which config file. If I want Claude Code on one account and Codex on another provider, that should not become a weekend of shell variables, TOML edits, and half-forgotten JSON.&lt;/p&gt;

&lt;p&gt;The tool should own the developer experience.&lt;/p&gt;

&lt;p&gt;The local control plane should own routing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I changed
&lt;/h2&gt;

&lt;p&gt;I built that rule into &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;CliGate&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;CliGate runs locally on &lt;code&gt;localhost:8081&lt;/code&gt;. Claude Code, Codex CLI, Gemini CLI, and OpenClaw point at the same local gateway. After that, provider decisions move out of the individual tools and into one dashboard.&lt;/p&gt;

&lt;p&gt;The setup becomes boring:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx cligate@latest start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then each tool only needs to know one thing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;use localhost:8081
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The gateway handles the rest: account pools, API keys, model mapping, app-level bindings, fallback order, usage logs, and cost visibility.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this feels different
&lt;/h2&gt;

&lt;p&gt;Before, switching providers meant touching the tools.&lt;/p&gt;

&lt;p&gt;After, switching providers means changing a route.&lt;/p&gt;

&lt;p&gt;For example, I can bind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Claude Code to a Claude account pool&lt;/li&gt;
&lt;li&gt;Codex CLI to Azure OpenAI&lt;/li&gt;
&lt;li&gt;Gemini CLI to Gemini API keys&lt;/li&gt;
&lt;li&gt;quick fallback requests to cheaper or free models&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The CLIs do not need to know that decision changed. They keep speaking the protocol they already speak. CliGate translates and routes behind them.&lt;/p&gt;

&lt;p&gt;That is the point: I do not want four AI tools to become four configuration databases.&lt;/p&gt;

&lt;h2&gt;
  
  
  The useful boundary
&lt;/h2&gt;

&lt;p&gt;This is the boundary that made the project simpler:&lt;/p&gt;

&lt;p&gt;Tools should be replaceable.&lt;/p&gt;

&lt;p&gt;Providers should be replaceable.&lt;/p&gt;

&lt;p&gt;Routing should be inspectable.&lt;/p&gt;

&lt;p&gt;Once those three things are separated, a lot of annoying workflow problems get smaller. A key expires in one pool? Replace it in one place. A provider gets expensive? Move that app to another route. A model name does not match the upstream deployment name? Map it once.&lt;/p&gt;

&lt;p&gt;No drama. No config archaeology.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who needs this
&lt;/h2&gt;

&lt;p&gt;If you use one AI CLI with one API key, you probably do not need a gateway.&lt;/p&gt;

&lt;p&gt;If you use Claude Code, Codex CLI, Gemini CLI, local models, work credentials, personal accounts, and a few free provider keys, you probably already feel the problem.&lt;/p&gt;

&lt;p&gt;The sharp version is this:&lt;/p&gt;

&lt;p&gt;Your AI CLI should not own your provider strategy.&lt;/p&gt;

&lt;p&gt;It should send the request. A local layer should decide where that request belongs.&lt;/p&gt;

&lt;p&gt;That is the part CliGate is trying to make normal.&lt;/p&gt;

&lt;p&gt;How are you handling provider config across AI coding tools right now: one config per tool, one wrapper script, or a local gateway?&lt;/p&gt;

</description>
      <category>discuss</category>
      <category>webdev</category>
      <category>ai</category>
      <category>javascript</category>
    </item>
    <item>
      <title>"My AI Agent Kept Missing Buttons, So I Used Windows UI Automation"</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Wed, 27 May 2026 05:53:52 +0000</pubDate>
      <link>https://dev.to/codekingai/my-ai-agent-kept-missing-buttons-so-i-used-windows-ui-automation-35mk</link>
      <guid>https://dev.to/codekingai/my-ai-agent-kept-missing-buttons-so-i-used-windows-ui-automation-35mk</guid>
      <description>&lt;p&gt;The first time you let an AI agent control a desktop, it feels impressive.&lt;/p&gt;

&lt;p&gt;Then it misses a button by 40 pixels.&lt;/p&gt;

&lt;p&gt;Or it clicks the window behind the window. Or it types into the wrong field because a notification stole focus. Or it spends ten seconds looking at a screenshot just to decide where a textbox probably is.&lt;/p&gt;

&lt;p&gt;That was the part of desktop automation that bothered me. The model was not really failing at reasoning. It was being forced to reverse-engineer an application from pixels.&lt;/p&gt;

&lt;h2&gt;
  
  
  Screenshot-first is the wrong default
&lt;/h2&gt;

&lt;p&gt;The common loop looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;screenshot -&amp;gt; model guesses UI -&amp;gt; model guesses coordinates -&amp;gt; click -&amp;gt; screenshot again
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is useful as a fallback, but it is a bad default for normal desktop apps.&lt;/p&gt;

&lt;p&gt;Most buttons, inputs, tabs, menus, and labels already exist in a semantic tree before they become pixels on the screen. The operating system exposes that tree for accessibility tools: screen readers, magnifiers, automation software, and anything else that needs to understand the UI without guessing from a bitmap.&lt;/p&gt;

&lt;p&gt;On Windows, that layer is UI Automation.&lt;/p&gt;

&lt;p&gt;So I stopped treating screenshots as the first source of truth.&lt;/p&gt;

&lt;h2&gt;
  
  
  The loop I wanted
&lt;/h2&gt;

&lt;p&gt;For CliGate's desktop agent, I wanted a local loop that worked more like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;list windows -&amp;gt; focus app -&amp;gt; find control -&amp;gt; set value -&amp;gt; send key -&amp;gt; read text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is a different kind of automation.&lt;/p&gt;

&lt;p&gt;Instead of "click around until something happens," the agent can ask Windows for visible windows, focus the right one, find an &lt;code&gt;Edit&lt;/code&gt; control, set its value through &lt;code&gt;ValuePattern&lt;/code&gt;, invoke a button through UIA, or read text from a matching control.&lt;/p&gt;

&lt;p&gt;Coordinates still exist, but they become a fallback. If the app exposes accessibility metadata, the assistant should use the semantic route first. If the app is custom-rendered, Canvas-heavy, or not exposing useful controls, then it can capture a screenshot and fall back to visual inspection.&lt;/p&gt;

&lt;p&gt;The important rule is:&lt;/p&gt;

&lt;p&gt;Observe semantically first. Use pixels only when the semantic layer is missing.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I wired it into CliGate
&lt;/h2&gt;

&lt;p&gt;CliGate is already a local gateway for Claude Code, Codex CLI, Gemini CLI, OpenClaw, account pools, API keys, runtime sessions, and channel workflows.&lt;/p&gt;

&lt;p&gt;The desktop part adds a small local companion service that runs in the user's interactive desktop session. The assistant talks to it through local tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;list visible windows&lt;/li&gt;
&lt;li&gt;launch or focus an app&lt;/li&gt;
&lt;li&gt;find one control or all matching controls&lt;/li&gt;
&lt;li&gt;click a control through UIA&lt;/li&gt;
&lt;li&gt;set an input value without relying on clipboard focus&lt;/li&gt;
&lt;li&gt;send control keys like Enter&lt;/li&gt;
&lt;li&gt;read visible text or values&lt;/li&gt;
&lt;li&gt;capture a screenshot when UIA is not enough&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means the agent can stay inside one local control plane. It can work in a repo, continue a Codex or Claude Code runtime session, and still operate the desktop app that owns the last step of the workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why local matters
&lt;/h2&gt;

&lt;p&gt;I do not want desktop control to depend on a hosted relay.&lt;/p&gt;

&lt;p&gt;The desktop is full of sensitive state: open apps, browser sessions, account consoles, chat windows, clipboard contents, and local files. Keeping the control service on &lt;code&gt;localhost&lt;/code&gt; keeps the automation close to the machine that owns that state.&lt;/p&gt;

&lt;p&gt;It also makes the loop simpler. The assistant can inspect, act, and verify against the actual window in front of the user without shipping a stream of screenshots to a separate service.&lt;/p&gt;

&lt;p&gt;This fits the rest of CliGate's design: local server, local dashboard, local runtime orchestration, local desktop bridge.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;p&gt;The big change was not speed, although UIA calls are much faster than screenshot-model-click loops.&lt;/p&gt;

&lt;p&gt;The bigger change was reliability.&lt;/p&gt;

&lt;p&gt;A textbox is no longer "some rectangle near the bottom of the screenshot." It is an &lt;code&gt;Edit&lt;/code&gt; control. A button is no longer "probably the blue thing." It is a control with a name, bounding box, state, and supported patterns.&lt;/p&gt;

&lt;p&gt;Screenshots are still useful. Some apps do not expose good accessibility metadata. Some content is graphical by nature. But for normal desktop and browser workflows, UIA-first makes the assistant feel less like a demo and more like an operator.&lt;/p&gt;

&lt;p&gt;The project is open source here: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;CliGate on GitHub&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you are building desktop-capable agents, are you starting with screenshots, accessibility trees, browser automation, or something else?&lt;/p&gt;

</description>
      <category>tutorial</category>
      <category>webdev</category>
      <category>node</category>
      <category>ai</category>
    </item>
    <item>
      <title>"My AI Assistant Could Code, But It Couldn't Operate My Desktop"</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Tue, 26 May 2026 09:51:16 +0000</pubDate>
      <link>https://dev.to/codekingai/my-ai-assistant-could-code-but-it-couldnt-operate-my-desktop-4d97</link>
      <guid>https://dev.to/codekingai/my-ai-assistant-could-code-but-it-couldnt-operate-my-desktop-4d97</guid>
      <description>&lt;p&gt;My assistant could already read files, run shell commands, and delegate coding work to Claude Code or Codex.&lt;/p&gt;

&lt;p&gt;But the moment a workflow hit a real desktop app, the illusion broke.&lt;/p&gt;

&lt;p&gt;A browser needed a click. A page needed a scroll. A field needed real text input. A task could finish the hard part and still get stuck on the last two seconds of UI.&lt;/p&gt;

&lt;p&gt;That felt like a fake kind of automation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem wasn't coding
&lt;/h2&gt;

&lt;p&gt;The hard part here wasn't generating code. It was crossing the gap between "I know what should happen next" and "I can actually operate the window in front of me."&lt;/p&gt;

&lt;p&gt;In practice, that gap showed up in small but annoying ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a browser tab needed &lt;code&gt;Ctrl+L&lt;/code&gt; and a URL paste&lt;/li&gt;
&lt;li&gt;a page exposed no reliable accessibility selector, so a screenshot was needed first&lt;/li&gt;
&lt;li&gt;a long form needed scrolling inside the right pane, not the whole desktop&lt;/li&gt;
&lt;li&gt;a final publish step still depended on one visible button&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the assistant didn't need another coding loop. It needed a safe desktop-control layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  The local control loop I added
&lt;/h2&gt;

&lt;p&gt;I added a small set of desktop tools around a companion agent running on the same machine.&lt;/p&gt;

&lt;p&gt;The assistant can now do things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;list windows&lt;/li&gt;
&lt;li&gt;focus a specific app&lt;/li&gt;
&lt;li&gt;find accessible controls when UI Automation is available&lt;/li&gt;
&lt;li&gt;set input values directly&lt;/li&gt;
&lt;li&gt;send hotkeys like &lt;code&gt;Ctrl+L&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;capture screenshots before pixel-based actions&lt;/li&gt;
&lt;li&gt;click, move, and scroll with explicit coordinates only after visual confirmation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key constraint is simple: &lt;strong&gt;observe first, then act&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If selectors are available, use them. If they are not, capture the window, inspect what is actually visible, and only then click. That rule matters more than any single tool because it keeps desktop automation from turning into random coordinate guessing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed in the workflow
&lt;/h2&gt;

&lt;p&gt;Before this, the assistant could help me prepare a task but not finish anything that crossed into a real app.&lt;/p&gt;

&lt;p&gt;Now the same local loop can cover more of the actual workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;inspect window → focus app → locate control or capture screenshot → act → verify
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That sounds small, but it changes what "assistant" means in practice.&lt;/p&gt;

&lt;p&gt;It is no longer limited to code and terminal state. It can handle the messy last mile where real work often stalls.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I kept it local
&lt;/h2&gt;

&lt;p&gt;I did not want this running through a hosted browser service or a remote desktop relay.&lt;/p&gt;

&lt;p&gt;Desktop control touches exactly the kind of things that should stay on the machine that owns them: open apps, visible windows, clipboard state, local sessions, and personal accounts.&lt;/p&gt;

&lt;p&gt;Keeping it local also makes the loop faster. The assistant can inspect, act, and verify against the current desktop state without shipping screenshots or UI events to another service first.&lt;/p&gt;

&lt;p&gt;That local-first constraint fits the rest of CliGate anyway. The gateway, the assistant, the runtimes, and now the desktop-control layer all live on the same box.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;p&gt;The interesting lesson was that "assistant capability" is not just about better reasoning or better code generation.&lt;/p&gt;

&lt;p&gt;A lot of workflows fail because the assistant cannot cross boundaries between tools.&lt;/p&gt;

&lt;p&gt;Terminal-only automation is useful. But if the real workflow ends in a browser, settings window, login dialog, or web app form, then desktop control becomes part of the product surface whether you planned for it or not.&lt;/p&gt;

&lt;p&gt;So this update was less about making the assistant smarter and more about making it less incomplete.&lt;/p&gt;

&lt;p&gt;If you're building local AI tooling, where does your automation still stop — at the terminal, at the API, or at the desktop?&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;https://github.com/codeking-ai/cligate&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>"My AI Assistant Could Code, But It Couldn't Operate My Desktop"</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Tue, 26 May 2026 09:10:44 +0000</pubDate>
      <link>https://dev.to/codekingai/my-ai-assistant-could-code-but-it-couldnt-operate-my-desktop-1log</link>
      <guid>https://dev.to/codekingai/my-ai-assistant-could-code-but-it-couldnt-operate-my-desktop-1log</guid>
      <description>&lt;p&gt;Most AI coding agents are good until the task leaves the terminal.&lt;/p&gt;

&lt;p&gt;They can edit files. They can run tests. They can explain a diff. Then the work hits a desktop app, an OAuth approval screen, a native settings window, or a web UI that was not designed for API access. Suddenly the agent is not stuck on intelligence. It is stuck on reach.&lt;/p&gt;

&lt;p&gt;That was the gap I kept running into while building my local AI setup. I had Claude Code, Codex CLI, Gemini CLI, local models, provider keys, and account pools. The missing piece was not another model.&lt;/p&gt;

&lt;p&gt;It was an operator.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem Was The Boundary
&lt;/h2&gt;

&lt;p&gt;My old workflow had two separate worlds.&lt;/p&gt;

&lt;p&gt;In one world, coding agents lived inside terminals and repos. They could reason about code, run commands, and keep a session alive.&lt;/p&gt;

&lt;p&gt;In the other world, real work still happened through desktop apps, dashboards, browser windows, chat clients, and provider consoles. A human could jump between those worlds without thinking. An agent could not.&lt;/p&gt;

&lt;p&gt;That made the assistant feel smaller than it should:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It could fix a bug, but not always finish the setup.&lt;/li&gt;
&lt;li&gt;It could tell me where to click, but not click safely.&lt;/li&gt;
&lt;li&gt;It could generate a workflow, but not reliably drive the app that owned the workflow.&lt;/li&gt;
&lt;li&gt;It could reuse project knowledge, but only if I remembered to paste it in.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I changed how I think about CliGate.&lt;/p&gt;

&lt;p&gt;CliGate is no longer just a local API gateway for AI tools. It is becoming a local control plane for agent work.&lt;/p&gt;

&lt;h2&gt;
  
  
  What CliGate Does Now
&lt;/h2&gt;

&lt;p&gt;CliGate still starts as one localhost service for AI coding tools.&lt;/p&gt;

&lt;p&gt;You can point Claude Code, Codex CLI, Gemini CLI, and OpenClaw at the same local server, then manage provider keys, account pools, routing, usage, logs, and local runtimes from one dashboard.&lt;/p&gt;

&lt;p&gt;But the newer assistant layer sits above that.&lt;/p&gt;

&lt;p&gt;It has two modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Direct runtime: keep talking to the current Codex or Claude Code session.&lt;/li&gt;
&lt;li&gt;Assistant collaboration: ask CliGate Assistant to inspect state, choose a runtime, continue a task, handle a blocked run, or summarize what happened.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That split matters. I do not want every normal message to be intercepted by a clever supervisor. Sometimes I just want to continue the current runtime session. Other times I want an assistant that can see the bigger picture.&lt;/p&gt;

&lt;p&gt;The assistant is not trying to replace Codex or Claude Code. It coordinates them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Skills Made It Less Generic
&lt;/h2&gt;

&lt;p&gt;The second piece is skills.&lt;/p&gt;

&lt;p&gt;A skill is a local package of instructions, scripts, templates, and references. The assistant does not need every detail in context all the time. It can see a short description first, then read the full &lt;code&gt;SKILL.md&lt;/code&gt; only when the task matches.&lt;/p&gt;

&lt;p&gt;For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;skills/
  devto-publisher/
    SKILL.md
    publish.js
    templates/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That turns the assistant from "a general chat box with tools" into something closer to a teammate with reusable procedures.&lt;/p&gt;

&lt;p&gt;One skill can know how to publish a Dev.to article. Another can know how to build a spreadsheet. Another can know the conventions of a local repo. The key is that these are local, inspectable, and executable through the same permission system as the rest of the agent.&lt;/p&gt;

&lt;p&gt;It is not magic. It is just a better way to keep operational knowledge out of one giant prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Desktop Part Is The Big Unlock
&lt;/h2&gt;

&lt;p&gt;The part I am most excited about is desktop control.&lt;/p&gt;

&lt;p&gt;The first naive version of desktop automation is usually visual: take a screenshot, ask the model where to click, move the mouse, repeat. That works for demos, but it is fragile. Small buttons, focus changes, DPI scaling, popups, and animations can break it.&lt;/p&gt;

&lt;p&gt;CliGate's desktop agent takes a different default path on Windows: UI Automation first, screenshots second.&lt;/p&gt;

&lt;p&gt;Instead of guessing pixels, the assistant can ask the operating system for the UI tree:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;list windows -&amp;gt; focus app -&amp;gt; find input -&amp;gt; set value -&amp;gt; send Enter -&amp;gt; read text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That means it can find a textbox by control type, set its value through the accessibility API, invoke a button, read visible text, and only fall back to screenshots when the app does not expose useful accessibility metadata.&lt;/p&gt;

&lt;p&gt;This is the bridge I wanted: a coding assistant that can work in repos, but also operate the desktop applications that surround the repo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where This Is Going
&lt;/h2&gt;

&lt;p&gt;The current shape is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CliGate routes AI coding tools through one local server.&lt;/li&gt;
&lt;li&gt;Runtime sessions keep Codex and Claude Code work alive.&lt;/li&gt;
&lt;li&gt;The assistant watches, coordinates, and summarizes.&lt;/li&gt;
&lt;li&gt;Skills give it reusable procedures.&lt;/li&gt;
&lt;li&gt;Desktop control gives it a path into native apps and GUI workflows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That combination changes the product from "proxy for AI tools" into "local operator for developer workflows."&lt;/p&gt;

&lt;p&gt;I think the desktop-control layer deserves its own post, because "AI can operate any app through the OS accessibility tree" is a deeper topic than I can fit here.&lt;/p&gt;

&lt;p&gt;The project is open source here: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;CliGate on GitHub&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How are you handling the boundary between coding agents and the desktop apps they still need to interact with?&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>DeepSeek's API Price Cut Changed My Claude Code and ChatGPT Math</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Mon, 25 May 2026 08:37:24 +0000</pubDate>
      <link>https://dev.to/codekingai/deepseeks-api-price-cut-changed-my-claude-code-and-chatgpt-math-2fln</link>
      <guid>https://dev.to/codekingai/deepseeks-api-price-cut-changed-my-claude-code-and-chatgpt-math-2fln</guid>
      <description>&lt;p&gt;The DeepSeek API price cut made me rethink a habit I had quietly accepted: choosing an AI coding tool and then living with whatever model economics came with it.&lt;/p&gt;

&lt;p&gt;Claude Code is great when I want a strong terminal-native coding agent. ChatGPT and Codex are great when I want OpenAI's workflow and model stack. But when a provider like DeepSeek suddenly drops API pricing, the obvious question is not just "is this cheap?"&lt;/p&gt;

&lt;p&gt;It is: &lt;strong&gt;can I actually use the cheaper model from the tools I already use?&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Price Cut Is The Interesting Part
&lt;/h2&gt;

&lt;p&gt;As of May 25, 2026, DeepSeek's pricing page lists V4 Flash at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$0.14 per 1M input tokens&lt;/li&gt;
&lt;li&gt;$0.0028 per 1M cached input tokens&lt;/li&gt;
&lt;li&gt;$0.28 per 1M output tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It also lists V4 Pro at the 75% discounted rate, with a note that after the promotion ends on May 31, 2026, the API price will still be officially adjusted to one-quarter of the original price:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$0.435 per 1M input tokens&lt;/li&gt;
&lt;li&gt;$0.003625 per 1M cached input tokens&lt;/li&gt;
&lt;li&gt;$0.87 per 1M output tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The part that matters for coding agents is cached input. Coding tools resend a lot of repeated context: system prompts, repo summaries, conversation history, tool schemas, and task state. If cache hits are cheap enough, repeated agent loops start looking very different economically.&lt;/p&gt;

&lt;p&gt;I checked the current public pricing pages before writing this: &lt;a href="https://api-docs.deepseek.com/quick_start/pricing/" rel="noopener noreferrer"&gt;DeepSeek API pricing&lt;/a&gt;, &lt;a href="https://claude.com/pricing" rel="noopener noreferrer"&gt;Claude plans&lt;/a&gt;, &lt;a href="https://platform.claude.com/docs/en/about-claude/models/overview" rel="noopener noreferrer"&gt;Claude API models&lt;/a&gt;, &lt;a href="https://chatgpt.com/pricing/" rel="noopener noreferrer"&gt;ChatGPT plans&lt;/a&gt;, and &lt;a href="https://openai.com/api/pricing/" rel="noopener noreferrer"&gt;OpenAI API pricing&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;That is why this cut is more than a nice model announcement. It changes where I want routine coding traffic to go.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Comparison I Actually Care About
&lt;/h2&gt;

&lt;p&gt;Claude Code pricing is predictable if you use a subscription: Claude Pro is $20/month when billed monthly, and Max starts at $100/month. On the API side, Anthropic lists Claude Opus 4.7 at $5 input and $25 output per 1M tokens, and Sonnet 4.6 at $3 input and $15 output.&lt;/p&gt;

&lt;p&gt;ChatGPT has the same split. Plus is the familiar $20/month plan, Pro tiers go much higher, and OpenAI API pricing for flagship GPT models is still priced like premium infrastructure. GPT-5.5 is listed at $5 input, $0.50 cached input, and $30 output per 1M tokens.&lt;/p&gt;

&lt;p&gt;Those plans can be worth it. I am not pretending DeepSeek replaces every hard reasoning workload.&lt;/p&gt;

&lt;p&gt;But for coding-agent traffic, the uncomfortable truth is that a lot of tokens are not "hard reasoning" tokens. They are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;reading files&lt;/li&gt;
&lt;li&gt;rewriting boilerplate&lt;/li&gt;
&lt;li&gt;producing test scaffolds&lt;/li&gt;
&lt;li&gt;formatting docs&lt;/li&gt;
&lt;li&gt;classifying intent&lt;/li&gt;
&lt;li&gt;continuing a known task&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is exactly the kind of traffic I want to route to a cheaper model first.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Annoying Part: Tools Do Not Make This Easy
&lt;/h2&gt;

&lt;p&gt;The problem is that Claude Code, Codex, and ChatGPT-style workflows do not all speak the same protocol.&lt;/p&gt;

&lt;p&gt;Claude Code expects Anthropic-shaped requests.&lt;/p&gt;

&lt;p&gt;Codex expects OpenAI-shaped requests.&lt;/p&gt;

&lt;p&gt;Other tools may expect Gemini-style routes or their own local configuration. So even when DeepSeek exposes low-cost models, the practical setup can still turn into a mess of environment variables, API keys, base URLs, and wrappers.&lt;/p&gt;

&lt;p&gt;That is the gap I built CliGate to fill.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changed With CliGate
&lt;/h2&gt;

&lt;p&gt;CliGate is a local AI gateway that runs on &lt;code&gt;localhost&lt;/code&gt;. Instead of pointing every tool directly at a provider, I point the tools at CliGate once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Claude Code&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;http://localhost:8081
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;any-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Codex can also point at the same local gateway through its OpenAI-compatible configuration.&lt;/p&gt;

&lt;p&gt;From there, CliGate handles the important layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;route Claude Code, Codex CLI, Gemini CLI, and web chat through one local control plane&lt;/li&gt;
&lt;li&gt;keep account pools and API keys in the same routing layer&lt;/li&gt;
&lt;li&gt;map model names and app-level routes&lt;/li&gt;
&lt;li&gt;send routine traffic to DeepSeek when cost matters&lt;/li&gt;
&lt;li&gt;keep premium models available for the tasks that actually need them&lt;/li&gt;
&lt;li&gt;show usage, request logs, and cost views in the dashboard&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That means I do not have to decide "Claude Code or DeepSeek" as a tool choice. I can keep Claude Code as the interface and route some of its traffic through DeepSeek. I can keep Codex as the workflow and still move compatible requests to a cheaper upstream.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Advantage Is Not Just Cheap Tokens
&lt;/h2&gt;

&lt;p&gt;Cheap tokens help. But the bigger advantage is optionality.&lt;/p&gt;

&lt;p&gt;I want to be able to say:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use DeepSeek V4 Flash for cheap routine work&lt;/li&gt;
&lt;li&gt;use DeepSeek V4 Pro when I want stronger low-cost reasoning&lt;/li&gt;
&lt;li&gt;keep Claude for difficult multi-file edits&lt;/li&gt;
&lt;li&gt;keep GPT for workflows where OpenAI's stack is the right fit&lt;/li&gt;
&lt;li&gt;keep local models for private or offline tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without a routing layer, that sounds like a spreadsheet and a pile of config files. With a local gateway, it becomes an operations problem: add keys, set routing, inspect usage, adjust when the bill or quality tells you to.&lt;/p&gt;

&lt;p&gt;That is the product advantage I care about. CliGate does not ask me to abandon Claude Code or ChatGPT-style tools. It lets those tools reach low-cost DeepSeek models without changing how I work.&lt;/p&gt;

&lt;h2&gt;
  
  
  My New Default
&lt;/h2&gt;

&lt;p&gt;After this price cut, my default is no longer "pick one premium coding assistant and pay whatever it costs."&lt;/p&gt;

&lt;p&gt;It is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;keep the coding tools I like&lt;/li&gt;
&lt;li&gt;route routine traffic to the cheapest good-enough model&lt;/li&gt;
&lt;li&gt;reserve expensive models for the tasks that justify them&lt;/li&gt;
&lt;li&gt;watch usage and pricing in one place&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That feels like the right shape for AI coding in 2026.&lt;/p&gt;

&lt;p&gt;The models will keep changing. The prices will definitely keep changing. The part I do not want to keep changing is every CLI config on my machine.&lt;/p&gt;

&lt;p&gt;CliGate is here if you want to inspect the implementation: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;https://github.com/codeking-ai/cligate&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;How are you handling model cost now: one subscription, direct API usage, or routing per task?&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>"My Web Chat Wasn't a Real Channel. That Broke My Agent Pipeline"</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Fri, 22 May 2026 11:35:09 +0000</pubDate>
      <link>https://dev.to/codekingai/my-web-chat-wasnt-a-real-channel-that-broke-my-agent-pipeline-11ed</link>
      <guid>https://dev.to/codekingai/my-web-chat-wasnt-a-real-channel-that-broke-my-agent-pipeline-11ed</guid>
      <description>&lt;p&gt;I thought my web chat was the simplest surface in the whole product.&lt;/p&gt;

&lt;p&gt;Telegram, Feishu, and DingTalk were the complicated ones. Web chat was just the dashboard. Same browser, same server, same app. What could possibly go wrong?&lt;/p&gt;

&lt;p&gt;A lot, apparently.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bug looked random from the UI
&lt;/h2&gt;

&lt;p&gt;A task would start from the web chat UI just fine.&lt;/p&gt;

&lt;p&gt;The runtime session existed. The conversation existed. The task existed. The logs looked healthy enough.&lt;/p&gt;

&lt;p&gt;And then the delivery pipeline tried to send a follow-up update back into the conversation and got:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;conversation_not_found
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which made no sense, because the conversation definitely existed. I had just used it.&lt;/p&gt;

&lt;p&gt;This is the kind of bug that wastes time because every individual subsystem looks half-correct.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real problem: I treated web chat like a page, not a channel
&lt;/h2&gt;

&lt;p&gt;The architecture in &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;CliGate&lt;/a&gt; already had a channel model.&lt;/p&gt;

&lt;p&gt;Telegram is a channel. Feishu is a channel. DingTalk is a channel.&lt;/p&gt;

&lt;p&gt;Those inbound messages go through the same supervision and delivery machinery:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;conversation store&lt;/li&gt;
&lt;li&gt;scheduler&lt;/li&gt;
&lt;li&gt;delivery sender&lt;/li&gt;
&lt;li&gt;assistant orchestration&lt;/li&gt;
&lt;li&gt;runtime session binding&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But web chat had slowly drifted into a special-case path.&lt;/p&gt;

&lt;p&gt;That felt harmless at first. Web chat lived inside the same app, so it was easy to give it a little custom state and a few convenience wrappers.&lt;/p&gt;

&lt;p&gt;That was the mistake.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually broke
&lt;/h2&gt;

&lt;p&gt;The old version of &lt;code&gt;chat-ui/conversation-store.js&lt;/code&gt; exported its own store instance.&lt;/p&gt;

&lt;p&gt;Meanwhile, the delivery and orchestration path used the shared channel conversation store.&lt;/p&gt;

&lt;p&gt;So both sides were reading and writing "conversations," but not the same in-memory array.&lt;/p&gt;

&lt;p&gt;That meant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the chat UI could create a conversation&lt;/li&gt;
&lt;li&gt;the route handler could see it&lt;/li&gt;
&lt;li&gt;the runtime could bind to it&lt;/li&gt;
&lt;li&gt;but the scheduler could still fail to find it later&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The comments in the fix say it more plainly than I can:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// chat-ui and agent-channels each held a SEPARATE in-memory&lt;/span&gt;
&lt;span class="c1"&gt;// `conversations` array, even though both wrote to the same JSON file on disk.&lt;/span&gt;
&lt;span class="c1"&gt;// After server start, a chat-ui conversation created at runtime was visible to&lt;/span&gt;
&lt;span class="c1"&gt;// chat-ui-route but NOT to message-service, so scheduler deliveries hit&lt;/span&gt;
&lt;span class="c1"&gt;// `conversation_not_found` and silently dropped notifications.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is exactly the sort of bug you get when a "small UI shortcut" quietly forks your domain model.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix was not complicated
&lt;/h2&gt;

&lt;p&gt;I did not need a new abstraction.&lt;/p&gt;

&lt;p&gt;I needed one source of truth.&lt;/p&gt;

&lt;p&gt;Instead of exporting a dedicated chat-ui conversation store in production, I attached chat-specific helpers to the shared singleton used by the channel system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;installChatUiHelpers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;agentChannelConversationStore&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chatUiConversationStore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;agentChannelConversationStore&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That one change matters more than it looks.&lt;/p&gt;

&lt;p&gt;Now web chat is not pretending to be adjacent to the channel system. It is part of the channel system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this changed more than delivery
&lt;/h2&gt;

&lt;p&gt;Once I stopped treating web chat as a special page, a lot of other decisions became cleaner.&lt;/p&gt;

&lt;p&gt;A chat-ui conversation now behaves like a real peer of the other channels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it has the same conversation identity model&lt;/li&gt;
&lt;li&gt;it uses the same assistant delivery state&lt;/li&gt;
&lt;li&gt;it flows through the same runtime binding logic&lt;/li&gt;
&lt;li&gt;it can receive scheduler-driven updates without weird bridging code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That matters because a multi-surface assistant only stays sane if all entry points agree on what a conversation is.&lt;/p&gt;

&lt;p&gt;If one surface has its own special rules, you do not have one product anymore. You have one product plus one exception that keeps leaking.&lt;/p&gt;

&lt;h2&gt;
  
  
  The other important fix: seed assistant mode from the start
&lt;/h2&gt;

&lt;p&gt;There was a second detail hidden in the same file.&lt;/p&gt;

&lt;p&gt;New chat-ui conversations now start with assistant control mode already set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;assistantCore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildAssistantCoreDeliveryState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;existingAssistantCore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;controlMode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;CONVERSATION_ASSISTANT_CONTROL_MODE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ASSISTANT&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That matters because web chat should enter the same top-level assistant orchestration path as the messaging channels.&lt;/p&gt;

&lt;p&gt;If the first message from the web UI skips that and goes straight to the bound runtime, you get behavioral drift:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;web chat behaves one way&lt;/li&gt;
&lt;li&gt;Telegram behaves another way&lt;/li&gt;
&lt;li&gt;Feishu behaves another way&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then every bug becomes impossible to reason about because the surfaces are no longer comparable.&lt;/p&gt;

&lt;h2&gt;
  
  
  The test I actually wanted
&lt;/h2&gt;

&lt;p&gt;I have learned to distrust fixes like this unless there is a test that proves the behavioral contract.&lt;/p&gt;

&lt;p&gt;The right question was not "does chat-ui store still work?"&lt;/p&gt;

&lt;p&gt;The right question was:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;does a chat-ui conversation participate in assistant behavior like a real channel?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is why the surrounding tests focus on assistant-mode behavior and persisted conversation semantics instead of only checking helper methods in isolation.&lt;/p&gt;

&lt;p&gt;The implementation detail was a store instance bug.&lt;/p&gt;

&lt;p&gt;The product bug was channel inconsistency.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned from this
&lt;/h2&gt;

&lt;p&gt;When you build multi-channel agent systems, the browser UI is seductive.&lt;/p&gt;

&lt;p&gt;It feels local. It feels simple. It feels close enough to the app that you can justify giving it custom flow control, custom state, or custom routing.&lt;/p&gt;

&lt;p&gt;That instinct is expensive.&lt;/p&gt;

&lt;p&gt;If the browser chat can start tasks, receive async updates, carry conversation identity, and interact with the same supervisor as your mobile or messaging surfaces, then it is not "just a page."&lt;/p&gt;

&lt;p&gt;It is a channel.&lt;/p&gt;

&lt;p&gt;And if you do not model it that way, the architecture will eventually make you pay for the lie.&lt;/p&gt;

&lt;p&gt;Are you treating your web chat as a first-class channel, or as a special case that has not failed loudly yet?&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;github.com/codeking-ai/cligate&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>"My Coding Agent Remembered Sessions, Not Work. That Was the Bug"</title>
      <dc:creator>CodeKing</dc:creator>
      <pubDate>Thu, 21 May 2026 03:46:07 +0000</pubDate>
      <link>https://dev.to/codekingai/my-coding-agent-remembered-sessions-not-work-that-was-the-bug-1n3e</link>
      <guid>https://dev.to/codekingai/my-coding-agent-remembered-sessions-not-work-that-was-the-bug-1n3e</guid>
      <description>&lt;p&gt;The first version of my coding agent had a very common bug: it remembered the conversation, but not the work.&lt;/p&gt;

&lt;p&gt;That sounds fine until the agent has to do something real.&lt;/p&gt;

&lt;p&gt;I would start a task from the web UI, continue it from a mobile channel, approve one command, ask for progress later, and then discover that the system was mostly guessing from the last few messages. It knew there was a session. It did not really know what job that session belonged to.&lt;/p&gt;

&lt;p&gt;That is the difference between a chatbot and a working assistant.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem Was The Unit Of Memory
&lt;/h2&gt;

&lt;p&gt;Most agent systems begin with a simple shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;conversation -&amp;gt; runtime session -&amp;gt; messages
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That works for demos because the user does one thing at a time.&lt;/p&gt;

&lt;p&gt;It breaks when the user behaves normally:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"continue the routing task"&lt;/li&gt;
&lt;li&gt;"use Claude Code to review what Codex just changed"&lt;/li&gt;
&lt;li&gt;"what happened with the thing from yesterday?"&lt;/li&gt;
&lt;li&gt;"retry that, but keep the same working directory"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of those are really about a chat session. They are about work.&lt;/p&gt;

&lt;p&gt;A runtime session can crash. A user can switch from web to Telegram or Feishu. Two agents can work on the same issue from different roles. If the system treats the runtime session as the main identity, every one of those cases becomes fragile.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix: Split Work From Execution
&lt;/h2&gt;

&lt;p&gt;In CliGate, I started moving the design toward a different model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Person
  -&amp;gt; Project
    -&amp;gt; Task
      -&amp;gt; Execution
        -&amp;gt; RuntimeSession
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important part is not the diagram. It is the boundary.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;Task&lt;/strong&gt; is the thing the user thinks they are doing: "fix routing", "review the auth change", "write release notes", "check why the build failed".&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;Execution&lt;/strong&gt; is one concrete attempt to move that task forward. It may be Codex acting as the editor, Claude Code acting as a reviewer, or another provider doing a focused job.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;RuntimeSession&lt;/strong&gt; is just the current process or provider session underneath that execution.&lt;/p&gt;

&lt;p&gt;That means the assistant can say: this is still the same task, even if the runtime process has changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters In Real Use
&lt;/h2&gt;

&lt;p&gt;The most annoying bugs came from follow-ups.&lt;/p&gt;

&lt;p&gt;When I typed:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;make the button green&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I did not mean "start an unrelated new job." I meant "continue the last task with the same context."&lt;/p&gt;

&lt;p&gt;When I typed:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;use cc to review it too&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I did not mean "replace the current agent." I meant "spawn a second execution under the same task, with a reviewer role."&lt;/p&gt;

&lt;p&gt;Those two messages look similar if all you have is chat history. They are very different if the system has a task model.&lt;/p&gt;

&lt;p&gt;Once the assistant can distinguish task identity from execution identity, a few things become much easier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;status questions can be answered from task state&lt;/li&gt;
&lt;li&gt;provider preference can follow the work instead of the channel&lt;/li&gt;
&lt;li&gt;a dead runtime can be replaced without pretending the task is new&lt;/li&gt;
&lt;li&gt;multiple agents can collaborate without sharing one messy transcript&lt;/li&gt;
&lt;li&gt;web UI and mobile channels can show different levels of detail&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point surprised me. On mobile, I want a short answer: "Codex is waiting for approval." In the web UI, I may want the full timeline: user message, assistant decision, runtime event, command output, file changes, approval, result.&lt;/p&gt;

&lt;p&gt;Same task. Different presentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Rule I Wish I Had Started With
&lt;/h2&gt;

&lt;p&gt;If the user can reasonably ask "what happened with that thing?", that thing deserves an identity outside the chat transcript.&lt;/p&gt;

&lt;p&gt;For my project, that identity became &lt;code&gt;Task&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The runtime session is still useful. It preserves provider context and lets the agent resume efficiently. But it should not be the thing the product uses to understand the user's work.&lt;/p&gt;

&lt;p&gt;Sessions are implementation details. Work is the product surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;I am still iterating on the architecture, but the direction already cleaned up several design decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;follow-ups route to tasks, not just the latest session&lt;/li&gt;
&lt;li&gt;retries can keep the same task identity&lt;/li&gt;
&lt;li&gt;reviewer agents can attach to the same task as editor agents&lt;/li&gt;
&lt;li&gt;approvals can be remembered at task or project scope&lt;/li&gt;
&lt;li&gt;channel messages can stay short without losing full traceability in the dashboard&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This also made failure handling less awkward. If a runtime dies, the assistant does not need to tell the user "your session is gone, please start over." It can start a new runtime under the same execution or create a fresh execution under the same task, depending on what actually failed.&lt;/p&gt;

&lt;p&gt;That is a small implementation detail with a large UX effect.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;I used to think agent memory meant better summaries of previous messages.&lt;/p&gt;

&lt;p&gt;Now I think the more important question is: what are you summarizing &lt;em&gt;into&lt;/em&gt;?&lt;/p&gt;

&lt;p&gt;If everything collapses back into a conversation, the assistant will eventually lose the shape of the work. If the product has explicit projects, tasks, executions, and runtime sessions, the agent has somewhere stable to put its memory.&lt;/p&gt;

&lt;p&gt;That has become one of the design principles behind &lt;a href="https://github.com/codeking-ai/cligate" rel="noopener noreferrer"&gt;CliGate&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you are building coding agents, how are you modeling the difference between a conversation, a task, and a runtime session?&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
