<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vyacheslav Mayorskiy</title>
    <description>The latest articles on DEV Community by Vyacheslav Mayorskiy (@vmayorskiyac).</description>
    <link>https://dev.to/vmayorskiyac</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3619098%2F1cdca48f-cdfe-4f75-9c2c-a0ecfb244204.png</url>
      <title>DEV Community: Vyacheslav Mayorskiy</title>
      <link>https://dev.to/vmayorskiyac</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vmayorskiyac"/>
    <language>en</language>
    <item>
      <title>Databricks Integration Deep Dive: Teaching Aye Chat to Talk to Your Endpoint</title>
      <dc:creator>Vyacheslav Mayorskiy</dc:creator>
      <pubDate>Tue, 03 Mar 2026 16:49:00 +0000</pubDate>
      <link>https://dev.to/vmayorskiyac/databricks-integration-deep-dive-teaching-aye-chat-to-talk-to-your-endpoint-4ke8</link>
      <guid>https://dev.to/vmayorskiyac/databricks-integration-deep-dive-teaching-aye-chat-to-talk-to-your-endpoint-4ke8</guid>
      <description>&lt;p&gt;Databricks has a nice party trick: it can host an LLM endpoint that &lt;em&gt;looks&lt;/em&gt; like an OpenAI chat-completions API.&lt;/p&gt;

&lt;p&gt;Aye Chat has a different party trick: it can treat LLMs like a backend detail, as long as it gets a structured JSON response it can render and (optionally) apply to files.&lt;/p&gt;

&lt;p&gt;This post is the handshake between the two.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Scope note: this is a &lt;strong&gt;deep dive of the current implementation&lt;/strong&gt; of Aye Chat’s Databricks plugin (&lt;code&gt;DatabricksModelPlugin&lt;/code&gt;). It’s not a general “Databricks + LLMs” tutorial.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Where the integration lives
&lt;/h2&gt;

&lt;p&gt;The Databricks integration is implemented as a model plugin:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;plugins/databricks_model.py&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Class:&lt;/strong&gt; &lt;code&gt;DatabricksModelPlugin&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Main hook:&lt;/strong&gt; &lt;code&gt;on_command()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Command it intercepts:&lt;/strong&gt; &lt;code&gt;local_model_invoke&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Aye Chat’s plugin architecture, model plugins can intercept &lt;code&gt;local_model_invoke&lt;/code&gt; and return an LLM response object that the rest of the app can render and/or apply.&lt;/p&gt;




&lt;h2&gt;
  
  
  Activation: the plugin is silent unless you invite it
&lt;/h2&gt;

&lt;p&gt;This plugin has strong “don’t bother me unless configured” energy.&lt;/p&gt;

&lt;p&gt;It only activates when &lt;strong&gt;both&lt;/strong&gt; env vars are present:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;AYE_DBX_API_URL&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;AYE_DBX_API_KEY&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The check is centralized:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_is_databricks_configured&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AYE_DBX_API_URL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AYE_DBX_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If not configured, &lt;code&gt;on_command()&lt;/code&gt; returns &lt;code&gt;None&lt;/code&gt; and Aye Chat falls back to other model backends.&lt;/li&gt;
&lt;li&gt;Users who don’t care about Databricks get true “zero config.”&lt;/li&gt;
&lt;li&gt;Users who &lt;em&gt;do&lt;/em&gt; care can opt in with two env vars and zero ceremony.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Configuration and model selection
&lt;/h2&gt;

&lt;p&gt;The plugin expects the Databricks endpoint to behave like an &lt;strong&gt;OpenAI-compatible chat completions API&lt;/strong&gt; (or a compatible proxy).&lt;/p&gt;

&lt;h3&gt;
  
  
  Required
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;AYE_DBX_API_URL&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full URL to &lt;code&gt;POST&lt;/code&gt; chat completion requests.&lt;/li&gt;
&lt;li&gt;Example (illustrative):&lt;/li&gt;
&lt;li&gt;&lt;code&gt;https://&amp;lt;workspace-host&amp;gt;/serving-endpoints/&amp;lt;endpoint&amp;gt;/invocations&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;code&gt;AYE_DBX_API_KEY&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bearer token used for &lt;code&gt;Authorization&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Optional
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;AYE_DBX_MODEL&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Defaults to: &lt;code&gt;gpt-3.5-turbo&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Yes, the default says &lt;code&gt;gpt-3.5-turbo&lt;/code&gt;. No, this plugin isn’t emotionally attached to that string. It just needs a &lt;code&gt;model&lt;/code&gt; field to put in the payload.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lifecycle: &lt;code&gt;new_chat&lt;/code&gt; vs &lt;code&gt;local_model_invoke&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;The plugin handles two command names.&lt;/p&gt;

&lt;h3&gt;
  
  
  1) &lt;code&gt;new_chat&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;If configured, &lt;code&gt;new_chat&lt;/code&gt; resets conversation state by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deleting &lt;code&gt;.aye/chat_history.json&lt;/code&gt; (Databricks plugin history file)&lt;/li&gt;
&lt;li&gt;resetting in-memory &lt;code&gt;self.chat_history&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Translation: when you start a fresh session in Aye Chat, the Databricks plugin doesn’t cling to the past.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) &lt;code&gt;local_model_invoke&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This is the main inference path:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Load existing history from disk&lt;/li&gt;
&lt;li&gt;Build the message list (system + history + new user message)&lt;/li&gt;
&lt;li&gt;POST to the Databricks endpoint&lt;/li&gt;
&lt;li&gt;Extract JSON from the model output (even if it tries to write a novel first)&lt;/li&gt;
&lt;li&gt;Store lightweight history (no repeated file contents)&lt;/li&gt;
&lt;li&gt;Return a parsed LLM response (with optional token usage)&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Prompt construction: one message for the model, another for history
&lt;/h2&gt;

&lt;p&gt;Aye Chat can include repo context (files / RAG snippets) in the prompt. The Databricks plugin uses &lt;strong&gt;two representations&lt;/strong&gt; of the same idea.&lt;/p&gt;

&lt;h3&gt;
  
  
  A) The full user message (sent to the API)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;user_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_user_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source_files&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the user’s prompt&lt;/li&gt;
&lt;li&gt;the full contents of &lt;code&gt;source_files&lt;/code&gt; (as gathered by Aye Chat)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s what the model needs to actually do the work.&lt;/p&gt;

&lt;h3&gt;
  
  
  B) The lightweight history message (saved to disk)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;history_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;build_history_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;source_files&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This stores a compact representation (typically prompt + filenames), not full file contents.&lt;/p&gt;

&lt;p&gt;Why: if you store full file contents in history on every request, &lt;code&gt;.aye/chat_history.json&lt;/code&gt; turns into a data hoarder. Performance degrades, diffs get silly, and you start paying a storage bill for your own laziness.&lt;/p&gt;

&lt;p&gt;This design is unglamorous, and therefore correct.&lt;/p&gt;




&lt;h2&gt;
  
  
  System prompt behavior
&lt;/h2&gt;

&lt;p&gt;The plugin uses the shared &lt;code&gt;SYSTEM_PROMPT&lt;/code&gt; by default (imported from &lt;code&gt;aye.model.config&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;But the invoker can override it per request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;effective_system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the plugin builds OpenAI-style messages with the system prompt first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;effective_system_prompt&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
  &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;conv_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_message&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The request payload (OpenAI-style, by design)
&lt;/h2&gt;

&lt;p&gt;Headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Content-Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Payload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;temperature&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;max_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;max_output_tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notable behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The plugin assumes an OpenAI-like schema: &lt;code&gt;model&lt;/code&gt;, &lt;code&gt;messages&lt;/code&gt;, &lt;code&gt;temperature&lt;/code&gt;, &lt;code&gt;max_tokens&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Timeout is generous:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;LLM_TIMEOUT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;600.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because sometimes the model needs a minute. And sometimes it needs a minute to think, plus another minute to dramatically clear its throat.&lt;/p&gt;




&lt;h2&gt;
  
  
  Response handling: extracting JSON from “helpful” output
&lt;/h2&gt;

&lt;p&gt;Aye Chat expects the assistant to return a JSON object (usually with &lt;code&gt;summary&lt;/code&gt; and optional &lt;code&gt;updated_files&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;In practice, models often return:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a paragraph of explanation&lt;/li&gt;
&lt;li&gt;a code fence&lt;/li&gt;
&lt;li&gt;three different “final” answers&lt;/li&gt;
&lt;li&gt;then a JSON object&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the plugin uses &lt;code&gt;_extract_json_object()&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What &lt;code&gt;_extract_json_object()&lt;/code&gt; does
&lt;/h3&gt;

&lt;p&gt;It tries, in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;json.loads(raw_response)&lt;/code&gt; directly&lt;/li&gt;
&lt;li&gt;If that fails, it scans the text for balanced &lt;code&gt;{ ... }&lt;/code&gt; candidates while being string/escape-aware&lt;/li&gt;
&lt;li&gt;It parses candidates and typically picks the last valid object&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It’s not elegant. It’s resilient. (Those are often the same thing.)&lt;/p&gt;

&lt;h3&gt;
  
  
  A small caveat worth knowing
&lt;/h3&gt;

&lt;p&gt;After extraction, the plugin does:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;generated_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generated_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If extraction fails, &lt;code&gt;generated_json&lt;/code&gt; may be &lt;code&gt;None&lt;/code&gt;, which turns into the literal JSON string:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Depending on how &lt;code&gt;parse_llm_response()&lt;/code&gt; handles that, you may get confusing parse failures.&lt;/p&gt;

&lt;p&gt;If you ever find yourself staring into the abyss wondering “why is the model output &lt;code&gt;null&lt;/code&gt;?”, this is the first flashlight to grab.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chat history: stored locally, lightweight on purpose
&lt;/h2&gt;

&lt;p&gt;The plugin persists history in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;.aye/chat_history.json&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s keyed by a conversation id derived from &lt;code&gt;chat_id&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;conv_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_conversation_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On each successful request it appends:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the user’s &lt;strong&gt;lightweight&lt;/strong&gt; history message&lt;/li&gt;
&lt;li&gt;the assistant’s response as a &lt;strong&gt;JSON string&lt;/strong&gt; (not raw prose)
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;conv_id&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;history_message&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat_history&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;conv_id&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;generated_text&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_save_history&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps future requests grounded without turning your history file into a landfill.&lt;/p&gt;




&lt;h2&gt;
  
  
  Parsing into Aye Chat’s internal response shape
&lt;/h2&gt;

&lt;p&gt;Once the plugin has a JSON string in &lt;code&gt;generated_text&lt;/code&gt;, it calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;parsed_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parse_llm_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generated_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;debug&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;parse_llm_response()&lt;/code&gt; converts the JSON into Aye Chat’s internal response schema.&lt;/p&gt;

&lt;p&gt;Typically you’ll see fields like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;summary&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;updated_files: [{ file_name, file_content }, ...]&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those &lt;code&gt;updated_files&lt;/code&gt; are what Aye Chat can apply optimistically to disk (with automatic snapshots so you can &lt;code&gt;restore&lt;/code&gt; instantly).&lt;/p&gt;




&lt;h2&gt;
  
  
  Token usage passthrough
&lt;/h2&gt;

&lt;p&gt;If the Databricks endpoint includes an OpenAI-like &lt;code&gt;usage&lt;/code&gt; block, the plugin passes it through:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;usage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;usage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;parsed_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;token_usage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;prompt_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completion_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;completion_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;total_tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why you care:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;debugging prompt growth (especially with repo context)&lt;/li&gt;
&lt;li&gt;monitoring costs (when applicable)&lt;/li&gt;
&lt;li&gt;comparing RAG/context strategies across runs&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Error handling (aka “tell me what broke, not poetry about failure”)
&lt;/h2&gt;

&lt;p&gt;The plugin distinguishes between:&lt;/p&gt;

&lt;h3&gt;
  
  
  HTTP status errors
&lt;/h3&gt;

&lt;p&gt;It catches &lt;code&gt;httpx.HTTPStatusError&lt;/code&gt; and builds messages like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;DBX API error: &amp;lt;status_code&amp;gt; - &amp;lt;detail&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It tries to parse a JSON error body and extract &lt;code&gt;error.message&lt;/code&gt; when possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generic exceptions
&lt;/h3&gt;

&lt;p&gt;Anything else becomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Error calling Databricks API: &amp;lt;exception&amp;gt;&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Verbose / debug output
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;verbose&lt;/code&gt;: prints status code and raw response text (for non-200)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;debug&lt;/code&gt;: prints internal message history and response blocks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is especially useful when wiring new endpoints, where the biggest issues are usually:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;schema mismatch&lt;/li&gt;
&lt;li&gt;response shape differences&lt;/li&gt;
&lt;li&gt;“the model didn’t output JSON like you asked” (shocking)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Minimal setup example
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AYE_DBX_API_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://.../invocations"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AYE_DBX_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"dapi..."&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AYE_DBX_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-model-name"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then run Aye Chat normally. If configured, this plugin will intercept &lt;code&gt;local_model_invoke&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Troubleshooting checklist
&lt;/h2&gt;

&lt;p&gt;If nothing happens (or worse, something happens but it’s wrong):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Env vars&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;AYE_DBX_API_URL&lt;/code&gt; set?&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;AYE_DBX_API_KEY&lt;/code&gt; set?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Endpoint compatibility&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Accepts &lt;code&gt;messages&lt;/code&gt; chat format?&lt;/li&gt;
&lt;li&gt;Accepts &lt;code&gt;max_tokens&lt;/code&gt;?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Response shape&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does &lt;code&gt;result["choices"][0]["message"]["content"]&lt;/code&gt; exist?&lt;/li&gt;
&lt;li&gt;Does &lt;code&gt;content&lt;/code&gt; contain a JSON object Aye Chat can parse?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Model output discipline&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extra prose is usually fine; &lt;code&gt;_extract_json_object()&lt;/code&gt; can recover.&lt;/li&gt;
&lt;li&gt;If extraction becomes &lt;code&gt;null&lt;/code&gt;, your endpoint likely isn’t returning JSON-like content in the expected place.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Turn on the lights&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;verbose on&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;debug on&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;The Databricks integration is a clean, opt-in model plugin that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;activates only when configured via environment variables&lt;/li&gt;
&lt;li&gt;sends OpenAI-style chat completion payloads to your Databricks endpoint&lt;/li&gt;
&lt;li&gt;builds rich prompts with file context, but stores lightweight history&lt;/li&gt;
&lt;li&gt;extracts JSON robustly from messy model output&lt;/li&gt;
&lt;li&gt;returns a structured response Aye Chat can apply to files&lt;/li&gt;
&lt;li&gt;surfaces token usage when the endpoint provides it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re extending or deploying this integration, the two most important things to validate are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;endpoint schema compatibility (messages in, choices out)&lt;/li&gt;
&lt;li&gt;response format consistency (JSON object inside &lt;code&gt;choices[0].message.content&lt;/code&gt;)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything else is just plumbing. Occasionally wet plumbing, but still.&lt;/p&gt;




&lt;h2&gt;
  
  
  About Aye Chat
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Aye Chat&lt;/strong&gt; is an open-source, AI-powered terminal workspace that brings AI directly into command-line workflows. Edit files, run commands, and chat with your codebase without leaving the terminal - with an optimistic workflow backed by instant local snapshots.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support Us
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Star our &lt;a href="https://github.com/acrotron/aye-chat" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;&lt;/strong&gt; - it helps new users discover Aye Chat.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spread the word.&lt;/strong&gt; Share Aye Chat with your team and friends who live in the terminal.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>opensource</category>
      <category>databricks</category>
    </item>
    <item>
      <title>Stop Re-Explaining Yourself to the AI: The Case for Behavior Presets</title>
      <dc:creator>Vyacheslav Mayorskiy</dc:creator>
      <pubDate>Tue, 24 Feb 2026 16:42:00 +0000</pubDate>
      <link>https://dev.to/vmayorskiyac/stop-re-explaining-yourself-to-the-ai-the-case-for-behavior-presets-25fj</link>
      <guid>https://dev.to/vmayorskiyac/stop-re-explaining-yourself-to-the-ai-the-case-for-behavior-presets-25fj</guid>
      <description>&lt;p&gt;Every developer who uses AI coding assistants long enough develops the same tic.&lt;/p&gt;

&lt;p&gt;You start every prompt with a little speech.&lt;/p&gt;

&lt;p&gt;"Be concise. Include examples. Explain intent, not just code. Don't refactor things I didn't ask you to touch. Don't add dependencies without asking. Don't..."&lt;/p&gt;

&lt;p&gt;You're not prompting anymore. You're &lt;em&gt;parenting&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;And the worst part? It works - until the one time you forget a sentence. Then the assistant reads that omission as creative license and rearranges your living room while you asked it to change a lightbulb.&lt;/p&gt;

&lt;p&gt;This is the prompt tax. You pay it every single time. And it scales terribly.&lt;/p&gt;




&lt;h2&gt;
  
  
  The prompt tax (and why it compounds)
&lt;/h2&gt;

&lt;p&gt;Most AI interactions are stateless in practice. Sure, there's a context window, but instructions from three prompts ago are already being diluted by new information. The model doesn't &lt;em&gt;remember&lt;/em&gt; that you prefer explicit error handling, or that you hate magic strings, or that "refactor" means "move this function, not rewrite the module."&lt;/p&gt;

&lt;p&gt;So you repeat yourself. And the repetition creates three problems:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Fatigue.&lt;/strong&gt; You get tired of typing the same preamble. You start abbreviating. The output quality drops. You blame the model.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Drift.&lt;/strong&gt; Tuesday-you writes a thorough instruction set. Friday-you writes "just fix it." The results diverge. The codebase diverges.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Tribal knowledge.&lt;/strong&gt; Your carefully crafted prompt lives in your head (or your clipboard history, which is basically the same thing). Your teammate doesn't have it. Neither does your future self after a long weekend.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The prompt tax isn't just annoying. It's a consistency problem wearing an inconvenience costume.&lt;/p&gt;




&lt;h2&gt;
  
  
  What if you could save your instructions?
&lt;/h2&gt;

&lt;p&gt;Not as a text file you copy-paste from. Not as a system prompt you configure once and forget. Something in between.&lt;/p&gt;

&lt;p&gt;Picture this: a small collection of Markdown files in your repo, each one defining a &lt;em&gt;behavior mode&lt;/em&gt; for your AI assistant.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;documentation.md&lt;/code&gt; - "When writing docs, explain intent first. Include examples. Don't just describe function signatures."&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;security.md&lt;/code&gt; - "Think like an attacker. Focus on practical mitigations, not theoretical CVE lists."&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;testing.md&lt;/code&gt; - "Write behavior-driven tests. Test boundaries and edge cases. Don't mock everything into oblivion."&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;refactoring.md&lt;/code&gt; - "Reduce coupling. Respect existing boundaries. Don't rename things for fun."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then, when you need one, you invoke it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Using security skill, review the authentication flow for injection risks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tool reads the Markdown file and injects it into the system prompt - the part the model treats as instructions from the operator, not casual conversation.&lt;/p&gt;

&lt;p&gt;Your prompt stays short. The behavior stays consistent. And the instructions live &lt;em&gt;in the repo&lt;/em&gt;, not in your head.&lt;/p&gt;

&lt;p&gt;I call these &lt;strong&gt;skills&lt;/strong&gt;. But you could call them presets, personas, modes, lenses - whatever. The name matters less than the pattern.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Markdown files in a directory?
&lt;/h2&gt;

&lt;p&gt;I know what you're thinking. "This is just a prompt template library." And you're not wrong. But the &lt;em&gt;where&lt;/em&gt; and &lt;em&gt;how&lt;/em&gt; matter more than you'd expect.&lt;/p&gt;

&lt;h3&gt;
  
  
  They live in the repo
&lt;/h3&gt;

&lt;p&gt;This means they're version-controlled. They go through code review. When someone on your team writes a &lt;code&gt;documentation.md&lt;/code&gt; skill that says "always include usage examples," that decision is visible, reviewable, and shared.&lt;/p&gt;

&lt;p&gt;You're not just saving prompts. You're &lt;strong&gt;version-controlling your team's taste.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Code style guides exist because "write clean code" isn't specific enough. Skills exist because "be a good assistant" or "you are a super-expert senior software engineer" isn't specific enough.&lt;/p&gt;

&lt;h3&gt;
  
  
  They're scoped per project
&lt;/h3&gt;

&lt;p&gt;Your open-source library needs different AI behavior than your internal microservice. The library wants careful public API documentation, backwards compatibility awareness, and changelog entries. The microservice wants fast iteration, infra-aware suggestions, and monitoring hooks.&lt;/p&gt;

&lt;p&gt;Same developer, different context. Repo-local skills handle this automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  They're composable
&lt;/h3&gt;

&lt;p&gt;Need docs &lt;em&gt;and&lt;/em&gt; security awareness for a single task?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;skills:security,documentation review and document the auth module
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two Markdown files get injected. The model gets both sets of instructions. You typed one line.&lt;/p&gt;

&lt;h3&gt;
  
  
  They're transparent
&lt;/h3&gt;

&lt;p&gt;No hidden configuration. No settings buried in a YAML file three directories up. If you want to know what &lt;code&gt;skill:testing&lt;/code&gt; does, you open &lt;code&gt;skills/testing.md&lt;/code&gt; and read it. If you disagree, you edit it and commit.&lt;/p&gt;

&lt;p&gt;The system is legible. That matters more than most people think.&lt;/p&gt;




&lt;h2&gt;
  
  
  What makes a good skill?
&lt;/h2&gt;

&lt;p&gt;Not everything belongs in a skill file. The best ones share a few properties:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They describe a mindset, not a checklist.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Bad: "Always add type hints. Always add docstrings. Always use f-strings."&lt;/p&gt;

&lt;p&gt;Good: "You're writing code that a junior developer will maintain alone. Prioritize clarity over cleverness. Prefer explicit patterns over implicit conventions."&lt;/p&gt;

&lt;p&gt;The first is a linter. The second changes how the model &lt;em&gt;thinks&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They're focused.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One skill per concern. Don't make a &lt;code&gt;be_good_at_everything.md&lt;/code&gt; kitchen sink file. A security skill shouldn't also contain formatting preferences, and a documentation skill shouldn't also contain performance tips.&lt;/p&gt;

&lt;p&gt;Keep them small. Compose them when you need more than one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They express opinions.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The whole point is consistency. "Consider testing" is useless. "Write tests that describe behavior, not implementation. Prefer integration tests for I/O boundaries. Mock only what you must" - that's a skill with a point of view.&lt;/p&gt;

&lt;p&gt;Your skills should sound like the best version of your team's code review comments. Specific, opinionated, and useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They assume the model is capable but amnesiac.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You're not teaching the model how to code. You're reminding it what &lt;em&gt;you&lt;/em&gt; care about. The difference between a good AI response and a great one usually isn't capability - it's context about your preferences.&lt;/p&gt;

&lt;p&gt;Skills provide that context. Repeatedly. Reliably. Without you typing it again.&lt;/p&gt;




&lt;h2&gt;
  
  
  The consistency argument (or: why drift will ruin you slowly)
&lt;/h2&gt;

&lt;p&gt;Here's the thing about prompt drift that nobody warns you about: it's invisible.&lt;/p&gt;

&lt;p&gt;You don't notice it happening. One day you write a thorough prompt and get beautiful, well-structured code. The next day you're tired and write something shorter. The output is slightly different. Not worse, exactly - just different. Different naming. Different patterns. Different assumptions.&lt;/p&gt;

&lt;p&gt;Multiply that across a team of four over two months and your codebase starts to look like it was written by eight people with conflicting philosophies. Because it was.&lt;/p&gt;

&lt;p&gt;Skills don't eliminate drift. But they create a gravity well - a default behavior that holds unless you actively override it. The model still has its moods (they all do), but instead of freestyling from a blank slate every time, it's freestyling within guardrails that your team defined.&lt;/p&gt;

&lt;p&gt;That's a massive difference.&lt;/p&gt;




&lt;h2&gt;
  
  
  The tools-that-write-files problem
&lt;/h2&gt;

&lt;p&gt;This pattern becomes especially important with tools that let AI write directly to your filesystem - what some call an "optimistic" or "agentic" workflow.&lt;/p&gt;

&lt;p&gt;When the AI just &lt;em&gt;suggests&lt;/em&gt; code in a chat window, inconsistency is annoying but harmless. You copy-paste what you like, edit the rest, move on.&lt;/p&gt;

&lt;p&gt;But when the AI writes directly to disk? Inconsistency &lt;em&gt;lands&lt;/em&gt;. It becomes committed code, merged PRs, production behavior. The gap between "the model was in a weird mood" and "we shipped a bug" shrinks to nothing.&lt;/p&gt;

&lt;p&gt;In that world, skills aren't a convenience. They're a safety mechanism. They're the difference between "the AI followed our conventions" and "the AI decided this module should use a completely different error handling pattern than everything else."&lt;/p&gt;

&lt;p&gt;The more authority you give the model, the more important it is to tell it &lt;em&gt;how to behave&lt;/em&gt; - and to tell it the same way every time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Design principles that matter
&lt;/h2&gt;

&lt;p&gt;If you're building this pattern into a tool (or adopting it for your own workflow), a few design decisions are worth getting right:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explicit invocation beats auto-detection.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It's tempting to have the tool guess which skill to apply based on the prompt content. Resist this. Auto-applied behavior that the user didn't ask for is spooky. It makes debugging harder and trust lower.&lt;/p&gt;

&lt;p&gt;Let the user say &lt;code&gt;skill:security&lt;/code&gt;. Don't try to infer it from the word "vulnerability" appearing in their prompt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System-level injection beats user-level injection.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Skills should be injected as system prompt content, not as a user message the model might deprioritize. System-level instructions get treated as operator guidance. User-level instructions compete with everything else in the conversation.&lt;/p&gt;

&lt;p&gt;This distinction matters more than it sounds. It's the difference between a suggestion and a directive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ambiguity should resolve to "do nothing."&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If the tool can't figure out which skill the user meant - apply none. Don't guess. The user can rephrase. Silently applying the wrong behavior preset is worse than applying no preset at all.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Keep them readable. Keep them auditable.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Skills should be injected with clear delimiters so that logs and debug output make it obvious what instructions were active. "Why did the model do that weird thing?" should always have an answerable investigation path.&lt;/p&gt;




&lt;h2&gt;
  
  
  How we built it in Aye Chat
&lt;/h2&gt;

&lt;p&gt;We shipped skills as a first-class feature in &lt;a href="https://github.com/acrotron/aye-chat" rel="noopener noreferrer"&gt;Aye Chat&lt;/a&gt;. Here's the short version:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Skills live in a &lt;code&gt;skills/&lt;/code&gt; directory at the repo root (auto-discovered by walking upward from the working directory, monorepo-friendly).&lt;/li&gt;
&lt;li&gt;Only &lt;code&gt;*.md&lt;/code&gt; files, flat directory, no recursion. Boring on purpose.&lt;/li&gt;
&lt;li&gt;You invoke them with &lt;code&gt;skill:name&lt;/code&gt; or &lt;code&gt;skills:name1,name2&lt;/code&gt; or &lt;code&gt;using name skill&lt;/code&gt; in your prompt.&lt;/li&gt;
&lt;li&gt;They're injected into the system prompt with clear delimiters.&lt;/li&gt;
&lt;li&gt;Fuzzy matching exists but is intentionally conservative: high threshold, at most one match, and if two skills tie, neither is applied.&lt;/li&gt;
&lt;li&gt;Results are cached in memory, invalidated by directory &lt;code&gt;mtime&lt;/code&gt;. One cheap &lt;code&gt;stat()&lt;/code&gt; call per request.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can see our own &lt;a href="https://github.com/acrotron/aye-chat/tree/main/skills" rel="noopener noreferrer"&gt;skill files on GitHub&lt;/a&gt; - &lt;code&gt;documentation.md&lt;/code&gt;, &lt;code&gt;security.md&lt;/code&gt;, &lt;code&gt;testing.md&lt;/code&gt;, &lt;code&gt;modularization.md&lt;/code&gt;, &lt;code&gt;performance.md&lt;/code&gt;, and &lt;code&gt;repo-exploration.md&lt;/code&gt;. Feel free to steal them as starting points for your own.&lt;/p&gt;

&lt;p&gt;The implementation is small (~200 lines of Python), but the leverage is disproportionate. Once your team has a shared &lt;code&gt;skills/&lt;/code&gt; directory, every prompt becomes shorter and every output becomes more predictable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The real takeaway: codify your taste
&lt;/h2&gt;

&lt;p&gt;Every team has implicit preferences. How errors should be handled. How tests should be structured. How documentation should read. What "clean code" means in &lt;em&gt;this&lt;/em&gt; repo, not in a textbook.&lt;/p&gt;

&lt;p&gt;Those preferences usually live in people's heads, sometimes on a wiki page nobody reads, occasionally in a linter config that covers 10% of what you actually care about.&lt;/p&gt;

&lt;p&gt;Skills are a way to make those preferences explicit, portable, and reusable - in a format that both humans and AI assistants can read.&lt;/p&gt;

&lt;p&gt;You're not writing prompt templates. You're codifying your team's engineering taste into something version-controlled and composable.&lt;/p&gt;

&lt;p&gt;And the next time you're about to type "please be concise, include examples, explain intent, don't refactor things I didn't ask-" you can just write &lt;code&gt;skill:documentation&lt;/code&gt; and get on with your day.&lt;/p&gt;

&lt;p&gt;Your future self will thank you. Probably with a short, well-documented message.&lt;/p&gt;




&lt;h2&gt;
  
  
  About Aye Chat
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Aye Chat&lt;/strong&gt; is an open-source, AI-powered terminal workspace that brings AI directly into command-line workflows. Edit files, run commands, and chat with your codebase without leaving the terminal - with an optimistic workflow backed by instant local snapshots.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support Us
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Star our &lt;a href="https://github.com/acrotron/aye-chat" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;&lt;/strong&gt; - it helps new users discover Aye Chat.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spread the word.&lt;/strong&gt; Share Aye Chat with your team and friends who live in the terminal.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>agents</category>
    </item>
    <item>
      <title>Why I Use Different AI Models for Planning, Reviewing, and Coding</title>
      <dc:creator>Vyacheslav Mayorskiy</dc:creator>
      <pubDate>Tue, 17 Feb 2026 16:19:00 +0000</pubDate>
      <link>https://dev.to/vmayorskiyac/why-i-use-different-ai-models-for-planning-reviewing-and-coding-4p7l</link>
      <guid>https://dev.to/vmayorskiyac/why-i-use-different-ai-models-for-planning-reviewing-and-coding-4p7l</guid>
      <description>&lt;p&gt;I've been experimenting with something that feels slightly unhinged: using &lt;em&gt;different&lt;/em&gt; AI models at different stages of building a feature.&lt;/p&gt;

&lt;p&gt;Not because I'm indecisive. Because each model has a different superpower.&lt;/p&gt;

&lt;p&gt;GPT-5.2 is great at structured documentation and architectural thinking. Claude Opus 4.6 is terrifyingly good at catching edge cases and writing precise code. So why would I force one&lt;br&gt;
model to do everything when I could use them like specialized tools?&lt;/p&gt;

&lt;p&gt;This is the story of building a tiny feature called &lt;code&gt;printraw&lt;/code&gt; - and how a five-stage, multi-model workflow caught bugs that a single-model approach would have missed entirely.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Feature: Stop Making Me Fight the Terminal
&lt;/h2&gt;

&lt;p&gt;Here's the problem: Aye Chat renders AI responses in pretty Rich panels with Markdown formatting and box-drawing characters. Looks great. Feels polished.&lt;/p&gt;

&lt;p&gt;But try to &lt;em&gt;copy&lt;/em&gt; that text and paste it somewhere else.&lt;/p&gt;

&lt;p&gt;You get a mess of line breaks, box characters, and formatting artifacts that make you want to throw your laptop into the sea.&lt;/p&gt;

&lt;p&gt;The fix seemed simple: add a &lt;code&gt;printraw&lt;/code&gt; command that reprints the last response as plain, copy-friendly text. No panels. No Rich formatting. Just raw text wrapped in delimiters you can&lt;br&gt;
select and copy.&lt;/p&gt;

&lt;p&gt;The feature itself? Trivial. The &lt;em&gt;workflow&lt;/em&gt; I used to build it? That's what got interesting.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Five-Stage Pipeline
&lt;/h2&gt;

&lt;p&gt;Here's what I ended up doing:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Write the plan&lt;/td&gt;
&lt;td&gt;GPT-5.2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Validate the plan&lt;/td&gt;
&lt;td&gt;Claude Opus 4.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Implement&lt;/td&gt;
&lt;td&gt;Claude Opus 4.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Write tests&lt;/td&gt;
&lt;td&gt;GPT-5.2 + Claude Opus 4.6 (alternating)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Fix until green&lt;/td&gt;
&lt;td&gt;GPT-5.2 + Claude Opus 4.6 (alternating)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This isn't "use one model for everything." It's a staged pipeline where model selection is intentional - like choosing a screwdriver vs a hammer based on what you're actually fastening.&lt;/p&gt;


&lt;h2&gt;
  
  
  Stage 1: Planning with GPT-5.2
&lt;/h2&gt;

&lt;p&gt;I started by describing the UX problem to GPT-5.2 and asking for a complete implementation plan.&lt;/p&gt;

&lt;p&gt;GPT-5.2 produced a thorough document covering:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Command syntax and output format&lt;/li&gt;
&lt;li&gt;Where to capture the last response text&lt;/li&gt;
&lt;li&gt;Two architecture options (store in REPL vs. store in presenter)&lt;/li&gt;
&lt;li&gt;Which files to modify&lt;/li&gt;
&lt;li&gt;Testing approach&lt;/li&gt;
&lt;li&gt;Edge cases&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why GPT-5.2 for planning?&lt;/strong&gt; It's genuinely good at organized technical writing. It thinks through tradeoffs without needing to see every line of code. The output was clean, structured,&lt;br&gt;
and gave me something concrete to react to.&lt;/p&gt;


&lt;h2&gt;
  
  
  Stage 2: Validation with Claude Opus 4.6
&lt;/h2&gt;

&lt;p&gt;Here's where it gets interesting.&lt;/p&gt;

&lt;p&gt;I handed the plan to Claude 4.6 with a simple prompt: "Review and validate this plan. Let me know if you'd recommend any adjustments."&lt;/p&gt;

&lt;p&gt;Claude came back with seven specific recommendations, prioritized by impact:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;th&gt;Priority&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Add &lt;code&gt;raw&lt;/code&gt; as a short alias&lt;/td&gt;
&lt;td&gt;High - usability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Use plain &lt;code&gt;print()&lt;/code&gt;, not Rich &lt;code&gt;console.print()&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;High - correctness&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Shorten delimiter lines&lt;/td&gt;
&lt;td&gt;Low - taste&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Clarify: summary-only output, not file changes&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Treat whitespace-only summary as empty&lt;/td&gt;
&lt;td&gt;Medium - edge case&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Note that mid-stream &lt;code&gt;printraw&lt;/code&gt; is N/A&lt;/td&gt;
&lt;td&gt;Low - docs only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Add Rich-markup-leak test case&lt;/td&gt;
&lt;td&gt;Medium - correctness&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Recommendation #2 was the one that made me sit up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Rich markup leak problem:&lt;/strong&gt; If you use Rich's &lt;code&gt;console.print()&lt;/code&gt; to output "raw" text, and the AI's response happens to contain tokens like &lt;code&gt;or&lt;/code&gt;, Rich interprets them as markup&lt;br&gt;
instead of printing them literally. Your "raw" output comes out formatted. The whole point of the feature is defeated.&lt;/p&gt;

&lt;p&gt;The fix - using Python's built-in &lt;code&gt;print()&lt;/code&gt; - is trivial. But I would have missed it without a dedicated review pass.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Claude 4.6 for validation?&lt;/strong&gt; It's like hiring a polite pedant to review your work. The structured table with priority ratings made it easy to cherry-pick which adjustments to accept.&lt;br&gt;
I took recommendations #1 through #5 and skipped the documentation-only items.&lt;/p&gt;


&lt;h2&gt;
  
  
  Stage 3: Implementation with Claude Opus 4.6
&lt;/h2&gt;

&lt;p&gt;With a validated plan in hand, I asked Claude to implement it.&lt;/p&gt;

&lt;p&gt;The first implementation changed the return types of &lt;code&gt;handle_with_command()&lt;/code&gt; and &lt;code&gt;handle_blog_command()&lt;/code&gt; from &lt;code&gt;Optional&lt;/code&gt; to &lt;code&gt;Tuple[Optional, Optional]&lt;/code&gt; - threading the response text back&lt;br&gt;
to the REPL.&lt;/p&gt;

&lt;p&gt;I flagged this immediately: "Won't that introduce regressions and break existing functionality?"&lt;/p&gt;

&lt;p&gt;Claude acknowledged the risk and proposed something cleaner: &lt;strong&gt;capture the text at the source of truth&lt;/strong&gt; - inside &lt;code&gt;print_assistant_response()&lt;/code&gt; itself, using a module-level variable.&lt;/p&gt;

&lt;p&gt;This approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Required zero signature changes&lt;/li&gt;
&lt;li&gt;Had zero regression risk&lt;/li&gt;
&lt;li&gt;Was guaranteed to capture the correct text (whatever was actually printed)&lt;/li&gt;
&lt;li&gt;Worked automatically for all code paths&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Much better.&lt;/p&gt;
&lt;h3&gt;
  
  
  The Bug That Almost Shipped
&lt;/h3&gt;

&lt;p&gt;Even after the refactor, the first test showed the command printing "No assistant response available yet" after a valid response.&lt;/p&gt;

&lt;p&gt;Root cause: the initial code tried to capture text using &lt;code&gt;getattr(llm_response, 'answer_summary', None)&lt;/code&gt;, but the response object's attribute was actually &lt;code&gt;.summary&lt;/code&gt;, not&lt;br&gt;
&lt;code&gt;.answer_summary&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The fix was exactly the module-level capture approach - store the text inside &lt;code&gt;print_assistant_response()&lt;/code&gt; where the correct string is guaranteed to exist, regardless of what the response&lt;br&gt;
object's attributes are named.&lt;/p&gt;

&lt;p&gt;Final implementation touched four files:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;presenter/repl_ui.py&lt;/code&gt; - Module variable + getter + capture logic&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;presenter/raw_output.py&lt;/code&gt; - New file: plain &lt;code&gt;print()&lt;/code&gt; with delimiters&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;controller/command_handlers.py&lt;/code&gt; - New &lt;code&gt;handle_printraw_command()&lt;/code&gt; handler&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;controller/repl.py&lt;/code&gt; - Added &lt;code&gt;printraw&lt;/code&gt; and &lt;code&gt;raw&lt;/code&gt; to built-in commands&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Stages 4 &amp;amp; 5: The Adversarial Testing Loop
&lt;/h2&gt;

&lt;p&gt;Here's where the multi-model approach got &lt;em&gt;really&lt;/em&gt; interesting.&lt;/p&gt;

&lt;p&gt;With implementation done, I didn't just ask one model to write tests and fix them. I ping-ponged between models: one writes, the other critiques and fixes, repeat.&lt;/p&gt;

&lt;p&gt;The test coverage needed to include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Normal output with delimiters&lt;/li&gt;
&lt;li&gt;Rich markup leak prevention (the &lt;code&gt;something&lt;/code&gt; case)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;None&lt;/code&gt; input → warning message&lt;/li&gt;
&lt;li&gt;Whitespace-only input → warning message&lt;/li&gt;
&lt;li&gt;Empty string → warning message&lt;/li&gt;
&lt;li&gt;The capture mechanism&lt;/li&gt;
&lt;li&gt;The handler integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then came the iteration loop - but with a twist.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» model gpt-5.2
&lt;span class="o"&gt;(&lt;/span&gt;ツ» write tests &lt;span class="k"&gt;for &lt;/span&gt;the printraw feature
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GPT writes tests.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» pytest tests/test_raw_output.py &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Oh God. OH GOD. Red everywhere.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» model claude-opus-4.6
&lt;span class="o"&gt;(&lt;/span&gt;ツ» fix the failing tests
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude fixes things - and often rewrites chunks of GPT's approach entirely.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» pytest tests/test_raw_output.py &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Still red, but fewer failures.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» model gpt-5.2
&lt;span class="o"&gt;(&lt;/span&gt;ツ» these tests are still failing, fix them
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GPT takes a different angle. Catches something Claude missed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» pytest tests/test_raw_output.py &lt;span class="nt"&gt;-v&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Green. Finally green.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why alternate models?&lt;/strong&gt; Because each model has different blind spots. GPT might write a test that's technically correct but uses mocking patterns Claude handles better. Claude might fix&lt;br&gt;
the mock but miss an assertion edge case that GPT catches on the next pass.&lt;/p&gt;

&lt;p&gt;It's adversarial collaboration. Each model is essentially reviewing the other's work, and bugs that survive one model's scrutiny often get caught by the other.&lt;/p&gt;

&lt;p&gt;No context-switching. No copying error messages between terminals. Everything in one session - just swapping which brain is on the case.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Workflow Works
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Different models for different cognitive tasks
&lt;/h3&gt;

&lt;p&gt;Planning is a different skill than code review which is a different skill than implementation. Using one model for everything is like using a hammer for screws - it technically works but&lt;br&gt;
you're fighting the tool.&lt;/p&gt;

&lt;h3&gt;
  
  
  The staged approach catches errors early
&lt;/h3&gt;

&lt;p&gt;The validation stage caught the Rich markup leak &lt;em&gt;before any code was written&lt;/em&gt;. Without it, that bug would have surfaced (maybe) when users reported garbled output weeks later.&lt;/p&gt;

&lt;h3&gt;
  
  
  Regression risk is managed explicitly
&lt;/h3&gt;

&lt;p&gt;By questioning the return-type changes, I avoided an entire class of integration issues. The "capture at the source of truth" pattern emerged from that pushback.&lt;/p&gt;

&lt;h3&gt;
  
  
  Alternating models surfaces hidden bugs
&lt;/h3&gt;

&lt;p&gt;The ping-pong pattern during testing caught issues that a single model iterating with itself would have missed. Each model brings a different failure mode - and different solutions.&lt;/p&gt;

&lt;h3&gt;
  
  
  The conversation &lt;em&gt;is&lt;/em&gt; the development environment
&lt;/h3&gt;

&lt;p&gt;Every stage happened in the same Aye Chat session:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Model switching via &lt;code&gt;model&lt;/code&gt; command&lt;/li&gt;
&lt;li&gt;File generation via prompts&lt;/li&gt;
&lt;li&gt;Test execution via &lt;code&gt;pytest&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Undo via &lt;code&gt;restore&lt;/code&gt; when something went wrong&lt;/li&gt;
&lt;li&gt;Diff inspection via &lt;code&gt;diff&lt;/code&gt; to verify changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No IDE. No separate terminal. No copy-pasting between tools.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Plan first, validate second, implement third.&lt;/strong&gt; Writing a plan document forces clarity before you touch code.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Switch models for validation.&lt;/strong&gt; The model that wrote the plan won't catch its own blind spots. A fresh perspective - even from a different AI - brings a different analytical lens.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Capture at the source of truth.&lt;/strong&gt; When multiple code paths need the same data, find the single point where it's guaranteed to be correct. Don't thread it through function signatures.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Question regression risk explicitly.&lt;/strong&gt; When implementation requires changing existing contracts, ask: "Is there a way to do this without breaking things?" Usually there is.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Alternate models during test/fix loops.&lt;/strong&gt; One model writes, the other critiques. Bugs that slip past one often get caught by the other. It's like having two reviewers who never get&lt;br&gt;
tired.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Keep tests in the same session.&lt;/strong&gt; Running pytest, reading failures, and fixing them without leaving the terminal keeps iteration tight and fast.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This whole feature - planned, validated, implemented, tested, debugged, and shipped - happened in a single Aye Chat session across a few hours.&lt;/p&gt;

&lt;p&gt;Not because the feature was hard. Because the workflow made it frictionless.&lt;/p&gt;




&lt;h2&gt;
  
  
  About Aye Chat
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Aye Chat&lt;/strong&gt; is an open-source, AI-powered terminal workspace that brings AI directly into command-line workflows. Edit files, run commands, and chat with your codebase without leaving the&lt;br&gt;
terminal - with an optimistic workflow backed by instant local snapshots.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support Us
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Star our &lt;a href="https://github.com/acrotron/aye-chat" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;&lt;/strong&gt; - it helps new users discover Aye Chat.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spread the word.&lt;/strong&gt; Share Aye Chat with your team and friends who live in the terminal.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>python</category>
    </item>
    <item>
      <title>Claude Code Asks Nicely. Aye Chat Defaults to Action.</title>
      <dc:creator>Vyacheslav Mayorskiy</dc:creator>
      <pubDate>Tue, 10 Feb 2026 16:17:00 +0000</pubDate>
      <link>https://dev.to/vmayorskiyac/claude-code-asks-nicely-aye-chat-defaults-to-action-18h2</link>
      <guid>https://dev.to/vmayorskiyac/claude-code-asks-nicely-aye-chat-defaults-to-action-18h2</guid>
      <description>&lt;p&gt;I saw a Claude Code ad and thought: &lt;em&gt;ah yes, the well-mannered butler of developer tools.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"An assistant that explains its thinking before acting."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13nqwyohy68m53v64kfn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F13nqwyohy68m53v64kfn.png" alt="Claude Code ad that kicked off this thought off." width="800" height="707"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It's elegant. It's cautious. It's the kind of assistant that puts a napkin on its arm before refactoring.&lt;/p&gt;

&lt;p&gt;Aye Chat is not that.&lt;/p&gt;

&lt;p&gt;Aye Chat is the chaotic-good intern with a power drill, except I installed a big red &lt;strong&gt;UNDO&lt;/strong&gt; button and wrote the insurance policy ourselves.&lt;/p&gt;

&lt;p&gt;Because here's our belief:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Permission is an expensive user interface.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The tiny tax that eats your lunch
&lt;/h2&gt;

&lt;p&gt;Approval-first tools tend to feel like ordering at a restaurant where the waiter reads you the entire supply chain.&lt;/p&gt;

&lt;p&gt;You:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Could you rename this parameter?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Tool:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Certainly. First, I will analyze the repository. Then I will propose a plan. Then I will show you the plan. Then I will explain the plan. Then I will ask if you approve&lt;br&gt;
the plan. Then, pending approval of the plan explaining the plan... would you like me to proceed?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Meanwhile your coffee goes cold, your focus wanders off, and your brain starts loading another task like a browser tab you didn't mean to open.&lt;/p&gt;

&lt;p&gt;This isn't a moral failing. It's just physics.&lt;/p&gt;

&lt;p&gt;Every "Are you sure?" prompt is a speed bump on the highway of flow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Aye Chat's bet: optimism with a parachute (and a spare)
&lt;/h2&gt;

&lt;p&gt;Aye Chat is built around the &lt;strong&gt;optimistic workflow&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The AI &lt;strong&gt;writes directly to files&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Every write is &lt;strong&gt;snapshotted locally&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;If you don't like it, you &lt;strong&gt;undo it instantly&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No approval checkbox.&lt;br&gt;
No pre-flight committee meeting.&lt;br&gt;
No 12-slide deck titled &lt;em&gt;"Proposed Rename Initiative (Q1)"&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Just:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;restore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the trick.&lt;/p&gt;

&lt;p&gt;I am not claiming models are perfect. That would be like claiming your cat always lands on its feet &lt;em&gt;and&lt;/em&gt; files your taxes.&lt;/p&gt;

&lt;p&gt;I am claiming something more boring - and more useful:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The model is right often enough that &lt;strong&gt;defaulting to action&lt;/strong&gt; is faster... &lt;strong&gt;if reversal is instant and reliable&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So I built the parachute first.&lt;/p&gt;

&lt;p&gt;Then I started skydiving.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why undo beats approval (most of the time)
&lt;/h2&gt;

&lt;p&gt;Approval feels safe because it tries to prevent mistakes.&lt;/p&gt;

&lt;p&gt;Undo feels safe because it makes mistakes &lt;strong&gt;cheap&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;And cheap mistakes are how software gets written.&lt;/p&gt;

&lt;p&gt;Approval-first pushes you into reviewer brain &lt;em&gt;before you even know if the change matters&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Undo-first keeps you in builder brain:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Ask.&lt;/li&gt;
&lt;li&gt;Get the change.&lt;/li&gt;
&lt;li&gt;Run tests / run the app / run the command.&lt;/li&gt;
&lt;li&gt;Keep it or revert it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It's the same reason whiteboards work: you write first, erase later.&lt;/p&gt;

&lt;p&gt;Git is basically civilization's agreement that:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We will do slightly reckless things, but we will keep receipts."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Aye Chat is that - except the receipt is printed automatically, stapled to the change, and placed directly into your hand.&lt;/p&gt;

&lt;h2&gt;
  
  
  "But writing straight to files is scary."
&lt;/h2&gt;

&lt;p&gt;Yes.&lt;/p&gt;

&lt;p&gt;So is using a table saw.&lt;/p&gt;

&lt;p&gt;That's why table saws have guards, and why we treat snapshots like a seatbelt:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automatic&lt;/strong&gt; (no remembering to commit/stash)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local&lt;/strong&gt; (stored in &lt;code&gt;.aye/&lt;/code&gt;, not beamed into the void)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Immediate&lt;/strong&gt; (restore is one command)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the model goes off-road, you don't open a philosophical debate.&lt;/p&gt;

&lt;p&gt;You don't negotiate with the GPS.&lt;/p&gt;

&lt;p&gt;You just take the last exit and try again.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real difference in plain English
&lt;/h2&gt;

&lt;p&gt;Claude Code's UX says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I'll explain, you approve, then I'll act."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Aye Chat's UX says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I'll act, you react - and if it's cursed, we roll time back."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Different defaults.&lt;br&gt;
Different vibes.&lt;/p&gt;

&lt;p&gt;Claude is the assistant that asks permission to move a chair.&lt;/p&gt;

&lt;p&gt;Aye Chat is the assistant that rearranges the room so it finally makes sense, and if you hate it, it puts everything back exactly where it was.&lt;/p&gt;

&lt;p&gt;If you live in the terminal, that default isn't cosmetic.&lt;/p&gt;

&lt;p&gt;It's the difference between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;babysitting an assistant who wants a signature for every screw, and&lt;/li&gt;
&lt;li&gt;using a power tool with an emergency stop and a clean rollback.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stop approving.&lt;br&gt;
Start shipping.&lt;br&gt;
Keep the parachute.&lt;/p&gt;




&lt;h2&gt;
  
  
  About Aye Chat
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Aye Chat&lt;/strong&gt; is an open-source, AI-powered terminal workspace that brings AI directly into command-line workflows. Edit files, run commands, and chat with your codebase without leaving the terminal - with an optimistic workflow backed by instant local snapshots.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support Us
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Star our GitHub repository: &lt;a href="https://github.com/acrotron/aye-chat" rel="noopener noreferrer"&gt;https://github.com/acrotron/aye-chat&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Spread the word. Share Aye Chat with your team and friends who live in the terminal.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>productivity</category>
      <category>python</category>
    </item>
    <item>
      <title>When a Model Gets Stuck: How GPT‑5.2 Finished a 'Simple' Spinner That Opus 4.5 Couldn't</title>
      <dc:creator>Vyacheslav Mayorskiy</dc:creator>
      <pubDate>Tue, 06 Jan 2026 16:54:00 +0000</pubDate>
      <link>https://dev.to/vmayorskiyac/when-a-model-gets-stuck-how-gpt-52-finished-a-simple-spinner-that-opus-45-couldnt-1o6p</link>
      <guid>https://dev.to/vmayorskiyac/when-a-model-gets-stuck-how-gpt-52-finished-a-simple-spinner-that-opus-45-couldnt-1o6p</guid>
      <description>&lt;p&gt;I have a weakness for feature requests that start with:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This should be simple."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Because they &lt;em&gt;are&lt;/em&gt; simple.&lt;/p&gt;

&lt;p&gt;Right up until you try to make them correct.&lt;/p&gt;

&lt;p&gt;This one came from a very real pain point in Aye Chat:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Sometimes the LLM response stalls mid‑sentence. Show a basic spinner when that happens, and remove it when more text arrives."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In a web UI, a stall is annoying.&lt;br&gt;
In a terminal, a stall looks exactly like a crash.&lt;/p&gt;

&lt;p&gt;So yes - I needed a subtle &lt;code&gt;⋯ waiting for more&lt;/code&gt; indicator.&lt;br&gt;
Not a blinking disco. Not a permanent footer. Just an honest signal: &lt;em&gt;we're alive, we're waiting&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;And this is where the story gets interesting, because it wasn't just a concurrency story.&lt;/p&gt;

&lt;p&gt;It was also a &lt;strong&gt;model story&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Claude Opus 4.5 got us most of the way there, then got stuck in a loop of "reasonable fixes" that didn't quite land.&lt;br&gt;
GPT‑5.2 came in and finished the job - not by being more verbose or more confident, but by being more precise about what the system actually was: a little real‑time UI&lt;br&gt;
state machine with multiple writers.&lt;/p&gt;

&lt;p&gt;This is the write-up I wish I had before I lost an afternoon to a spinner.&lt;/p&gt;


&lt;h2&gt;
  
  
  The scene: simulated streaming meets real-world stalls
&lt;/h2&gt;

&lt;p&gt;Aye Chat streams responses into a terminal UI using Rich (&lt;code&gt;Live&lt;/code&gt;).&lt;br&gt;
We also have simulated streaming: even if a provider returns larger chunks, we animate output word-by-word so it feels like "typing."&lt;/p&gt;

&lt;p&gt;Real providers behave like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you receive &lt;strong&gt;some&lt;/strong&gt; tokens,&lt;/li&gt;
&lt;li&gt;then there's a gap (LLM is gathering some thoughts, server-side pause, backpressure),&lt;/li&gt;
&lt;li&gt;then streaming resumes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If that gap happens mid-sentence, users freeze with it.&lt;/p&gt;

&lt;p&gt;So the request had four deceptively clean requirements:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Detect "stall" while streaming.&lt;/li&gt;
&lt;li&gt;Show &lt;code&gt;⋯ waiting for more&lt;/code&gt; only during stalls.&lt;/li&gt;
&lt;li&gt;Remove it immediately when new text arrives.&lt;/li&gt;
&lt;li&gt;Don't break Markdown or final formatting.&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  The architecture (and why it can lie to you)
&lt;/h2&gt;

&lt;p&gt;At a high level the streaming UI has three moving parts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;update(content: str)&lt;/code&gt; - called when &lt;em&gt;new streamed content arrives&lt;/em&gt; (full accumulated content, not a delta).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;_animate_words(new_text: str)&lt;/code&gt; - prints newly received text word-by-word with a small delay.&lt;/li&gt;
&lt;li&gt;a background monitor thread - periodically decides whether we are "stalled."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Rendering is via a helper like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;_create_response_panel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_markdown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;show_stall_indicator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Panel&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When &lt;code&gt;show_stall_indicator=True&lt;/code&gt;, it appends:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;⋯ waiting for more
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So far, boring.&lt;/p&gt;

&lt;p&gt;But then you hit the question that decides everything:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What does "stalled" &lt;em&gt;mean&lt;/em&gt;?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And it turns out there are two kinds of "stalled":&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Network stall:&lt;/strong&gt; no new content is arriving.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User-visible stall:&lt;/strong&gt; nothing is changing on screen.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those are not the same in a system that intentionally delays rendering.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where Opus 4.5 got stuck: fixing symptoms instead of the machine
&lt;/h2&gt;

&lt;p&gt;Claude Opus 4.5 did the first part quickly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;add a timestamp,&lt;/li&gt;
&lt;li&gt;monitor elapsed time,&lt;/li&gt;
&lt;li&gt;show the indicator if we exceed a threshold.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It "worked"… until it didn't.&lt;/p&gt;

&lt;p&gt;The bug appeared very specific:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The indicator blinks briefly even when words are still printing.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That symptom is a big clue. It means the stall detector is looking at &lt;strong&gt;time since last network update&lt;/strong&gt;, while the UI is still busy animating buffered words.&lt;/p&gt;

&lt;p&gt;Opus tried the next obvious move: add an &lt;code&gt;_is_animating&lt;/code&gt; flag and suppress the indicator while animating.&lt;/p&gt;

&lt;p&gt;Still not fixed.&lt;/p&gt;

&lt;p&gt;At that point, you can feel a model fall into a common trap: it keeps proposing plausible tweaks (threshold changes, different checks, "maybe we should…"), but it doesn't&lt;br&gt;
fully re-frame the problem.&lt;/p&gt;

&lt;p&gt;Because the real problem wasn't just "the flag is wrong."&lt;/p&gt;

&lt;p&gt;The real problem was that we had &lt;strong&gt;two concurrent writers&lt;/strong&gt; to the same UI.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the animation path calls &lt;code&gt;Live.update()&lt;/code&gt; as it prints words&lt;/li&gt;
&lt;li&gt;the monitor thread calls &lt;code&gt;Live.update()&lt;/code&gt; as it toggles the indicator&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Without serialization, you &lt;em&gt;will&lt;/em&gt; eventually render an inconsistent intermediate frame - which, to the user, looks like blinking.&lt;/p&gt;

&lt;p&gt;So the "didn't fix it" moment wasn't Opus being incompetent.&lt;/p&gt;

&lt;p&gt;It was Opus being stuck in a local optimum:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;treat it as timing&lt;/li&gt;
&lt;li&gt;treat it as one boolean&lt;/li&gt;
&lt;li&gt;treat it as "add one more guard"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When what we needed was: &lt;strong&gt;state + synchronization&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's the moment I switched models.&lt;/p&gt;


&lt;h2&gt;
  
  
  What GPT‑5.2 did differently: treat it like a state machine with a single renderer
&lt;/h2&gt;

&lt;p&gt;GPT‑5.2 didn't win by being clever.&lt;br&gt;
It won by being strict.&lt;/p&gt;

&lt;p&gt;It made three changes that turned the spinner from "mostly works" into "boring and correct."&lt;/p&gt;
&lt;h3&gt;
  
  
  1) Serialize shared state and all UI updates
&lt;/h3&gt;

&lt;p&gt;First: a lock.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;RLock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then a rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If it touches shared state or calls &lt;code&gt;Live.update()&lt;/code&gt;, it must hold the lock.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We centralized rendering into a single helper so we stopped sprinkling &lt;code&gt;Live.update()&lt;/code&gt; in random code paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_refresh_display&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;use_markdown&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;show_stall&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_live&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nf"&gt;_create_response_panel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_animated_content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;use_markdown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;use_markdown&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;show_stall_indicator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;show_stall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_showing_stall_indicator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;show_stall&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one change kills a whole class of "blinking because two threads fought for the frame buffer."&lt;/p&gt;

&lt;h3&gt;
  
  
  2) Redefine "stall" as "caught up and no new input"
&lt;/h3&gt;

&lt;p&gt;This was the conceptual pivot.&lt;/p&gt;

&lt;p&gt;A stall should only be possible when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;we are &lt;strong&gt;not currently animating&lt;/strong&gt;, and&lt;/li&gt;
&lt;li&gt;the animated output has &lt;strong&gt;caught up&lt;/strong&gt; to what we have received.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;caught_up&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_is_animating&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_animated_content&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_current_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That single definition fixes the original "indicator shows while words are still printing" bug.&lt;/p&gt;

&lt;p&gt;If the UI is still draining buffered words, you're not stalled.&lt;br&gt;
You're busy.&lt;/p&gt;
&lt;h3&gt;
  
  
  3) Use "last receive time," not "last render time"
&lt;/h3&gt;

&lt;p&gt;After the above, we hit a second, subtler bug:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When streaming is actually paused, the indicator blinks instead of staying lit.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a classic mistake in real-time UI code: if you update the "progress timestamp" when you redraw the indicator, the indicator becomes self-canceling.&lt;/p&gt;

&lt;p&gt;So GPT‑5.2 separated the concepts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;_last_receive_time&lt;/code&gt; changes &lt;strong&gt;only when new stream content arrives&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;redraws do not touch it
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_last_receive_time&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Updated only in &lt;code&gt;update()&lt;/code&gt; when content truly changes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_current_content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_last_receive_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the monitor checks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;time_since_receive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_last_receive_time&lt;/span&gt;
&lt;span class="n"&gt;should_show_stall&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time_since_receive&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_stall_threshold&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That makes the indicator "sticky" in the correct way:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;it turns on after the threshold,&lt;/li&gt;
&lt;li&gt;it stays on continuously,&lt;/li&gt;
&lt;li&gt;and it turns off immediately when new text arrives.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The final monitor loop (the boring version that works)
&lt;/h2&gt;

&lt;p&gt;Here's what the working logic boils down to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_monitor_stall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_stop_monitoring&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_set&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_stop_monitoring&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;

        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_started&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_animated_content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

            &lt;span class="n"&gt;caught_up&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_is_animating&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_animated_content&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_current_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;caught_up&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

            &lt;span class="n"&gt;time_since_receive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_last_receive_time&lt;/span&gt;
            &lt;span class="n"&gt;should_show_stall&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time_since_receive&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_stall_threshold&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;should_show_stall&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_showing_stall_indicator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_live&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="nf"&gt;_create_response_panel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_animated_content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;use_markdown&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;show_stall_indicator&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;should_show_stall&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_showing_stall_indicator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;should_show_stall&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key properties:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;no indicator while buffered words are still animating&lt;/li&gt;
&lt;li&gt;indicator appears only after &lt;strong&gt;no new content&lt;/strong&gt; arrives for &lt;code&gt;stall_threshold&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;indicator stays on continuously once shown&lt;/li&gt;
&lt;li&gt;indicator disappears immediately when new text arrives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The spinner stops being a feature.&lt;br&gt;
It becomes infrastructure.&lt;/p&gt;

&lt;p&gt;Which is exactly what terminal UX should be.&lt;/p&gt;




&lt;h2&gt;
  
  
  The real theme: why swapping models is a debugging tool
&lt;/h2&gt;

&lt;p&gt;I'm not interested in "model wars."&lt;/p&gt;

&lt;p&gt;But I &lt;em&gt;am&lt;/em&gt; interested in the practical reality of building with them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Some models are great at first-pass implementation.&lt;/li&gt;
&lt;li&gt;Some models are great at refactoring.&lt;/li&gt;
&lt;li&gt;Some models are great at pushing through annoying edge cases.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this case:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Opus 4.5&lt;/strong&gt; got to a plausible implementation quickly, and even cleaned up structure when asked.
But it kept circling around incremental fixes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPT‑5.2&lt;/strong&gt; zoomed out, saw "two writers + ambiguous definition of stall," and forced the solution into a small state machine with serialized rendering.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That doesn't mean GPT is "better" in the abstract.&lt;/p&gt;

&lt;p&gt;It means something more useful:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When you feel a model looping, change the shape of the conversation - or change the model.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In Aye Chat, switching models is cheap.&lt;br&gt;
And when you're stuck on a UI race condition that only reproduces one out of ten times, "cheap" matters.&lt;/p&gt;




&lt;h2&gt;
  
  
  Takeaways (and the part that matches Aye Chat's philosophy)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;A spinner has a bigger correctness surface area than it deserves.&lt;/strong&gt;&lt;br&gt;
Animation + monitoring + concurrent rendering is a real system.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;"Stall" is a state, not a timeout.&lt;/strong&gt;&lt;br&gt;
It must mean "caught up and no new input," not "some time passed."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Don't let rendering update the clock that decides whether to render.&lt;/strong&gt;&lt;br&gt;
That's how you invent blinking.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Locking isn't optional when multiple threads can render.&lt;/strong&gt;&lt;br&gt;
Even if nothing crashes, the UX will.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Model choice is part of the toolchain.&lt;/strong&gt;&lt;br&gt;
When one model gets stuck in local fixes, another might see the global shape.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In a weird way, this tiny &lt;code&gt;⋯ waiting for more&lt;/code&gt; indicator is the same lesson as the optimistic workflow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;let the system move fast,&lt;/li&gt;
&lt;li&gt;but build it so you can recover instantly,&lt;/li&gt;
&lt;li&gt;and be pragmatic about the tools (including the model) that get you unstuck.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  About Aye Chat
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Aye Chat&lt;/strong&gt; is an open-source, AI-powered terminal workspace that brings AI directly into command-line workflows. Edit files, run commands, and chat with your codebase&lt;br&gt;
without leaving the terminal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support Us
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Star our GitHub repository: &lt;a href="https://github.com/acrotron/aye-chat" rel="noopener noreferrer"&gt;https://github.com/acrotron/aye-chat&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Spread the word. Share Aye Chat with your team and friends who live in the terminal.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>programming</category>
      <category>ai</category>
      <category>llm</category>
      <category>python</category>
    </item>
    <item>
      <title>Designing Terminal UX for AI is really about DX (and a few classic UX principles)</title>
      <dc:creator>Vyacheslav Mayorskiy</dc:creator>
      <pubDate>Tue, 16 Dec 2025 14:27:07 +0000</pubDate>
      <link>https://dev.to/vmayorskiyac/designing-terminal-ux-for-ai-is-really-about-dx-and-a-few-classic-ux-principles-3dd6</link>
      <guid>https://dev.to/vmayorskiyac/designing-terminal-ux-for-ai-is-really-about-dx-and-a-few-classic-ux-principles-3dd6</guid>
      <description>&lt;p&gt;A lot of AI devtools talk about models. However, another hard problem is DX ("Developer Experience"): how do you make an AI assistant feel &lt;em&gt;native&lt;/em&gt; inside the terminal, without breaking the muscle memory developers already have?&lt;/p&gt;

&lt;p&gt;When we built Aye Chat, the UX playbook looked surprisingly “traditional”:&lt;/p&gt;

&lt;h2&gt;
  
  
  1) Hierarchy first: route user intent the way a shell does
&lt;/h2&gt;

&lt;p&gt;In a terminal, ambiguity is the enemy. So Aye Chat treats input with a clear priority:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Built-in commands (&lt;code&gt;help&lt;/code&gt; / &lt;code&gt;restore&lt;/code&gt; / &lt;code&gt;model&lt;/code&gt; / etc.)&lt;/li&gt;
&lt;li&gt;Shell commands (run the real command directly)&lt;/li&gt;
&lt;li&gt;Everything else becomes an AI prompt&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a classic hierarchical decision tree: predictable, fast, and easy to learn. The key DX benefit is you don’t “leave the terminal mindset” to use AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  2) Progressive disclosure: power features only appear when you need them
&lt;/h2&gt;

&lt;p&gt;The terminal is already dense; AI shouldn’t add UI clutter.&lt;/p&gt;

&lt;p&gt;Aye Chat keeps the default experience minimal (type a prompt, get an answer), then progressively reveals capabilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;help&lt;/code&gt; exists, but it’s not forced into every interaction.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;verbose on|off&lt;/code&gt; toggles how much operational detail you see.&lt;/li&gt;
&lt;li&gt;RAG context selection is usually invisible; in verbose mode you can see which files were included.&lt;/li&gt;
&lt;li&gt;For large projects, indexing runs in the background; you only see progress if you opt into verbosity (and it shows as a small inline hint in the prompt).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This keeps the “happy path” clean while still supporting the power-user path.&lt;/p&gt;

&lt;h2&gt;
  
  
  3) “Optimistic UX” with an explicit escape hatch
&lt;/h2&gt;

&lt;p&gt;AI edits are only useful if they’re fast, but speed without safety kills trust.&lt;/p&gt;

&lt;p&gt;Aye Chat leans into an optimistic workflow: apply changes directly, but make rollback instantaneous.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every update creates snapshots automatically.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;diff&lt;/code&gt; is the review step.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;restore&lt;/code&gt; / &lt;code&gt;undo&lt;/code&gt; is the safety net.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;DX-wise, this shifts the mental model from “AI is risky” to “AI is reversible” and "I can experiment without fear". Developers move faster because the cost of being wrong is low.&lt;/p&gt;

&lt;h2&gt;
  
  
  4) Reduce cognitive load with “focus tools” instead of more UI
&lt;/h2&gt;

&lt;p&gt;Two small primitives do a lot of work:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;@file&lt;/code&gt; references: include a file inline, on demand.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;with &amp;lt;files&amp;gt;: &amp;lt;prompt&amp;gt;&lt;/code&gt;: constrain the problem to specific files (supports wildcards).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is progressive disclosure again: you don’t need to learn these on day one, but when prompts get vague, you have a precise way to tell the system what matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  5) Interaction design for the terminal: make latency feel manageable
&lt;/h2&gt;

&lt;p&gt;AI latency is real; good terminal UX acknowledges it instead of hiding it.&lt;/p&gt;

&lt;p&gt;Aye Chat uses “progressive waiting” messages (spinner text that changes over time):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;“Building prompt…”&lt;/li&gt;
&lt;li&gt;“Sending to LLM…”&lt;/li&gt;
&lt;li&gt;“Still waiting…”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s a small thing, but it makes the experience feel responsive and honest.&lt;/p&gt;

&lt;h2&gt;
  
  
  6) Autocomplete as a DX feature (not a gimmick)
&lt;/h2&gt;

&lt;p&gt;Terminal tools win when they help you stay in flow. Aye Chat’s completion behavior is designed for that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-column completions for readability.&lt;/li&gt;
&lt;li&gt;Special-cased auto-complete for &lt;code&gt;@file&lt;/code&gt; contexts.&lt;/li&gt;
&lt;li&gt;A user setting to choose “readline-like” vs “complete while typing”.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The principle here is consistency: the UI adapts to the terminal, not the other way around.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;If you’re building AI for developers, UX principles like hierarchy, progressive disclosure, and reversible actions aren’t “nice to have.” They’re the difference between a demo and a daily driver.&lt;/p&gt;




&lt;h2&gt;
  
  
  About Aye Chat
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Aye Chat&lt;/strong&gt; is an open-source, AI-powered terminal workspace that brings the power of AI directly into your command-line workflow. Edit files, run commands, and chat with your codebase without ever leaving the terminal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support Us
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Star our GitHub repository: &lt;a href="https://github.com/acrotron/aye-chat#aye-chat-ai-powered-terminal-workspace" rel="noopener noreferrer"&gt;https://github.com/acrotron/aye-chat#aye-chat-ai-powered-terminal-workspace&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Spread the word. Share Aye Chat with your team and friends who live in the terminal.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>cli</category>
      <category>programming</category>
      <category>ux</category>
    </item>
    <item>
      <title>The Day I Stopped Babysitting an AI: Building the Optimistic Workflow</title>
      <dc:creator>Vyacheslav Mayorskiy</dc:creator>
      <pubDate>Thu, 04 Dec 2025 12:57:00 +0000</pubDate>
      <link>https://dev.to/vmayorskiyac/i-got-so-annoyed-with-ai-coding-assistants-that-i-built-my-own-pa7</link>
      <guid>https://dev.to/vmayorskiyac/i-got-so-annoyed-with-ai-coding-assistants-that-i-built-my-own-pa7</guid>
      <description>&lt;p&gt;You know that feeling when you're trying to get work done and someone keeps tapping your shoulder for approval? "Is this okay?" Tap. "How about this?" Tap. "Should I do this next?" Tap tap tap.&lt;/p&gt;

&lt;p&gt;That's what using most AI coding assistants felt like to me. I'd ask it to add a feature, and then - instead of just doing it - it would present me with a diff and wait. Like a puppy showing me its toy, waiting for validation. Every. Single. Time.&lt;/p&gt;

&lt;p&gt;I was babysitting an AI like it was a toddler with scissors.&lt;/p&gt;

&lt;p&gt;And I kept thinking: "These models are getting scary good. Claude 4.5 Sonnet gets things right maybe 70-80% of the time on first try. Why am I still approving every comma it wants to add?"&lt;/p&gt;

&lt;p&gt;So I built something different. Something that would just &lt;em&gt;do the thing&lt;/em&gt; and let me fix it if it screwed up. I called it the &lt;strong&gt;optimistic workflow&lt;/strong&gt;, and it became the heart of Aye Chat.&lt;/p&gt;

&lt;p&gt;This is the story of how that came together - and why it works.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Undo Button on Steroids
&lt;/h2&gt;

&lt;p&gt;Letting an AI write directly to your files sounds insane at first. What if it misunderstands and deletes your entire database layer? What if it gets creative and rewrites your authentication in a way that locks everyone out?&lt;/p&gt;

&lt;p&gt;The only way the optimistic approach works - the &lt;em&gt;only&lt;/em&gt; way - is if you have a perfect undo button. Not Git (too slow, too manual). Not Ctrl+Z (only works in editors). Something instant. Something automatic. Something that happens &lt;em&gt;before&lt;/em&gt; the AI even thinks about touching your files.&lt;/p&gt;

&lt;p&gt;That's what I built first: a snapshot engine that acts like a paranoid librarian.&lt;/p&gt;

&lt;p&gt;You know those rare book libraries where they photograph every page before letting you read it? That's the idea. Before the AI writes &lt;em&gt;anything&lt;/em&gt;, we take a perfect snapshot of what's there. The whole file, exactly as it was, tucked away in &lt;code&gt;.aye/snapshots/&lt;/code&gt; with a timestamp and the prompt that triggered the change.&lt;/p&gt;

&lt;p&gt;Only &lt;em&gt;then&lt;/em&gt; does the AI get to write.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# aye/model/snapshot.py
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;apply_updates&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;updated_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    1. Take a snapshot of the *current* files.
    2. Write the new contents supplied by the LLM.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;file_paths&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;updated_files&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;batch_ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_snapshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_paths&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Safety net FIRST
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;updated_files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;fp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="n"&gt;fp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mkdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exist_ok&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;fp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write_text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;file_content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Then write
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;batch_ts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This backup-first approach is non-negotiable. It's the foundation that makes everything else possible. No snapshot, no write. Period.&lt;/p&gt;

&lt;p&gt;And here's the beautiful part: the snapshot isn't just a file copy. It's a time capsule. It remembers &lt;em&gt;why&lt;/em&gt; the change happened (your prompt is in the metadata), &lt;em&gt;when&lt;/em&gt; it happened (timestamp), and &lt;em&gt;exactly&lt;/em&gt; what was there before. It's version control purpose-built for the "try -&amp;gt; fail -&amp;gt; undo" loop of AI coding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Commands That Changed Everything
&lt;/h2&gt;

&lt;p&gt;A safety net is useless if it's tangled up in a closet somewhere. The whole point of moving fast is... well, moving fast. So reviewing and reverting changes had to be just as instant as making them.&lt;/p&gt;

&lt;p&gt;Enter two stupidly simple commands: &lt;code&gt;diff&lt;/code&gt; and &lt;code&gt;restore&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Seeing What Changed: &lt;code&gt;diff&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;After the AI makes a change, you naturally wonder: "Okay, what exactly did you do?" Instead of opening another tool or switching windows, you just type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» diff calculator.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Boom. Instant, colorized diff right in your terminal. It's just comparing the live file against the snapshot we took two seconds ago. You see the change, you understand it, you move on. No context switch. No friction.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# aye/presenter/diff_presenter.py
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;show_diff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Show diff between two files using system diff command.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;diff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--color=always&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-u&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt;
            &lt;span class="c1"&gt;# ...
&lt;/span&gt;        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# ...
&lt;/span&gt;    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Fallback to Python's difflib if system diff is not available
&lt;/span&gt;        &lt;span class="nf"&gt;_python_diff_files&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's so simple it's almost boring. But that's the point. Boring infrastructure that just works is what lets you focus on the interesting stuff.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Magic Undo: &lt;code&gt;restore&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;And if you don't like what you see? If the AI got creative in the wrong way?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» restore calculator.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The file is back to exactly how it was. The AI's change is gone, like it never happened. No Git commands to remember, no stash juggling, no "wait which commit was that again?"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# aye/model/snapshot.py
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;restore_snapshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ordinal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# ...
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ordinal&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;file_name&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;snapshots&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list_snapshots&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;snapshots&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No snapshots found for file &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;file_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;snapshot_path_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;snapshots&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="c1"&gt;# ...
&lt;/span&gt;        &lt;span class="n"&gt;shutil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;copy2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;snapshot_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;original_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="c1"&gt;# ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This one-command rollback is what makes the optimistic workflow &lt;em&gt;work&lt;/em&gt;. It removes the fear. The cost of a bad AI suggestion drops to near zero - just the three seconds it takes to type &lt;code&gt;restore&lt;/code&gt;. That's when you realize: you're not babysitting anymore. You're collaborating.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Feels in Practice
&lt;/h2&gt;

&lt;p&gt;Let me show you where this gets real. You're working on a simple calculator module with an &lt;code&gt;add()&lt;/code&gt; function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# calculator.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You've got tests. They're green. Life is good.&lt;/p&gt;

&lt;p&gt;Then you think: "You know what? This should handle any number of arguments, not just two." So you ask:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» modify the add &lt;span class="k"&gt;function &lt;/span&gt;to take a list of numbers instead of two arguments
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI doesn't ask for permission. It just does it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;-{•!•}- » I have modified the `add` function to accept a list of numbers.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Behind the scenes, the snapshot happened first (safety net deployed), then the file got rewritten. You're curious what changed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» diff calculator.py
&lt;span class="nt"&gt;---&lt;/span&gt; .aye/snapshots/001_.../calculator.py
+++ calculator.py
@@ &lt;span class="nt"&gt;-1&lt;/span&gt;,2 +1,2 @@
&lt;span class="nt"&gt;-def&lt;/span&gt; add&lt;span class="o"&gt;(&lt;/span&gt;a, b&lt;span class="o"&gt;)&lt;/span&gt;:
-    &lt;span class="k"&gt;return &lt;/span&gt;a + b
+def add&lt;span class="o"&gt;(&lt;/span&gt;numbers&lt;span class="o"&gt;)&lt;/span&gt;:
+    &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;numbers&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean. Elegant. The AI understood the assignment. You feel that little dopamine hit - this is going to work.&lt;/p&gt;

&lt;p&gt;So you run the tests (without exiting the session mind you):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» pytest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;================================ FAILURES =================================
_______________________________ test_add __________________________________

    def test_add():
&amp;gt;       assert add(2, 3) == 5
E       TypeError: add() takes 1 positional argument but 2 were given

test_calculator.py:4: TypeError
========================= 1 failed in 0.12s ==========================
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Oh God. OH GOD.&lt;/p&gt;

&lt;p&gt;The tests are broken. The function signature changed but the tests still call it the old way. Your brain immediately starts racing: "Okay I need to update the tests, or wait maybe I should revert this, or maybe I should just fix the function to handle both cases, or—"&lt;/p&gt;

&lt;p&gt;And then you remember: you have an undo button.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» restore calculator.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One second later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="o"&gt;(&lt;/span&gt;ツ» pytest
&lt;span class="o"&gt;=========================&lt;/span&gt; 1 passed &lt;span class="k"&gt;in &lt;/span&gt;0.08s &lt;span class="o"&gt;===========================&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Green again. Crisis averted. Your heart rate returns to normal.&lt;/p&gt;

&lt;p&gt;Total time from "oh no" to "all good": &lt;strong&gt;4 seconds&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That's the moment it clicks. You're not afraid anymore. You can try things. Wild things. Aggressive refactors. Experimental rewrites. Because the cost of failure isn't hours of Git archaeology or careful manual rollbacks - it's typing seven letters: &lt;code&gt;restore&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You can iterate fearlessly. And when you're not afraid to fail, you move &lt;em&gt;fast&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That's the workflow. Fast, confident, reversible. No approval dialogs. No copy/paste. No babysitting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Git Upgrade: From Sedan to Sports Car
&lt;/h2&gt;

&lt;p&gt;Our file-copy snapshot system works great. It's simple, it's reliable, it works on any project with zero setup. It's like a dependable sedan - gets you where you need to go without fuss.&lt;/p&gt;

&lt;p&gt;But I knew developers had a sports car sitting in the garage: &lt;strong&gt;Git&lt;/strong&gt;. And I kept thinking: what if we could give them the option to use it?&lt;/p&gt;

&lt;p&gt;So that's what we're building next: a Git-powered snapshot engine that's faster, more powerful, and integrates seamlessly with tools developers already use.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Plan: Two Engines, One Interface
&lt;/h3&gt;

&lt;p&gt;We're using the &lt;strong&gt;Strategy pattern&lt;/strong&gt; - fancy name for a simple idea. Think of it as building a car with interchangeable engines. Same car, same controls, but you can swap in a different engine depending on what you need.&lt;/p&gt;

&lt;p&gt;Two "engines":&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;FileCopyStrategy&lt;/code&gt;&lt;/strong&gt;: The sedan. Our current file-copy logic. Default for non-Git projects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;GitStrategy&lt;/code&gt;&lt;/strong&gt;: The sports car. Uses native Git commands. Automatically kicks in for Git repos.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This way, simple projects stay simple, but Git users get superpowers.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the Git Version Will Work
&lt;/h3&gt;

&lt;p&gt;Instead of copying files, we just use Git operations developers already know:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Snapshot (&lt;code&gt;apply_updates&lt;/code&gt;)&lt;/strong&gt;: &lt;code&gt;git stash push -m "aye-chat: &amp;lt;your_prompt&amp;gt;"&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Instant, space-efficient, automatically documented&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;View changes (&lt;code&gt;diff&lt;/code&gt;)&lt;/strong&gt;: &lt;code&gt;git diff stash@{0} -- &amp;lt;file_name&amp;gt;&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compare against the stashed version&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Undo (&lt;code&gt;restore&lt;/code&gt;)&lt;/strong&gt;: &lt;code&gt;git stash pop&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One command, back to where you were&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But here's where it gets interesting.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Real Power: Partial Acceptance
&lt;/h3&gt;

&lt;p&gt;Ever wanted to accept &lt;em&gt;part&lt;/em&gt; of an AI's suggestion but not all of it? With Git, we can build a &lt;code&gt;review&lt;/code&gt; command that uses &lt;code&gt;git checkout -p stash@{0}&lt;/code&gt;. It lets you approve changes hunk-by-hunk, like a code review.&lt;/p&gt;

&lt;p&gt;You get to say: "Yes to this function, no to that refactor, yes to the docstring, no to the renaming."&lt;/p&gt;

&lt;p&gt;That's not just faster. That's a completely different level of control.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using Your Existing Tools
&lt;/h3&gt;

&lt;p&gt;And because the AI's changes are just Git stashes, you can use &lt;em&gt;any&lt;/em&gt; Git tool to inspect them. VS Code's source control panel. &lt;code&gt;lazygit&lt;/code&gt;. Whatever you already use. Aye Chat becomes part of your existing workflow instead of replacing it.&lt;/p&gt;

&lt;p&gt;This isn't about using a fancier tool for the same job. It's about unlocking a workflow that wasn't possible before - one where you can collaborate with an AI at the speed of thought, with surgical precision when you need it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This All Means
&lt;/h2&gt;

&lt;p&gt;The optimistic workflow isn't just a feature. It's a different way of thinking about AI collaboration.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prompt -&amp;gt; Review -&amp;gt; Approve -&amp;gt; Apply -&amp;gt; Test&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prompt -&amp;gt; Apply -&amp;gt; (Test/Review if needed) -&amp;gt; (Undo if wrong)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The approval step vanishes. The AI becomes a collaborator you can trust to act, knowing you have a perfect undo button if it goes sideways.&lt;/p&gt;

&lt;p&gt;It took me about a week to build the first version - nights and a weekend, obsessively iterating. The snapshot engine took a day. The &lt;code&gt;diff&lt;/code&gt; and &lt;code&gt;restore&lt;/code&gt; commands took a few hours. The Git integration is still in progress.&lt;/p&gt;

&lt;p&gt;But the feeling of it? That happened immediately. The first time I prompted the AI, watched it write code directly to my file, checked the diff, and moved on - all in under 10 seconds - I knew this was different.&lt;/p&gt;

&lt;p&gt;No more babysitting. No more approval fatigue. Just flow.&lt;/p&gt;

&lt;p&gt;That's what we're building. And honestly? It's exhilarating.&lt;/p&gt;




&lt;h2&gt;
  
  
  About Aye Chat
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Aye Chat&lt;/strong&gt; is an open-source, AI-powered terminal workspace that brings the power of AI directly into your command-line workflow. Edit files, run commands, and chat with your codebase without ever leaving the terminal.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support Us 🫶
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Star 🌟 our &lt;a href="https://github.com/acrotron/aye-chat#aye-chat-ai-powered-terminal-workspace" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/strong&gt; It helps new users discover Aye Chat.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spread the word&lt;/strong&gt; 🗣️. Share Aye Chat on social media and recommend it to your friends.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>cli</category>
    </item>
    <item>
      <title>I got so annoyed with AI coding assistants that I built my own.</title>
      <dc:creator>Vyacheslav Mayorskiy</dc:creator>
      <pubDate>Thu, 20 Nov 2025 14:11:00 +0000</pubDate>
      <link>https://dev.to/vmayorskiyac/i-got-so-annoyed-with-ai-coding-assistants-that-i-built-my-own-5dfa</link>
      <guid>https://dev.to/vmayorskiyac/i-got-so-annoyed-with-ai-coding-assistants-that-i-built-my-own-5dfa</guid>
      <description>&lt;p&gt;I was browsing and thinking back and reminiscing on how it all started (not that long ago: back in September) – and decided to put it in writing. This is a walk down memory lane to the very beginning.&lt;/p&gt;

&lt;p&gt;With AI assistants becoming more and more powerful, I as a Python coder found myself using them more and more, but I operate mostly in a terminal SSH sessions, so the practical aspect of such use was "do something in a terminal -&amp;gt; ok, need to solve this problem -&amp;gt; copy/paste into Chat GPT -&amp;gt; receive response -&amp;gt; copy/paste into the terminal. Repeat".&lt;/p&gt;

&lt;p&gt;At some point this process of switching back and forth and back and forth became so frustrating that I decided to do something about it.&lt;/p&gt;

&lt;p&gt;The very first thing I did was to install a CLI AI assistant. I thought "Aha! I am not the first: someone already solved this pain". Alas, the experience was terrifying: after install - it sent me back to web to log on, and then - it started working, and Oh My God was it full of itself! Every little sneeze - it had to tell me about it, and every little thing it generated - it needed approval from me. So it was acting like a little puppy who's looking for validation: "Did I do it right? Do you like me? Wait, I have some more, do you like it as well?"&lt;/p&gt;

&lt;p&gt;I was even more annoyed to say the least - to the degree that I decided to build something from scratch and so that it would address all those pain points that accumulated from web copy/pasting and from CLI assistant asking for approval.&lt;/p&gt;

&lt;p&gt;So Aye Chat was born (it had different name at the time of course). The main problem that I wanted to solve was this constant nagging by the famous CLI assistant. It seemed unnatural that in our days when LLMs are reasonably solid and require correction maybe 20-30% of time if that - I would need to approve them for every little thing. I decided that my tool would make updates automatically.&lt;/p&gt;

&lt;p&gt;The main problem with that of course is the catastrophic disaster when LLM does screw up and you lose all that you worked so hard on.&lt;/p&gt;

&lt;p&gt;Digression: in the 1990s, PalmPilot became one of the most successful hand-held device companies not because of the device itself (many were doing them) but because they introduced a safety net: syncing your content to a PC. With that – even if you dropped it in a water and it became turtle food (or nest, or mirror – whatever), - your data was safe.&lt;/p&gt;

&lt;p&gt;With LLM making updates automatically it was very obvious that there needed to be some kind of rock-solid safety net. Implementation of it became technicality: before every update – just save the file version, and if the update was result of LLM being drunk or whatever – just restore that version.&lt;/p&gt;

&lt;p&gt;That was the very first feature that went into this tool: get a response from LLM – if files need to change – save them first – then apply update. Well, "very first feature" after the trivial integration with LLM endpoint of course: nowadays everybody and their guppy do that it seems.&lt;/p&gt;

&lt;p&gt;With automatic updates we resolved the first pain point: having to approve every suggestion from LLM.&lt;/p&gt;

&lt;p&gt;I was not looking for complex implementations: again, this was just another custom tool for myself – so did not care what others would think because there was no others. There was no fancy extracting of file fragments before sending to LLM, there was no cumbersome aggregation of fragments when receiving them back: simplicity was the name of the game. I was sending full file content and receiving updates for full file content. Moreover, because my projects are fairly small in nature (bunch of AWS Lambda functions, bunch of terraform, bunch of shell scripts) - I would send entire codebase at once and it would still fit into LLM context window. On the upside: there was no missing content and no ambiguity on the LLM side: it had everything it needed to make edits.&lt;/p&gt;

&lt;p&gt;With those 2 things: saving files automatically with rudimentary version control and with sending all files – all of a sudden I had a miniature powerhouse, which eliminated the need to go to web for copy/pasting, and that alone reduced the wasted time at least by 30%. Not bad for a 2-day implementation. And as I was sending all files - it did not even occur to me to have a flag to name files individually to put into prompt name by name: of course it's a wildcard mask to do the job, what else.&lt;/p&gt;

&lt;p&gt;After that I became greedy. And not “good greedy” when you want something more but when everybody wins: I wanted it all for myself. The next thing I noticed was that I was switching back and forth from the tool session where I was talking to AI to another terminal where I was doing edits and command execution. (Yes, you guessed it: I am not a tmux user. Sue me). “We eliminated going to web already: what’s stopping us from eliminating switching between terminals?”&lt;/p&gt;

&lt;p&gt;And so the next major thing went in: shell integration. With the tool (let’s start calling it “Aye Chat”: it was that already at that point), and the lightning fast speed of iterating from version to version, shell integration took less than an hour. And not just “ls -la” type of commands: I was able to open vim right from that very session. Later came “cd” as well – when I started missing it and confinement of a single directory started being painful.&lt;/p&gt;

&lt;p&gt;All of the above is in a span of one week mind you: nights and a weekend. I spent another week I can’t even remember on what – but it became an obsession: there was no downtime, there was only my main work, and then – all free time would go into Aye Chat, and it was getting better and better.&lt;/p&gt;

&lt;p&gt;By the end of the third week it had so many features that it became rather obvious that this is now not just a side project: it’s a product in making that can help others that are in the same position improve productivity tremendously. And not just because of AI-assisted code generation but because of how User Experience was built on top of it. I tried to eliminate most friction points that I would come over – and the experience is now exhilarating: you think of something – you ask it to do it for you – and it does. You run the tests – you see failures – you ask the tool to fix them – and it does.&lt;/p&gt;

&lt;p&gt;Of course it’s a fresh product and many things are continuing to be improved – but the pleasure of having a tool work for you instead of you having to fight a tool to be useful – this pleasure is real already.&lt;/p&gt;

&lt;p&gt;Who knows what's to come: we now share the development load with our small team (we are a consulting company that my friend and I started). We keep building on it and keep using it ourselves in our projects, and now have a roadmap, a sprint board, and 3-a-week scrums. Aye Chat can now handle larger projects - with built-in RAG (Retrieval-Augmented Generation) capabilities, it has privacy-oriented enterprise-grade features such as offline operation mode, and others. And it keeps growing. And what's more important: however few users we have - they seem to like it.&lt;/p&gt;

&lt;p&gt;Or using the words of one of our users: “It looks very promising!” Let’s leave it at that.&lt;/p&gt;




&lt;p&gt;If you liked what you read - star 🌟 our GitHub repository (&lt;a href="https://github.com/acrotron/aye-chat" rel="noopener noreferrer"&gt;https://github.com/acrotron/aye-chat&lt;/a&gt;). It helps new users discover Aye Chat.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>cli</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
