<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Arnau Moyano</title>
    <description>The latest articles on DEV Community by Arnau Moyano (@aminoy77).</description>
    <link>https://dev.to/aminoy77</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3914349%2F155612e8-5018-4d1b-a5b2-2d611715587d.png</url>
      <title>DEV Community: Arnau Moyano</title>
      <link>https://dev.to/aminoy77</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/aminoy77"/>
    <language>en</language>
    <item>
      <title>How I built a terminal AI agent that never hits rate limits (open source, Python)</title>
      <dc:creator>Arnau Moyano</dc:creator>
      <pubDate>Thu, 07 May 2026 18:15:47 +0000</pubDate>
      <link>https://dev.to/aminoy77/how-i-built-a-terminal-ai-agent-that-never-hits-rate-limits-open-source-python-45jl</link>
      <guid>https://dev.to/aminoy77/how-i-built-a-terminal-ai-agent-that-never-hits-rate-limits-open-source-python-45jl</guid>
      <description>&lt;p&gt;A month ago I was building a side project and kept &lt;br&gt;
hitting the same wall: I'd start a task with OpenAI, &lt;br&gt;
hit the rate limit, manually switch to Anthropic, &lt;br&gt;
hit a different limit, then open yet another tab to &lt;br&gt;
configure Gemini. Three API dashboards open, three &lt;br&gt;
different billing pages, and my actual project sitting &lt;br&gt;
there waiting.&lt;/p&gt;

&lt;p&gt;I didn't want to pay for multiple APIs just to keep &lt;br&gt;
working. So I built something to fix it.&lt;/p&gt;
&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;HelloChusquis is an open source terminal AI agent that automatically switches between 35+ AI providers when one hits rate limits or goes down.&lt;/p&gt;

&lt;p&gt;One config file. Zero manual switching.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;hellochusquis
hellochusquis &lt;span class="nt"&gt;--quick&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent tries your first provider, and if it fails or hits limits, silently falls back to the next one. You never see an error — the task just completes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hardest part
&lt;/h2&gt;

&lt;p&gt;The trickiest bug was getting the agent to execute commands correctly during multi-step plans. The agent would generate a plan, start executing, and then lose access to its tools halfway through. Step 1 worked, steps 2-6 failed with "Unknown tool" errors.&lt;/p&gt;

&lt;p&gt;The problem: tools were available in the initial context but weren't being passed through each step of the execution loop. Once I fixed the context propagation, multi-step tasks like "search the web for AI news and summarize the top 3 stories" started working end to end.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the fallback works
&lt;/h2&gt;

&lt;p&gt;The core is a ProviderPool class that tracks each provider's state:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;exhausted&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
    &lt;span class="n"&gt;exhausted_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProviderPool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;available&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_available&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;provider&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;available&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_handle_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;All providers failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a provider returns a 429, 402, or 503, it gets marked as exhausted with a timestamp. After a configurable window (default 1 hour), it resets automatically. It's essentially a circuit breaker pattern applied to LLM providers.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it can do
&lt;/h2&gt;

&lt;p&gt;Beyond the fallback, HelloChusquis has grown into &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8by923ebw286jinthuz9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8by923ebw286jinthuz9.png" alt=" " width="800" height="436"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5c048wt9vp67cnmrracb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5c048wt9vp67cnmrracb.png" alt=" " width="800" height="41"&gt;&lt;/a&gt;a full terminal agent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;128 integrations (Stripe, Supabase, AWS, Discord...)&lt;/li&gt;
&lt;li&gt;Browser automation with human-like mouse movement&lt;/li&gt;
&lt;li&gt;Web UI with voice I/O&lt;/li&gt;
&lt;li&gt;Auto-Tool Builder: describe an integration, it generates the plugin&lt;/li&gt;
&lt;li&gt;REST API mode&lt;/li&gt;
&lt;li&gt;Persistent memory across sessions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;hellochusquis
hellochusquis &lt;span class="nt"&gt;--quick&lt;/span&gt;  &lt;span class="c"&gt;# 60 second setup&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: github.com/aminoy77/HelloChusquis&lt;/p&gt;

&lt;p&gt;Open source, MIT license, free forever.&lt;/p&gt;

&lt;p&gt;If you've hit the same rate limit frustration, I'd &lt;br&gt;
love to hear how you're handling it — or what you'd &lt;br&gt;
want HelloChusquis to do that it doesn't yet.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>opensource</category>
      <category>terminal</category>
    </item>
  </channel>
</rss>
