<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: David C Cavalcante</title>
    <description>The latest articles on DEV Community by David C Cavalcante (@davcavalcante).</description>
    <link>https://dev.to/davcavalcante</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1593816%2Fb6699d52-bd30-46bf-9225-8827450bc595.jpg</url>
      <title>DEV Community: David C Cavalcante</title>
      <link>https://dev.to/davcavalcante</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/davcavalcante"/>
    <language>en</language>
    <item>
      <title>These tools provide the engineering substrate required to meet the rigorous safety and economic constraints of production environments.</title>
      <dc:creator>David C Cavalcante</dc:creator>
      <pubDate>Mon, 29 Jun 2026 05:33:23 +0000</pubDate>
      <link>https://dev.to/davcavalcante/these-tools-provide-the-engineering-substrate-required-to-meet-the-rigorous-safety-and-economic-10j2</link>
      <guid>https://dev.to/davcavalcante/these-tools-provide-the-engineering-substrate-required-to-meet-the-rigorous-safety-and-economic-10j2</guid>
      <description>&lt;p&gt;Engineering production AI infrastructure requires moving beyond heuristic guesswork toward deterministic, verifiable logic. My open-source portfolio of 11 TypeScript packages, published with SLSA provenance and zero runtime dependencies, provides the foundational primitives for high-stakes agent deployments.&lt;/p&gt;

&lt;p&gt;I built the @takk ecosystem to solve specific, quantifiable bottlenecks in LLM systems engineering. We treat code as a mathematical artifact rather than a collection of features. Every module is strictly typed, dual ESM/CJS, Apache-2.0 licensed, and validated by extensive test suites designed to prove stability before runtime execution.&lt;/p&gt;

&lt;p&gt;The efficacy of this architecture rests on objective technical benchmarks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;@takk/mcpcustoms provides a semantic firewall for agent tool calls with 158 tests across 19 suites. It implements a fail-closed, hash-chained audit trail to mitigate injection and capability overreach.&lt;/li&gt;
&lt;li&gt;@takk/gaptime implements bi-temporal knowledge-graph memory. By tracking independent transaction and valid time axes, it satisfies record-keeping requirements for EU AI Act Article 12 and ISO/IEC 42001 control A.6.2.8.&lt;/li&gt;
&lt;li&gt;@takk/krikos manages agent identity through Ed25519 signatures, enabling non-human identity governance within large-scale agent fleets.&lt;/li&gt;
&lt;li&gt;@takk/tokenforecast delivers predictive cost intelligence via Bayesian cold-start and Holt-Winters methods, maintaining 95%+ test coverage to ensure reliable FinOps within the execution process.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These tools do not offer magic; they provide the engineering substrate required to meet the rigorous safety and economic constraints of production environments. Governance compliance is an organizational responsibility; these libraries simply provide the auditability and control mechanisms to make that compliance technically feasible.&lt;/p&gt;

&lt;p&gt;Zero-dependency design remains non-negotiable to minimize the attack surface and ensure deterministic behavior across edge and server environments. By isolating business logic from external sidecars, I have optimized for performance and verifiable reliability.&lt;/p&gt;

&lt;p&gt;Inspect the technical architecture, test coverage, and source code here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/davccavalcante/racs" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/racs&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/davccavalcante/modelchain" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/modelchain&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/davccavalcante/mcpcustoms" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/mcpcustoms&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/davccavalcante/gaptime" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/gaptime&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/davccavalcante/krikos" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/krikos&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/davccavalcante/tokenforecast" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/tokenforecast&lt;/a&gt;&lt;br&gt;
&lt;a href="https://github.com/davccavalcante/alkaline" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/alkaline&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The transition from prototype to industrial-grade infrastructure requires this level of discipline. Inspect the repositories to evaluate the implementation details. Constructive critique based on the codebase is welcome.&lt;/p&gt;




&lt;p&gt;Sources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;davccavalcante/modelchain (2026-05-30): &lt;a href="https://github.com/davccavalcante/modelchain" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/modelchain&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;David C Cavalcante davccavalcante - GitHub: &lt;a href="https://github.com/davccavalcante" rel="noopener noreferrer"&gt;https://github.com/davccavalcante&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;dcavalcante (Daniel Cavalcante) · GitHub: &lt;a href="https://github.com/dcavalcante" rel="noopener noreferrer"&gt;https://github.com/dcavalcante&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Leopoldo Cavalcante Poldo11 - GitHub: &lt;a href="https://github.com/Poldo11" rel="noopener noreferrer"&gt;https://github.com/Poldo11&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;README.md: &lt;a href="https://github.com/davccavalcante/racs/blob/main/README.md" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/racs/blob/main/README.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;davccavalcante/noeticos: &lt;a href="https://github.com/davccavalcante/noeticos" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/noeticos&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;davccavalcante/behavioralai: &lt;a href="https://github.com/davccavalcante/behavioralai" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/behavioralai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;davccavalcante/bayesroute: &lt;a href="https://github.com/davccavalcante/bayesroute" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/bayesroute&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>github</category>
      <category>npm</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Engineering production AI infrastructure requires moving beyond heuristic guesswork toward deterministic, verifiable logic</title>
      <dc:creator>David C Cavalcante</dc:creator>
      <pubDate>Mon, 29 Jun 2026 05:29:27 +0000</pubDate>
      <link>https://dev.to/davcavalcante/engineering-production-ai-infrastructure-requires-moving-beyond-heuristic-guesswork-toward-3mpg</link>
      <guid>https://dev.to/davcavalcante/engineering-production-ai-infrastructure-requires-moving-beyond-heuristic-guesswork-toward-3mpg</guid>
      <description>&lt;p&gt;Engineering production AI infrastructure requires moving beyond heuristic guesswork toward deterministic, verifiable logic. My open-source portfolio of 11 TypeScript packages, published with SLSA provenance and zero runtime dependencies, provides the mathematical and architectural substrate for this transition. Every module in the @takk and @teleologyhi-sdk ecosystem adheres to strict TypeScript, dual ESM/CJS, and Apache-2.0 licensing, ensuring that the logic powering your agent fleet remains as defensible as your core application code.&lt;/p&gt;

&lt;p&gt;@takk/mcpcustoms functions as a semantic firewall for agent tool calls. Validated by 158 tests across 19 suites, it implements seven default detectors to intercept command injection and secret exfiltration. It maintains a hash-chained, tamper-evident audit trail, ensuring every verdict—allow, block, or ask—is recorded with cryptographic integrity.&lt;/p&gt;

&lt;p&gt;@takk/gaptime addresses memory volatility through bi-temporal knowledge graph modeling. By tracking both valid time and transaction time across 13 Allen interval relations, it enables agents to resolve historical contradictions. This architecture provides the record-keeping primitive required for EU AI Act Article 12 and ISO/IEC 42001 control A.6.2.8.&lt;/p&gt;

&lt;p&gt;@takk/krikos and @takk/alkaline manage the operational lifecycle. Krikos establishes Ed25519-based non-human identity governance, while Alkaline offers durable execution without external sidecars, persisting state via swappable cells for SQLite and Postgres.&lt;/p&gt;

&lt;p&gt;@takk/tokenforecast provides predictive cost intelligence, utilizing Holt-Winters and Bayesian cold-start methods to forecast LLM spend and detect drift via Page-Hinkley analysis. This grounds cost economics in statistical reality rather than heuristic vibes.&lt;/p&gt;

&lt;p&gt;Inspect the 11 repositories and full test suites here: &lt;a href="https://github.com/davccavalcante" rel="noopener noreferrer"&gt;https://github.com/davccavalcante&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Which of these architectural constraints is currently the primary bottleneck in your production agent pipeline?&lt;/p&gt;




&lt;p&gt;Sources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;davccavalcante/modelchain (2026-05-30): &lt;a href="https://github.com/davccavalcante/modelchain" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/modelchain&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;David C Cavalcante davccavalcante - GitHub: &lt;a href="https://github.com/davccavalcante" rel="noopener noreferrer"&gt;https://github.com/davccavalcante&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;David C Cavalcante (@davccavalcante) / Posts / X - Twitter: &lt;a href="https://x.com/davccavalcante" rel="noopener noreferrer"&gt;https://x.com/davccavalcante&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;README.md: &lt;a href="https://github.com/davccavalcante/racs/blob/main/README.md" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/racs/blob/main/README.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;davccavalcante/noeticos: &lt;a href="https://github.com/davccavalcante/noeticos" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/noeticos&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;davccavalcante/behavioralai: &lt;a href="https://github.com/davccavalcante/behavioralai" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/behavioralai&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;davccavalcante/bayesroute: &lt;a href="https://github.com/davccavalcante/bayesroute" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/bayesroute&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>github</category>
      <category>npm</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Static routing is a relic. #LangChain keeps you chained to manual configurations while costs mount. #ModelChain changes the paradigm. It routes prompts dynamically based on empirical cost, latency, and quality data. Drop it in. https://lnk.ua/XFl5MJBPl</title>
      <dc:creator>David C Cavalcante</dc:creator>
      <pubDate>Mon, 01 Jun 2026 05:25:52 +0000</pubDate>
      <link>https://dev.to/davcavalcante/static-routing-is-a-relic-langchain-keeps-you-chained-to-manual-configurations-while-costs-mount-27eo</link>
      <guid>https://dev.to/davcavalcante/static-routing-is-a-relic-langchain-keeps-you-chained-to-manual-configurations-while-costs-mount-27eo</guid>
      <description>&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://davccavalcante.github.io/modelchain/" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmodelchain.takk.ag%2Fassets%2Fog-image.png" height="400" class="m-0" width="800"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://davccavalcante.github.io/modelchain/" rel="noopener noreferrer" class="c-link"&gt;
            modelchain - measurable LLM router for Node, Edge &amp;amp; browser
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            Zero-dependency, drop-in router. Route one prompt across OpenAI, Anthropic, Gemini, or any OpenAI-compatible endpoint by cost, latency, and observed quality. Native streaming, tool calling, Vercel AI SDK adapter.
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdavccavalcante.github.io%2Fmodelchain%2Fassets%2Ffavicon.svg" width="64" height="64"&gt;
          davccavalcante.github.io
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


</description>
    </item>
    <item>
      <title>ModelChain: Measurable LLM Router with Adaptive Model Selection, Real-Time Scoring, Budget Guards and Failover for Node.js, Edge and Browser</title>
      <dc:creator>David C Cavalcante</dc:creator>
      <pubDate>Sat, 30 May 2026 22:01:10 +0000</pubDate>
      <link>https://dev.to/davcavalcante/modelchain-measurable-llm-router-with-adaptive-model-selection-real-time-scoring-budget-guards-4ag7</link>
      <guid>https://dev.to/davcavalcante/modelchain-measurable-llm-router-with-adaptive-model-selection-real-time-scoring-budget-guards-4ag7</guid>
      <description>&lt;h1&gt;
  
  
  ModelChain: Measurable LLM Router with Adaptive Model Selection, Real-Time Scoring, Budget Guards and Failover for Node.js, Edge and Browser
&lt;/h1&gt;

&lt;p&gt;As a solo LLMOps engineer with over 25 years building production AI systems, I kept hitting the same limitation: when you have access to multiple LLM providers and models, choosing the right one for each request becomes fragile and outdated quickly.&lt;/p&gt;

&lt;p&gt;Static if/else rules or fixed fallbacks do not survive real-world changes in pricing, latency, or model quality. Manual benchmarking is time-consuming and error-prone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ModelChain&lt;/strong&gt; (@takk/modelchain) was built to solve this.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Developers and companies with keys for OpenAI, Anthropic, Gemini, Groq, and others waste time and money because they cannot dynamically route each prompt to the best available model based on current cost, observed latency, and actual response quality. Hard-coded choices quickly become suboptimal.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;ModelChain&lt;/strong&gt; is a measurable, adaptive LLM router for Node.js, Edge runtimes, and browser. It selects the best model per request using seven routing strategies, scores every response in real time, feeds those scores back into future decisions, enforces hard budget guards, and includes per-model circuit breakers with automatic failover.&lt;/p&gt;

&lt;p&gt;It normalises responses, tool calling, and streaming across providers while remaining zero-runtime-dependency and fully tree-shakable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Seven declarative routing strategies (cost-then-quality, cost-first, quality-first, etc.)&lt;/li&gt;
&lt;li&gt;Six pluggable scorers (latency, token-budget, length-bound, regex-match, exact-match, schema-valid)&lt;/li&gt;
&lt;li&gt;Native streaming over Web Streams with a unified &lt;code&gt;CompletionChunk&lt;/code&gt; type&lt;/li&gt;
&lt;li&gt;Normalised tool calling across OpenAI, Anthropic, and Gemini&lt;/li&gt;
&lt;li&gt;Hard budget guard (per-request, per-task, daily ceilings) that throws before any network call&lt;/li&gt;
&lt;li&gt;Per-model circuit breaker + full-jitter exponential backoff + automatic failover&lt;/li&gt;
&lt;li&gt;EWMA health scoring that decays on failure and recovers on success&lt;/li&gt;
&lt;li&gt;Thirteen in-process telemetry events (no external OpenTelemetry required)&lt;/li&gt;
&lt;li&gt;Vercel AI SDK adapter (&lt;code&gt;toVercelAILanguageModel&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;CLI proxy, inspect, and bench modes&lt;/li&gt;
&lt;li&gt;Six tree-shakeable entry points (core, providers, web, edge, ai-sdk, cli)&lt;/li&gt;
&lt;li&gt;SLSA provenance on every release&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quickstart Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Basic Router Setup
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createModelchain&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@takk/modelchain&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;openaiModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;anthropicModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;geminiModel&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@takk/modelchain/providers&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createModelchain&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;models&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nf"&gt;openaiModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4o-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;costPer1kInput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.00015&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;costPer1kOutput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.00060&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="nf"&gt;anthropicModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-3-5-haiku-latest&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;costPer1kInput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.00080&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;costPer1kOutput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.00400&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ANTHROPIC_API_KEY&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="nf"&gt;geminiModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini-2.0-flash&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;cost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;costPer1kInput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.00010&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;costPer1kOutput&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.00040&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;GEMINI_API_KEY&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cost-then-quality&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;scoring&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;built&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;latency&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;token-budget&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;budget&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;perRequestUsd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.02&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;dailyUsd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;telemetry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;complete&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Summarise X in 3 bullets.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;maxTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;finishReason&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Streaming
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="k"&gt;await &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Tell me a story.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text-delta&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stdout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;delta&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;finish&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;Done:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;finishReason&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Vercel AI SDK Integration
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;generateText&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;toVercelAILanguageModel&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@takk/modelchain/ai-sdk&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateText&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;toVercelAILanguageModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Hello.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Tool Calling (normalised)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;complete&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;What is the weather in Tokyo?&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="cm"&gt;/* ToolDefinition shape */&lt;/span&gt; &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How It Works (Request Flow)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Select best model using chosen strategy and current health/scores
&lt;/li&gt;
&lt;li&gt;Pre-flight budget guard check
&lt;/li&gt;
&lt;li&gt;Dispatch request through normalised provider adapter
&lt;/li&gt;
&lt;li&gt;Classify response or error
&lt;/li&gt;
&lt;li&gt;Update EWMA health score and circuit breaker state
&lt;/li&gt;
&lt;li&gt;Score response quality and record for future routing
&lt;/li&gt;
&lt;li&gt;Emit telemetry events
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All operations happen in-process with zero external dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add @takk/modelchain
&lt;span class="c"&gt;# or&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; @takk/modelchain
&lt;span class="c"&gt;# or&lt;/span&gt;
yarn add @takk/modelchain
&lt;span class="c"&gt;# or&lt;/span&gt;
bun add @takk/modelchain
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Optional peer dependencies only if using richer typed adapters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why ModelChain Exists
&lt;/h2&gt;

&lt;p&gt;ModelChain is the second building block (after KeyMesh) of a long-term family of high-reliability, open-source-first npm libraries for AI-native infrastructure that I plan to maintain through 2026–2030.&lt;/p&gt;

&lt;p&gt;I built it because dynamic, measurable routing is the missing layer between raw LLM providers and production applications that care about cost, latency, quality, and reliability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Documentation: &lt;a href="https://davccavalcante.github.io/modelchain/" rel="noopener noreferrer"&gt;https://davccavalcante.github.io/modelchain/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;npm: &lt;a href="https://www.npmjs.com/package/@takk/modelchain" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@takk/modelchain&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/davccavalcante/modelchain" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/modelchain&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;License: Apache-2.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you run multi-provider LLM applications in Node.js, Edge, or with the Vercel AI SDK, I would love your feedback, real-world usage reports, and contributions.&lt;/p&gt;

&lt;p&gt;Try ModelChain today and let me know which routing strategy and scorers work best for your workload.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>javascript</category>
      <category>llm</category>
      <category>showdev</category>
    </item>
    <item>
      <title>KeyMesh: Zero-Runtime-Dependency API Key Rotation, Circuit Breaker and Failover for Production LLM Applications in Node.js</title>
      <dc:creator>David C Cavalcante</dc:creator>
      <pubDate>Sat, 30 May 2026 21:58:53 +0000</pubDate>
      <link>https://dev.to/davcavalcante/keymesh-zero-runtime-dependency-api-key-rotation-circuit-breaker-and-failover-for-production-llm-ij6</link>
      <guid>https://dev.to/davcavalcante/keymesh-zero-runtime-dependency-api-key-rotation-circuit-breaker-and-failover-for-production-llm-ij6</guid>
      <description>&lt;h1&gt;
  
  
  KeyMesh: Zero-Runtime-Dependency API Key Rotation, Circuit Breaker and Failover for Production LLM Applications in Node.js
&lt;/h1&gt;

&lt;p&gt;As a solo LLMOps engineer with over 25 years of experience building production AI systems, I constantly faced the same critical failure point: API key rate limits and transient errors breaking LLM-powered applications.&lt;/p&gt;

&lt;p&gt;KeyMesh was created to solve exactly this problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;When a single OpenAI, Anthropic or Gemini API key hits a 429 Too Many Requests (or any transient 5xx/408 error), most applications fail immediately for the user. Manual key rotation or on-call intervention becomes necessary. Existing gateway solutions add network hops, latency, and extra operational complexity.&lt;/p&gt;

&lt;p&gt;I needed a solution that lives inside the application code itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;KeyMesh&lt;/strong&gt; (@takk/keymesh) is a universal, zero-runtime-dependency Node.js library and CLI that provides intelligent API key rotation, per-key circuit breakers, smart retries, health scoring, and automatic failover.&lt;/p&gt;

&lt;p&gt;It works as a drop-in replacement for official SDKs and supports any HTTP-based API.&lt;/p&gt;

&lt;p&gt;KeyMesh is fully TypeScript-first, has 93% test coverage (145 tests), zero runtime dependencies, and ships with SLSA provenance for supply-chain security.&lt;/p&gt;

&lt;h2&gt;
  
  
  Core Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Automatic key rotation using multiple selection strategies (round-robin, least-used, weighted, sequential-then-rotate, and custom)&lt;/li&gt;
&lt;li&gt;Per-key circuit breaker with three states (closed, open, half-open)&lt;/li&gt;
&lt;li&gt;Smart retry with AWS full-jitter exponential backoff and Retry-After support&lt;/li&gt;
&lt;li&gt;Health scoring system (0-100) that decays on failure and recovers on success&lt;/li&gt;
&lt;li&gt;In-process telemetry with 8 typed events (no external OpenTelemetry dependency)&lt;/li&gt;
&lt;li&gt;Pluggable state backends (memory by default, file backend included; Redis/Postgres planned)&lt;/li&gt;
&lt;li&gt;Auth-failure cooldown (401 errors disable key for 24 hours)&lt;/li&gt;
&lt;li&gt;Official adapters for OpenAI, Anthropic, Gemini, and a generic HTTP adapter&lt;/li&gt;
&lt;li&gt;CLI proxy mode for easy testing and non-Node.js environments&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quickstart Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. OpenAI SDK Adapter
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createKeymesh&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@takk/keymesh&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;openaiAdapter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@takk/keymesh/openai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createKeymesh&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;openaiAdapter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEYS&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;,&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
  &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;least-used&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;circuitBreaker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;cooldownMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;retry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;max&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;baseMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;jitter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;telemetry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Use exactly like the official OpenAI client&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gpt-4.1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Hello.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Generic HTTP Adapter (any API)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createKeymesh&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@takk/keymesh&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;httpAdapter&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@takk/keymesh/http&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tavily&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createKeymesh&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;httpAdapter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.tavily.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;authHeader&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Bearer &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TAVILY_API_KEYS&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;,&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt;
  &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;round-robin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tavily&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AI infrastructure 2026&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. CLI Proxy Mode
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;OPENAI_API_KEYS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;key1,key2,key3 npx @takk/keymesh start &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--port&lt;/span&gt; 8787 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--adapter&lt;/span&gt; openai &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--strategy&lt;/span&gt; round-robin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then call it like a normal OpenAI endpoint on &lt;a href="http://localhost:8787" rel="noopener noreferrer"&gt;http://localhost:8787&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works (Request Flow)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Pick key using selected strategy&lt;/li&gt;
&lt;li&gt;Dispatch request through the provider adapter&lt;/li&gt;
&lt;li&gt;Classify response/error&lt;/li&gt;
&lt;li&gt;Update health score and circuit breaker state&lt;/li&gt;
&lt;li&gt;Retry with backoff or rotate to next healthy key&lt;/li&gt;
&lt;li&gt;Emit telemetry events&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All keys remain hashed in state. Raw credentials are never logged or persisted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pnpm add @takk/keymesh
&lt;span class="c"&gt;# or&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; @takk/keymesh
&lt;span class="c"&gt;# or&lt;/span&gt;
yarn add @takk/keymesh
&lt;span class="c"&gt;# or&lt;/span&gt;
bun add @takk/keymesh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Optional provider SDKs only if using the typed adapters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why KeyMesh Exists
&lt;/h2&gt;

&lt;p&gt;I built KeyMesh because I got tired of production incidents caused by rate limits. It turns a common point of failure into silent, automatic self-healing.&lt;/p&gt;

&lt;p&gt;It is the first piece of a larger family of high-reliability open-source libraries for the AI infrastructure stack that I plan to maintain long-term.&lt;/p&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Documentation: &lt;a href="https://davccavalcante.github.io/keymesh/" rel="noopener noreferrer"&gt;https://davccavalcante.github.io/keymesh/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;npm: &lt;a href="https://www.npmjs.com/package/@takk/keymesh" rel="noopener noreferrer"&gt;https://www.npmjs.com/package/@takk/keymesh&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;GitHub: &lt;a href="https://github.com/davccavalcante/keymesh" rel="noopener noreferrer"&gt;https://github.com/davccavalcante/keymesh&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;License: Apache-2.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you work with LLM applications in Node.js, Bun, Deno, or Edge runtimes, I would love your feedback and contributions.&lt;/p&gt;

&lt;p&gt;Try KeyMesh today and let me know how it performs in your production environment.&lt;/p&gt;

</description>
      <category>api</category>
      <category>llm</category>
      <category>node</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
