<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nimi</title>
    <description>The latest articles on DEV Community by Nimi (@nimi_runtime).</description>
    <link>https://dev.to/nimi_runtime</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3830920%2F5c8966d9-dd48-44ed-9a74-c3d8d754d2fb.png</url>
      <title>DEV Community: Nimi</title>
      <link>https://dev.to/nimi_runtime</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nimi_runtime"/>
    <language>en</language>
    <item>
      <title>One SDK, 12 Modalities: AI Inference Shouldn't Be This Fragmented</title>
      <dc:creator>Nimi</dc:creator>
      <pubDate>Wed, 18 Mar 2026 09:44:50 +0000</pubDate>
      <link>https://dev.to/nimi_runtime/one-sdk-12-modalities-ai-inference-shouldnt-be-this-fragmented-4oi1</link>
      <guid>https://dev.to/nimi_runtime/one-sdk-12-modalities-ai-inference-shouldnt-be-this-fragmented-4oi1</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqgv9en7fvs7zegayb1nm.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqgv9en7fvs7zegayb1nm.jpg" alt="Nimi Banner" width="800" height="132"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;GitHub: &lt;a href="https://github.com/nimiplatform/nimi" rel="noopener noreferrer"&gt;github.com/nimiplatform/nimi&lt;/a&gt;&lt;/strong&gt; | Apache-2.0 / MIT&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Local inference is becoming the default. But fragmentation is the real problem.
&lt;/h2&gt;

&lt;p&gt;Models are getting stronger and smaller. Local inference is no longer a hobbyist pursuit — it's becoming a standard part of how AI apps are built. IDC predicts that by 2027, 80% of AI inference will run locally or at the edge.&lt;/p&gt;

&lt;p&gt;The 2025 Stack Overflow Developer Survey found that 59% of developers use three or more AI tools simultaneously. But open any AI project we're working on today. Take an AI character app as an example: it needs speech recognition (STT), text reasoning (LLM), voice synthesis (TTS), scene image generation, and maybe background music. Five modalities, five different capabilities.&lt;/p&gt;

&lt;p&gt;With today's toolchain, we need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Local text inference: Ollama or llama.cpp&lt;/li&gt;
&lt;li&gt;Local image generation: ComfyUI or AUTOMATIC1111&lt;/li&gt;
&lt;li&gt;Local voice synthesis: Piper or GPT-SoVITS&lt;/li&gt;
&lt;li&gt;Cloud video generation: Runway API&lt;/li&gt;
&lt;li&gt;Cloud music generation: Suno API&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Five tools. Five processes. Five configurations. Five different interface formats.&lt;/p&gt;

&lt;p&gt;Every AI capability is an island.&lt;/p&gt;

&lt;p&gt;It's like cooking a single meal, but the kitchen is split into five rooms. Chopping in room A, frying in room B, seasoning in room C. Every room has a different lock, a different stove, and different measuring cups.&lt;/p&gt;

&lt;p&gt;We spend 40% of our development time writing glue code — provider switching, fallback logic, health checks, streaming adapters, token metering, error retries. None of this has anything to do with our actual product. Yet every AI app is writing the same glue from scratch.&lt;/p&gt;

&lt;p&gt;The time we spend on what we actually want to build? Maybe 20%. The other 40% goes to servers, deployment, and infrastructure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Local tools and cloud SDKs each solve half the problem
&lt;/h2&gt;

&lt;p&gt;The first half of the development pipeline already has some solutions. They fall into two categories. OpenRouter's numbers tell the story — 5 million developers routing requests across 60+ providers and 300+ models. Multi-provider isn't a niche need, it's the norm.&lt;/p&gt;

&lt;p&gt;Category one: local model runners. Ollama, LM Studio, LocalAI. They solve "run a model on your machine," but they don't touch the cloud. When local GPU isn't enough, or we need GPT-4-level reasoning, we're on our own switching to a cloud SDK.&lt;/p&gt;

&lt;p&gt;Category two: cloud API gateways. OpenRouter, LiteLLM. They unify multiple cloud providers behind one interface, but they don't touch local. Want to use local models to save money, or work offline? They can't help.&lt;/p&gt;

&lt;p&gt;There's also a third category: application-layer frameworks. LangChain, Vercel AI SDK. They abstract at the app level, but they don't manage where inference actually runs, what happens when a provider goes down, or how to manage local engine lifecycles.&lt;/p&gt;

&lt;p&gt;No single solution handles: local + cloud + multimodal + routing/fallback + lifecycle management.&lt;/p&gt;

&lt;p&gt;Each one solves one piece of the puzzle. Nobody has completed the whole picture.&lt;/p&gt;

&lt;p&gt;Nimi Runtime is that complete picture.&lt;/p&gt;




&lt;h2&gt;
  
  
  Runtime: not a model runner — an execution surface
&lt;/h2&gt;

&lt;p&gt;We built Nimi Runtime. In one sentence:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Docker for AI inference.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Docker didn't solve "how to run a program" — that was already solved. Docker solved "run it anywhere, same behavior." Nimi Runtime works the same way: whether calling local Llama or cloud GPT-4, whether it's text or image or voice — same interface, same behavior.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3o9ugnxo520odkum1ceq.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3o9ugnxo520odkum1ceq.jpg" alt="Nimi Architecture" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A single Go daemon, running in the background. Start it up, and every AI capability comes through one port.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;nimi start                              &lt;span class="c"&gt;# start the daemon&lt;/span&gt;
nimi run &lt;span class="s2"&gt;"Hello"&lt;/span&gt;                        &lt;span class="c"&gt;# default inference (local or cloud)&lt;/span&gt;
nimi run &lt;span class="nt"&gt;--provider&lt;/span&gt; gemini &lt;span class="s2"&gt;"Hello"&lt;/span&gt;      &lt;span class="c"&gt;# specify cloud&lt;/span&gt;
nimi run &lt;span class="nt"&gt;--model&lt;/span&gt; llama3.2 &lt;span class="s2"&gt;"Hello"&lt;/span&gt;       &lt;span class="c"&gt;# specify local&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same command. Same interface. The execution plane is abstracted away.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8c45fsbybyvxwhkm5fl8.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8c45fsbybyvxwhkm5fl8.gif" alt="Nimi Quickstart" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;42 cloud providers. Covering OpenAI, Anthropic, Gemini, DeepSeek, Qwen, MiniMax, Kimi, GLM — global coverage. Local engines support LocalAI and Nexa SDK, with Runtime automatically managing their lifecycle — startup, shutdown, health probes, fault recovery. No manual management needed.&lt;/p&gt;

&lt;p&gt;12 modalities, one protocol:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Modality&lt;/th&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Text generation&lt;/td&gt;
&lt;td&gt;Chat, instructions, tool calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Text + vision&lt;/td&gt;
&lt;td&gt;Image understanding, OCR&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Image generation&lt;/td&gt;
&lt;td&gt;Text-to-image, image-to-image&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Video generation&lt;/td&gt;
&lt;td&gt;Text-to-video, image-to-video&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speech synthesis&lt;/td&gt;
&lt;td&gt;TTS + voice cloning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speech recognition&lt;/td&gt;
&lt;td&gt;STT + timestamp alignment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Music generation&lt;/td&gt;
&lt;td&gt;Text-to-music, style transfer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embeddings&lt;/td&gt;
&lt;td&gt;Semantic search, RAG&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Knowledge retrieval&lt;/td&gt;
&lt;td&gt;Document indexing + semantic search&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All through a single gRPC interface.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ag5bmcu75u67eoz4x3h.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2ag5bmcu75u67eoz4x3h.gif" alt="Nimi Multimodal" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Key capabilities: routing, fallback, and code we no longer need to write
&lt;/h2&gt;

&lt;p&gt;Smart routing. Set local-first priority: if LocalAI is healthy, it runs locally. LocalAI goes down? Automatic switch to cloud OpenAI. OpenAI rate-limited? Automatic switch to Gemini. Zero if-else statements needed.&lt;/p&gt;

&lt;p&gt;Health monitoring. Runtime probes every provider every 8 seconds — HEALTHY, DEGRADED, UNREACHABLE, UNAUTHORIZED. What we see in our app is always an available inference service. Who's running it behind the scenes doesn't matter.&lt;/p&gt;

&lt;p&gt;Idempotent deduplication. 10,000-request sliding window prevents duplicate billing. Concurrency control: global limit of 8, per-app limit of 2, all configurable.&lt;/p&gt;

&lt;p&gt;Audit trail. Every AI call is logged — which provider, how many tokens, what routing decision was made, whether auto-switching occurred. Ring buffer stores the last 20,000 events. For debugging, cost analysis, and compliance.&lt;/p&gt;

&lt;p&gt;This is code we used to write in every project. Not anymore.&lt;/p&gt;




&lt;h2&gt;
  
  
  Comparison
&lt;/h2&gt;

&lt;p&gt;All solutions in one table:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;Ollama&lt;/th&gt;
&lt;th&gt;LM Studio&lt;/th&gt;
&lt;th&gt;LocalAI&lt;/th&gt;
&lt;th&gt;ComfyUI&lt;/th&gt;
&lt;th&gt;OpenRouter&lt;/th&gt;
&lt;th&gt;LangChain&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Nimi Runtime&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Local text inference&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local image generation&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local TTS/STT&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloud providers&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (42)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Local + cloud routing&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto-fallback&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Video / music&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Daemon architecture&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Workflow DAG&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Permissions / audit&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Ollama made local model running beautifully simple. ComfyUI made image workflows unmatched. But each solves one dimension of the problem.&lt;/p&gt;

&lt;p&gt;Nimi Runtime unifies these dimensions into a single execution surface. Our apps only need to know one thing: call the Runtime.&lt;/p&gt;




&lt;h2&gt;
  
  
  Use it in your app
&lt;/h2&gt;

&lt;p&gt;SDK integration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Runtime&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@nimiplatform/sdk/runtime&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;runtime&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Runtime&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Local inference&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;local&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Describe yourself in one sentence.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Cloud inference — same interface, one parameter added&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cloud&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Describe yourself in one sentence.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Already using Vercel AI SDK? Zero migration cost:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;generateText&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createNimiAiProvider&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@nimiplatform/sdk/ai-provider&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;nimi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createNimiAiProvider&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;runtime&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateText&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;nimi&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;gemini/default&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Hello from Vercel AI SDK + Nimi&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Under the hood it's Nimi Runtime's routing and fallback logic. But the code looks exactly like using an OpenAI provider.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmcr0onlqnqg0ijs2y1pq.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmcr0onlqnqg0ijs2y1pq.gif" alt="Nimi SDK" width="760" height="427"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Nimi Runtime is open source. Apache-2.0 / MIT dual license.&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/nimiplatform/nimi" rel="noopener noreferrer"&gt;https://github.com/nimiplatform/nimi&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://install.nimi.xyz | sh
nimi start
nimi run &lt;span class="s2"&gt;"Hello"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three commands. One unified AI inference surface. 42 cloud providers, local engines, 12 modalities, one interface.&lt;/p&gt;

&lt;p&gt;Stop writing separate integrations for every AI modality. Point Claude at this link — it'll tell you what to do next.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Nimi Team&lt;/em&gt;&lt;br&gt;
&lt;em&gt;GitHub: &lt;a href="https://github.com/nimiplatform/nimi" rel="noopener noreferrer"&gt;https://github.com/nimiplatform/nimi&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>typescript</category>
      <category>devtools</category>
    </item>
  </channel>
</rss>
