DEV Community

Cover image for Accessing 200+ AI Models with One TypeScript SDK: OpenRouter + NeuroLink Guide
NeuroLink AI
NeuroLink AI

Posted on • Edited on • Originally published at blog.neurolink.ink

Accessing 200+ AI Models with One TypeScript SDK: OpenRouter + NeuroLink Guide

The modern AI landscape is fragmented. You want GPT-4o for creative writing, Claude for code analysis, Gemini for multimodal tasks, and Llama for cost-effective batch processing. Each provider has its own SDK, authentication flow, rate limits, and response format. Managing all of this in a production TypeScript application quickly becomes a nightmare.

What if you could access 200+ AI models through a single, type-safe TypeScript SDK?

That is exactly what NeuroLink + OpenRouter delivers.


The Multi-Provider Problem

Here is what a typical multi-provider setup looks like without a unified layer:

// Provider A
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_KEY });
const gptResult = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: prompt }],
});

// Provider B
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_KEY });
const claudeResult = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250514',
  max_tokens: 1024,
  messages: [{ role: 'user', content: prompt }],
});

// Provider C
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GEMINI_KEY);
const geminiModel = genAI.getGenerativeModel({ model: 'gemini-2.0-flash' });
const geminiResult = await geminiModel.generateContent(prompt);
Enter fullscreen mode Exit fullscreen mode

Three SDKs. Three API keys. Three different response formats. Three sets of error handling. And if one provider goes down, your application goes with it.


What is OpenRouter?

OpenRouter is a unified API gateway that provides access to 200+ AI models from every major provider -- OpenAI, Anthropic, Google, Meta, Mistral, Cohere, and dozens more -- through a single endpoint and a single API key.

Key benefits:

  • One API key for all providers
  • Automatic failover between providers hosting the same model
  • Unified pricing with transparent per-token costs
  • No vendor lock-in -- switch models by changing a string

When paired with NeuroLink, you get full TypeScript type safety, streaming support, structured output parsing, and intelligent routing on top of OpenRouter's model catalog.


Setting Up NeuroLink with OpenRouter

Installation

npm install neurolink-ai
Enter fullscreen mode Exit fullscreen mode

Configuration

import { NeuroLink } from 'neurolink-ai';

const ai = new NeuroLink({
  provider: 'openrouter',
  apiKey: process.env.OPENROUTER_API_KEY,
  // Optional: set default model
  defaultModel: 'anthropic/claude-sonnet-4-5-20250514',
});
Enter fullscreen mode Exit fullscreen mode

That is it. One import, one configuration object, and you have access to the entire OpenRouter catalog.


Basic Text Generation with Type Safety

NeuroLink provides full TypeScript inference for model parameters and responses:

import { NeuroLink } from 'neurolink-ai';
import type { ChatMessage, GenerationConfig } from 'neurolink-ai/types';

const ai = new NeuroLink({
  provider: 'openrouter',
  apiKey: process.env.OPENROUTER_API_KEY,
});

const messages: ChatMessage[] = [
  { role: 'system', content: 'You are a senior TypeScript developer.' },
  { role: 'user', content: 'Explain the builder pattern with a practical example.' },
];

const config: GenerationConfig = {
  model: 'anthropic/claude-sonnet-4-5-20250514',
  maxTokens: 2048,
  temperature: 0.7,
};

const response = await ai.chat(messages, config);

console.log(response.content);    // string - the generated text
console.log(response.model);      // string - actual model used
console.log(response.usage);      // { promptTokens, completionTokens, totalTokens }
console.log(response.cost);       // number - cost in USD
Enter fullscreen mode Exit fullscreen mode

Every response includes token usage and cost tracking out of the box -- no manual calculation needed.


Streaming: Real-Time Output with Async Iterators

For chat interfaces and real-time applications, streaming is essential. NeuroLink exposes a clean async iterator API:

const stream = ai.chatStream(messages, {
  model: 'openai/gpt-4o',
  maxTokens: 1024,
});

// Async iterator -- works with for-await-of
for await (const chunk of stream) {
  process.stdout.write(chunk.delta); // Print each token as it arrives
}

// Or collect the full response
const fullStream = ai.chatStream(messages, {
  model: 'openai/gpt-4o',
  maxTokens: 1024,
});

let fullText = '';
for await (const chunk of fullStream) {
  fullText += chunk.delta;
}

console.log('Final output:', fullText);
Enter fullscreen mode Exit fullscreen mode

The streaming API works identically across all 200+ models. Whether you are streaming from Claude, GPT-4o, Gemini, or Llama, the interface is the same.


Model Comparison: Run Prompts Against Multiple Models in Parallel

One of the most powerful patterns with OpenRouter is running the same prompt against multiple models simultaneously. This is invaluable for evaluation, benchmarking, and choosing the right model for your use case.

const models = [
  'anthropic/claude-sonnet-4-5-20250514',
  'openai/gpt-4o',
  'google/gemini-2.0-flash',
  'meta-llama/llama-3.1-70b-instruct',
];

const prompt: ChatMessage[] = [
  { role: 'user', content: 'Write a regex that validates email addresses. Explain each part.' },
];

// Run all models in parallel
const results = await Promise.all(
  models.map(async (model) => {
    const start = Date.now();
    const response = await ai.chat(prompt, { model, maxTokens: 1024 });
    const latency = Date.now() - start;

    return {
      model,
      content: response.content,
      tokens: response.usage.totalTokens,
      cost: response.cost,
      latencyMs: latency,
    };
  })
);

// Compare results
console.table(results.map(r => ({
  Model: r.model.split('/')[1],
  Tokens: r.tokens,
  Cost: `$${r.cost.toFixed(4)}`,
  Latency: `${r.latencyMs}ms`,
})));
Enter fullscreen mode Exit fullscreen mode

This pattern lets you make data-driven model selection decisions instead of guessing.


Cost Optimization: Route by Complexity

Not every request needs the most expensive model. NeuroLink supports intelligent routing based on task complexity:

import { NeuroLink, Router } from 'neurolink-ai';

const ai = new NeuroLink({
  provider: 'openrouter',
  apiKey: process.env.OPENROUTER_API_KEY,
});

// Define a routing strategy
const router = new Router({
  rules: [
    {
      // Simple classification tasks -> use a fast, cheap model
      match: (messages) => messages.length <= 2 && messages[0].content.length < 200,
      model: 'meta-llama/llama-3.1-8b-instruct', // ~$0.05/1M tokens
    },
    {
      // Code generation -> use a strong coding model
      match: (messages) => messages.some(m =>
        m.content.includes('code') || m.content.includes('function')
      ),
      model: 'anthropic/claude-sonnet-4-5-20250514', // Best for code
    },
    {
      // Default -> balanced model
      model: 'openai/gpt-4o-mini', // Good quality, reasonable price
    },
  ],
});

// Use the router
const response = await ai.chat(messages, {
  model: router.select(messages),
  maxTokens: 2048,
});
Enter fullscreen mode Exit fullscreen mode

Price Comparison (per 1M tokens)

Here is a quick reference for common models available through OpenRouter:

Model Input Cost Output Cost Best For
Llama 3.1 8B $0.05 $0.08 Simple tasks, classification
GPT-4o Mini $0.15 $0.60 Balanced quality/cost
Gemini 2.0 Flash $0.10 $0.40 Fast multimodal tasks
Claude Sonnet 4.5 $3.00 $15.00 Complex reasoning, code
GPT-4o $2.50 $10.00 General high-quality tasks
Claude Opus 4 $15.00 $75.00 Most demanding tasks

By routing simple queries to cheaper models, you can reduce costs by 60-80% without sacrificing quality where it matters.


Built-In Failover: Auto-Switching on Provider Failure

Production applications cannot afford downtime when a single provider has an outage. NeuroLink's failover configuration handles this automatically:

const ai = new NeuroLink({
  provider: 'openrouter',
  apiKey: process.env.OPENROUTER_API_KEY,
  failover: {
    enabled: true,
    // Define fallback chains
    chains: [
      {
        primary: 'anthropic/claude-sonnet-4-5-20250514',
        fallbacks: [
          'openai/gpt-4o',
          'google/gemini-2.0-flash',
        ],
      },
      {
        primary: 'openai/gpt-4o',
        fallbacks: [
          'anthropic/claude-sonnet-4-5-20250514',
          'google/gemini-2.0-flash',
        ],
      },
    ],
    // Retry configuration
    maxRetries: 2,
    retryDelayMs: 1000,
    onFallback: (primary, fallback, error) => {
      console.warn(`Failover: ${primary} -> ${fallback} (reason: ${error.message})`);
    },
  },
});

// This call will automatically try fallback models if the primary fails
const response = await ai.chat(messages, {
  model: 'anthropic/claude-sonnet-4-5-20250514',
  maxTokens: 1024,
});

// response.model tells you which model actually served the request
console.log('Served by:', response.model);
Enter fullscreen mode Exit fullscreen mode

The failover is transparent to your application logic. Your code does not change -- NeuroLink handles the retry and model switching behind the scenes.


CLI for Rapid Prototyping

Need to quickly test a prompt against different models without writing code? NeuroLink includes a CLI:

# Quick single-model test
npx neurolink-ai chat "Explain monads in simple terms" \
  --provider openrouter \
  --model anthropic/claude-sonnet-4-5-20250514

# Compare multiple models
npx neurolink-ai compare "Write a haiku about TypeScript" \
  --provider openrouter \
  --models gpt-4o,claude-sonnet-4-5-20250514,gemini-2.0-flash

# Stream output in real-time
npx neurolink-ai chat "Build a REST API with Hono" \
  --provider openrouter \
  --model openai/gpt-4o \
  --stream

# List available models with pricing
npx neurolink-ai models --provider openrouter --sort price
Enter fullscreen mode Exit fullscreen mode

The CLI is perfect for prompt engineering, model evaluation, and quick experiments before committing to a model in your codebase.


Putting It All Together

Here is a complete example that combines routing, streaming, and failover:

import { NeuroLink, Router } from 'neurolink-ai';

const ai = new NeuroLink({
  provider: 'openrouter',
  apiKey: process.env.OPENROUTER_API_KEY,
  failover: {
    enabled: true,
    chains: [
      {
        primary: 'anthropic/claude-sonnet-4-5-20250514',
        fallbacks: ['openai/gpt-4o', 'google/gemini-2.0-flash'],
      },
    ],
    maxRetries: 2,
  },
});

async function handleUserQuery(userMessage: string) {
  const messages = [
    { role: 'system' as const, content: 'You are a helpful coding assistant.' },
    { role: 'user' as const, content: userMessage },
  ];

  // Route based on complexity
  const isSimple = userMessage.length < 100 && !userMessage.includes('code');
  const model = isSimple
    ? 'meta-llama/llama-3.1-8b-instruct'
    : 'anthropic/claude-sonnet-4-5-20250514';

  // Stream the response
  const stream = ai.chatStream(messages, { model, maxTokens: 2048 });

  let response = '';
  for await (const chunk of stream) {
    process.stdout.write(chunk.delta);
    response += chunk.delta;
  }

  return response;
}

// Usage
await handleUserQuery('What is a closure?');
await handleUserQuery('Implement a type-safe event emitter in TypeScript with generics');
Enter fullscreen mode Exit fullscreen mode

Conclusion

The combination of NeuroLink and OpenRouter eliminates the complexity of multi-provider AI integration. Instead of managing multiple SDKs, API keys, and response formats, you get:

  • One SDK with full TypeScript type safety
  • 200+ models accessible through a single API key
  • Streaming with a clean async iterator interface
  • Intelligent routing to optimize cost and performance
  • Automatic failover for production reliability
  • CLI tools for rapid prototyping

Whether you are building a chatbot, a code assistant, a content pipeline, or any AI-powered application, this stack gives you the flexibility to use the best model for every task without the integration overhead.

Get Started

npm install neurolink-ai
Enter fullscreen mode Exit fullscreen mode

Start building with 200+ models today.

Top comments (0)