The modern AI landscape is fragmented. You want GPT-4o for creative writing, Claude for code analysis, Gemini for multimodal tasks, and Llama for cost-effective batch processing. Each provider has its own SDK, authentication flow, rate limits, and response format. Managing all of this in a production TypeScript application quickly becomes a nightmare.
What if you could access 200+ AI models through a single, type-safe TypeScript SDK?
That is exactly what NeuroLink + OpenRouter delivers.
The Multi-Provider Problem
Here is what a typical multi-provider setup looks like without a unified layer:
// Provider A
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_KEY });
const gptResult = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: prompt }],
});
// Provider B
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_KEY });
const claudeResult = await anthropic.messages.create({
model: 'claude-sonnet-4-5-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: prompt }],
});
// Provider C
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(process.env.GEMINI_KEY);
const geminiModel = genAI.getGenerativeModel({ model: 'gemini-2.0-flash' });
const geminiResult = await geminiModel.generateContent(prompt);
Three SDKs. Three API keys. Three different response formats. Three sets of error handling. And if one provider goes down, your application goes with it.
What is OpenRouter?
OpenRouter is a unified API gateway that provides access to 200+ AI models from every major provider -- OpenAI, Anthropic, Google, Meta, Mistral, Cohere, and dozens more -- through a single endpoint and a single API key.
Key benefits:
- One API key for all providers
- Automatic failover between providers hosting the same model
- Unified pricing with transparent per-token costs
- No vendor lock-in -- switch models by changing a string
When paired with NeuroLink, you get full TypeScript type safety, streaming support, structured output parsing, and intelligent routing on top of OpenRouter's model catalog.
Setting Up NeuroLink with OpenRouter
Installation
npm install neurolink-ai
Configuration
import { NeuroLink } from 'neurolink-ai';
const ai = new NeuroLink({
provider: 'openrouter',
apiKey: process.env.OPENROUTER_API_KEY,
// Optional: set default model
defaultModel: 'anthropic/claude-sonnet-4-5-20250514',
});
That is it. One import, one configuration object, and you have access to the entire OpenRouter catalog.
Basic Text Generation with Type Safety
NeuroLink provides full TypeScript inference for model parameters and responses:
import { NeuroLink } from 'neurolink-ai';
import type { ChatMessage, GenerationConfig } from 'neurolink-ai/types';
const ai = new NeuroLink({
provider: 'openrouter',
apiKey: process.env.OPENROUTER_API_KEY,
});
const messages: ChatMessage[] = [
{ role: 'system', content: 'You are a senior TypeScript developer.' },
{ role: 'user', content: 'Explain the builder pattern with a practical example.' },
];
const config: GenerationConfig = {
model: 'anthropic/claude-sonnet-4-5-20250514',
maxTokens: 2048,
temperature: 0.7,
};
const response = await ai.chat(messages, config);
console.log(response.content); // string - the generated text
console.log(response.model); // string - actual model used
console.log(response.usage); // { promptTokens, completionTokens, totalTokens }
console.log(response.cost); // number - cost in USD
Every response includes token usage and cost tracking out of the box -- no manual calculation needed.
Streaming: Real-Time Output with Async Iterators
For chat interfaces and real-time applications, streaming is essential. NeuroLink exposes a clean async iterator API:
const stream = ai.chatStream(messages, {
model: 'openai/gpt-4o',
maxTokens: 1024,
});
// Async iterator -- works with for-await-of
for await (const chunk of stream) {
process.stdout.write(chunk.delta); // Print each token as it arrives
}
// Or collect the full response
const fullStream = ai.chatStream(messages, {
model: 'openai/gpt-4o',
maxTokens: 1024,
});
let fullText = '';
for await (const chunk of fullStream) {
fullText += chunk.delta;
}
console.log('Final output:', fullText);
The streaming API works identically across all 200+ models. Whether you are streaming from Claude, GPT-4o, Gemini, or Llama, the interface is the same.
Model Comparison: Run Prompts Against Multiple Models in Parallel
One of the most powerful patterns with OpenRouter is running the same prompt against multiple models simultaneously. This is invaluable for evaluation, benchmarking, and choosing the right model for your use case.
const models = [
'anthropic/claude-sonnet-4-5-20250514',
'openai/gpt-4o',
'google/gemini-2.0-flash',
'meta-llama/llama-3.1-70b-instruct',
];
const prompt: ChatMessage[] = [
{ role: 'user', content: 'Write a regex that validates email addresses. Explain each part.' },
];
// Run all models in parallel
const results = await Promise.all(
models.map(async (model) => {
const start = Date.now();
const response = await ai.chat(prompt, { model, maxTokens: 1024 });
const latency = Date.now() - start;
return {
model,
content: response.content,
tokens: response.usage.totalTokens,
cost: response.cost,
latencyMs: latency,
};
})
);
// Compare results
console.table(results.map(r => ({
Model: r.model.split('/')[1],
Tokens: r.tokens,
Cost: `$${r.cost.toFixed(4)}`,
Latency: `${r.latencyMs}ms`,
})));
This pattern lets you make data-driven model selection decisions instead of guessing.
Cost Optimization: Route by Complexity
Not every request needs the most expensive model. NeuroLink supports intelligent routing based on task complexity:
import { NeuroLink, Router } from 'neurolink-ai';
const ai = new NeuroLink({
provider: 'openrouter',
apiKey: process.env.OPENROUTER_API_KEY,
});
// Define a routing strategy
const router = new Router({
rules: [
{
// Simple classification tasks -> use a fast, cheap model
match: (messages) => messages.length <= 2 && messages[0].content.length < 200,
model: 'meta-llama/llama-3.1-8b-instruct', // ~$0.05/1M tokens
},
{
// Code generation -> use a strong coding model
match: (messages) => messages.some(m =>
m.content.includes('code') || m.content.includes('function')
),
model: 'anthropic/claude-sonnet-4-5-20250514', // Best for code
},
{
// Default -> balanced model
model: 'openai/gpt-4o-mini', // Good quality, reasonable price
},
],
});
// Use the router
const response = await ai.chat(messages, {
model: router.select(messages),
maxTokens: 2048,
});
Price Comparison (per 1M tokens)
Here is a quick reference for common models available through OpenRouter:
| Model | Input Cost | Output Cost | Best For |
|---|---|---|---|
| Llama 3.1 8B | $0.05 | $0.08 | Simple tasks, classification |
| GPT-4o Mini | $0.15 | $0.60 | Balanced quality/cost |
| Gemini 2.0 Flash | $0.10 | $0.40 | Fast multimodal tasks |
| Claude Sonnet 4.5 | $3.00 | $15.00 | Complex reasoning, code |
| GPT-4o | $2.50 | $10.00 | General high-quality tasks |
| Claude Opus 4 | $15.00 | $75.00 | Most demanding tasks |
By routing simple queries to cheaper models, you can reduce costs by 60-80% without sacrificing quality where it matters.
Built-In Failover: Auto-Switching on Provider Failure
Production applications cannot afford downtime when a single provider has an outage. NeuroLink's failover configuration handles this automatically:
const ai = new NeuroLink({
provider: 'openrouter',
apiKey: process.env.OPENROUTER_API_KEY,
failover: {
enabled: true,
// Define fallback chains
chains: [
{
primary: 'anthropic/claude-sonnet-4-5-20250514',
fallbacks: [
'openai/gpt-4o',
'google/gemini-2.0-flash',
],
},
{
primary: 'openai/gpt-4o',
fallbacks: [
'anthropic/claude-sonnet-4-5-20250514',
'google/gemini-2.0-flash',
],
},
],
// Retry configuration
maxRetries: 2,
retryDelayMs: 1000,
onFallback: (primary, fallback, error) => {
console.warn(`Failover: ${primary} -> ${fallback} (reason: ${error.message})`);
},
},
});
// This call will automatically try fallback models if the primary fails
const response = await ai.chat(messages, {
model: 'anthropic/claude-sonnet-4-5-20250514',
maxTokens: 1024,
});
// response.model tells you which model actually served the request
console.log('Served by:', response.model);
The failover is transparent to your application logic. Your code does not change -- NeuroLink handles the retry and model switching behind the scenes.
CLI for Rapid Prototyping
Need to quickly test a prompt against different models without writing code? NeuroLink includes a CLI:
# Quick single-model test
npx neurolink-ai chat "Explain monads in simple terms" \
--provider openrouter \
--model anthropic/claude-sonnet-4-5-20250514
# Compare multiple models
npx neurolink-ai compare "Write a haiku about TypeScript" \
--provider openrouter \
--models gpt-4o,claude-sonnet-4-5-20250514,gemini-2.0-flash
# Stream output in real-time
npx neurolink-ai chat "Build a REST API with Hono" \
--provider openrouter \
--model openai/gpt-4o \
--stream
# List available models with pricing
npx neurolink-ai models --provider openrouter --sort price
The CLI is perfect for prompt engineering, model evaluation, and quick experiments before committing to a model in your codebase.
Putting It All Together
Here is a complete example that combines routing, streaming, and failover:
import { NeuroLink, Router } from 'neurolink-ai';
const ai = new NeuroLink({
provider: 'openrouter',
apiKey: process.env.OPENROUTER_API_KEY,
failover: {
enabled: true,
chains: [
{
primary: 'anthropic/claude-sonnet-4-5-20250514',
fallbacks: ['openai/gpt-4o', 'google/gemini-2.0-flash'],
},
],
maxRetries: 2,
},
});
async function handleUserQuery(userMessage: string) {
const messages = [
{ role: 'system' as const, content: 'You are a helpful coding assistant.' },
{ role: 'user' as const, content: userMessage },
];
// Route based on complexity
const isSimple = userMessage.length < 100 && !userMessage.includes('code');
const model = isSimple
? 'meta-llama/llama-3.1-8b-instruct'
: 'anthropic/claude-sonnet-4-5-20250514';
// Stream the response
const stream = ai.chatStream(messages, { model, maxTokens: 2048 });
let response = '';
for await (const chunk of stream) {
process.stdout.write(chunk.delta);
response += chunk.delta;
}
return response;
}
// Usage
await handleUserQuery('What is a closure?');
await handleUserQuery('Implement a type-safe event emitter in TypeScript with generics');
Conclusion
The combination of NeuroLink and OpenRouter eliminates the complexity of multi-provider AI integration. Instead of managing multiple SDKs, API keys, and response formats, you get:
- One SDK with full TypeScript type safety
- 200+ models accessible through a single API key
- Streaming with a clean async iterator interface
- Intelligent routing to optimize cost and performance
- Automatic failover for production reliability
- CLI tools for rapid prototyping
Whether you are building a chatbot, a code assistant, a content pipeline, or any AI-powered application, this stack gives you the flexibility to use the best model for every task without the integration overhead.
Get Started
- Website: neurolink.ink
- Documentation: docs.neurolink.ink
- GitHub: github.com/juspay/neurolink
- Blog: blog.neurolink.ink
- OpenRouter: openrouter.ai
npm install neurolink-ai
Start building with 200+ models today.
Top comments (0)