Have you ever tried building a production-ready AI chatbot that streams responses token-by-token, handles failover across providers, enforces structured JSON outputs, and lets you inject custom logic (like metadata tracking or approval gates) — all without managing WebSocket servers, polling, timeouts, or connection state?
Most vanilla setups (OpenAI/Anthropic streaming) force you into complex infra. But what if a lightweight gateway handled all that?
Enter this full-stack example using ModelRiver (an AI gateway I'm building). It demonstrates a clean pattern for true end-to-end streaming with async requests, event-driven webhooks, automatic failover, and easy local dev — no ngrok needed.
In ~30-45 minutes, you can recreate this: React frontend → Node.js backend → ModelRiver → real-time WebSocket back to browser.
(Disclosure: I work on ModelRiver. This is a genuine technical demo for feedback on production LLM patterns.)
Why This Pattern Matters in 2026
Modern AI apps need:
- Instant, human-like streaming UX
- Reliability (failover if a provider flakes)
- Structured, type-safe outputs (e.g., sentiment + action items)
- Business logic gates (validation, enrichment, custom IDs for DB)
- Zero heavy infra (no persistent WebSockets on your side)
This example solves all that with async + webhook callbacks + lightweight client SDK.
Architecture at a Glance
User (React) → Node.js Backend → ModelRiver Async API
↓
AI Processing (background, failover)
↓
Webhook to Backend (enrich/inject)
↓
Callback to ModelRiver
↓
WebSocket Stream → Frontend (real-time)
Key magic: ModelRiver processes async, hits your webhook before final delivery → you enrich → callback → streams via WS.
Prerequisites
- Node.js 16+
- ModelRiver account (free tier): console.modelriver.com
- API key from console
- Optional: Ollama/llama.cpp/vLLM for local inference testing
Step 1: Set Up ModelRiver (Console)
- Create a project.
- Add providers (OpenAI/Anthropic + local if wanted).
- Define structured output schema (e.g.,
chatbot_response):
{
"reply": "string",
"summary": "string",
"sentiment": "positive | negative | neutral | mixed",
"confidence": "number",
"topics": "array<string>",
"action_items": "array<{task: string, priority: 'high' | 'medium' | 'low'}>"
}
- Create workflow mr_chatbot_workflow with structured output + event new_chat.
- Set webhook type to "Localhost CLI" for dev.
Step 2: Backend (Node.js + Express)
Install deps:
npm init -y
npm i express uuid dotenv node-fetch
.env:
MODELRIVER_API_KEY=your_key
PORT=4000
BACKEND_PUBLIC_URL=http://localhost:4000
WEBHOOK_SECRET=your_secret_from_console
EVENT_NAME=new_chat
Main index.js (key endpoints):
const express = require('express');
const { v4: uuidv4 } = require('uuid');
require('dotenv').config();
const fetch = require('node-fetch');
const app = express();
app.use(express.json());
const pendingRequests = new Map(); // channel_id → {conversationId, messageId}
app.post('/chat', async (req, res) => {
const { message, workflow = 'mr_chatbot_workflow' } = req.body;
const conversationId = uuidv4();
const messageId = uuidv4();
const resp = await fetch('https://api.modelriver.com/v1/ai/async', {
method: 'POST',
headers: {
Authorization: `Bearer ${process.env.MODELRIVER_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
workflow,
messages: [{ role: 'user', content: message }],
delivery_method: 'websocket',
webhook_url: `${process.env.BACKEND_PUBLIC_URL}/webhook/modelriver`,
events: ['webhook_received'],
metadata: { conversationId, messageId },
}),
});
const data = await resp.json();
pendingRequests.set(data.channel_id, { conversationId, messageId });
res.json({
channel_id: data.channel_id,
websocket_url: data.websocket_url,
ws_token: data.ws_token,
});
});
app.post('/webhook/modelriver', async (req, res) => {
const { channel_id, ai_response, callback_url } = req.body;
const pending = pendingRequests.get(channel_id);
if (!pending) return res.status(404).json({ error: 'Not found' });
// Enrich with custom IDs (add validation/sentiment gates here!)
const enriched = {
id: pending.messageId,
conversation_id: pending.conversationId,
...ai_response.data,
};
await fetch(callback_url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(enriched),
});
pendingRequests.delete(channel_id);
res.status(200).json({ success: true });
});
app.listen(process.env.PORT, () => console.log(`Backend on ${process.env.PORT}`));
Step 3: Frontend (React + ModelRiver Client SDK)
npx create-vite@latest frontend --template react
cd frontend
npm i @modelriver/client
Use useModelRiver hook for streaming:
import { useModelRiver } from '@modelriver/client';
import { useState } from 'react';
function App() {
const [input, setInput] = useState('');
const { connect, message, status } = useModelRiver();
const send = async () => {
if (!input) return;
const res = await fetch('http://localhost:4000/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: input }),
});
const { websocket_url, ws_token, channel_id } = await res.json();
connect({ websocket_url, ws_token, channel_id });
setInput('');
};
return (
<div>
<h1>Streaming Chatbot Demo</h1>
<div>{status}</div>
<div style={{ whiteSpace: 'pre-wrap' }}>{message?.reply || 'Waiting...'}</div>
{message?.sentiment && <p>Sentiment: {message.sentiment} ({(message.confidence * 100).toFixed(0)}%)</p>}
{/* Render topics, action_items similarly */}
<input value={input} onChange={e => setInput(e.target.value)} />
<button onClick={send}>Send</button>
</div>
);
}
export default App;
Step 4: Local Dev (No ngrok!)
- Install CLI:
npm i -g @modelriver/cli - Run:
modelriver forward(forwards webhooks to localhost) - Start backend:
node index.js - Start frontend:
npm run dev
Test at http://localhost:5173 (Vite default).
Production Benefits Recap
- No streaming servers — ModelRiver handles WS.
- Async non-blocking — UI stays responsive.
- Failover built-in — Auto-switches providers.
- Structured + enriched — JSON schema + your logic.
- Local-first dev — CLI makes webhooks trivial.
- Metadata tracking — Easy DB/logging integration.
Next Steps & Repo
Full repo: https://github.com/modelriver/modelriver-chatbot-demo.git (clone, follow README).
Docs: https://modelriver.com/docs/chatbot-example
Top comments (0)