🧵 I chased a phantom through two config files, three API keys, and 47 SSH sessions. The initial "fix" was one line of JSON. The real fix? Deleting the file entirely.
🤖 What's OpenClaw?
Before I dive in — if you haven't heard of OpenClaw, it's an open-source AI agent framework that lets you run persistent AI assistants on your own server. Think of it as your self-hosted ChatGPT, but with memory, personality, tools, scheduled tasks, and multi-channel support (Telegram, Discord, WhatsApp, TUI, etc.).
You configure which LLM models power your agents — GPT-4, Gemini, Claude, or any model via OpenRouter — and OpenClaw handles the orchestration: routing messages, managing sessions, executing tools, and maintaining long-term memory across conversations.
I run my personal AI assistant (Elara) on an AWS EC2 instance using OpenClaw. The model I'd been using for weeks: stepfun/step-3.5-flash:free\ via OpenRouter — a solid, free, 250K-context model that worked beautifully.
Until one Saturday morning, when it just… stopped.
🔇 The Silence
I opened my OpenClaw TUI (the terminal-based chat interface) and typed Hello\:
\`
🦞 OpenClaw 2026.2.2-3 — Think different. Actually think.
openclaw tui - ws://127.0.0.1:18789 - agent main - session main
connecting | idle
`\
The spinner appeared — ⠴ kerfuffling…\ — and just kept going. And going. And going.
No error. No timeout message. No response. Just an infinite spinner and silence.
🕵️ Act I: The Obvious Suspects
Checking the gateway logs
First instinct: check the logs. OpenClaw writes daily log files to /tmp/openclaw/\:
\bash
cat /tmp/openclaw/openclaw-2026-03-01.log | grep -i "error" | tail -5
\\
And there it was:
\json
{
"error": "Error: Unknown model: openrouter/stepfun/step-3.5-flash:free",
"lane": "main",
"durationMs": 55
}
\\
"Unknown model." But… that model was in my config. I'd been using it for weeks. How could OpenClaw suddenly not recognize it?
The mysterious configured,missing\ status
OpenClaw has a CLI command to list all configured models:
\bash
$ openclaw models list
\\
\
Model Input Context Auth Tags
openrouter/stepfun/step-3.5-flash:free text 250k yes configured,missing
google/gemini-2.0-flash text 1000k yes configured
google/gemini-3-flash-preview text 1024k yes configured
\\
There it is: configured,missing\. 🤨
I'd never seen this status before. In OpenClaw:
-
configured\= the model is listed in your config and the runtime can resolve it ✅ -
configured,missing\= the model is listed in your config, but the runtime can't resolve it to a working provider endpoint ❌
The model exists on paper but is invisible at runtime. Like a ghost in the machine.
Trying the obvious fixes
\`bash
Re-register the model via CLI
$ openclaw models set openrouter/stepfun/step-3.5-flash:free
Updated successfully ✅
Restart the gateway
$ systemctl --user restart openclaw-gateway
Check again...
$ openclaw models list | grep stepfun
openrouter/stepfun/step-3.5-flash:free text 250k yes configured,missing
`\
Still configured,missing\. 😤 The models set\ command updated the global config, but the runtime still couldn't find the model. Something deeper was wrong.
Trying a model scan
OpenClaw can scan your providers for available models:
\bash
$ openclaw models scan --yes
\\
It found Google models, Llama models, and others — but not stepfun. The scan only picks up models that advertise tool-calling support, and step-3.5-flash:free\ doesn't. Dead end.
💀 Act II: The >\ That Ate My API Key
While investigating, I discovered something horrifying. Earlier that day, while configuring a new Google API key, a command had been run:
\bash
echo "GOOGLE_API_KEY=AIzaSy..." > ~/.openclaw/.env
\\
See that >\? That's not >>\.
⚠️ That single character —
>\instead of>>\— overwrote the entire.env\file, silently destroying theOPENROUTER_API_KEY\that had been there for a month.
No error. No warning. Just gone.
I found the original key buried deep in .bash_history\ and restored it:
\`bash
Found the original onboarding command in history
$ history | grep openrouter
openclaw onboard --auth-choice apiKey --token-provider openrouter --token "sk-or-v1-..."
Restored it (with >> this time!)
$ echo 'OPENROUTER_API_KEY=sk-or-v1-...' >> ~/.openclaw/.env
`\
Direct API test
To verify the key was valid, I bypassed OpenClaw entirely:
\bash
$ curl -s https://openrouter.ai/api/v1/chat/completions \\
-H "Authorization: Bearer sk-or-v1-..." \\
-H "Content-Type: application/json" \\
-d '{
"model": "stepfun/step-3.5-flash:free",
"messages": [{"role": "user", "content": "Say hello"}]
}'
\\
\json
{
"choices": [{
"message": { "content": "Hello! How can I help you today?" }
}]
}
\\
The API worked perfectly. 🎉 Key valid. OpenRouter up. Model alive and responding.
But OpenClaw still said "Unknown model." 💀
The API worked. The config had the model. The key was valid. But OpenClaw couldn't see it. This is the moment I realized the problem was deeper than a missing key or a typo.
🔬 Act III: The Two-Layer Architecture
I went full forensics. I downloaded everything from the server:
- 📄 28 backup config files spanning a month
- 📊 12MB of gateway logs (4 days)
- 🧠 Memory files, soul files, identity files — the AI assistant's persistent state
- 📝 Configuration change reports — auto-generated docs from previous changes
And after two hours of diffing JSON files, I found the problem.
OpenClaw resolves models through TWO config layers
Most documentation focuses on the global config file. But OpenClaw actually has two layers of model configuration:
| Layer | File | Purpose |
|---|---|---|
| Layer 1 | ~/.openclaw/openclaw.json\ |
Global config — model names, aliases, fallbacks, per-agent assignments |
| Layer 2 | ~/.openclaw/agents/<id>/agent/models.json\ |
Provider definitions — maps provider names → base URLs, API keys, explicit model schemas |
The critical behavior: When Layer 2 defines a provider (like openrouter\), its model definitions shadow (override) the built-in registry for that provider. Only models explicitly listed in that provider's models[]\ array will be recognized.
My stepfun model was in Layer 1 ✅ but not in Layer 2 ❌.
🕰️ Act IV: Where Did This File Come From?
Here's the part that makes this story truly interesting. I diffed the backup files to reconstruct exactly how models.json\ evolved:
Stage 1: The innocent beginning (early February)
My AI assistant (Elara) needed to connect to a custom model (dolphin-mistral\ via OpenRouter) that wasn't in OpenClaw's built-in registry. So she created models.json\ with a custom provider called openrouter-custom\:
\json
{
"providers": {
"openrouter-custom": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "sk-or-v1-...",
"models": [
{ "id": "cognitivecomputations/dolphin-mistral-24b-venice-edition:free" }
]
},
"google": {
"models": [{ "id": "gemini-3-pro-preview" }]
}
}
}
\\
File size: 1.3KB. Two providers, two models. Harmless.
At this point, stepfun/step-3.5-flash:free\ was still working perfectly — resolved through OpenClaw's built-in OpenRouter registry, no models.json\ entry needed. The provider name openrouter-custom\ was smart — it's a custom name that doesn't clash with the built-in openrouter\ provider.
Stage 2: Adding Nvidia models (February 22)
I asked Elara to configure Kimi K2.5 via Nvidia's API. She added a new nvidia-custom\ provider to models.json\:
\json
"nvidia-custom": {
"baseUrl": "https://integrate.api.nvidia.com/v1",
"apiKey": "nvapi-...",
"models": [
{ "id": "moonshotai/kimi-k2.5" },
{ "id": "deepseek-ai/deepseek-v3.2" },
{ "id": "mistralai/mistral-large-3-675b-instruct-2512" }
// ... 8 models total
]
}
\\
File size grew to 4.7KB. Three providers, 11 models. Still harmless — nvidia-custom\ is a truly custom provider that doesn't shadow any built-in. Stepfun still worked fine.
Stage 3: The fatal addition (late February)
At some point between Feb 22 and Mar 1, during a configuration session where I asked Elara to add Google models via OpenRouter, a new provider block was added to models.json\:
\json
"openrouter": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "sk-or-v1-...",
"models": [
{ "id": "google/gemini-2.0-flash-001" },
{ "id": "google/gemini-2.5-flash" },
{ "id": "google/gemini-2.5-pro" }
// ... 13 Google models total via OpenRouter
//
// But where's stepfun?
// 🦗 *crickets* 🦗
]
}
\\
File size ballooned to 11KB. And this single block was the killer.
Why did this break everything? Because unlike openrouter-custom\ in Stage 1, this provider was named just openrouter\ — which exactly matches OpenClaw's built-in OpenRouter provider name. Per OpenClaw's merge rules, when models.json\ defines a provider, non-empty values take precedence over the built-in registry. The explicit openrouter\ block with only 13 Google models completely replaced the built-in OpenRouter model catalog — which previously included hundreds of models, stepfun among them.
Stepfun was never added to this custom openrouter\ block because it was already working through the built-in registry. Nobody knew they needed to add it. The built-in registry was handling it silently. But the moment the custom openrouter\ block appeared, it overwrote that silent handling, and stepfun became invisible.
💡 Analogy: Imagine your phone contacts are stored in iCloud. One day, a friend sets up a "Google Contacts" sync for you with only work contacts. Your phone switches to Google as the primary source and suddenly all your personal contacts vanish — they're still in iCloud, but it's no longer being consulted.
✅ The Fix: Two Approaches, One Revelation
🔧 The initial fix: Patching the symptom
Having identified that the openrouter\ provider block in models.json\ was missing stepfun, my first instinct was to add the missing model definition. This felt like the right approach — the file exists, it lists models, my model isn't in the list, so add it.
Step 1: Understanding the required schema
Each model in the provider's models[]\ array needs a specific structure. You can't just add the model name — you need the full definition. I found the schema by looking at existing entries in the file:
\json
// Every model in models.json needs these fields:
{
"id": "...", // Model slug (from the provider)
"name": "...", // Human-readable display name
"reasoning": false, // Does it support chain-of-thought?
"input": ["text"], // Input types: "text", "image", etc.
"cost": { // Per-token pricing
"input": 0, "output": 0,
"cacheRead": 0, "cacheWrite": 0
},
"contextWindow": ..., // Max input tokens
"maxTokens": ... // Max output tokens
}
\\
Step 2: Finding the right values for stepfun
I checked the OpenRouter model page for stepfun/step-3.5-flash:free\ to get the specs:
- Context window: 250,000 tokens
- Max output: 8,192 tokens
- Input: text only (no image support)
- Cost: free (
0\for all price fields) - Reasoning: no
Step 3: Writing a Node.js script to safely modify the JSON
I didn't want to hand-edit an 11KB JSON file through SSH — one misplaced comma and the whole config breaks. So I wrote a script:
\`javascript
const fs = require('fs');
const path = process.env.HOME + '/.openclaw/agents/main/agent/models.json';
const config = JSON.parse(fs.readFileSync(path));
const newModel = {
id: 'stepfun/step-3.5-flash:free',
name: 'Step 3.5 Flash (Free)',
reasoning: false,
input: ['text'],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 250000,
maxTokens: 8192
};
// Check if it already exists
const exists = config.providers.openrouter.models.some(
m => m.id === newModel.id
);
if (!exists) {
config.providers.openrouter.models.push(newModel);
fs.writeFileSync(path, JSON.stringify(config, null, 2));
console.log('✅ Added stepfun to openrouter provider');
} else {
console.log('Model already exists');
}
`\
Step 4: Apply and verify
\`bash
Run the script
$ node add_stepfun.js
✅ Added stepfun to openrouter provider
Restart the gateway to load the new config
$ systemctl --user restart openclaw-gateway
Wait for startup
$ sleep 5
Check status
$ openclaw models list | grep stepfun
openrouter/stepfun/step-3.5-flash:free text 250k yes configured ✅
Test in TUI
$ openclaw tui --message "Hello? Are you there?"
🌸 Hello! I'm here and ready to help!
agent main | openrouter/stepfun/step-3.5-flash:free | tokens 54k/250k (22%)
`\
It worked! 🎉 The model was back. Status changed from configured,missing\ to configured\.
But something nagged at me.
🤔 The nagging question
I stared at models.json\ — now 11.3KB — and asked myself: why does this file need to exist at all?
OpenClaw has a built-in model registry. It already knows about every OpenRouter model, every Google model, every Anthropic model. That's how stepfun was working for weeks — through the built-in registry, with no models.json\ needed.
The only reason models.json\ existed was for truly custom providers like nvidia-custom\ (an Nvidia API endpoint that OpenClaw doesn't know about natively) and openrouter-custom\ (a non-standard name for testing). Those make sense.
But the openrouter\ block? That was just a duplicate of something OpenClaw already knows. Worse — it was an incomplete duplicate that was shadowing the complete built-in version.
What if I just… removed the file?
🎯 The real fix: Removing what shouldn't be there
Step 1: Back up the file (I'd learned my lesson about backups by this point):
\bash
$ cp ~/.openclaw/agents/main/agent/models.json \\
~/.openclaw/agents/main/agent/models.json.backup.$(date +%Y%m%d-%H%M%S)
echo "Backup saved. Restore with:"
echo " cp models.json.backup.TIMESTAMP models.json"
echo " systemctl --user restart openclaw-gateway"
\\
Step 2: Disable models.json\ by renaming it (safer than deleting — I can reverse this instantly):
\bash
$ mv ~/.openclaw/agents/main/agent/models.json \\
~/.openclaw/agents/main/agent/models.json.disabled
\\
Step 3: Restart the gateway:
\bash
$ systemctl --user restart openclaw-gateway
$ sleep 5
\\
Step 4: Check if the gateway starts without errors:
\`bash
$ journalctl --user -u openclaw-gateway -n 20 --no-pager | grep -i error
(no output — no errors!) ✅
`\
Step 5: Check ALL models:
\bash
$ openclaw models list
\\
\
Model Input Ctx Auth Tags
google/gemini-3-flash-preview text+image 1024k yes configured ✅
google/gemini-1.5-flash text+image 977k yes configured ✅
google/gemini-1.5-pro text+image 977k yes configured ✅
google/gemini-2.0-flash text+image 1024k yes configured ✅
google/gemini-2.5-flash text+image 1024k yes configured ✅
google/gemini-2.5-pro text+image 1024k yes configured ✅
google/gemini-3-pro-preview text+image 977k yes configured ✅
openrouter/stepfun/step-3.5-flash:free text 250k yes configured ✅
openrouter/meta-llama/llama-3.3-70b-ins... text 128k yes configured ✅
\\
Every. Single. Model. configured\. Not a single missing\. ✅
Step 6: Test the models in TUI:
\bash
$ openclaw tui --message "Hello! Which model are you?"
\\
\
Hello! 🌸 I'm Elara, running on openrouter/stepfun/step-3.5-flash:free.
agent main | openrouter/stepfun/step-3.5-flash:free | tokens 54k/250k (22%)
\\
I verified the Google models too by checking the gateway logs:
\bash
$ tail -20 /tmp/openclaw/openclaw-*.log | grep "embedded run done"
\\
\
lane=session:agent:main:test-google durationMs=16949 active=0 queued=0
\\
Google model completed a run in 16.9 seconds. No errors. ✅
Step 7: Confirm models.json\ was NOT regenerated:
\`bash
$ ls ~/.openclaw/agents/main/agent/models.json 2>&1
"No such file or directory" — it was NOT regenerated ✅
`\
This appeared to confirm that OpenClaw does not auto-regenerate models.json\. When the file doesn't exist, the gateway falls back entirely to its built-in registry.
⚠️ March 2026 Update: Further testing revealed this is not always true. On newer OpenClaw versions (2026.2.2+),
models.json\is regenerated frommodels.providers\inopenclaw.json\on gateway restart andopenclaw doctor\runs. The proper permanent fix is to manage model entries viamodels.providers\in the main config — not by deleting the agent-levelmodels.json\. See the official docs for details.
📊 Comparing the two fixes
| Initial Fix | Real Fix | |
|---|---|---|
| What | Added stepfun to models.json\
|
Removed models.json\ entirely |
| Effort | Write a script, figure out the schema, find the right values | One mv\ command |
| Models fixed | Only stepfun | All current + all future models |
| Future risk | Every new OpenRouter model needs manual addition | No maintenance needed |
| Root cause | Patched → still shadowing | Eliminated the shadow |
The initial fix treated the symptom. The real fix treated the disease — but only temporarily (see update above).
⚠️ The best permanent fix is to manage custom providers through
models.providers\inopenclaw.json\. Use a custom provider name (likeopenrouter-custom\) for models not in the built-in catalog, and let the built-in provider handle everything else.
🛠️ ## How to Check the Built-in Catalog
Before creating custom providers, check whether your model is already in OpenClaw's built-in catalog. If it is, you don't need models.json or models.providers at all - just add it to the allowlist.
# List ALL models in the built-in catalog for a provider
$ openclaw models list --all --provider openrouter
# Check if a specific model exists
$ openclaw models list --all --provider openrouter | grep dolphin
# No results = model is NOT built-in = needs openrouter-custom
$ openclaw models list --all --provider openrouter | grep stepfun
openrouter/stepfun/step-3.5-flash:free text 250k yes
# Found = model IS built-in = just add to allowlist, no custom provider needed
$ openclaw models list --all --provider google
# Shows all built-in Google models
Rule of thumb: If
openclaw models list --all --provider <name>shows your model, just add it toagents.defaults.modelsinopenclaw.json. If it doesn't show up, you need a custom provider block inmodels.providers(use a name likeopenrouter-customto avoid shadowing the built-in).
At time of writing, the built-in OpenRouter catalog has 230+ models - including every major provider (OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, Qwen, etc.) but not community/niche models like cognitivecomputations/dolphin-mistral*.
404: No Endpoints Found That Support Tool Use
If you set a Dolphin (or other community) model as your primary and see:
404 No endpoints found that support tool use
This means the model does not support function calling/tools, and OpenRouter has no endpoint to handle a request that includes tool definitions.
Why it happens: OpenClaw sends tool definitions (web search, exec, etc.) with every request. If the model does not support tools, OpenRouter rejects with 404.
The fix: Add params.tools: false in the model's allowlist entry in openclaw.json:
"openrouter-custom/cognitivecomputations/dolphin-mistral-24b-venice-edition:free": {
"alias": "dolphin",
"params": {
"tools": false
}
}
Note: Even with
tools: false, free-tier models may still get 429 rate-limited. Configure fallbacks to ensure graceful failover:"model": { "primary": "openrouter-custom/.../dolphin-mistral:free", "fallbacks": ["openrouter/stepfun/step-3.5-flash:free", "google/gemini-3-flash-preview"] }
You can check if your model supports tools via the OpenRouter API:
curl -s https://openrouter.ai/api/v1/models | python3 -c "
import json, sys
for m in json.load(sys.stdin)['data']:
if 'dolphin' in m['id']:
print(m['id'], 'tools:', 'tools' in m.get('supported_parameters', []))
"
How to Fix This Yourself
If you're hitting Unknown model\ or configured,missing\ in OpenClaw, here's the diagnostic playbook:
Step 1: Check if you have an agent-level models.json\
\bash
ls -la ~/.openclaw/agents/main/agent/models.json 2>&1
\\
If this file exists and you're only using standard providers (OpenRouter, Google, Anthropic, OpenAI), this file is probably unnecessary and might be shadowing the built-in registry.
Step 2: Check what's in it
\bash
cat ~/.openclaw/agents/main/agent/models.json | python3 -m json.tool | grep -E '"id"'
\\
If you see a provider name that matches a built-in provider (openrouter\, google\, anthropic\, etc.), that block is overriding the built-in model catalog. Only models explicitly listed will be recognized.
Step 3: Try disabling it
\`bash
Backup first!
cp ~/.openclaw/agents/main/agent/models.json \
~/.openclaw/agents/main/agent/models.json.bak.$(date +%Y%m%d-%H%M%S)
Rename to disable
mv ~/.openclaw/agents/main/agent/models.json \
~/.openclaw/agents/main/agent/models.json.disabled
Restart
systemctl --user restart openclaw-gateway
Check
openclaw models list
`\
If all models now show configured\ — the file was the problem. Delete it permanently (or keep the .disabled\ backup just in case).
Step 4: If you DO need custom providers
If you have truly custom providers (not built-in), such as:
-
Nvidia API (
integrate.api.nvidia.com\) - Custom self-hosted endpoints
- Non-standard API providers
Then you need models.json\, but be very careful:
-
Don't use provider names that match built-in providers (e.g., use
openrouter-custom\instead ofopenrouter\) - Only define the custom providers, let the built-in registry handle the standard ones
Quick diagnostic cheat sheet
| Symptom | Likely cause | Fix |
|---|---|---|
configured,missing\ |
Custom models.json\ is shadowing built-in registry |
Rename/remove models.json\
|
Unknown model\ in logs |
Same as above | Same as above |
401 Unauthorized\ |
API key missing from .env\
|
Check .env\ (and never use >\!) |
Model works via curl\ but not OpenClaw |
Provider block in models.json\ doesn't list the model |
Remove the shadowing provider block |
models scan\ doesn't find a model |
Model doesn't support tool-calling | Add manually via openclaw models set\
|
📚 What I Learned
1️⃣ >\ vs >>\ can destroy your entire config
\bash
echo "KEY=value" > .env # ❌ REPLACES the file — destroys everything else
echo "KEY=value" >> .env # ✅ APPENDS to the file — safe
\\
Always use >>\ when adding to environment files. Or better: use the app's CLI to manage keys.
2️⃣ "Unknown model" doesn't mean what you think
It doesn't mean you misspelled the model name. It means the runtime can't resolve the name to a provider endpoint — and that resolution path might go through a file you didn't know existed.
3️⃣ Custom config files can shadow built-in behavior
This is the core lesson. My AI assistant created models.json\ for a legitimate reason (custom Nvidia provider). But when it added an openrouter\ block to the same file, it accidentally replaced the entire built-in OpenRouter catalog with its 13-model subset. Everything not in that subset — including stepfun — became invisible.
💡 If your tool has a built-in registry, a custom config that matches its namespace will override it.
4️⃣ AI agents optimise for the task at hand
Elara added Google models when I asked for Google models. She didn't know that creating an openrouter\ provider block would shadow the built-in one and break stepfun. AI agents don't preserve context they weren't told about.
5️⃣ Backup everything, always 💾
I had 28 backup files spanning a month. They let me reconstruct the exact state of every config file at every point in time. I now run a daily cron job:
\`bash
2 AM UTC daily, 30-day retention
0 2 * * * ~/openclaw_daily_backup.sh >> /tmp/openclaw/backup.log 2>&1
`\
🎯 The Takeaway
Infrastructure debugging is archaeology.
You're not fixing bugs — you're reconstructing what a system looked like at a moment when it worked, and comparing it to the moment it stopped.
The difference is usually:
- ✏️ One character (
>\vs>>\) - 📄 One file that's shadowing a built-in registry
- 🤖 One good-faith change by an AI agent that had unintended side effects
And the real fix isn't always adding what's missing — sometimes it's removing what shouldn't be there.
If you've ever stared at configured,missing\ and felt your sanity slipping — now you know exactly where to look. 🦞
Update: openrouter-custom Provider Removed (March 2026)
After further testing, we found that openrouter-custom models (community/niche models like Dolphin-Mistral) always fail with 404 No endpoints found that support tool use when used with OpenClaw agents. This happens because:
- OpenClaw agents always include tool definitions in the API request body
- Dolphin-Mistral has zero tool-supporting endpoints on OpenRouter
- OpenClaw has no config option to suppress tool definitions at the API payload level for custom providers (
tools.deny: ["*"]is agent-side only)
Final decision: Removed the openrouter-custom provider entirely. Created a dedicated dolphin agent bound to a separate Telegram bot, currently running on StepFun as primary — ready to switch to dolphin when OpenRouter adds tool-supporting endpoints for it.
Clean model setup that works (9/9 TUI tests passed):
{
"agents": {
"defaults": {
"model": {
"primary": "openrouter/stepfun/step-3.5-flash:free",
"fallbacks": ["google/gemini-3-flash-preview"]
}
}
}
}
Top comments (0)