If you use OpenCode, you already have GitHub Copilot, Ollama, Anthropic, Gemini, and other providers configured in one place.
The problem: every other tool in your workflow — Open WebUI, LangChain, Chatbox, Continue, Zed, your own scripts — needs the same models re-entered with their own API keys and base URLs.
I built opencode-llm-proxy to fix this. It's an OpenCode plugin that starts a local HTTP server on http://127.0.0.1:4010 and translates between whatever API format your tool uses and OpenCode's model list.
Install
npm install opencode-llm-proxy
Add to opencode.json:
{ "plugin": ["opencode-llm-proxy"] }
Start OpenCode — the proxy starts automatically.
Supported API formats
All formats support streaming.
| Format | Endpoint |
|---|---|
| OpenAI Chat Completions | POST /v1/chat/completions |
| OpenAI Responses API | POST /v1/responses |
| Anthropic Messages API | POST /v1/messages |
| Google Gemini | POST /v1beta/models/:model:generateContent |
Examples
OpenAI SDK (Python) — route to Ollama
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:4010/v1", api_key="unused")
response = client.chat.completions.create(
model="ollama/qwen2.5-coder",
messages=[{"role": "user", "content": "Write a binary search in Python."}],
)
print(response.choices[0].message.content)
Anthropic SDK (Python) — route to GitHub Copilot
import anthropic
client = anthropic.Anthropic(base_url="http://127.0.0.1:4010", api_key="unused")
message = client.messages.create(
model="github-copilot/claude-sonnet-4.6",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain the CAP theorem."}],
)
print(message.content[0].text)
OpenAI SDK (JavaScript)
import OpenAI from "openai"
const client = new OpenAI({ baseURL: "http://127.0.0.1:4010/v1", apiKey: "unused" })
const response = await client.chat.completions.create({
model: "anthropic/claude-3-5-sonnet",
messages: [{ role: "user", content: "Explain async/await." }],
})
console.log(response.choices[0].message.content)
Open WebUI
Settings → Connections → set API Base URL to http://127.0.0.1:4010/v1. All your OpenCode models appear in the model picker instantly.
LangChain (Python)
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="anthropic/claude-3-5-sonnet",
openai_api_base="http://127.0.0.1:4010/v1",
openai_api_key="unused",
)
response = llm.invoke("What are the SOLID principles?")
print(response.content)
How it works
Each request:
- Authenticates (optional bearer token via
OPENCODE_LLM_PROXY_TOKEN) - Resolves the model ID —
provider/modelnotation, bare model ID, or Gemini URL path - Creates a temporary OpenCode session
- Sends the prompt via the OpenCode SDK
- Returns the response in the same API format as the request
Model IDs come from
GET /v1/models, which returns all configured providers in OpenAI list format:
curl http://127.0.0.1:4010/v1/models | jq '.data[].id'
# "github-copilot/claude-sonnet-4.6"
# "anthropic/claude-3-5-sonnet"
# "ollama/qwen2.5-coder"
Configuration
| Variable | Default | Description |
|---|---|---|
OPENCODE_LLM_PROXY_HOST |
127.0.0.1 |
Bind address. Set to 0.0.0.0 to expose on LAN or Docker. |
OPENCODE_LLM_PROXY_PORT |
4010 |
TCP port. |
OPENCODE_LLM_PROXY_TOKEN |
(unset) | Bearer token required on every request. |
OPENCODE_LLM_PROXY_CORS_ORIGIN |
* |
Access-Control-Allow-Origin for browser clients. |
Top comments (0)