You can use the Sakana Fugu API by creating a key at console.sakana.ai, copying the base URL shown in your console, and pointing your existing OpenAI client at that endpoint. No SDK migration is required. Fugu exposes an OpenAI-compatible API, so the same openai Python and JavaScript libraries you already use will work after you swap the base URL and API key. Behind that endpoint, Fugu runs a multi-agent orchestration system: according to the Sakana release page, it decides whether to answer directly or coordinate multiple models, while your application receives a normal chat completion response. If you have configured gateways before, such as in our guide to the Claude Fable 5 API, the setup pattern will feel familiar.
This guide shows how to get a key, configure Python and JavaScript clients, choose the right model value, stream responses, and test the request before integrating it into an app.
What the Sakana Fugu API is
Fugu is not a single model in the usual sense. Sakana describes it as a trained language model specialized in delegation, agent communication, and work synthesis. The release headline is “One Model to Command Them All.”
When you send a request, a trained conductor reads your prompt and either:
- Answers directly
- Dynamically coordinates several LLMs
- Recursively includes instances of itself
- Synthesizes the work into one final answer
From your application’s point of view, this still looks like a normal OpenAI-compatible chat completion. You do not assemble agents, route between providers, or manage orchestration logic. That complexity stays server-side.
Fugu currently has two main variants:
| Variant | Best for |
|---|---|
fugu |
Everyday work, coding, code review, chatbots, and interactive services where latency matters |
fugu-ultra |
Deep research, paper reproduction, cybersecurity analysis, literature review, and patent investigation where quality matters more than speed |
During the beta and in some early coverage, the smaller variant was called “Fugu Mini.” Use the current names, Fugu and Fugu Ultra, and treat “Mini” as the older beta label.
The research lineage is public. Two ICLR 2026 papers underpin the approach:
- Trinity: a sub-20K-parameter coordinator optimized by derivative-free evolution with Thinker, Worker, and Verifier roles
- Conductor: a 7B model trained with reinforcement learning that learns communication structures between agents
Do not conflate these two systems. They use different methods and sizes, and the shipped product’s exact parameter count has not been published.
Step 1: Create a Fugu API key at console.sakana.ai
Go to console.sakana.ai. Access is behind a login wall, so sign in with Google or email before opening the dashboard.
The beta reportedly started with roughly 500 users in late April 2026. General availability and regional access may change, so verify the current signup status and region support when you open the console. A reported EU/EEA availability restriction has circulated as well, so confirm your region before building against the API.
After logging in:
- Open the API keys section.
- Generate a new key.
- Store it in an environment variable or secrets manager.
- Never commit the key to source control.
- Rotate the key immediately if it leaks.
Example environment variable setup:
export FUGU_API_KEY="your_fugu_api_key"
While you are in the console, copy your account base URL. This is important: the Fugu endpoint base URL is not published on any public Sakana page as of this writing.
Do not guess the base URL. Do not copy one from a forum post. Use the exact value shown in your Sakana console.
Step 2: Point your OpenAI client at Fugu
Because Fugu is OpenAI-compatible, you keep your existing SDK.
You only need to change:
api_key-
base_url/baseURL
The request shape stays aligned with the standard OpenAI chat completion API.
In the examples below:
-
<YOUR_FUGU_BASE_URL_FROM_CONSOLE>must be copied from your Sakana console -
fuguis an example model string; confirm the exact value in your console
Python
Install or update the OpenAI SDK:
pip install --upgrade openai
Create a chat completion:
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["FUGU_API_KEY"],
base_url="<YOUR_FUGU_BASE_URL_FROM_CONSOLE>", # copy from console.sakana.ai
)
response = client.chat.completions.create(
model="fugu", # confirm the exact model string in your console
messages=[
{"role": "system", "content": "You are a helpful engineering assistant."},
{"role": "user", "content": "Refactor this function to remove the nested loop."},
],
)
print(response.choices[0].message.content)
JavaScript
Install the OpenAI SDK:
npm install openai
Create a chat completion:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.FUGU_API_KEY,
baseURL: "<YOUR_FUGU_BASE_URL_FROM_CONSOLE>", // copy from console.sakana.ai
});
const response = await client.chat.completions.create({
model: "fugu", // confirm the exact model string in your console
messages: [
{ role: "system", content: "You are a helpful engineering assistant." },
{ role: "user", content: "Refactor this function to remove the nested loop." },
],
});
console.log(response.choices[0].message.content);
That is the migration path. If you have routed an OpenAI client through a third-party gateway before, the pattern is similar to setups like Claude Code with OpenRouter: same client, new base URL, new key.
Step 3: Choose the model field
The model field selects the Fugu variant.
Reported model strings include:
fugu
fugu-ultra
Some sources have also reported dated identifiers like:
fugu-ultra-20260615
Treat dated IDs as unstable unless your console explicitly lists them. Model identifiers can change between releases, so use the value shown in your Sakana dashboard.
Use this rule of thumb:
- Start with
fugufor interactive apps, chat, coding tasks, and latency-sensitive workloads. - Use
fugu-ultrawhen answer quality matters more than latency, such as deep research, paper reproduction, or security review.
Switching variants is a one-line change:
# Balanced, low-latency variant
fast = client.chat.completions.create(
model="fugu",
messages=[
{"role": "user", "content": "Summarize this changelog in three bullets."}
],
)
# Maximum-quality variant
deep = client.chat.completions.create(
model="fugu-ultra", # confirm the exact string in your console
messages=[
{
"role": "user",
"content": "Reproduce the main result of this paper and flag any gaps.",
}
],
)
Step 4: Stream responses
Streaming works like it does with OpenAI. Enable streaming and iterate over response chunks.
Use streaming when:
- Building a chat UI
- Displaying long responses progressively
- Reducing perceived latency
- Showing tokens as they arrive
Python streaming
stream = client.chat.completions.create(
model="fugu",
messages=[
{"role": "user", "content": "Walk me through setting up a CI pipeline."}
],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)
JavaScript streaming
const stream = await client.chat.completions.create({
model: "fugu",
messages: [
{ role: "user", content: "Walk me through setting up a CI pipeline." },
],
stream: true,
});
for await (const chunk of stream) {
const delta = chunk.choices[0]?.delta?.content;
if (delta) process.stdout.write(delta);
}
With Fugu, streaming returns the synthesized output. You do not see intermediate agent messages, routing decisions, or orchestration graphs. Even if Fugu coordinates multiple models behind the scenes, your stream contains the final assistant response.
What happens behind one request
When a request reaches Fugu, the conductor decides whether to answer directly or build a team of models. If it builds a team, it can call multiple LLMs, recursively include more instances of Fugu, and merge the results into one response.
This matters for governance and evaluation.
According to Sakana’s release page:
- Agents in the pool are swappable.
- Teams can opt specific agents out.
- Fugu can route around provider restrictions.
- The orchestration layer can act as both a quality mechanism and a compliance lever.
This also affects how you should read benchmarks.
Fugu is an orchestrator that can call other vendors’ frontier models, recursively including itself. When Sakana reports that Fugu Ultra “stands shoulder-to-shoulder with leading models like Fable 5 and Mythos Preview” across engineering, scientific, and reasoning benchmarks, read that as a parity claim attributed to Sakana for a model-of-models system.
Sakana also reports that Fugu “consistently outperforms” Gemini 3.1 Pro, Opus 4.8, and GPT 5.5 on specific applications such as AutoResearch, one-shot chess, and financial time-series prediction. A result like “beats Opus 4.8” may come from Fugu calling Opus and synthesizing its output. That is still a real capability, but it is different from a standalone model topping a leaderboard.
For a single-model comparison point, see our Claude Fable 5 API guide. For more background on the orchestration approach, see what is Sakana Fugu.
Access, pricing, and alternatives
Sakana’s release page confirms two pricing categories:
- Subscription tiers for everyday use
- Pay-as-you-go plans for heavier and enterprise workloads
Specific dollar amounts circulating for tiers, promos, and per-token rates come from secondary, JavaScript-rendered sources rather than the release page itself. Because those numbers can change and were not published on the official release page, this guide does not quote them.
Check live pricing in your Sakana console as of 2026-06-22 before you choose a plan.
If you are comparing Fugu with routing gateways, keep the categories separate:
| System type | Behavior |
|---|---|
| Router or gateway | Picks one model per request |
| Fugu | Uses a learned adaptive topology that can coordinate several models and itself |
Routers like OpenRouter and Martian typically select one model per request. Fugu instead acts as an orchestration layer. If you are evaluating gateways for your stack, our roundup of the best OpenRouter alternatives can help frame the trade-offs.
Test Fugu with Apidog before writing code
Because Fugu uses the OpenAI chat-completions format, you can test it like any HTTP API before integrating it into your application.
In Apidog:
- Create a new request.
- Paste the base URL from your Sakana console.
- Add your Fugu key as a bearer token.
- Set the request body with
modelandmessages. - Send the request.
- Inspect the raw response, token usage, and assistant message.
Example JSON body:
{
"model": "fugu",
"messages": [
{
"role": "system",
"content": "You are a helpful engineering assistant."
},
{
"role": "user",
"content": "Explain how to add retry logic to this API client."
}
]
}
This helps you verify:
- The base URL is correct
- The API key works
- The model string is valid
- The response shape matches your client expectations
- Streaming chunks behave as expected before you build a UI
Apidog also lets you save the request, parameterize the key across environments, and share a working example with your team. For a focused walkthrough, see our guide on how to test the Sakana Fugu API with Apidog.
When you are ready to build, Download Apidog and start from a verified request instead of guessing at the contract.
Frequently Asked Questions
Do I need a new SDK to call the Sakana Fugu API?
No. Fugu exposes an OpenAI-compatible endpoint, so you can keep using the existing openai Python or JavaScript client. Change the base_url or baseURL to the value from your console and set your Fugu API key.
Where do I find the Fugu base URL?
Copy it from your dashboard at console.sakana.ai after logging in with Google or email. The base URL is not published on a public Sakana page, so do not guess it or reuse a host from another provider.
What is the difference between Fugu and Fugu Ultra?
Fugu is the balanced, low-latency variant for everyday work, coding, code review, chat, and interactive services. Fugu Ultra targets maximum answer quality for research, paper reproduction, and security analysis. Both use the same endpoint, and you switch between them by changing the model field.
Does Fugu beat single models like Fable 5?
Treat that carefully. Sakana frames Fugu Ultra as standing shoulder-to-shoulder with Fable 5 and Mythos Preview, which is a parity claim rather than a direct “beats” claim. Fugu is an orchestrator that can call other vendors’ frontier models, so its results reflect a model-of-models system rather than a single standalone model. See our Claude Fable 5 API guide for the single-model comparison point.
How much does the Sakana Fugu API cost?
The release page confirms subscription tiers plus a pay-as-you-go plan, but specific dollar amounts circulating online come from secondary sources and can change. Check live pricing in your Sakana console as of 2026-06-22 before subscribing.
How do I test Fugu before writing code?
Use an API client like Apidog. Paste your console base URL, add your key, set the model and messages, and inspect the response. The test Sakana Fugu API with Apidog guide shows the full flow.
Final checklist
Before you integrate Fugu into your app, verify the following:
- You have a valid Sakana console account.
- You copied the exact base URL from the console.
- Your API key is stored in an environment variable or secrets manager.
- Your OpenAI client uses the Fugu
base_url/baseURL. - Your
modelvalue matches the console. - Streaming works if your UI needs token-by-token output.
- You tested the request in Apidog before shipping.
Fugu collapses a multi-agent system into one OpenAI-compatible API call. The hardest setup step is copying the correct base URL from your console. After that, your existing OpenAI client can send requests normally, and Apidog gives you a fast way to verify the contract before writing production code.
Download Apidog and send your first Fugu request from a clean, repeatable request.



Top comments (0)