Hassann

Posted on Jun 22 • Originally published at apidog.com

What Is Sakana Fugu?

Sakana Fugu is a multi-agent orchestration system from Sakana AI, exposed as a single foundation model behind an OpenAI-compatible API. Instead of answering every prompt directly, Fugu acts as a trained conductor: it delegates work, coordinates agent communication, and synthesizes outputs from a pool of LLMs, including recursive copies of itself. Sakana announced Fugu on June 22, 2026 under “One Model to Command Them All” on the official release page. If you have followed other frontier releases like Claude Fable 5, Fugu is different: it is a conductor, not a soloist.

Try Apidog today

The name fits the architecture. Fugu is the Japanese pufferfish: valuable, but only when prepared correctly. That is a useful way to think about an AI system whose value comes from how it coordinates other models rather than from a single model answering alone.

TL;DR

What it is: A trained “conductor” LLM that orchestrates a team of frontier models behind one endpoint.
Variants: fugu for balanced, lower-latency use; fugu-ultra for maximum answer quality.
API shape: OpenAI-compatible. Existing OpenAI clients can usually be pointed at Fugu’s base URL.
Important caveat: Fugu calls other vendors’ models, recursively including itself. Treat benchmark results as model-of-models results, not single-model wins.
Access: Product page plus console.sakana.ai behind Google/email login.

What Sakana Fugu actually is

Most LLM APIs work like this:

You send a prompt.
One model processes it.
You receive tokens from that model.

Fugu adds an orchestration layer. When a request arrives, Fugu decides whether to answer directly or to assemble a team of models. It can split the task, assign subtasks, route messages between agents, and merge the results into one response.

From your code, Fugu looks like one model behind one endpoint. Internally, it is a model-of-models. That distinction matters when you evaluate latency, cost, and benchmark claims.

The branding also matches Sakana’s larger theme. “Sakana” means “fish” in Japanese, and the company often frames its research around school-of-fish collective intelligence: many small agents producing behavior stronger than any individual agent.

The two variants: Fugu and Fugu Ultra

Sakana ships two variants through the same API.

Fugu

Use Fugu when latency matters.

Typical use cases:

Coding assistance
Code review
Chatbots
Interactive developer tools
General productivity workflows

This is the variant that was called “Fugu Mini” during beta. In current usage, lead with Fugu; “Mini” is the old beta name.

Fugu Ultra

Use Fugu Ultra when answer quality matters more than latency.

Typical use cases:

AI research
Paper reproduction
Cybersecurity analysis
Literature review
Patent or technical investigation

Both variants run behind one endpoint. You choose the model ID, but you do not manually control the internal orchestration.

For a deeper comparison, see the Fugu Ultra vs Fable 5 vs Mythos breakdown.

Spec table

Attribute	Detail
Vendor	Sakana AI
Released	June 22, 2026
Type	Multi-agent orchestration system, shipped as one foundation model
Variants	Fugu, Fugu Ultra
Old beta name	“Fugu Mini” for the smaller variant
API	One OpenAI-compatible endpoint
Model IDs reported	`fugu`, `fugu-ultra`; verify in console before shipping
Base URL	Not published publicly; copy from console.sakana.ai
Access	Product page plus console.sakana.ai with Google/email login
Pricing structure	Subscription tiers plus pay-as-you-go for heavier or enterprise use
Research lineage	Trinity, arXiv:2512.04695; Conductor, arXiv:2512.04388; both ICLR 2026

How the orchestration works

Fugu is not just a static router. The core idea is a learned conductor.

A traditional router picks one model for a request and forwards the prompt. Tools like OpenRouter or Martian follow that pattern.

An agent framework like Swarm, AutoGen, or LangGraph gives you the primitives to build multi-agent workflows, but you write the coordination logic.

Fugu sits between those approaches:

You call one API.
Fugu decides whether the task needs multiple agents.
Fugu delegates subtasks.
Fugu manages communication.
Fugu synthesizes the final answer.

Per Sakana, the conductor handles three main jobs:

Delegation

It chooses which agents, including possible recursive copies of itself, should handle which subtasks.
Communication

It manages messages between agents and can shape the team structure dynamically.
Synthesis

It merges partial outputs into one coherent final response.

Two governance-related mechanics are also important:

Swappable agents: The model pool is not fixed. Teams can opt specific agents out for data, policy, or compliance reasons.
Routing around restrictions: Per Sakana, Fugu can dynamically route around provider restrictions by choosing a different agent when one is unavailable or disallowed.

For a related single-model framing, see the Mythos-class model explainer.

The benchmark caveat: orchestrator, not single model

Read Fugu benchmark claims carefully.

Fugu is an orchestrator that calls other vendors’ frontier models, recursively including itself. A Fugu benchmark result may come from Fugu calling Opus 4.8, Fable 5, or multiple models, then synthesizing their outputs. That is a model-of-models result, not a like-for-like single-model win.

This matters for two Sakana claims.

First, Sakana says Fugu Ultra “stands shoulder-to-shoulder with leading models like Fable 5 and Mythos Preview” across engineering, scientific, and reasoning benchmarks. Read that as a parity claim, not a “Fugu’s own weights beat every single model” claim.

Second, Sakana says Fugu “consistently outperforms” Gemini 3.1 Pro high, Opus 4.8 max, and GPT 5.5 xhigh on these application tasks:

AutoResearch
Rubik’s Cube
Mechanical Design
Japanese Handwriting Analysis
One-Shot Chess
Financial Time Series Prediction

The honest interpretation: the orchestrated team can outperform an individual model while still depending on that individual model as part of the team.

That does not make the result invalid. Orchestration is a real capability. It just means you should label the result correctly.

For context, Fable 5 is Anthropic’s most powerful generally available model, while Mythos Preview was an unreleased frontier model. The Claude Fable 5 explainer covers that side in more detail.

Research lineage: Trinity and Conductor

Fugu builds on two ICLR 2026 research threads.

Trinity: An Evolved LLM Coordinator

Trinity: An Evolved LLM Coordinator describes a tiny coordinator with fewer than 20,000 parameters, optimized by derivative-free evolution.

It assigns roles such as:

Thinker
Worker
Verifier

The key point: a very small evolved controller can drive a useful multi-agent loop.

Conductor: Learning to Orchestrate Agents in Natural Language

Conductor: Learning to Orchestrate Agents in Natural Language describes a 7B model trained with reinforcement learning to learn communication structures between agents.

It claims to outperform Mixture-of-Agents at lower cost.

These are different methods:

Trinity uses evolution.
Conductor uses reinforcement learning.
Trinity is sub-20K parameters.
Conductor is 7B parameters.

Do not assume the shipped Fugu product has the same parameter count or implementation details. Sakana has not published a parameter count for Fugu, so applying third-party details directly to the product is inference, not an official spec.

The practical novelty is narrower and more useful: Fugu packages learned, adaptive, cost-aware orchestration behind one endpoint.

What early users report

Sakana shared two vendor-reported testimonials.

One software engineer using Fugu Ultra for code review said it surfaced “more than twenty” issues where other tools flagged “about three,” and called it better than GPT-5.5.

A security engineer said one scoped instruction drove a full end-to-end assessment, including recon, XSS and SQLi probing, and auth review, while staying within scope.

Treat these as anecdotes, not benchmarks. They are still useful signals for workload fit. Fugu is most interesting for tasks that:

Decompose into subtasks
Benefit from parallel investigation
Need verification or synthesis at the end
Have enough complexity to justify orchestration overhead

For Sakana’s related model lineup, see the Mirofish explainer.

Using the Fugu API

Fugu exposes an OpenAI-compatible endpoint. That is the main implementation advantage: if your app already uses the OpenAI SDK, you can usually switch by changing the API key, base URL, and model ID.

The request shape follows the standard OpenAI Chat Completions API.

Important caveat: as of 2026-06-22, the base URL is not published on a public page. Do not guess it. Copy it from console.sakana.ai.

Python example

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_FUGU_API_KEY",
    base_url="<YOUR_FUGU_BASE_URL_FROM_CONSOLE>",
)

response = client.chat.completions.create(
    model="fugu",
    messages=[
        {
            "role": "system",
            "content": "You are a precise coding assistant."
        },
        {
            "role": "user",
            "content": "Review this function for off-by-one bugs."
        },
    ],
)

print(response.choices[0].message.content)

Switch to Fugu Ultra

Change the model ID:

response = client.chat.completions.create(
    model="fugu-ultra",
    messages=[
        {
            "role": "user",
            "content": "Reproduce the headline result from this paper."
        },
    ],
)

print(response.choices[0].message.content)

The reported model IDs are:

fugu
fugu-ultra

Verify the exact strings in the Sakana console before shipping. Model IDs and dated variants can change.

For a full walkthrough, see the Sakana Fugu API guide.

How to test Fugu in Apidog

Because Fugu uses the OpenAI Chat Completions format, you can test it like any other LLM endpoint in Apidog.

A practical setup:

Copy your Fugu base URL from console.sakana.ai.
Create an Apidog environment variable, for example:

FUGU_BASE_URL=<YOUR_FUGU_BASE_URL_FROM_CONSOLE>
FUGU_API_KEY=<YOUR_FUGU_API_KEY>

Create a new request using the chat completions path from your Fugu-compatible endpoint.
Add authorization:

Authorization: Bearer {{FUGU_API_KEY}}

Send a JSON body like this:

{
  "model": "fugu",
  "messages": [
    {
      "role": "system",
      "content": "You are a precise API testing assistant."
    },
    {
      "role": "user",
      "content": "Review this API response schema and identify breaking changes."
    }
  ]
}

Save the request as a reusable test case.
Duplicate it and change only the model ID to compare fugu and fugu-ultra.

This matters more for Fugu than for a single-model endpoint. Because Fugu may assemble different teams per request, latency and cost can vary. Capturing request timing, response shape, and token usage in Apidog gives you workload-specific data instead of relying only on vendor benchmarks.

You can download Apidog and point a new request at your console base URL to start testing.

Frequently asked questions

Is Sakana Fugu a single model or many models?

Both, depending on perspective.

To your code, Fugu is one model behind one API. Internally, it is a trained conductor that can call a pool of frontier models and synthesize their outputs. That is why its benchmark results should be treated as model-of-models results, not single-model wins.

See the Mythos-class model explainer for the single-model tier Fugu is often compared against.

What is the difference between Fugu and Fugu Ultra?

Fugu is the balanced, lower-latency variant for everyday tasks like coding, code review, and chatbots.

Fugu Ultra trades latency for higher answer quality and targets heavier work like research, security analysis, and deep technical investigation.

Both use the same endpoint. You select the variant with the model ID.

Does Fugu really beat Opus 4.8 and GPT 5.5?

Per Sakana, Fugu consistently outperforms Gemini 3.1 Pro, Opus 4.8, and GPT 5.5 on a specific list of application tasks.

The precise reading is that Fugu can orchestrate a team that may include those same kinds of models. A team result can beat a solo result while still depending on the solo model. Do not present it as Fugu’s own weights beating every individual model.

How do I call the Fugu API?

Use an OpenAI-compatible client, set your Fugu API key, set the Fugu base URL from the Sakana console, and send a standard Chat Completions request.

Example:

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_FUGU_API_KEY",
    base_url="<YOUR_FUGU_BASE_URL_FROM_CONSOLE>",
)

response = client.chat.completions.create(
    model="fugu",
    messages=[
        {"role": "user", "content": "Explain this error log and suggest a fix."}
    ],
)

print(response.choices[0].message.content)

The base URL is not public, so copy it from console.sakana.ai.

For a complete example, see the Sakana Fugu API guide.

Is Fugu available to everyone right now?

Access runs through the product page and console.sakana.ai, behind Google or email login.

The beta reportedly ran with roughly 500 users from late April 2026. Whether fully self-serve GA signup is open, and whether any regional restrictions apply, should be checked live in the console.

How is Fugu different from a router or an agent framework?

A router chooses one model and forwards your request.

An agent framework gives you primitives to build multi-agent workflows, but you write the coordination logic.

Fugu trains the coordinator itself. A learned model decides delegation, communication, and synthesis, then exposes the result behind one endpoint.

Bottom line

Fugu is a bet that the next gains come from how models work together, not only from larger single-model weights.

For developers, the practical evaluation is straightforward:

Use the OpenAI-compatible API.
Test fugu and fugu-ultra on your real tasks.
Measure latency, cost, and output quality.
Compare against the single-model endpoints you already use.
Decide whether orchestration improves your workload enough to justify the overhead.

Set up your first Fugu request in Apidog, save it as a reusable test case, and compare the results against your current LLM stack.

DEV Community