TL;DR
MiMo-V2-Pro Pricing starts at $1/1M input tokens and $3/1M output tokens (≤256K context). MiMo-V2-Omni Pricing covers multimodal inputs—text, image, audio, and video—in a unified model. Both are accessible via an OpenAI-compatible API at platform.xiaomimimo.com. Use Apidog for visual API testing, or Python for production integrations, and always back your integration with a unit test.
Introduction
Xiaomi released three AI models on March 18, 2026, and the developer community responded quickly. MiMo-V2-Pro and MiMo-V2-Omni are the flagship releases: one for deep agentic reasoning, the other for true multimodal understanding. This guide breaks down MiMo-V2-Pro Pricing, Omni Pricing, and actionable steps on how to use the API in your stack. You'll get a clear overview of pricing tiers, API capabilities, and two integration paths: GUI-based workflow with Apidog and a Python-based approach with a unit test for validation.
💡 Before writing code for the MiMo-V2-Pro or Omni API, download Apidog for free. You can visually test requests, validate responses, add unit test assertions, and debug token usage instantly—all without burning tokens or writing a single line of Python.
MiMo-V2-Pro Pricing & MiMo-V2-Omni Pricing Breakdown
Before you start calling the API, understand MiMo-V2-Pro Pricing and Omni Pricing. Both models use tiered, token-based pricing suitable for production workloads.
MiMo-V2-Pro Pricing: Tiered by Context Length
MiMo-V2-Pro Pricing is split by context size:
| Context Length | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| ≤ 256K tokens | $1.00 | $3.00 |
| 256K – 1M tokens | $2.00 | $6.00 |
For most workloads under 256K tokens, MiMo-V2-Pro is highly cost-effective (output at $3/1M is only 1/8th the price of Claude Opus). For long-horizon tasks, like processing full codebases, the 256K–1M tier applies.
MiMo-V2-Omni Pricing
Omni Pricing follows a similar structure but also covers multimodal inputs. MiMo-V2-Omni natively processes text, image, audio, and video—image and audio tokens are counted alongside text tokens. For pure text tasks, pricing is comparable to MiMo-V2-Pro. Multimodal tasks will have higher token counts per request.
MiMo-V2 Family Pricing Comparison
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context Window | Modalities |
|---|---|---|---|---|
| MiMo-V2-Pro | $1.00 / $2.00* | $3.00 / $6.00* | 1M tokens | Text |
| MiMo-V2-Omni | ~$1.00* | ~$3.00* | 256K tokens | Text, Image, Audio, Video |
| MiMo-V2-Flash | $0.10 | $0.30 | 256K tokens | Text |
*Tiered or approximate—verify current rates at platform.xiaomimimo.com
- MiMo-V2-Flash: Cheapest for pure text.
- MiMo-V2-Pro: Best for deep reasoning and long context.
- MiMo-V2-Omni: For multimodal pipelines (text, image, audio, video).
MiMo-V2-Pro & Omni API Capabilities
Know your tool before integration:
MiMo-V2-Pro:
- 1 trillion total parameters (42B active)
- 1M token context window
- Multi-Token Prediction (MTP) for faster inference
- Designed for autonomous reasoning, tool execution, and software engineering
- Top-ranked in its price tier on the Artificial Analysis Intelligence Index
- Strong on SWE-Bench and coding benchmarks
MiMo-V2-Omni:
- Natively processes text, image, audio, and video
- Dedicated image/audio encoders at the architecture level
- Use for document understanding, transcription, video analysis, cross-modal reasoning
Both are exposed through OpenAI-compatible APIs at platform.xiaomimimo.com, allowing drop-in integration with OpenAI SDKs.
How to Use the API with Apidog
Apidog is the fastest way to explore the API visually—send requests, inspect responses, and run unit test assertions without writing code.
Setting Up MiMo-V2-Pro & Omni API Requests in Apidog
Step-by-step:
-
Open Apidog and create a new project (e.g.
MiMo-V2 API Tests). -
Create a new HTTP request:
- Method:
POST - URL:
https://api.xiaomimimo.com/v1/chat/completions
- Method:
- Add headers in the Headers tab:
| Key | Value |
|----------------|-----------------------------------|
| Authorization | Bearer YOUR_MIMO_API_KEY |
| Content-Type | application/json |
- Set the request body (Body → JSON):
For MiMo-V2-Pro:
{
"model": "mimo-v2-pro",
"messages": [
{
"role": "user",
"content": "Write a Python function that checks if a number is prime, and explain how you would unit test it."
}
],
"temperature": 0.6,
"max_tokens": 512
}
For MiMo-V2-Omni (with image input):
{
"model": "mimo-v2-omni",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Describe what you see in this image." },
{ "type": "image_url", "image_url": { "url": "https://example.com/diagram.png" } }
]
}
],
"max_tokens": 300
}
- Click Send. Apidog displays the full response with token usage so you can track MiMo-V2-Pro Pricing and Omni Pricing per request in real time.
Writing Unit Tests for the MiMo-V2-Pro & Omni API in Apidog
Use Apidog's built-in test scripting engine. After sending a request, open the Tests tab and add these assertions:
// Unit test 1: HTTP status is 200
pm.test("Status code is 200", function () {
pm.response.to.have.status(200);
});
// Unit test 2: Correct model returned
pm.test("Model ID is correct", function () {
const json = pm.response.json();
pm.expect(json.model).to.include("mimo-v2");
});
// Unit test 3: Response contains assistant message
pm.test("Assistant message is present", function () {
const json = pm.response.json();
pm.expect(json.choices[0].message.content).to.be.a("string").and.not.empty;
});
// Unit test 4: Token usage reported
pm.test("Token usage is present", function () {
const json = pm.response.json();
pm.expect(json.usage.total_tokens).to.be.above(0);
});
These four unit test checks cover status, model ID, response content, and token usage. Apidog runs them on every Send, catching regressions instantly. Save your collection and run it in CI with Apidog's CLI runner.
How to Use the API with Python
For production, use the OpenAI Python SDK (compatible with MiMo) and add a unit test suite.
Installation
pip install openai pytest
The MiMo API is OpenAI-compatible; the openai SDK works out of the box.
Basic API Call (MiMo-V2-Pro)
# mimo_client.py
from openai import OpenAI
client = OpenAI(
api_key="YOUR_MIMO_API_KEY",
base_url="https://api.xiaomimimo.com/v1"
)
def ask_mimo_pro(prompt: str) -> dict:
"""Call MiMo-V2-Pro API and return structured response."""
response = client.chat.completions.create(
model="mimo-v2-pro",
messages=[{"role": "user", "content": prompt}],
temperature=0.6,
max_tokens=512
)
return {
"content": response.choices[0].message.content,
"model": response.model,
"prompt_tokens": response.usage.prompt_tokens,
"completion_tokens": response.usage.completion_tokens,
"total_tokens": response.usage.total_tokens,
}
if __name__ == "__main__":
result = ask_mimo_pro("What is a unit test and why does it matter?")
print(result["content"])
# Estimate cost using MiMo-V2-Pro Pricing (≤256K tier)
input_cost = (result["prompt_tokens"] / 1_000_000) * 1.00
output_cost = (result["completion_tokens"] / 1_000_000) * 3.00
print(f"Estimated cost: ${input_cost + output_cost:.6f}")
Unit Test for the MiMo-V2-Pro API
# test_mimo_client.py
import pytest
from unittest.mock import patch, MagicMock
from mimo_client import ask_mimo_pro
@pytest.fixture
def mock_mimo_response():
"""Mock MiMo-V2-Pro API response for unit testing."""
mock = MagicMock()
mock.choices[0].message.content = (
"A unit test verifies a single function behaves correctly in isolation."
)
mock.model = "mimo-v2-pro"
mock.usage.prompt_tokens = 20
mock.usage.completion_tokens = 30
mock.usage.total_tokens = 50
return mock
@patch("mimo_client.client.chat.completions.create")
def test_returns_content(mock_create, mock_mimo_response):
"""Unit test: API returns non-empty string content."""
mock_create.return_value = mock_mimo_response
result = ask_mimo_pro("What is a unit test?")
assert isinstance(result["content"], str)
assert len(result["content"]) > 0
@patch("mimo_client.client.chat.completions.create")
def test_correct_model(mock_create, mock_mimo_response):
"""Unit test: confirms mimo-v2-pro model ID is used."""
mock_create.return_value = mock_mimo_response
result = ask_mimo_pro("Hello")
assert result["model"] == "mimo-v2-pro"
@patch("mimo_client.client.chat.completions.create")
def test_token_usage_for_pricing(mock_create, mock_mimo_response):
"""Unit test: token usage present for MiMo-V2-Pro Pricing tracking."""
mock_create.return_value = mock_mimo_response
result = ask_mimo_pro("Hello")
assert result["total_tokens"] > 0
assert result["prompt_tokens"] + result["completion_tokens"] == result["total_tokens"]
Run the unit tests:
pytest test_mimo_client.py -v
Expected output:
test_mimo_client.py::test_returns_content PASSED
test_mimo_client.py::test_correct_model PASSED
test_mimo_client.py::test_token_usage_for_pricing PASSED
3 passed in 0.28s
Mocking the API in your unit test suite means zero token spend during CI runs, which matters as MiMo-V2-Pro Pricing scales with every request in automated pipelines.
MiMo-V2-Pro & Omni API Best Practices
Maximize reliability and minimize cost with these practices:
-
Track token usage: Log
prompt_tokensandcompletion_tokensper call. At $1/1M input and $3/1M output, keep system prompts concise. - Prototype with Apidog: Use Apidog for prompt iteration and response validation before coding. Share collections with your team.
-
Write unit tests from day one: Add unit tests for every function calling the API. Mock responses (
unittest.mockin Python) for fast, cost-free CI. Use Apidog's test scripts for GUI-based coverage. - Choose the right model: Use MiMo-V2-Pro for text/code/reasoning tasks and MiMo-V2-Omni for multimodal inputs. Don’t pay Omni rates for text-only jobs.
- Stay under 256K context when possible: Pricing doubles at the 256K–1M tier. For RAG pipelines, retrieve the most relevant chunks instead of full datasets.
-
Leverage OpenAI SDK compatibility: Swap the
base_urlandmodelto integrate MiMo models into any OpenAI-based codebase—no new SDK needed.
Conclusion
MiMo-V2-Pro Pricing at $1/1M input and $3/1M output is among the most cost-effective for flagship reasoning models. Omni Pricing extends this value to unified multimodal workloads (text, image, audio, video).
Whether you're testing how to use the API via Apidog or building a Python integration with unit tests, MiMo-V2-Pro and MiMo-V2-Omni fit directly into modern developer workflows. Start with Apidog for visual validation, then move to code with confidence.
Try Apidog free—no credit card required.
FAQ
What is MiMo-V2-Pro Pricing?
MiMo-V2-Pro Pricing is $1/1M input tokens and $3/1M output tokens for context up to 256K. For context between 256K and 1M tokens, it's $2/1M input and $6/1M output.
What is MiMo-V2-Omni Pricing?
Omni Pricing is comparable to MiMo-V2-Pro for text inputs. Multimodal inputs (image, audio, video) are tokenized and billed alongside text tokens. Check platform.xiaomimimo.com for the latest Omni Pricing rates.
How do I use the MiMo-V2-Pro API?
Use the OpenAI Python SDK with base_url="https://api.xiaomimimo.com/v1" and model="mimo-v2-pro". The API is fully OpenAI-compatible. Use Apidog to test it visually before writing code.
How do I write a unit test for the MiMo API?
Mock the API client with unittest.mock in Python and assert on the response structure. In Apidog, use the Tests tab to add JavaScript-based unit test assertions after each request.
What is the difference between MiMo-V2-Pro and MiMo-V2-Omni?
MiMo-V2-Pro is a text-only reasoning model with 1T parameters and a 1M token context window. MiMo-V2-Omni is a multimodal model that natively handles text, image, audio, and video.
How does MiMo-V2-Pro Pricing compare to MiMo-V2-Flash?
MiMo-V2-Flash is much cheaper at $0.10/1M input and $0.30/1M output, but MiMo-V2-Pro offers significantly stronger reasoning and a 1M token context window. Choose based on task complexity.
Where can I access the MiMo API?
The MiMo API is available at platform.xiaomimimo.com. Both MiMo-V2-Pro and MiMo-V2-Omni are also accessible via third-party providers like OpenRouter and Vercel AI Gateway.
Top comments (0)