The Kimi K2.7 Code API exposes Moonshot’s coding-tuned trillion-parameter model through an OpenAI-compatible endpoint. If your app already calls the OpenAI API, integration is mostly a base URL and model ID change. Moonshot also provides an Anthropic-compatible endpoint, so you can use it in Claude Code with environment-variable config.
This guide shows how to call Kimi K2.7 Code with curl, Python, and Node.js, how to configure editor tools, what pricing to expect, and how to test the API in Apidog before shipping it.
TL;DR
-
OpenAI-compatible base URL:
https://api.moonshot.ai/v1 -
Anthropic-compatible base URL:
https://api.moonshot.ai/anthropic -
Pay-per-token model ID:
kimi-k2.7-code -
Kimi Code subscription model ID:
kimi-for-coding - Pricing: $0.95 / 1M input tokens, $4.00 / 1M output tokens, $0.19 / 1M cache-hit tokens
- Reasoning: always enabled, so responses include reasoning tokens
- Main integration path: use your existing OpenAI SDK and override the base URL
Choose the right access path
Kimi K2.7 Code has two common access patterns.
Option 1: Pay-per-token developer API
Use this when you are building scripts, backend services, agents, CI tools, or production features.
Base URL: https://api.moonshot.ai/v1
Model ID: kimi-k2.7-code
Billing: usage-based, per token
This is the path used in the code examples below.
Option 2: Kimi Code subscription
Use this for heavy interactive coding with the Kimi Code CLI and console.
Model ID: kimi-for-coding
Billing: flat-rate subscription quota refreshed every 7 days
If you are integrating Kimi into your own application, use the pay-per-token API.
Step 1: Create an API key
- Open the Kimi platform console.
- Create a new API key.
- Copy it immediately.
- Store it in an environment variable or secret manager.
For local development:
export MOONSHOT_API_KEY="sk-your-key-here"
Do not commit this key to Git, expose it in frontend code, or paste it into logs.
Step 2: Send your first request with curl
Kimi K2.7 Code uses an OpenAI-style chat completions API.
curl https://api.moonshot.ai/v1/chat/completions \
-H "Authorization: Bearer $MOONSHOT_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.7-code",
"messages": [
{
"role": "system",
"content": "You are a careful senior engineer."
},
{
"role": "user",
"content": "Write a Python function that validates an email and returns a clear error message."
}
]
}'
The response follows the OpenAI chat completions shape:
{
"choices": [
{
"message": {
"role": "assistant",
"content": "..."
}
}
],
"usage": {
"prompt_tokens": 0,
"completion_tokens": 0,
"total_tokens": 0
}
}
Use the usage object to track input, output, and reasoning-token cost.
Step 3: Call Kimi K2.7 Code from Python
Install the OpenAI SDK if you do not already have it:
pip install openai
Then configure the client with Moonshot’s base URL:
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["MOONSHOT_API_KEY"],
base_url="https://api.moonshot.ai/v1",
)
resp = client.chat.completions.create(
model="kimi-k2.7-code",
messages=[
{
"role": "user",
"content": "Refactor this loop for readability and explain why."
}
],
)
print(resp.choices[0].message.content)
print(resp.usage)
The key implementation detail is this line:
base_url="https://api.moonshot.ai/v1"
If your app already uses the OpenAI SDK, you can usually reuse the same request structure and change only the provider config.
Step 4: Call Kimi K2.7 Code from Node.js
Install the SDK:
npm install openai
Create a client with Moonshot’s base URL:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.MOONSHOT_API_KEY,
baseURL: "https://api.moonshot.ai/v1",
});
const resp = await client.chat.completions.create({
model: "kimi-k2.7-code",
messages: [
{
role: "user",
content: "Write a Jest test for an empty-input edge case.",
},
],
});
console.log(resp.choices[0].message.content);
console.log(resp.usage);
For server-side apps, keep MOONSHOT_API_KEY on the backend. Do not expose it to browser clients.
Model behavior to account for
Reasoning is always enabled
Kimi K2.7 Code forces thinking on. That means every response includes reasoning tokens, and those tokens are billed as output tokens.
Compared with K2.6, K2.7 Code reasons about 30% more efficiently, so the output-token bill should be lighter for the same type of work.
Tool calling uses the OpenAI pattern
You can pass a tools array with function schemas. The model returns tool-call objects, your application executes them, and you feed the results back into the conversation.
That makes it suitable for agent workflows such as:
- reading files
- calling internal APIs
- running tests
- creating pull-request summaries
- chaining multi-step coding tasks
Multimodal input is supported
You can include image content in the messages array for tasks such as:
- debugging a UI screenshot
- reading an error dialog
- interpreting a diagram
- reviewing visual test output
Pricing
Pay-per-token pricing:
| Token type | Price per million |
|---|---|
| Input | $0.95 |
| Output, including reasoning | $4.00 |
| Cache hit | $0.19 |
Two practical ways to reduce cost:
- Reuse stable context. Repeated system prompts and shared context can benefit from cheaper cache-hit pricing.
- Track completion tokens. Reasoning is always enabled, so set budgets around output tokens.
For more cost-control patterns, see the guide on reducing agent token costs. For comparison with other models, see the Kimi K2.6 API guide and the DeepSeek V4 API guide.
Use Kimi K2.7 Code in Claude Code, Cline, RooCode, or Cursor
You can also use Kimi K2.7 Code inside coding tools without writing your own client.
Claude Code
Use the Anthropic-compatible endpoint:
export ANTHROPIC_BASE_URL="https://api.moonshot.ai/anthropic"
export ANTHROPIC_AUTH_TOKEN="$MOONSHOT_API_KEY"
export ANTHROPIC_MODEL="kimi-k2.7-code"
After setting these variables, Claude Code can route requests to Kimi K2.7 Code.
Cline and RooCode
Configure the provider manually:
- Select Moonshot as the provider.
- Use the
api.moonshot.aiendpoint. - Paste your Moonshot API key.
- Select
kimi-k2.7-codeas the model. - Disable the browser tool for the smoothest run.
Cursor
Add Kimi as an OpenAI-compatible custom endpoint:
Base URL: https://api.moonshot.ai/v1
Model: kimi-k2.7-code
API key: your Moonshot API key
The setup is similar to the Kimi-in-Cursor guide; only the model ID changes.
Test and debug the API in Apidog
Before connecting Kimi K2.7 Code to an agent or production workflow, test the raw API response. Apidog gives you a visual workspace for sending requests, inspecting JSON, and creating reusable tests.
Create the request
Create a new request:
Method: POST
URL: https://api.moonshot.ai/v1/chat/completions
Add headers:
Authorization: Bearer {{MOONSHOT_API_KEY}}
Content-Type: application/json
Store MOONSHOT_API_KEY as an Apidog environment variable instead of hardcoding it.
Use this request body:
{
"model": "kimi-k2.7-code",
"messages": [
{
"role": "system",
"content": "You are a careful senior engineer."
},
{
"role": "user",
"content": "Write a small Node.js function that validates an email address."
}
]
}
Add basic assertions
Turn the request into a regression test with checks like:
- status code is
200 -
choices[0].message.contentis not empty -
usage.completion_tokensstays below your budget -
usage.total_tokensexists
This gives you a repeatable test you can run after changing prompts, tools, or model configuration.
If you are testing MCP-based tool calls, the MCP server testing playbook shows how to catch broken tool contracts. You can also download Apidog to set up the workflow locally.
FAQ
What is the API base URL?
Use:
https://api.moonshot.ai/v1
for OpenAI-compatible calls.
Use:
https://api.moonshot.ai/anthropic
for Claude Code and other Anthropic-compatible clients.
Which model ID should I use?
For the pay-per-token API:
kimi-k2.7-code
For the Kimi Code subscription:
kimi-for-coding
Is Kimi K2.7 Code OpenAI-compatible?
Yes. The request and response format matches OpenAI chat completions, so existing OpenAI SDK-based code can work with a base URL change.
How much does it cost?
The pay-per-token pricing is:
- $0.95 per million input tokens
- $4.00 per million output tokens
- $0.19 per million cache-hit tokens
Do I always pay for reasoning tokens?
Yes. Thinking is forced on, so every response includes reasoning tokens billed at the output-token rate. K2.7 Code still uses about 30% fewer reasoning tokens than K2.6.
Can I send images?
Yes. The model is multimodal and supports image content in the messages array.
Summary
Kimi K2.7 Code is straightforward to integrate if you already use OpenAI-compatible clients. Set the base URL to https://api.moonshot.ai/v1, use the model ID kimi-k2.7-code, and track usage because reasoning tokens are always enabled. For editor workflows, configure Claude Code, Cline, RooCode, or Cursor with the same API key and model. Before shipping, test the request and token usage in Apidog so you can validate responses, enforce budgets, and catch regressions.

Top comments (0)