DEV Community

Cover image for How to Use the Kimi K2.7 Code API
Hassann
Hassann

Posted on • Originally published at apidog.com

How to Use the Kimi K2.7 Code API

The Kimi K2.7 Code API exposes Moonshot’s coding-tuned trillion-parameter model through an OpenAI-compatible endpoint. If your app already calls the OpenAI API, integration is mostly a base URL and model ID change. Moonshot also provides an Anthropic-compatible endpoint, so you can use it in Claude Code with environment-variable config.

Try Apidog today

This guide shows how to call Kimi K2.7 Code with curl, Python, and Node.js, how to configure editor tools, what pricing to expect, and how to test the API in Apidog before shipping it.

TL;DR

  • OpenAI-compatible base URL: https://api.moonshot.ai/v1
  • Anthropic-compatible base URL: https://api.moonshot.ai/anthropic
  • Pay-per-token model ID: kimi-k2.7-code
  • Kimi Code subscription model ID: kimi-for-coding
  • Pricing: $0.95 / 1M input tokens, $4.00 / 1M output tokens, $0.19 / 1M cache-hit tokens
  • Reasoning: always enabled, so responses include reasoning tokens
  • Main integration path: use your existing OpenAI SDK and override the base URL

Choose the right access path

Kimi K2.7 Code has two common access patterns.

Option 1: Pay-per-token developer API

Use this when you are building scripts, backend services, agents, CI tools, or production features.

Base URL: https://api.moonshot.ai/v1
Model ID: kimi-k2.7-code
Billing: usage-based, per token
Enter fullscreen mode Exit fullscreen mode

This is the path used in the code examples below.

Option 2: Kimi Code subscription

Use this for heavy interactive coding with the Kimi Code CLI and console.

Model ID: kimi-for-coding
Billing: flat-rate subscription quota refreshed every 7 days
Enter fullscreen mode Exit fullscreen mode

If you are integrating Kimi into your own application, use the pay-per-token API.

Step 1: Create an API key

  1. Open the Kimi platform console.
  2. Create a new API key.
  3. Copy it immediately.
  4. Store it in an environment variable or secret manager.

For local development:

export MOONSHOT_API_KEY="sk-your-key-here"
Enter fullscreen mode Exit fullscreen mode

Do not commit this key to Git, expose it in frontend code, or paste it into logs.

Step 2: Send your first request with curl

Kimi K2.7 Code uses an OpenAI-style chat completions API.

curl https://api.moonshot.ai/v1/chat/completions \
  -H "Authorization: Bearer $MOONSHOT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2.7-code",
    "messages": [
      {
        "role": "system",
        "content": "You are a careful senior engineer."
      },
      {
        "role": "user",
        "content": "Write a Python function that validates an email and returns a clear error message."
      }
    ]
  }'
Enter fullscreen mode Exit fullscreen mode

The response follows the OpenAI chat completions shape:

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "..."
      }
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  }
}
Enter fullscreen mode Exit fullscreen mode

Use the usage object to track input, output, and reasoning-token cost.

Step 3: Call Kimi K2.7 Code from Python

Install the OpenAI SDK if you do not already have it:

pip install openai
Enter fullscreen mode Exit fullscreen mode

Then configure the client with Moonshot’s base URL:

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["MOONSHOT_API_KEY"],
    base_url="https://api.moonshot.ai/v1",
)

resp = client.chat.completions.create(
    model="kimi-k2.7-code",
    messages=[
        {
            "role": "user",
            "content": "Refactor this loop for readability and explain why."
        }
    ],
)

print(resp.choices[0].message.content)
print(resp.usage)
Enter fullscreen mode Exit fullscreen mode

The key implementation detail is this line:

base_url="https://api.moonshot.ai/v1"
Enter fullscreen mode Exit fullscreen mode

If your app already uses the OpenAI SDK, you can usually reuse the same request structure and change only the provider config.

Step 4: Call Kimi K2.7 Code from Node.js

Install the SDK:

npm install openai
Enter fullscreen mode Exit fullscreen mode

Create a client with Moonshot’s base URL:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.MOONSHOT_API_KEY,
  baseURL: "https://api.moonshot.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "kimi-k2.7-code",
  messages: [
    {
      role: "user",
      content: "Write a Jest test for an empty-input edge case.",
    },
  ],
});

console.log(resp.choices[0].message.content);
console.log(resp.usage);
Enter fullscreen mode Exit fullscreen mode

For server-side apps, keep MOONSHOT_API_KEY on the backend. Do not expose it to browser clients.

Model behavior to account for

Reasoning is always enabled

Kimi K2.7 Code forces thinking on. That means every response includes reasoning tokens, and those tokens are billed as output tokens.

Compared with K2.6, K2.7 Code reasons about 30% more efficiently, so the output-token bill should be lighter for the same type of work.

Tool calling uses the OpenAI pattern

You can pass a tools array with function schemas. The model returns tool-call objects, your application executes them, and you feed the results back into the conversation.

That makes it suitable for agent workflows such as:

  • reading files
  • calling internal APIs
  • running tests
  • creating pull-request summaries
  • chaining multi-step coding tasks

Multimodal input is supported

You can include image content in the messages array for tasks such as:

  • debugging a UI screenshot
  • reading an error dialog
  • interpreting a diagram
  • reviewing visual test output

Pricing

Pay-per-token pricing:

Token type Price per million
Input $0.95
Output, including reasoning $4.00
Cache hit $0.19

Two practical ways to reduce cost:

  1. Reuse stable context. Repeated system prompts and shared context can benefit from cheaper cache-hit pricing.
  2. Track completion tokens. Reasoning is always enabled, so set budgets around output tokens.

For more cost-control patterns, see the guide on reducing agent token costs. For comparison with other models, see the Kimi K2.6 API guide and the DeepSeek V4 API guide.

Use Kimi K2.7 Code in Claude Code, Cline, RooCode, or Cursor

You can also use Kimi K2.7 Code inside coding tools without writing your own client.

Claude Code

Use the Anthropic-compatible endpoint:

export ANTHROPIC_BASE_URL="https://api.moonshot.ai/anthropic"
export ANTHROPIC_AUTH_TOKEN="$MOONSHOT_API_KEY"
export ANTHROPIC_MODEL="kimi-k2.7-code"
Enter fullscreen mode Exit fullscreen mode

After setting these variables, Claude Code can route requests to Kimi K2.7 Code.

Cline and RooCode

Configure the provider manually:

  1. Select Moonshot as the provider.
  2. Use the api.moonshot.ai endpoint.
  3. Paste your Moonshot API key.
  4. Select kimi-k2.7-code as the model.
  5. Disable the browser tool for the smoothest run.

Cursor

Add Kimi as an OpenAI-compatible custom endpoint:

Base URL: https://api.moonshot.ai/v1
Model: kimi-k2.7-code
API key: your Moonshot API key
Enter fullscreen mode Exit fullscreen mode

The setup is similar to the Kimi-in-Cursor guide; only the model ID changes.

Test and debug the API in Apidog

Before connecting Kimi K2.7 Code to an agent or production workflow, test the raw API response. Apidog gives you a visual workspace for sending requests, inspecting JSON, and creating reusable tests.

Create the request

Create a new request:

Method: POST
URL: https://api.moonshot.ai/v1/chat/completions
Enter fullscreen mode Exit fullscreen mode

Add headers:

Authorization: Bearer {{MOONSHOT_API_KEY}}
Content-Type: application/json
Enter fullscreen mode Exit fullscreen mode

Store MOONSHOT_API_KEY as an Apidog environment variable instead of hardcoding it.

Use this request body:

{
  "model": "kimi-k2.7-code",
  "messages": [
    {
      "role": "system",
      "content": "You are a careful senior engineer."
    },
    {
      "role": "user",
      "content": "Write a small Node.js function that validates an email address."
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Add basic assertions

Turn the request into a regression test with checks like:

  • status code is 200
  • choices[0].message.content is not empty
  • usage.completion_tokens stays below your budget
  • usage.total_tokens exists

This gives you a repeatable test you can run after changing prompts, tools, or model configuration.

If you are testing MCP-based tool calls, the MCP server testing playbook shows how to catch broken tool contracts. You can also download Apidog to set up the workflow locally.

FAQ

What is the API base URL?

Use:

https://api.moonshot.ai/v1
Enter fullscreen mode Exit fullscreen mode

for OpenAI-compatible calls.

Use:

https://api.moonshot.ai/anthropic
Enter fullscreen mode Exit fullscreen mode

for Claude Code and other Anthropic-compatible clients.

Which model ID should I use?

For the pay-per-token API:

kimi-k2.7-code
Enter fullscreen mode Exit fullscreen mode

For the Kimi Code subscription:

kimi-for-coding
Enter fullscreen mode Exit fullscreen mode

Is Kimi K2.7 Code OpenAI-compatible?

Yes. The request and response format matches OpenAI chat completions, so existing OpenAI SDK-based code can work with a base URL change.

How much does it cost?

The pay-per-token pricing is:

  • $0.95 per million input tokens
  • $4.00 per million output tokens
  • $0.19 per million cache-hit tokens

Do I always pay for reasoning tokens?

Yes. Thinking is forced on, so every response includes reasoning tokens billed at the output-token rate. K2.7 Code still uses about 30% fewer reasoning tokens than K2.6.

Can I send images?

Yes. The model is multimodal and supports image content in the messages array.

Summary

Kimi K2.7 Code is straightforward to integrate if you already use OpenAI-compatible clients. Set the base URL to https://api.moonshot.ai/v1, use the model ID kimi-k2.7-code, and track usage because reasoning tokens are always enabled. For editor workflows, configure Claude Code, Cline, RooCode, or Cursor with the same API key and model. Before shipping, test the request and token usage in Apidog so you can validate responses, enforce budgets, and catch regressions.

Top comments (0)