Hassann

Posted on May 19 • Originally published at apidog.com

Cursor Composer 2.5: What It Is, How to Use It, and How to Access It

Cursor shipped Composer 2.5 on May 18, 2026. The headline for developers is simple: Cursor reports a coding model that matches Opus 4.7 and GPT-5.5 on real software benchmarks while costing under a dollar per task. If you use AI agents for production code, that price-to-quality ratio changes which tasks are worth delegating.

Try Apidog today

This guide shows how to evaluate and use Composer 2.5 in practice: what the model is, how to enable it in Cursor, how the pricing works, and how to pair it with Apidog so generated API code is based on your real API contract instead of guesses.

What is Cursor Composer 2.5?

Composer 2.5 is Cursor’s agentic coding model. It is designed to work inside the Cursor editor, where it can:

Plan multi-step coding tasks
Read and edit files
Run terminal commands
Use available tools
Verify changes against tests or command output

It succeeds Composer 2 and moves the workflow from “AI autocomplete” toward “AI agent that can complete repository-level tasks.”

Key facts:

It is built on the open-source Moonshot Kimi K2.5 checkpoint, a roughly one-trillion-parameter base model.
Cursor put about 85% of the training compute budget into post-training and reinforcement learning.
It trained on 25x more synthetic tasks than Composer 2, including tasks where Cursor deletes a feature and the model must rebuild it until tests pass.

For day-to-day work, the important change is consistency. Composer 2 was fast but could drift during long tasks. Composer 2.5 is designed to maintain context across longer sessions, follow multi-step instructions more reliably, and better estimate how much work a request requires.

If you want more background on the previous generation, the Composer 2 guide explains the baseline that 2.5 improves on.

What changed under the hood?

Three training changes matter for developers using it:

Targeted RL with textual feedback

Instead of rewarding only the final result, Cursor adds short feedback hints into local context and distills the improved behavior back into the model. This helps the model avoid unavailable tools and recover from mistakes.
More synthetic repository tasks

Composer 2.5 trained on 25x more synthetic tasks than Composer 2. These tasks are closer to real repo work because they are verified by tests, not only by generated examples.
Faster training infrastructure

Cursor used a sharded Muon optimizer with dual-mesh HSDP, enabling a 0.2-second optimizer step while training a 1T-parameter model. You do not interact with this directly, but faster training loops let Cursor iterate more on model quality.

The practical result: Composer 2.5 is better suited for messy, multi-file tasks such as implementing features, updating tests, and fixing regressions.

Composer 2.5 benchmarks

Cursor reports Composer 2.5 against Opus 4.7 and GPT-5.5 on three benchmark suites:

Benchmark	Composer 2.5	Opus 4.7	GPT-5.5
SWE-bench Multilingual	79.8%	80.5%	77.8%
Terminal-Bench 2.0	69.3%	69.4%	82.7%
CursorBench v3.1	63.2%	64.8% (max) / 61.6% (default)	59.2% (default)

How to read those numbers:

On SWE-bench Multilingual, Composer 2.5 reaches 79.8%, close to Opus 4.7 and ahead of GPT-5.5.
On CursorBench v3.1, it beats Opus 4.7’s default setting.
On Terminal-Bench 2.0, GPT-5.5 leads clearly at 82.7%.

If your work is mostly editor-based agent tasks, Composer 2.5 is competitive with frontier models. If your workflow depends heavily on long terminal-only sequences, Terminal-Bench suggests GPT-5.5 may still have an edge.

The cost is the bigger story. Cursor reports roughly 63% on CursorBench at under $1 average cost per task. Opus 4.7 and GPT-5.5 can cost several dollars per task for similar or worse results, with some comparisons putting competitor costs as high as eleven dollars. Independent coverage from The Decoder reached the same conclusion: near-frontier coding quality at a much lower price.

Composer 2.5 pricing

Cursor offers two Composer 2.5 variants:

Variant	Input	Output	When to use it
Standard	$0.50 / M tokens	$2.50 / M tokens	Default choice for cost-efficient agent work
Fast	$3.00 / M tokens	$15.00 / M tokens	Use when latency matters during live iteration

The fast variant has the same intelligence but lower latency. It is the default in the product.

Billing depends on your Cursor plan:

Individual plans include a standalone Composer usage pool, so many solo developers may not hit per-token billing during normal use.
Team and enterprise plans are charged at the API rate directly.
Launch promo: Cursor doubled Composer 2.5 usage for the first week after release.

For more pricing details, see the Cursor Composer pricing guide. If you want to test Composer without paying immediately, the Composer for free walkthrough explains the included-usage path.

How to access Cursor Composer 2.5

Set up Composer 2.5 in Cursor with this checklist:

Update Cursor Open Cursor and check for updates:
- macOS: Cursor menu
- Windows/Linux: Help menu

Restart Cursor after installing the update.

Sign in to a supported plan

Pro and Business plans include Composer usage. Free accounts may be able to try it through included allowances, but sustained usage requires a paid plan.
Open the model picker

Start a chat or agent session, then open the model dropdown.
Select composer-2.5

The fast variant is usually selected by default.
Use Agent mode

Composer is built for agent work. Use Agent mode instead of plain chat so it can edit files, run commands, and use tools.

The official Composer 2.5 model docs list the current defaults if Cursor changes them.

If you are new to Cursor’s agent workflow, the Cursor 2.0 overview explains the editor surface.

How to use Composer 2.5 effectively

Composer 2.5 works best when you give it a clear task, clear constraints, and a way to verify the result.

1. Give it repository-level tasks

Avoid prompts like:

Fix this function.

Use a concrete task with a clear end state:

Add cursor-based pagination to the GET /orders endpoint.

Requirements:
- Accept limit and cursor query parameters.
- Return items and nextCursor.
- Preserve existing response fields.
- Add unit tests for valid pagination, missing cursor, and invalid cursor.
- Run the relevant test suite and fix failures.

Composer 2.5 is trained for sustained multi-step work, so let it plan, edit, run tests, and iterate.

2. Define the success condition

The model performs better when it knows how the task will be judged.

Example:

Done means:
- All existing tests still pass.
- New tests cover success and validation failures.
- Invalid limit returns 422.
- The OpenAPI schema is updated.
- No unrelated files are modified.

This gives the agent a target for self-correction.

3. Choose the right variant

Use the variants intentionally:

Standard: batch work, background refactors, lower cost
Fast: live pair-programming, debugging loops, interactive edits

The quality is the same. You are choosing between latency and cost.

4. Keep the model grounded in real project context

AI agents are strongest when they can inspect the real codebase, tests, and API specs. They are weakest when they must infer hidden details.

For example, if you ask:

Write a TypeScript client for our payments API.

Composer may generate clean code, but it might assume the wrong endpoint path, auth scheme, field names, or error shape.

A better prompt is:

Using the API spec available through MCP, generate a TypeScript client for the payments API.

Requirements:
- Use the real endpoint paths and schemas.
- Include typed request and response objects.
- Add tests for success and error responses.
- Do not invent fields that are not in the spec.

Composer 2.5 plus your API workflow

Most production coding tasks touch APIs. The fastest way to reduce AI-generated bugs is to connect the agent to your actual API contract and then verify the generated calls.

Use this workflow:

Connect the API spec to Cursor

The Apidog MCP server connects your Apidog API specification directly to Cursor. Composer 2.5 can then generate request code, types, and tests from your real schema.
Ask Composer to implement against the spec

Example prompt:

   Use the API spec from Apidog MCP to implement the orders client.

   Generate:
   - TypeScript request/response types
   - A client method for creating an order
   - Validation/error handling based on the documented responses
   - Tests using the documented examples

Run the generated calls in Apidog

Open the endpoints in Apidog, send real requests, and confirm:
- Auth works
- Status codes match expectations
- Request bodies match the schema
- Response shapes match the generated types
Turn verified calls into tests and mocks

Once the requests work, use them as the basis for automated tests and mock servers.

The loop is:

Real API spec → Composer-generated code → Apidog verification → tests/mocks

That keeps agent speed from becoming debugging debt.

If you use other agents with Cursor, the best MCP servers for Cursor roundup covers complementary options.

Composer 2.5 vs the competition

Use this comparison when choosing a daily coding model:

vs Opus 4.7

Composer 2.5 is close on SWE-bench Multilingual and CursorBench, while costing far less per task. Opus still leads at the highest CursorBench max setting.
vs GPT-5.5

Composer 2.5 leads on SWE-bench Multilingual and CursorBench. GPT-5.5 leads clearly on Terminal-Bench 2.0.
vs Claude Code

Composer 2.5 runs inside the Cursor editor. Claude Code is a terminal agent. The best choice depends on whether your workflow is editor-first or terminal-first. See the Claude Code vs Cursor comparison for details.
vs GitHub Copilot

Copilot is strongest for inline completion. Composer 2.5 is built for multi-file agent tasks. The Cursor vs GitHub Copilot guide covers the workflow differences.

Cursor also said it is training a much larger model with xAI using about ten times the compute, so Composer 2.5 is likely a checkpoint in a broader model roadmap rather than the ceiling.

FAQ

Is Composer 2.5 free?

There is no fully free tier, but individual plans include a Composer usage pool that can cover normal daily work. Cursor also doubled usage for the launch week. The Composer for free guide explains the included allowance.

Is Composer 2.5 better than Composer 2?

Yes. Cursor reports SWE-bench Multilingual rising from 73.7% to 79.8%. Composer 2.5 also holds context better during long tasks. The Composer 2 guide is the baseline.

What model is Composer 2.5 based on?

Composer 2.5 is built on Moonshot’s open-source Kimi K2.5 checkpoint, then heavily post-trained by Cursor with reinforcement learning and synthetic tasks.

Should I use standard or fast?

Use standard for cost-efficient background work. Use fast when you are actively waiting on each response. The intelligence is the same; latency and price differ.

Does Composer 2.5 work with API specs and MCP?

Yes. Composer 2.5 supports Cursor’s agent tool set, including MCP. Connect your API spec through the Apidog MCP server so the model generates code against your actual schema.

Bottom line

Composer 2.5 makes frontier-quality coding agents more practical for everyday development. It delivers near-Opus-level results on real software tasks at a much lower reported cost per task, directly inside Cursor’s editor workflow.

To try it:

Update Cursor.
Open an Agent session.
Select composer-2.5.
Give it a concrete multi-file task.
Define the success condition.
Run tests and verify generated API calls.

For API-heavy work, connect your real specification first, then verify the generated requests in Apidog. Generate code against the contract, send live requests, confirm the responses, and convert working calls into tests or mocks. You can Download Apidog to add that verification loop to your Composer 2.5 workflow.

DEV Community