Cursor shipped Composer 2.5 on May 18, 2026. The headline for developers is simple: Cursor reports a coding model that matches Opus 4.7 and GPT-5.5 on real software benchmarks while costing under a dollar per task. If you use AI agents for production code, that price-to-quality ratio changes which tasks are worth delegating.
This guide shows how to evaluate and use Composer 2.5 in practice: what the model is, how to enable it in Cursor, how the pricing works, and how to pair it with Apidog so generated API code is based on your real API contract instead of guesses.
What is Cursor Composer 2.5?
Composer 2.5 is Cursor’s agentic coding model. It is designed to work inside the Cursor editor, where it can:
- Plan multi-step coding tasks
- Read and edit files
- Run terminal commands
- Use available tools
- Verify changes against tests or command output
It succeeds Composer 2 and moves the workflow from “AI autocomplete” toward “AI agent that can complete repository-level tasks.”
Key facts:
- It is built on the open-source Moonshot Kimi K2.5 checkpoint, a roughly one-trillion-parameter base model.
- Cursor put about 85% of the training compute budget into post-training and reinforcement learning.
- It trained on 25x more synthetic tasks than Composer 2, including tasks where Cursor deletes a feature and the model must rebuild it until tests pass.
For day-to-day work, the important change is consistency. Composer 2 was fast but could drift during long tasks. Composer 2.5 is designed to maintain context across longer sessions, follow multi-step instructions more reliably, and better estimate how much work a request requires.
If you want more background on the previous generation, the Composer 2 guide explains the baseline that 2.5 improves on.
What changed under the hood?
Three training changes matter for developers using it:
Targeted RL with textual feedback
Instead of rewarding only the final result, Cursor adds short feedback hints into local context and distills the improved behavior back into the model. This helps the model avoid unavailable tools and recover from mistakes.More synthetic repository tasks
Composer 2.5 trained on 25x more synthetic tasks than Composer 2. These tasks are closer to real repo work because they are verified by tests, not only by generated examples.Faster training infrastructure
Cursor used a sharded Muon optimizer with dual-mesh HSDP, enabling a 0.2-second optimizer step while training a 1T-parameter model. You do not interact with this directly, but faster training loops let Cursor iterate more on model quality.
The practical result: Composer 2.5 is better suited for messy, multi-file tasks such as implementing features, updating tests, and fixing regressions.
Composer 2.5 benchmarks
Cursor reports Composer 2.5 against Opus 4.7 and GPT-5.5 on three benchmark suites:
| Benchmark | Composer 2.5 | Opus 4.7 | GPT-5.5 |
|---|---|---|---|
| SWE-bench Multilingual | 79.8% | 80.5% | 77.8% |
| Terminal-Bench 2.0 | 69.3% | 69.4% | 82.7% |
| CursorBench v3.1 | 63.2% | 64.8% (max) / 61.6% (default) | 59.2% (default) |
How to read those numbers:
- On SWE-bench Multilingual, Composer 2.5 reaches 79.8%, close to Opus 4.7 and ahead of GPT-5.5.
- On CursorBench v3.1, it beats Opus 4.7’s default setting.
- On Terminal-Bench 2.0, GPT-5.5 leads clearly at 82.7%.
If your work is mostly editor-based agent tasks, Composer 2.5 is competitive with frontier models. If your workflow depends heavily on long terminal-only sequences, Terminal-Bench suggests GPT-5.5 may still have an edge.
The cost is the bigger story. Cursor reports roughly 63% on CursorBench at under $1 average cost per task. Opus 4.7 and GPT-5.5 can cost several dollars per task for similar or worse results, with some comparisons putting competitor costs as high as eleven dollars. Independent coverage from The Decoder reached the same conclusion: near-frontier coding quality at a much lower price.
Composer 2.5 pricing
Cursor offers two Composer 2.5 variants:
| Variant | Input | Output | When to use it |
|---|---|---|---|
| Standard | $0.50 / M tokens | $2.50 / M tokens | Default choice for cost-efficient agent work |
| Fast | $3.00 / M tokens | $15.00 / M tokens | Use when latency matters during live iteration |
The fast variant has the same intelligence but lower latency. It is the default in the product.
Billing depends on your Cursor plan:
- Individual plans include a standalone Composer usage pool, so many solo developers may not hit per-token billing during normal use.
- Team and enterprise plans are charged at the API rate directly.
- Launch promo: Cursor doubled Composer 2.5 usage for the first week after release.
For more pricing details, see the Cursor Composer pricing guide. If you want to test Composer without paying immediately, the Composer for free walkthrough explains the included-usage path.
How to access Cursor Composer 2.5
Set up Composer 2.5 in Cursor with this checklist:
-
Update Cursor
Open Cursor and check for updates:
- macOS: Cursor menu
- Windows/Linux: Help menu
Restart Cursor after installing the update.
Sign in to a supported plan
Pro and Business plans include Composer usage. Free accounts may be able to try it through included allowances, but sustained usage requires a paid plan.Open the model picker
Start a chat or agent session, then open the model dropdown.Select
composer-2.5
The fast variant is usually selected by default.Use Agent mode
Composer is built for agent work. Use Agent mode instead of plain chat so it can edit files, run commands, and use tools.
The official Composer 2.5 model docs list the current defaults if Cursor changes them.
If you are new to Cursor’s agent workflow, the Cursor 2.0 overview explains the editor surface.
How to use Composer 2.5 effectively
Composer 2.5 works best when you give it a clear task, clear constraints, and a way to verify the result.
1. Give it repository-level tasks
Avoid prompts like:
Fix this function.
Use a concrete task with a clear end state:
Add cursor-based pagination to the GET /orders endpoint.
Requirements:
- Accept limit and cursor query parameters.
- Return items and nextCursor.
- Preserve existing response fields.
- Add unit tests for valid pagination, missing cursor, and invalid cursor.
- Run the relevant test suite and fix failures.
Composer 2.5 is trained for sustained multi-step work, so let it plan, edit, run tests, and iterate.
2. Define the success condition
The model performs better when it knows how the task will be judged.
Example:
Done means:
- All existing tests still pass.
- New tests cover success and validation failures.
- Invalid limit returns 422.
- The OpenAPI schema is updated.
- No unrelated files are modified.
This gives the agent a target for self-correction.
3. Choose the right variant
Use the variants intentionally:
- Standard: batch work, background refactors, lower cost
- Fast: live pair-programming, debugging loops, interactive edits
The quality is the same. You are choosing between latency and cost.
4. Keep the model grounded in real project context
AI agents are strongest when they can inspect the real codebase, tests, and API specs. They are weakest when they must infer hidden details.
For example, if you ask:
Write a TypeScript client for our payments API.
Composer may generate clean code, but it might assume the wrong endpoint path, auth scheme, field names, or error shape.
A better prompt is:
Using the API spec available through MCP, generate a TypeScript client for the payments API.
Requirements:
- Use the real endpoint paths and schemas.
- Include typed request and response objects.
- Add tests for success and error responses.
- Do not invent fields that are not in the spec.
Composer 2.5 plus your API workflow
Most production coding tasks touch APIs. The fastest way to reduce AI-generated bugs is to connect the agent to your actual API contract and then verify the generated calls.
Use this workflow:
Connect the API spec to Cursor
The Apidog MCP server connects your Apidog API specification directly to Cursor. Composer 2.5 can then generate request code, types, and tests from your real schema.Ask Composer to implement against the spec
Example prompt:
Use the API spec from Apidog MCP to implement the orders client.
Generate:
- TypeScript request/response types
- A client method for creating an order
- Validation/error handling based on the documented responses
- Tests using the documented examples
-
Run the generated calls in Apidog
Open the endpoints in Apidog, send real requests, and confirm:- Auth works
- Status codes match expectations
- Request bodies match the schema
- Response shapes match the generated types
Turn verified calls into tests and mocks
Once the requests work, use them as the basis for automated tests and mock servers.
The loop is:
Real API spec → Composer-generated code → Apidog verification → tests/mocks
That keeps agent speed from becoming debugging debt.
If you use other agents with Cursor, the best MCP servers for Cursor roundup covers complementary options.
Composer 2.5 vs the competition
Use this comparison when choosing a daily coding model:
vs Opus 4.7
Composer 2.5 is close on SWE-bench Multilingual and CursorBench, while costing far less per task. Opus still leads at the highest CursorBench max setting.vs GPT-5.5
Composer 2.5 leads on SWE-bench Multilingual and CursorBench. GPT-5.5 leads clearly on Terminal-Bench 2.0.vs Claude Code
Composer 2.5 runs inside the Cursor editor. Claude Code is a terminal agent. The best choice depends on whether your workflow is editor-first or terminal-first. See the Claude Code vs Cursor comparison for details.vs GitHub Copilot
Copilot is strongest for inline completion. Composer 2.5 is built for multi-file agent tasks. The Cursor vs GitHub Copilot guide covers the workflow differences.
Cursor also said it is training a much larger model with xAI using about ten times the compute, so Composer 2.5 is likely a checkpoint in a broader model roadmap rather than the ceiling.
FAQ
Is Composer 2.5 free?
There is no fully free tier, but individual plans include a Composer usage pool that can cover normal daily work. Cursor also doubled usage for the launch week. The Composer for free guide explains the included allowance.
Is Composer 2.5 better than Composer 2?
Yes. Cursor reports SWE-bench Multilingual rising from 73.7% to 79.8%. Composer 2.5 also holds context better during long tasks. The Composer 2 guide is the baseline.
What model is Composer 2.5 based on?
Composer 2.5 is built on Moonshot’s open-source Kimi K2.5 checkpoint, then heavily post-trained by Cursor with reinforcement learning and synthetic tasks.
Should I use standard or fast?
Use standard for cost-efficient background work. Use fast when you are actively waiting on each response. The intelligence is the same; latency and price differ.
Does Composer 2.5 work with API specs and MCP?
Yes. Composer 2.5 supports Cursor’s agent tool set, including MCP. Connect your API spec through the Apidog MCP server so the model generates code against your actual schema.
Bottom line
Composer 2.5 makes frontier-quality coding agents more practical for everyday development. It delivers near-Opus-level results on real software tasks at a much lower reported cost per task, directly inside Cursor’s editor workflow.
To try it:
- Update Cursor.
- Open an Agent session.
- Select
composer-2.5. - Give it a concrete multi-file task.
- Define the success condition.
- Run tests and verify generated API calls.
For API-heavy work, connect your real specification first, then verify the generated requests in Apidog. Generate code against the contract, send live requests, confirm the responses, and convert working calls into tests or mocks. You can Download Apidog to add that verification loop to your Composer 2.5 workflow.


Top comments (0)