Arindam Majumder

Posted on Mar 19

Cursor Composer 2: Features, Pricing, Benchmarks, and Initial Impressions

#cursor #ai #llm #mcp

Introduction

Cursor has released Composer 2, the latest version of its in-house coding model.

The announcement is focused and fairly easy to summarize. Cursor is making three main claims:

Composer 2 is frontier-level at coding
it is materially better than previous Composer versions on Cursor’s published benchmarks
it is priced aggressively enough to be practical for everyday use

That combination makes the release worth paying attention to. In this post, I’ll walk through what Composer 2 is, what Cursor says improved, how the benchmark results look, what the pricing means, and my initial take on the release.

What Is Composer 2?

Composer 2 is Cursor’s latest in-house coding model.

Cursor describes it as frontier-level at coding and positions it as a better cost-performance option for agentic software work. The model is now available in Cursor, and the announcement puts most of the emphasis on three areas:

stronger coding performance
improved long-horizon task handling
lower cost than many competing fast models

Unlike some model launches that bundle a large number of product features together, this one is mostly about the model itself. Cursor is not presenting Composer 2 as a general platform shift. It is presenting it as a more capable and more economical coding model.

Composer 2 Key Features

The Composer 2 announcement is short, but there are still a few important takeaways.

Better coding performance

Cursor says Composer 2 delivers large improvements on all of the benchmarks it tracks, including Terminal-Bench 2.0 and SWE-bench Multilingual.

That matters because it suggests the gains are not limited to one internal evaluation. Cursor is showing improvement across several coding-oriented benchmarks rather than relying on a single headline number.

Continued pretraining

One of the most notable details in the post is that these improvements come from Cursor’s first continued pretraining run.

This is important because continued pretraining is often what gives a model a stronger base before more specialized post-training methods are applied. Cursor is explicitly saying that Composer 2 starts from a better foundation than earlier Composer versions.

Reinforcement learning for long-horizon tasks

Cursor also says it trains Composer 2 on long-horizon coding tasks using reinforcement learning.

This is probably the most interesting technical claim in the announcement. Cursor says Composer 2 can solve challenging tasks requiring hundreds of actions. That implies the model is being optimized for sustained multi-step software tasks, not just short code completions or simple edits.

A fast variant with the same intelligence

Cursor also introduces a faster Composer 2 variant and says it has the same intelligence.

That is a useful product choice. Instead of forcing users to pick between a “smart” model and a “fast” model family, Cursor is presenting speed as a deployment option on top of the same underlying capability level.

Composer 2 Benchmarks

Cursor publishes three benchmark comparisons in the announcement:

Model	CursorBench	Terminal-Bench 2.0	SWE-bench Multilingual
Composer 2	61.3	61.7	73.7
Composer 1.5	44.2	47.9	65.9
Composer 1	38.0	40.0	56.9

These gains are large enough to be meaningful.

The biggest point here is not just that Composer 2 is ahead of Composer 1 and 1.5, but that the improvements show up consistently across all three benchmarks. That gives the release more credibility than a single isolated result would.

Terminal-Bench 2.0 is especially relevant because Cursor frames it as an evaluation for agentic terminal use. If Composer 2 is genuinely stronger there, that supports Cursor’s claim that the model is getting better at longer, more interactive coding tasks.

SWE-bench Multilingual is also worth noting because it suggests broader coding competence beyond narrow English-only setups.

Still, these are vendor-published numbers, so the right takeaway is measured optimism rather than certainty.

How Composer 2 Is Priced

Cursor says Composer 2 is priced at:

$0.50 per million input tokens
$2.50 per million output tokens

The faster variant is priced at:

$1.50 per million input tokens
$7.50 per million output tokens

Cursor also says the fast variant has lower cost than other fast models and that fast will be the default option.

This part of the announcement is more important than it looks. Model releases are usually judged on benchmark quality first, but pricing determines whether a model becomes part of normal daily use or gets reserved for occasional high-value tasks. Cursor is clearly trying to push Composer 2 into the first category.

On individual plans, Composer usage is part of a standalone usage pool with generous usage included.

Composer 2 vs Earlier Composer Versions

Based on Cursor’s published table, Composer 2 is a clear step up from Composer 1.5 and Composer 1.

The improvement is visible across all the benchmarks included in the post, and Cursor attributes that jump to a combination of:

a stronger base model from continued pretraining
reinforcement learning on long-horizon coding tasks

That is a sensible recipe for a coding model. Better base training improves general capability, while long-horizon RL helps the model stay coherent over extended multi-step tasks.

From the announcement alone, Composer 2 looks like a real model upgrade rather than a minor iteration.

Initial Impressions

My first impression is that this is a disciplined release.

Cursor is not trying to claim that Composer 2 changes everything. The message is narrower and more believable: the model is better, it handles long-horizon coding tasks more effectively, and it is priced aggressively enough to be useful in regular workflows.

The long-horizon point is the one I would pay most attention to. A lot of coding models can produce a good patch in one pass. Fewer models stay reliable across a task that unfolds over many actions. If Composer 2 is genuinely stronger there, that is a meaningful improvement.

The pricing is the other major strength. A coding model can be strong on benchmarks and still be awkward in practice if the economics are wrong. Cursor seems to understand that and is making cost a central part of the launch rather than an afterthought.

At the same time, this is still an announcement built around Cursor’s own evaluation framing. The benchmark gains look strong, but the real test will be whether Composer 2 feels materially better in day-to-day software work.

Final Thoughts

Composer 2 looks like a meaningful upgrade to Cursor’s coding model stack.

The release is compelling for three reasons:

the benchmark gains are substantial
the training story is technically coherent
the pricing is practical

If you already use Cursor, Composer 2 is worth trying.

If you evaluate coding models more broadly, this release is notable because it tries to improve both capability and economics at the same time. That is the right combination to optimize for.

Top comments (1)

Brian Kirkpatrick • Mar 20

Probably won't be on huggingface any time soon, and it looks like I can't use it via OpenRouter. I wonder how much longer model trainers will feel justified in trying to control coder access? I feel like the lack of an ability to fold it into an orchestrated agent set (omo/sisyphus, etc.) really limits how usefully it can be leveraged.