DEV Community

Cover image for Composer-1 vs SWE-1.5: A detailed Comparison
Arindam Majumder
Arindam Majumder Subscriber

Posted on

Composer-1 vs SWE-1.5: A detailed Comparison

With recent launches from Cursor and Windsurf, the coding model race just got serious.

Cursor dropped Composer-1 during its big 2.0 launch, and just hours later, Windsurf released its SWE-1.5 model. Both are being hyped as “agentic coding” models, LLMs that can not only write code but also plan, review, and iterate like a real engineer.

So I had to put both to the test in some real coding challenges to see how they stack up in practice.

In this article, we’ll see what Composer-1 and SWE-1.5 can do, and how well they compare in actual coding.


TL;DR

If you just want the short version, here’s what I found:

  • Windsurf SWE-1.5 consistently produced more reliable and complete code. It was much faster, often finishing in half the time of Cursor’s Composer-1.
  • Cursor Composer-1 has solid reasoning and architecture planning but often slowed down or froze mid-generation. Its results were functional but less polished.
  • For complex, multi-file projects, SWE-1.5 handled dependencies and interactions far better. Composer-1 still felt experimental, though promising in structure and approach.

If speed, stability, and real-world usability matter, SWE-1.5 is the clear winner. If you care more about innovation and potential in an agentic workflow, Composer-1 is worth watching as it matures.

If you prefer a Video version, check this out:


How’s the test done?

For fairness, I ran both models under similar conditions.
Both were accessed via their respective coding environments — Cursor 2.0 for Composer-1 and Windsurf for SWE-1.5.

Each model was given identical prompts and tested on two coding challenges:

  1. Responsive Typing Game (Monkeytype Clone)
  2. 3D Solar System Simulator

All runs were recorded live, and execution speed was noted based on total wall-time until the final working output.


Brief About Cursor's Composer-1

Image

Let’s start with Cursor’s Composer-1.

This model is built for “agentic coding”, it doesn’t just complete your code, it thinks through multi-step changes, explains fixes, and can plan end-to-end implementations.

Cursor claims it’s 4× faster than competitors, powered by a mixture-of-experts (MoE) setup where specialized sub-models tackle different tasks.

Image

It’s trained through reinforcement learning in sandboxed environments, meaning it learned by coding thousands of apps instead of just reading static GitHub data.

Most responses were completed in under 30 seconds, though complex multi-file generations took longer.

The base model isn’t public, but speculation suggests it’s built on GLM-4.6 or Qwen3-Coder.


Brief About Windsurf's SWE-1.5

Image

Now let’s talk about Windsurf’s SWE-1.5.

SWE-1.5 is marketed as “agency-grade”, optimized for coding agents that can reason across multiple files, terminals, and APIs simultaneously.

Image

It runs on NVIDIA GB200 clusters, pushing around 950 tokens/sec, which is roughly 13× faster than Claude Sonnet and 6× faster than Haiku.

It’s designed for large-scale workflows:

  • Works across multiple files and repositories
  • Handles code, docs, APIs, and terminal context
  • Supports 128K tokens, enough for entire project folders

Like Composer-1, SWE-1.5 also uses a MoE + RLHF approach but scales it with distributed training and human-feedback loops fine-tuned for engineering use.


Quick Comparison Table:

Feature Cursor Composer-1 Windsurf SWE 1.5
Core Technology MoE + RL fine-tuning MoE + RLHF, agency workflows
Speed 4x faster than peers (<30s) 950 tokens/sec (13x Claude)
Context Window Extended, long-term 128K tokens
Multi-modal Capability Code, docs, UIs Code, docs, APIs, UIs, terminal
Benchmark Results Mixed performance; struggles in frameworks Lower completion, poor build logic
Transparency No base model disclosed No base model disclosed
Cloud Hardware Custom NVIDIA clusters Cerebras GB200, NVIDIA

Coding Comparison

For this comparison, I tested both models on two projects to see how they handle multi-step reasoning, UI logic, and runtime stability.

1. Responsive Typing Game (Monkeytype Clone)

First, I tried to create a Responsive Typing game like MonkeyType. Here's the prompt that I gave:

Prompt:

Build a modern, highly responsive typing game inspired by Monkeytype, with a clean and elegant UI based on the Catppuccin color themes (choose one or allow toggling between Latte, Frappe, Macchiato, Mocha). The typing area should feature randomized or curated famous texts (quotes, speeches, literature) displayed in the center of the screen.

- *Key Requirements:**

- The user must type the exact characters, including uppercase and lowercase, punctuation, and spaces.
- As the user types correctly, a fire or flame animation should appear under each typed character (like a typing trail).
- Mis-typed characters should be clearly marked with a subtle but visible indicator (e.g., red underline or animated shake).
- A minimalist virtual keyboard should be shown at the bottom center, softly glowing when keys are pressed.
- Include features such as WPM, accuracy, time, and combo streak counter.
- Once the user finishes typing the passage, show a summary screen with statistics and an animated celebration (like fireworks or confetti).

- **Design Aesthetic:**

- Soft but expressive, using the **Catppuccin** palette.
- Typography should be elegant and readable (e.g., Fira Code, JetBrains Mono).
- Use soft drop shadows, rounded corners, and smooth transitions.
Be creative in how you implement the fire animation. For example, the flame could rise up gently from the letter just typed, or the typing trail could resemble a burning path.
Enter fullscreen mode Exit fullscreen mode

Composer-1:

The implementation worked but lacked smoothness.

It generated a solid typing area and logic for word tracking but missed polish on UI animation and event handling. ( For Example, the Texts in the text box were not visible)

It also took a long time. Completion took around 3 minutes, and while the output ran, it wasn’t visually refined.

Here's the Output:

Image

SWE-1.5:

This one nailed it.

The model delivered a functional, clean, and responsive typing interface on the first try.

The animation, WPM logic, and accuracy counter worked seamlessly, and it completed in under 1 minutes.

Here's the Output:

Image

Overall, in this test, SWE-1.5 is a clear winner in this round. It was faster, cleaner, and gave better output.


2. 3D Solar System Simulator

Then, I tried creating a 3D Solar System. The main goal is to test how well it works with other libraries like Three.js.

Here's the prompt that I gave:

Prompt:

3D Solar System Simulator

Create an interactive 3D solar system simulation using Three.js and WebGL.
Planets orbit the Sun at different speeds, and the camera can freely rotate, zoom, or focus on specific planets.

Core Requirements:

- Use Three.js (CDN import, no bundlers).
- Include at least the Sun and 4 planets (Mercury, Venus, Earth, Mars).
- Each planet orbits the Sun at a unique speed and distance.
- Add a “time speed” slider to control orbit speed.
- Implement OrbitControls for camera movement and zoom.
- Clicking a planet focuses the camera on it.

Design & Aesthetics:

- Use emissive glow or bloom for the Sun.
- Give each planet a distinct color or texture.
- Add simple ambient + directional light for visibility.
- Use soft gradients or starfield background for space.
- Include smooth camera transitions when focusing on a planet.

Mechanics:

- Animate orbits using trigonometric rotation around the Sun.
- Use accurate scaling ratios for distances and sizes (roughly proportional, not real-world scale).
- Add optional orbit trails to visualize paths.
- Maintain smooth frame rate with GPU-efficient updates.

Use Three.js and vanilla JavaScript to build the demo in one HTML file.
Enter fullscreen mode Exit fullscreen mode

Composer-1:

This time, it generated the code faster than the last time!

The structure with Three.js setup was correct, but it struggled with the orbit logic and camera transitions. ( Also, the Planets were hardly discoverable)

The simulation loaded, but planet motion was inconsistent, and output latency was higher.

Here's the Output:

Image

SWE-1.5:

It was blazing fast, almost in the blink of an eye!

It implemented realistic planetary orbits, scaling, and camera motion perfectly.

The app ran smoothly with accurate 3D rendering in most of cases. But it was lagging a bit while focusing from one planet to another.

Here's the Output:

Image

Overall, SWE-1.5 again leads by a wide margin, with better 3D math, cleaner visuals, and far faster generation.


Conclusion

That's a Wrap!

Both Composer-1 and SWE-1.5 push the boundary of agentic coding.

They can plan, code, and debug like early versions of autonomous dev partners, not just assistants.

But in these tests:

  • SWE-1.5 delivered faster and more stable results across both tasks.
  • Composer-1 showed solid reasoning, but slower response and less consistent code.

If you’re optimizing for speed, stability, and output quality, go with Windsurf SWE-1.5.
If you want to explore agentic behavior and reasoning depth, Cursor Composer-1 has interesting potential, it just needs refinement.

Either way, it’s clear we’re entering a new era where coding assistants are no longer passive; they’re becoming collaborators.

Let me know what you think of these two models in the comments below!

Top comments (0)