DEV Community

Cover image for Claude Opus vs Kombai in 3 Real-World Frontend AI Tests πŸš€

Claude Opus vs Kombai in 3 Real-World Frontend AI Tests πŸš€

Shrijal Acharya on June 02, 2026

Frontend automation has been getting pretty wild lately. 🫠 A few months ago, this comparison would have been much easier to frame. On one side, y...
Collapse
 
shricodev profile image
Shrijal Acharya

One of the very few tools I’ve actually stuck with. I’ve been using it for more than a year now.
Such a bliss for frontend engineers, and even for someone like me who rarely touches frontend.

All in all, it’s my go-to for frontend. βœ…

Collapse
 
theuniverseson profile image
Andrii Krugliak

Never-trust-always-verify covers who's calling, but agents broke it on a different axis for me. The call is authorized and still wrong. An agent with valid creds that confidently does the wrong thing passes every auth check, so I ended up gating on the output being worth paying for, not on the identity making the request.

Collapse
 
shricodev profile image
Shrijal Acharya

Totally. Valid creds don’t mean much if the agent still builds the wrong thing. The result matters more.

Collapse
 
uzoma_uche_3ec83974b4a8a5 profile image
Echo

The 'frontend AI agents used to be much easier to frame' line is the whole story in one sentence. Kombai going from Figma-to-code to design-to-iterate-to-ship is a category shift, not a feature add. Tests like this are how I decide which one stays in my toolchain.

Collapse
 
shricodev profile image
Shrijal Acharya

Exactly, that’s what stood out to me too. The bigger shift isn’t just β€œbetter Figma to code,” it’s moving closer to an actual frontend workflow. That’s why I wanted the tests to be closer to real product work.

Collapse
 
shekharrr profile image
Shekhar Rajput

Is this tested on Opus API usage? And what was the criteria to pick the opensourced projects.

Collapse
 
shricodev profile image
Shrijal Acharya

No such criteria. I just picked those randomly from github explore.

Collapse
 
nabin_bd01 profile image
Nabin Bhardwaj

Whats up with "design engineer" tag with Kombai. Why Opus 4.6 though?

Collapse
 
shricodev profile image
Shrijal Acharya

With the recent release of Design Mode, Kombai 2.0 is now tagged as an AI design engineer.

The reason I used Opus 4.6 is that I had planned this blog a few months ago and had already run the test, but somehow forgot to share it publicly.

You can also try the same test with the newer 4.8 or the newer models from OpenAI. :)

Collapse
 
mudassirworks profile image
Mudassir Khan

the "does it preserve functionality or just match the visual" test is the right frame. we have burned time with general purpose agents on component rewrites that looked correct in isolation but quietly broke state bindings 2 layers up.

the gap most comparisons miss: does the output hold under the actual interaction patterns the component was designed for, not just "does it render." that distinction changes the verdict.

how did you handle cases where the figma spec and existing interaction patterns contradicted β€” did either tool pick the right winner?