In my last post, published yesterday, I wrote about which LLMs I currently like to use for Angular development and promised a follow-up about the apps and harnesses around them. This is that follow-up.
From Models to Agent Harnesses
Picking the model is only half the picture. The same Opus 4.7 behaves very differently in a JetBrains chat window, in Claude Code on the terminal, or wrapped in Cursor's agent mode. The harness decides how the model sees your code, how it edits files, how it runs tools, how much it verifies, and whether it feels like autocomplete, a helpful assistant, or a real agent. For serious Angular work, that part matters at least as much as the underlying model.
So this post is about the other half: the apps, IDE integrations, and agentic coding tools I've used over the last few months. I'll move from classic Copilot-style autocomplete to IDE agents and today's super apps, and then explain where Codex, Claude desktop app, Cursor, Antigravity, VS Code, and WebStorm currently fit (or don't) into my daily Angular workflow. I also want to look a little bit at the product philosophy behind these tools, because that often explains the day-to-day experience better than a feature checklist.
This Is Not a Benchmark
Before we go any further, one important disclaimer: this is not a scientific benchmark, not a universal ranking, and definitely not the final truth about AI coding tools. It is my subjective opinion based on my own daily work with Angular projects, workshops, code reviews, refactorings, and experiments over the last few months.
The sketch above is a very simplified overview of the different tools and how I see them in terms of their capabilities and my personal preferences. It is not meant to be an objective comparison, but rather a visual aid to understand my current opinion of these apps and harnesses.
Your workflow might be different. Your team might have different constraints. Your preferred IDE, operating system, budget, company policies, and tolerance for agentic autonomy might lead to a completely different result.
So please, read this post as a practical field report from my current setup, not as a neutral lab test.
TL;DR: My Current Setup
If you only have thirty seconds: my daily drivers for Angular work are currently Codex (with GPT 5.5) and the Claude desktop app (with Opus 4.7), used roughly interchangeably. Codex has the more polished app experience, while the Claude desktop app gives me access to the Opus models I trust most for architecture, design, and larger refactorings. Cursor with Composer 2.5 is a strong third option, especially when speed, cost, IDE integration, or cloud agents matter. Antigravity is interesting but, in my opinion, not yet on the same level. And whenever I do real handcrafting, I still happily switch back to WebStorm.
All of this assumes a strict human-in-the-loop workflow: I review every diff, and I expect every line of generated code to look as if I had written it myself.
My AI Coding Journey
To explain where I ended up, I first need to show the path that got me there. I'll start with the more traditional IDE integrations and then move on to the newer agentic coding apps.
WebStorm + GitHub Copilot
For over a decade PhpStorm (starting in my WordPress era) and later WebStorm have been my main IDEs for web development. So when GitHub Copilot launched, it was a natural choice to try it out in WebStorm. It was one of the first AI coding tools I used, and it had a big impact on how I thought about AI-assisted coding.
I just looked up in my GitHub Billing History and noticed that I've been using Copilot since 2023-10-27. That's exactly 31 months ago. What a journey. For the most part of that time - let's say the first 25 months, I used GitHub Copilot in WebStorm (and VS Code), which was a great way to get started with AI-assisted coding. It felt like a natural extension of the IDE's existing autocomplete features, and it worked well for small code snippets and boilerplate.
However, it also had its limitations. It often struggled with larger code contexts, complex Angular patterns, and multi-file refactorings. In particular, it was usually a step or two behind modern Angular APIs: it kept suggesting NgModule-based code long after standalone components had become the default, leaned on *ngIf and *ngFor instead of the new control flow syntax, and rarely reached for signals, input() or output() on its own. It was more of a helper for writing code than a partner for driving larger coding workflows. And sometimes it just was pretty annoying, suggesting irrelevant or incorrect code that I had to manually reject.
In 2025, I also experimented quite a bit with Cursor. However, I still didn't like the VS Code clone too much and preferred to stay in my beloved JetBrains editors.
WebStorm + AI Assistant + Junie + Opus 4.5
Switch to November 2025 when Opus 4.5 was released. People on social platforms got pretty excited and also some of my friends pushed me to try the new model. So again, I followed the easy path and I started to use the new model through AI Assistant and Junie in WebStorm.
For context: JetBrains AI Assistant is the AI layer built directly into JetBrains IDEs like WebStorm. It gives you the typical IDE-integrated AI features: chat, code explanations, code generation, documentation help, code completion, and smaller edits in the context of the project. The same AI Assistant interface is also becoming more of a hub for different agents and providers, so depending on your setup you can use tools like Junie, Claude, Codex or even Cursor and a lot more from inside the JetBrains workflow.
Junie is JetBrains' own more agentic coding tool. Instead of only answering questions or completing snippets, it can take a task, create a plan, edit multiple files, run commands or tests if you allow it, and iterate inside the IDE.
But then, since I was using Claude Opus 4.6 most of the time, I wanted to experiment with different harnesses for that model. So I came to a new workflow where I used VS Code with the Claude Code extension for my agentic coding.
VS Code + Claude Code + Opus 4.6
A few weeks after Opus 4.6 was released to the public, I think it was around the end of November, I started to look more seriously for the best harness for that model. Until then, I mostly thought about the model itself: is it good enough, fast enough, and useful enough for real Angular work? But the more I compared the different tools, the more obvious it became that the surrounding app matters a lot.
VS Code was a natural candidate for that experiment. I never loved it as much as my JetBrains IDEs, but I knew it well from the work in my Angular Architects workshops, where many participants use it as their main editor. And with the Claude Code extension in VS Code, I had the first setup where Opus really felt stronger because of the harness around it. Suddenly Opus could read across an entire Angular feature folder, follow a signal from computed() back to its source, and propose multi-file edits that respected my standalone-component setup, my routing, and even my providedIn: 'root' services without me having to spoon-feed it the context.
That was the moment where I started to think: OK, this is not only about the LLM anymore. Well and that was even before I tried the Agentic coding apps or what I like to call them super apps for the first time.
Beyond IDE: Agentic Coding Apps (Super Apps)
After using VS Code for my agentic coding for about a month or so, I was ready to start a new journey.
In the screenshot above, you can see the four main apps I currently use for agentic coding: Claude desktop app, Codex, Cursor, and Antigravity. Oh boy they really look the same, don't they?
To be clear, I strictly follow a human-in-the-loop workflow here. That is because I care a lot about clean code and high-quality code. Some people think that the quality of the codebase does not, or will not, matter anymore in the future. I strongly disagree with that view, although I accept it as a fair point that challenges my own position.
At first glance, these tools look like different skins around the same idea. But the more I use them, the more I think they are actually making different bets. The Claude desktop app feels terminal-first and model-first: give a strong model a lot of tool access and let it work. Codex feels app-first and verification-first: keep the interface calm, use the local environment when possible, and make the agent prove more of its work. Cursor feels IDE- and cloud-first: keep the editor close, but also make it possible to run agents somewhere else and bring the result back into the team workflow.
Claude desktop app + Opus 4.7
That is why I really care about reviewing everything these tools do, and I want every line of code to look exactly as if I had handcrafted it. Since the Claude desktop app and the VS Code extension share the same underlying Claude Code agent harness, the app already felt familiar to me. The Claude CLI is the canonical surface; the desktop app and the VS Code/JetBrains extensions are different front-ends that drive that same harness.
To be honest, I have never been much of a CLI guy (nor Linux - I prefer MacOS - but I love Android), so I really prefer using a fancy desktop app. But I fully understand that this is a pure personal preference. And yeah, that's all I can say to that.
So basically, the first one of these super apps that I really used on a day-to-day basis was the Claude desktop app. By the way, it also offers a Co-working tab, which is supposed to be the right choice for knowledge workers, while the Code tab is the choice for us developers.
The big strength of Claude Code is still the terminal story. It met developers where many of them already live, and that is probably one reason why adoption was so fast. The downside is that this also shapes the whole product: images, rich UI, visual verification, and desktop-app polish will always feel a bit different when the canonical workflow is the CLI. I also notice that Claude Code is very willing to spend tokens if that makes the agent feel more capable, for example with subagents or larger parallel explorations. That can be great for difficult tasks, but it is something I want to keep an eye on when cost and focus matter.
For Angular work specifically, Claude Code with Opus 4.7 is the harness I trust most when a task spans the whole stack: introducing a new feature module, migrating an older area to standalone components and signals, or refactoring a non-trivial RxJS pipeline into a cleaner mix of signals and toSignal(). It also handles the boring-but-important parts well, like keeping app.config.ts, route definitions, and lazy-loaded feature routes in sync after a rename.
Codex + GPT 5.5
And that is already a big difference compared with the Codex app by OpenAI, because Codex does not split these workflows into separate apps. Instead, it combines coding and co-working in just one app.
Again, starting to use Codex at the end of April felt familiar because it more or less had the same interface as the Claude desktop app. But from day one, I had the feeling that there were fewer bugs and that the Codex app, and therefore the GPT harness, just worked most of the time.
What I like about Codex is that it usually feels less theatrical. There is less visual noise, fewer little productivity animations, and more focus on the task, the diff, the terminal output, and the verification. That sounds boring, but for me boring is often good when I work on real code. I especially like the direction around using the machine and environment I already have configured instead of pretending that every project is easy to run in a clean cloud sandbox. For an Angular application with local setup, browser checks, environment variables, internal packages, and maybe a slightly weird test configuration, that matters a lot.
And there is another practical detail I really like: Codex can now be controlled from the smartphone while it still uses the desktop machine as the actual working environment. So I can start or continue a task from the couch, the train, or wherever, without moving the whole project into a cloud runner. Of course, I still review the diff properly later, but for keeping an agent moving this is very convenient.
So my personal verdict would be that the Codex app is superior to the Claude desktop app. However, the underlying models in the Claude desktop app, Opus 4.5, 4.6, and 4.7, are still superior to GPT 5.5 in many cases, especially in the fields I mentioned in the last post, such as architecture, design, and more.
In day-to-day Angular tasks, though, the gap is often smaller than the model benchmarks suggest. Codex with GPT 5.5 is very strong at the kind of work that fills most of my week: writing or updating standalone components, generating Vitest or Jasmine specs that actually use TestBed, scaffolding signal-based stores, and producing usable Signal Forms code. Where I still reach for the Claude desktop app is the harder, more design-heavy work, for example shaping a new feature's component tree or untangling a legacy service.
Cursor + Composer 2.5
Once I decided to do a workshop about agentic coding, I realized I had to try out all the available options and not just use my favorite ones. So of course, one candidate for the best super app, or in other words, agentic coding app, is Cursor.
Cursor, basically a VS Code fork, was one of the first IDEs to really focus on agentic coding and agentic workflows, and many people started to use these workflows much earlier than I did. But in the past, let's say at the beginning of 2025, I had the feeling that the code generation was not really good enough for my requirements.
Nevertheless, Cursor recently announced a major deal with SpaceX that gives SpaceX the option to acquire Cursor for 60 billion dollars, or to pay 10 billion dollars for their partnership instead. So this is no longer only a small IDE story.
So why are Cursor and SpaceX such a good match? Because SpaceX has huge compute with Colossus 2, while Cursor has a ton of collected user data and usage statistics. Together, they are about to enter, or have already entered, the frontier stage of agentic coding.
By the way, Cursor and Composer 2.5 are also super fast, and they can be a cheap alternative if you really have to pay for the tokens and cannot use the subsidized OpenAI or Anthropic subscription model. But more on the costs of using these tools in the next post.
The part I had underestimated is Cursor's cloud story. I used to think of Cursor mostly as the AI-first VS Code clone. That is still part of it, of course, but the more interesting direction is letting agents run away from my local machine and come back with a result. If your team wants to trigger work from Slack, run several agents in parallel, or give non-developers a safe way to start small engineering tasks, Cursor's cloud agents become much more interesting than the editor alone.
For Angular specifically, I found Composer 2.5 most useful for the mechanical kind of refactor: renaming a service across an entire feature, converting a class-based @Input() to the new input() signal API, or rewriting a template from *ngIf/*ngFor to the new @if/@for control flow. It is fast enough to feel interactive, and that speed changes how often you actually run a refactor instead of postponing it.
Antigravity + Gemini 3.5 Flash
A little more than a week ago, Google also announced a new release at Google I/O, where they presented the new agentic coding app called Antigravity 2.0. The interesting thing is that they transformed Antigravity from an IDE with an included agentic coding workflow into a super app similar to the Claude desktop app and Codex. If you compare the user interface to Codex, it almost feels like a one-to-one copy, which is good because I think Codex currently has the best user interface of all the options.
The former IDE is now kind of an integrated sub-app that can be triggered from the Antigravity app whenever needed. This is actually a really good idea, and I like it very much. The same idea already exists in Codex, where you can open your preferred IDE. In my case, that would be WebStorm, of course. For whatever reason, the Claude desktop app is missing that feature.
So while this 2.0 release is definitely a big step in the right direction, I personally don't think Antigravity is competitive with the other options yet. There are two reasons for that. First, the app still feels clunky and buggy, and it has fewer settings than the other apps, so it just does not feel as mature. Second, yes, you can use other models, but I think the real point of using Antigravity is to use it with Gemini models, especially Gemini 3.5 Flash in the new Antigravity 2.0 setup. Usage for models like Opus 4.6 has recently been reduced, and I guess it will be reduced even more in the future.
On Angular tasks, Gemini 3.5 Flash is genuinely capable, and its speed is useful when you want fast iterations over a whole feature module or a long template. But in practice I still see it default to slightly older Angular idioms more often than Opus 4.7 or GPT 5.5, especially around signals and the new control flow. With clear guardrails it gets there, but it currently needs more steering than the other two.
Open Source Alternatives to the Big Ones
Beyond the big names, I would also keep an eye on OpenCode and T3 Code. OpenCode is a solid open-source, terminal-first option if you want a model-agnostic agent and bring your own provider setup. T3 Code is interesting for the opposite reason: it gives you an open-source GUI on top of the agents you may already pay for, like Claude Code, Codex CLI, OpenCode, or Cursor.
But I'm super happy with working in the terminal
Fair enough. If you love the terminal, you can absolutely keep using it, but I no longer think it is obviously the best default for agentic coding. Apps make it easier to review diffs, manage several agents, use screenshots or browser checks, continue from the phone, and turn work into PRs. But that is personal preference, and I don't want to force anybody into my workflow.
If you are still happiest in the terminal, Pi is probably the option I would suggest trying first. It is a minimal terminal coding harness with AGENTS.md support, skills, extensions, tree-structured sessions, and many model providers. It feels less like a sealed product and more like something you can adapt to your own workflow.
Which Coding Agent Harness Would I Choose Today?
Okay, that was either a lot of information to digest, or you were already familiar with all the super apps and IDEs I mentioned. So let's try to summarize the current state of my workflow and which harnesses I prefer for different tasks.
My two favorite tools are definitely Codex and the Claude desktop app. I would put them roughly on par. I use them daily for almost everything, and I feel really lucky that I can use these tools through subsidized subscription models. More on costs in the next post.
While preparing the upcoming workshop, I also found out that Cursor has really become a super app and that it supports agentic workflows very well. All three are capable of code generation, code review, modernization, refactoring, and writing tests in your Angular codebases.
The practical difference is not only which one can solve a given prompt. It is also how it tries to solve it. Claude Code is great when I want to give a strong model a lot of freedom and let it explore. Codex is great when I want a calmer engineering tool that uses my local setup and keeps verification close to the work. Cursor is great when the work should move beyond my own machine into an IDE, browser, cloud, or team workflow.
For Antigravity, however, I still see a lot of work for the Google teams before it reaches the level of the other three apps. And of course, those other apps will continue to improve as well.
For my beloved IDE WebStorm, I see difficult times ahead. I hope the JetBrains team will be able to catch up with the other apps in terms of productivity and usability when using these models for agentic tasks. There is still a big benefit to using IDEs, and they also allow you to choose between several model providers. Even Cursor allows you to do that, whereas Codex and the Claude desktop app want to lock you into one LLM provider.
So today, I'll still stick with Codex and the Claude desktop app. But maybe in a month or so, I might already be switching to the next super app. We will see.
Agentic Engineering Workshop
This is also why I don't think about harnesses in isolation: the model, the app, my Angular Guardrails, my Angular Coding Style Guide and the Angular Skills all belong together.
If you want to deep-dive into professional AI-assisted Angular workflows, we offer the Agentic Engineering Workshop – both in English and German.
In this workshop, advanced Angular developers learn how to move from vibe coding to traceable Agentic Engineering workflows: AI-ready project setup, guardrails, spec-first and plan-first workflows, UX and component prototyping, code review, testing, and brownfield refactoring.
- 🤖 Agentic Engineering Workshop – 2 days or 3 extended half-days, remote or in-house
Conclusion
The main takeaway is simple: the harness matters. A great model in a weak app can feel surprisingly limited, while the same model in a strong agentic workflow can become much more useful for real Angular work. That is why Angular teams should not only ask which model is best, but also which harness, review process, and engineering workflow actually help them ship better code.
For my current workflow, Codex and the Claude desktop app are still my daily drivers. Codex currently feels like the more polished app to me, while the Claude desktop app gives me access to the Opus models that I trust most for architecture, design, and complex reasoning. Cursor deserves a serious look when speed, cost, an IDE-like workflow, or cloud agents matter, and Antigravity is worth watching, but for now it still feels like it needs more time. When I do careful handcrafting, I still switch back to WebStorm.
One thing I learned quickly: don't force the same workflow onto every tool. Codex, the Claude desktop app, Cursor, and even the smaller open-source options work best when you let them shape your workflow a little bit.
So don't get too attached to one app, one IDE, or one vendor workflow. These tools are changing too quickly, so it is worth trying new harnesses regularly and switching when they make your real project work better.
In the next post, I want to look at costs. Because the best model and the best harness are only useful if the pricing model also works for your team.
This blog post was written by Alexander Thalhammer. For feedback, remarks or questions, please reach out to me!








Top comments (0)