Every weekend for a while I've put out a release to Wilmer on Sunday; generally a few features I was able to knock out on Saturday and test on Sunday. Almost always using either some combination of local models with Wilmer via Open WebUI, or using Gemini 2.5 Pro.
This weekend I challenged myself to primarily use Claude Code; I had 3 simple features I wanted to implement. I've been spending the past couple of weeks doing every tutorial I could find on best practices, I made sure to set up my project and global settings, I've got fantastic documentation, which each iteration plan is set to utilize...
I managed to get 1 out of 3. I spent almost 30 hours on it this weekend, and hit my $100 max plan limits twice. I'm frustrated.
The past few weeks have been a good learning experience, though. I feel like I'm starting to understand why there is a wider disconnect between how favorably I view AI development (using chat windows) vs the general consensus I seem to see on AI development (using agents). Especially on larger, more complex, applications like Wilmer.
Around 9pm or so I gave up and decided to get Wilmer and Gemini 2.5 Pro to help me code review what was going on, and it turns out that a lot of my troubles are due to little decisions the agent made that I was missing.
For example, my SSE Event stream was getting botched because the agent quietly enabled logging in library instantiation (which I completely missed), causing it to fill the buffer and dump it all at once, breaking the generator. Or the decision to enforce a version of a package to a mid-2024 build, when the fix I needed came in early 2025. When code reviewing hundreds of lines of changes, they were small items that blended right in.
What's annoying is that Wilmer (running GLM 4.6 + qwen3-30b-a3b + magistral 24b) and Gemini Pro 2.5 noticed immediately, despite Claude not noticing at all. Not that I can say anything; I didn't notice either lol.
I feel like a big pain point is the context window. Like it's losing track of important things.
Agentic coding is definitely a different beast. It could be that I'm still just used to using the chat AI for coding; I'm lightning fast with it, and so I'm likely feeling a tendency to want to reject something new and different that is currently slowing me down BECAUSE it's new and different. But it is frustrating...
Top comments (0)