aka how I finally stopped mass-complaining about linter errors at 2am... mostly
Ok So Here's the Thing
Its 3am. I'm so tired. Probably on my fifth coffee at this point, maybe sixth idk I lost count. Eyes are burning from staring at the screen. And I've been fighting with the SAME rubocop error for like an hour now.
You know the cycle right? Run rubocop. 3 errors. Fix them. Run again. Now its 2 different errors. Fix those. Run again. 5 ERRORS. HOW. I literally just fixed things and now theres MORE???
(I once spent 4 hours debugging something and it turned out I had a typo in a variable name. Four. Hours. I dont wanna talk about it.)
Anyway so I'm sitting there at 3am, questioning my life choices, and I'm thinking - wait. Didnt I become a developer to BUILD cool stuff? Not to fight with semicolons and copy paste Stack Overflow answers about "unexpected end of input"?
Thats basically why I made Buildmate. I got tired of doing everything myself lol.
What Is This Exactly
Ok so like... you know how in a real company you got different people doing different things:
- Theres a developer who writes the code (and googles stuff constantly)
- A tester who breaks everything (on purpose... hopefully)
- Code reviewer who points out you forgot null checks again
- Some PM who keeps everyone from losing their minds
- Security person who tells you everything is wrong
Buildmate gives you all of these. But as AI agents. That actually talk to eachother and do stuff while you go get lunch or whatever.
Me: "/pm Build user authentication with OAuth"
Buildmate: "got it"
→ spawns a backend dev (writes the Rails code)
→ spawns frontend dev (does the React stuff)
→ spawns testers (writes tests, runs them)
→ runs linting in a loop until everything passes
→ actually RUNS the code to verify it works
→ fixes its own mistakes automatically
→ spawns reviewers (finds my mistakes)
Me: *drinking coffee* "nice"
I know it sounds wild. When I first got this working I didnt believe it either.
The Linter Loop From Hell (And How I Fixed It)
We've ALL been here:
$ rubocop
3 offenses detected
*fixes them carefully*
$ rubocop
2 offenses detected (different ones??)
*fixes those too*
$ rubocop
5 offenses detected
*stares at screen*
*questions existence*
I swear I've lost years of my life to this.
The fix: Theres this thing called the Grind Agent. It runs your linters in a loop and fixes stuff automatically. Until everythings green. Or until it gives up after 10 tries.
Grind Agent:
Iteration 1: rubocop → 3 errors → fixing...
Iteration 2: rubocop → 0 errors, nice
Iteration 3: rspec → 2 failures → fixing...
Iteration 4: rspec → all green
Status: CONVERGED
Fixed:
- app/models/user.rb:15 - frozen_string_literal
- spec/models/user_spec.rb:28 - nil vs empty string
You didnt do anything. It handled it.
"LGTM" Isnt Really a Code Review
Be real. When your tired and the PR has 47 files, "Looks Good To Me" really means "I looked at maybe 3 files and gave up."
We all do it. Its human nature.
The fix: Theres an Eval Agent that scores your code with actual grades.
| Thing | Weight | Score |
|---|---|---|
| Correctness | 30% | 0.85 |
| Code Quality | 25% | 0.90 |
| Security | 20% | 0.95 |
| Performance | 15% | 0.80 |
| Test Coverage | 10% | 0.75 |
| Final | 0.87 (B) |
Actual numbers. No more arguing about tabs vs spaces. It tells you stuff like "line 47 might have an N+1 query" or "this method is doing 5 things, split it up."
It Actually Tests Itself (This Is the Cool Part)
Most AI coding tools just write code and hope it works. You run it, it crashes, you spend 30 minutes debugging.
I got tired of that too.
So theres /verify. It actually RUNS your code:
Backend: Starts dev server, makes real HTTP requests, validates responses
Frontend: Opens a browser, navigates to your page, takes screenshots, checks if components render, looks for console errors
But heres the best part. If something fails? It fixes itself.
[Verification] Creating HeroSection...
[Testing]
- Starting dev server... ✓
- Looking for .hero-section...
- Component NOT FOUND ✗
[Analyzing]
- Component exists but not exported
- Adding export...
[Retry 1/3]
- Component found ✓
- No console errors ✓
Verification passed after 1 fix.
It built the component. Tested it. Found it wasnt exported. FIXED it. Tested again. All by itself.
Backend too:
[Verification] POST /api/users
- Making request...
- Status: 500 ✗
[Analyzing]
- Missing user_params method
- Adding to controller...
[Retry 1/3]
- Status: 201 Created ✓
Verification passed after 1 fix.
The stupid mistakes that take 20 minutes to debug? Gone.
Security (Actually Important)
Theres a Security Auditor agent. Checks OWASP stuff - injection attacks, auth problems, XSS, CSRF, etc.
## Security Report
Found: 2 issues
1. [MEDIUM] Possible SQL injection
File: app/models/search.rb:45
Fix: use parameterized query
2. [LOW] No rate limiting on login
Suggestion: add rack-attack
Verdict: PASS WITH WARNINGS
I used to forget to check for this. Now it just happens.
How to Get Started
git clone https://github.com/vadim7j7/buildmate.git
cd buildmate
python3 -m venv .venv && .venv/bin/pip install -e .
# Bootstrap your project
buildmate rails /path/to/my-app
buildmate nextjs /path/to/app --ui=tailwind
buildmate rails+nextjs /path/to/app # fullstack
# Or use a preset
buildmate --profile saas /path/to/app
Then:
/pm "Build user authentication"
And watch it work.
Commands Cheat Sheet
| Command | What it does |
|---|---|
/pm "thing" |
Full workflow - plan, build, test, review |
/verify |
Actually runs your code to test it |
/verify --component Hero |
Test specific component |
/verify --endpoint /api/users |
Test specific endpoint |
/parallel "a" "b" |
Do multiple things at same time |
/new-model Name |
Create model + migration + spec + factory |
/new-component Name |
Create component + test |
/security |
Security audit |
/eval |
Score the code |
The /verify one is new and honestly its my favorite now.
FAQ
"What if the AI makes mistakes?"
It will. But:
- Verify agent RUNS your code and catches runtime errors
- If something breaks, it fixes automatically (up to 3 times)
- Grind agent catches lint/type errors
- Eval agent scores it so you know if its good
Most AI tools just write code and pray. This one tests it.
"Will it work with my project?"
Yeah. It just creates a .claude/ folder. Doesnt touch your code unless you ask.
"What frameworks?"
Rails, Next.js, FastAPI, React Native. More coming.
Coming Soon: Website Cloning
Working on something kinda crazy:
/analyze-site https://some-cool-website.com
/clone-page https://some-cool-website.com/pricing
It will look at any website, extract the design, and generate YOUR code using YOUR UI library. See a landing page you like? Clone it.
Will write a whole post about it when its ready. Follow me so you dont miss it.
Try It
git clone https://github.com/vadim7j7/buildmate.git
cd buildmate
python3 -m venv .venv && .venv/bin/pip install -e .
buildmate rails /path/to/your-app
/pm "Build something cool"
Takes like 2 minutes.
Support This Thing
If Buildmate saved you from a 3am debugging session - maybe consider buying me a coffee?
Links
- The Practical Guide: https://dev.to/vadim7j7/how-to-actually-use-buildmate-the-practical-guide-2if2
- GitHub: github.com/vadim7j7/buildmate
- Buy Me A Coffee: buymeacoffee.com/vadim7j7
Star the repo if you like it. And open issues if something breaks, I actually read those.
Built late at night with mass too much coffee
- Vadim






Top comments (10)
The 3am rubocop cycle is painfully accurate lol. Run linter, fix 3 errors, run again, now 5 new ones. Been there too many times. I built VibeCheck, TellMeMo, and a couple other products mostly solo and yeah, juggling dev/tester/security/PM roles gets exhausting fast. Curious how Buildmate handles context switching between agents though, like does the PM agent actually understand what the security agent flagged? Biggest challenge I hit with multi-agent setups is they lose the thread when handing off work.
Thank you for reading!
Good question, I actually ran into the same context-loss issue early on.
So basically agents don't pass stuff through returns (thats where things get lost). They write to files instead - security dumps its findings to
.agent-pipeline/security.md, eval writes scores, etc. Then the orchestrator just... reads all of it before deciding what to do next.Like if security finds SQL injection at line 45 somewhere, the PM sees that, tells grind agent "hey fix this", then reruns the audit after. Doesn't move forward until its clean.
Theres also a feature file that keeps track of everything - kinda like the source of truth for the whole thing. File-based state basically. Stuff doesn't disappear between handoffs.
But there's still a small problem with using PM (orchestrator): when you continue communicating with Claude Cod, the orchestrator might be skipped. That's why I always add "Use PM: ...".
Oh that's smart, using the filesystem as the state layer. I've been doing something similar where the agent writes everything to specific directories and then reads from them, but I like how you've got that feature file as source of truth. The orchestrator skip thing is real though - I ran into that exact issue where continuing a chat would bypass the whole pipeline and agents would just start doing random stuff without the context from previous steps. Your "Use PM:" prefix is basically forcing the routing which makes sense. How do you handle the case where the PM agent itself makes a mistake and kicks off the wrong agent? Does it have a way to self-correct or do you just intervene manually?
Hmm honestly I haven't run into this much since I fixed the orchestrator a while back. Now it creates a plan with all the delegations mapped out and you going to approve it before anything starts. So you see exactly which agent does what before it kicks off.
I did have delegation issues early on tho - agents getting tasks they shouldn't, that kinda thing. Spent some time fixing the orchestrator to route stuff correctly. Now it pretty much always picks the right agent for the job.
The plan approval step is a solid pattern. I ended up doing something similar - making agents dump their intended actions to a manifest file first so you can review before execution. The early days of agents just doing whatever they want were... chaotic lol. Curious though, do you version the plans at all? Like if an agent fails mid-execution do you roll back to the plan or just re-run?
Oh thats a good idea actually. I havent done proper versioning to be honest - felt like it might be overkill?
What it does now is track features and status in
.claude/context/features/- whats done, whats in progress, what failed. But also like git already tracks everything. So if something goes wrong you can just rollback with commits. Adding another versioning layer on top felt like duplicating what git already does.If something breaks mid-execution you just ask it to rollback the changes or checkout the previous commit.
yeah honestly leaning on git for versioning makes total sense, why build another layer when commits already give you the full history. I think the only case where plan-level versioning matters is when the plan itself changes - like you start with "build auth" and midway realize you need to pivot to "build auth + rate limiting" and now half the features context is stale.
but tracking features and status in .claude/context/features/ is pretty clever. we do something similar where we keep a manifest of what the agent touched so if it goes sideways you know exactly which files to look at instead of diffing the entire repo.
the "just ask it to rollback" approach is funny because it works surprisingly well until it doesnt. ever had it confidently rollback to the wrong commit?
Hi Vadim, can you explain what can you test with your setup? is it only for python or can this apply to javascript and others languages? thanks Hal
Hey Halille! Works with Rails (Ruby), Next.js, React Native (JavaScript), and FastAPI (Python). Adding more soon.
Python is only needed to run the CLI itself - it just generates the config files. Your actual project can be any supported stack like rails+nextjs or fastapi+react-native. Check available stacks with
buildmate --list, profiles withbuildmate --profiles, or options for a stack withbuildmate --options nextjs.You will find instructions on how to use it in my next article dev.to/vadim7j7/how-to-actually-us...
You can also start fresh with a new project:
buildmate nextjs ~/my-app --ui=tailwindThis creates a folder with agents, skills, patterns - everything configured for Next.js + Tailwind. Then just run
claudeand ask what you need:Use PM: Build a landing page with hero section, pricing cards, and contact formor
/pm Build a landing page with hero section, pricing cards, and contact form