DEV Community

Vadim Karnopelev
Vadim Karnopelev

Posted on

I Got Tired of Being a One-Man Dev Team (So I Built This Thing)

aka how I finally stopped mass-complaining about linter errors at 2am... mostly


Ok So Here's the Thing

Its 3am. I'm so tired. Probably on my fifth coffee at this point, maybe sixth idk I lost count. Eyes are burning from staring at the screen. And I've been fighting with the SAME rubocop error for like an hour now.

You know the cycle right? Run rubocop. 3 errors. Fix them. Run again. Now its 2 different errors. Fix those. Run again. 5 ERRORS. HOW. I literally just fixed things and now theres MORE???

(I once spent 4 hours debugging something and it turned out I had a typo in a variable name. Four. Hours. I dont wanna talk about it.)

Anyway so I'm sitting there at 3am, questioning my life choices, and I'm thinking - wait. Didnt I become a developer to BUILD cool stuff? Not to fight with semicolons and copy paste Stack Overflow answers about "unexpected end of input"?

Thats basically why I made Buildmate. I got tired of doing everything myself lol.


What Is This Exactly

Ok so like... you know how in a real company you got different people doing different things:

  • Theres a developer who writes the code (and googles stuff constantly)
  • A tester who breaks everything (on purpose... hopefully)
  • Code reviewer who points out you forgot null checks again
  • Some PM who keeps everyone from losing their minds
  • Security person who tells you everything is wrong

Buildmate gives you all of these. But as AI agents. That actually talk to eachother and do stuff while you go get lunch or whatever.

Me: "/pm Build user authentication with OAuth"

Buildmate: "got it"

         → spawns a backend dev (writes the Rails code)
         → spawns frontend dev (does the React stuff)
         → spawns testers (writes tests, runs them)
         → runs linting in a loop until everything passes
         → actually RUNS the code to verify it works
         → fixes its own mistakes automatically
         → spawns reviewers (finds my mistakes)

Me: *drinking coffee* "nice"
Enter fullscreen mode Exit fullscreen mode

I know it sounds wild. When I first got this working I didnt believe it either.


The Linter Loop From Hell (And How I Fixed It)

We've ALL been here:

$ rubocop
3 offenses detected

*fixes them carefully*

$ rubocop
2 offenses detected (different ones??)

*fixes those too*

$ rubocop
5 offenses detected

*stares at screen*
*questions existence*
Enter fullscreen mode Exit fullscreen mode

I swear I've lost years of my life to this.

The fix: Theres this thing called the Grind Agent. It runs your linters in a loop and fixes stuff automatically. Until everythings green. Or until it gives up after 10 tries.

Grind Agent:
  Iteration 1: rubocop → 3 errors → fixing...
  Iteration 2: rubocop → 0 errors, nice
  Iteration 3: rspec → 2 failures → fixing...
  Iteration 4: rspec → all green

  Status: CONVERGED

  Fixed:
  - app/models/user.rb:15 - frozen_string_literal
  - spec/models/user_spec.rb:28 - nil vs empty string
Enter fullscreen mode Exit fullscreen mode

You didnt do anything. It handled it.


"LGTM" Isnt Really a Code Review

Be real. When your tired and the PR has 47 files, "Looks Good To Me" really means "I looked at maybe 3 files and gave up."

We all do it. Its human nature.

The fix: Theres an Eval Agent that scores your code with actual grades.

Thing Weight Score
Correctness 30% 0.85
Code Quality 25% 0.90
Security 20% 0.95
Performance 15% 0.80
Test Coverage 10% 0.75
Final 0.87 (B)

Actual numbers. No more arguing about tabs vs spaces. It tells you stuff like "line 47 might have an N+1 query" or "this method is doing 5 things, split it up."


It Actually Tests Itself (This Is the Cool Part)

Most AI coding tools just write code and hope it works. You run it, it crashes, you spend 30 minutes debugging.

I got tired of that too.

So theres /verify. It actually RUNS your code:

Backend: Starts dev server, makes real HTTP requests, validates responses

Frontend: Opens a browser, navigates to your page, takes screenshots, checks if components render, looks for console errors

But heres the best part. If something fails? It fixes itself.

[Verification] Creating HeroSection...

[Testing]
- Starting dev server... ✓
- Looking for .hero-section...
- Component NOT FOUND ✗

[Analyzing]
- Component exists but not exported
- Adding export...

[Retry 1/3]
- Component found ✓
- No console errors ✓

Verification passed after 1 fix.
Enter fullscreen mode Exit fullscreen mode

It built the component. Tested it. Found it wasnt exported. FIXED it. Tested again. All by itself.

Backend too:

[Verification] POST /api/users

- Making request...
- Status: 500 ✗

[Analyzing]
- Missing user_params method
- Adding to controller...

[Retry 1/3]
- Status: 201 Created ✓

Verification passed after 1 fix.
Enter fullscreen mode Exit fullscreen mode

The stupid mistakes that take 20 minutes to debug? Gone.


Security (Actually Important)

Theres a Security Auditor agent. Checks OWASP stuff - injection attacks, auth problems, XSS, CSRF, etc.

## Security Report

Found: 2 issues

1. [MEDIUM] Possible SQL injection
   File: app/models/search.rb:45
   Fix: use parameterized query

2. [LOW] No rate limiting on login
   Suggestion: add rack-attack

Verdict: PASS WITH WARNINGS
Enter fullscreen mode Exit fullscreen mode

I used to forget to check for this. Now it just happens.


How to Get Started

git clone https://github.com/vadim7j7/buildmate.git
cd buildmate
python3 -m venv .venv && .venv/bin/pip install -e .

# Bootstrap your project
buildmate rails /path/to/my-app
buildmate nextjs /path/to/app --ui=tailwind
buildmate rails+nextjs /path/to/app  # fullstack

# Or use a preset
buildmate --profile saas /path/to/app
Enter fullscreen mode Exit fullscreen mode

Then:

/pm "Build user authentication"
Enter fullscreen mode Exit fullscreen mode

And watch it work.


Commands Cheat Sheet

Command What it does
/pm "thing" Full workflow - plan, build, test, review
/verify Actually runs your code to test it
/verify --component Hero Test specific component
/verify --endpoint /api/users Test specific endpoint
/parallel "a" "b" Do multiple things at same time
/new-model Name Create model + migration + spec + factory
/new-component Name Create component + test
/security Security audit
/eval Score the code

The /verify one is new and honestly its my favorite now.


FAQ

"What if the AI makes mistakes?"

It will. But:

  • Verify agent RUNS your code and catches runtime errors
  • If something breaks, it fixes automatically (up to 3 times)
  • Grind agent catches lint/type errors
  • Eval agent scores it so you know if its good

Most AI tools just write code and pray. This one tests it.

"Will it work with my project?"

Yeah. It just creates a .claude/ folder. Doesnt touch your code unless you ask.

"What frameworks?"

Rails, Next.js, FastAPI, React Native. More coming.


Coming Soon: Website Cloning

Working on something kinda crazy:

/analyze-site https://some-cool-website.com
/clone-page https://some-cool-website.com/pricing
Enter fullscreen mode Exit fullscreen mode

It will look at any website, extract the design, and generate YOUR code using YOUR UI library. See a landing page you like? Clone it.

Will write a whole post about it when its ready. Follow me so you dont miss it.


Try It

git clone https://github.com/vadim7j7/buildmate.git
cd buildmate
python3 -m venv .venv && .venv/bin/pip install -e .

buildmate rails /path/to/your-app
/pm "Build something cool"
Enter fullscreen mode Exit fullscreen mode

Takes like 2 minutes.


Support This Thing

If Buildmate saved you from a 3am debugging session - maybe consider buying me a coffee?

☕ Buy Me A Coffee


Links

Star the repo if you like it. And open issues if something breaks, I actually read those.


Built late at night with mass too much coffee

- Vadim

Top comments (10)

Collapse
 
itskondrat profile image
Mykola Kondratiuk

The 3am rubocop cycle is painfully accurate lol. Run linter, fix 3 errors, run again, now 5 new ones. Been there too many times. I built VibeCheck, TellMeMo, and a couple other products mostly solo and yeah, juggling dev/tester/security/PM roles gets exhausting fast. Curious how Buildmate handles context switching between agents though, like does the PM agent actually understand what the security agent flagged? Biggest challenge I hit with multi-agent setups is they lose the thread when handing off work.

Collapse
 
vadim7j7 profile image
Vadim Karnopelev

Thank you for reading!

Good question, I actually ran into the same context-loss issue early on.

So basically agents don't pass stuff through returns (thats where things get lost). They write to files instead - security dumps its findings to .agent-pipeline/security.md, eval writes scores, etc. Then the orchestrator just... reads all of it before deciding what to do next.

Like if security finds SQL injection at line 45 somewhere, the PM sees that, tells grind agent "hey fix this", then reruns the audit after. Doesn't move forward until its clean.

Theres also a feature file that keeps track of everything - kinda like the source of truth for the whole thing. File-based state basically. Stuff doesn't disappear between handoffs.

But there's still a small problem with using PM (orchestrator): when you continue communicating with Claude Cod, the orchestrator might be skipped. That's why I always add "Use PM: ...".

Collapse
 
itskondrat profile image
Mykola Kondratiuk

Oh that's smart, using the filesystem as the state layer. I've been doing something similar where the agent writes everything to specific directories and then reads from them, but I like how you've got that feature file as source of truth. The orchestrator skip thing is real though - I ran into that exact issue where continuing a chat would bypass the whole pipeline and agents would just start doing random stuff without the context from previous steps. Your "Use PM:" prefix is basically forcing the routing which makes sense. How do you handle the case where the PM agent itself makes a mistake and kicks off the wrong agent? Does it have a way to self-correct or do you just intervene manually?

Thread Thread
 
vadim7j7 profile image
Vadim Karnopelev

Hmm honestly I haven't run into this much since I fixed the orchestrator a while back. Now it creates a plan with all the delegations mapped out and you going to approve it before anything starts. So you see exactly which agent does what before it kicks off.

I did have delegation issues early on tho - agents getting tasks they shouldn't, that kinda thing. Spent some time fixing the orchestrator to route stuff correctly. Now it pretty much always picks the right agent for the job.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

The plan approval step is a solid pattern. I ended up doing something similar - making agents dump their intended actions to a manifest file first so you can review before execution. The early days of agents just doing whatever they want were... chaotic lol. Curious though, do you version the plans at all? Like if an agent fails mid-execution do you roll back to the plan or just re-run?

Thread Thread
 
vadim7j7 profile image
Vadim Karnopelev

Oh thats a good idea actually. I havent done proper versioning to be honest - felt like it might be overkill?

What it does now is track features and status in .claude/context/features/ - whats done, whats in progress, what failed. But also like git already tracks everything. So if something goes wrong you can just rollback with commits. Adding another versioning layer on top felt like duplicating what git already does.

If something breaks mid-execution you just ask it to rollback the changes or checkout the previous commit.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

yeah honestly leaning on git for versioning makes total sense, why build another layer when commits already give you the full history. I think the only case where plan-level versioning matters is when the plan itself changes - like you start with "build auth" and midway realize you need to pivot to "build auth + rate limiting" and now half the features context is stale.

but tracking features and status in .claude/context/features/ is pretty clever. we do something similar where we keep a manifest of what the agent touched so if it goes sideways you know exactly which files to look at instead of diffing the entire repo.

the "just ask it to rollback" approach is funny because it works surprisingly well until it doesnt. ever had it confidently rollback to the wrong commit?

Collapse
 
halille_azami_4f7d88d1ec1 profile image
Halille Azami

Hi Vadim, can you explain what can you test with your setup? is it only for python or can this apply to javascript and others languages? thanks Hal

Collapse
 
vadim7j7 profile image
Vadim Karnopelev • Edited

Hey Halille! Works with Rails (Ruby), Next.js, React Native (JavaScript), and FastAPI (Python). Adding more soon.

Python is only needed to run the CLI itself - it just generates the config files. Your actual project can be any supported stack like rails+nextjs or fastapi+react-native. Check available stacks with buildmate --list, profiles with buildmate --profiles, or options for a stack with buildmate --options nextjs.

You will find instructions on how to use it in my next article dev.to/vadim7j7/how-to-actually-us...

Collapse
 
vadim7j7 profile image
Vadim Karnopelev

You can also start fresh with a new project:
buildmate nextjs ~/my-app --ui=tailwind

This creates a folder with agents, skills, patterns - everything configured for Next.js + Tailwind. Then just run claude and ask what you need:
Use PM: Build a landing page with hero section, pricing cards, and contact form
or
/pm Build a landing page with hero section, pricing cards, and contact form