Firstly, excuse the AI generated image. This is the Lone Ranger on a robot horse apparently..
Anyhoo, a few days ago I wrote about the idea of the AI developer team of one — the notion that a single developer, working with modern AI coding tools, can operate more like a small engineering team than an individual contributor.
At the time that post was mostly theoretical. I had only just started building something to test the idea.
Since then I’ve been running the experiment properly.
The project I’ve been building is a small application designed to answer a surprisingly common business question:
“Will this invoice actually be accepted by the company I'm sending it to?”
It turns out that this question is more complicated than it sounds.
Different organisations have very specific requirements for invoices. Some require purchase orders in strict formats. Others require exact site names. Some ingest invoices via OCR, others via EDI or Peppol networks. When those rules aren't followed, invoices get rejected or delayed.
So the system I'm building tries to check invoices before they are sent and predict whether the buyer's systems will accept them.
The interesting part, though, isn't just the software itself.
It's how it was built.
The development experiment
Instead of building the application entirely by hand, I used AI coding tools as collaborators.
In practice my workflow looks something like this:
- Claude Code handles most implementation work
- ChatGPT helps with architecture, design thinking, documentation, and code review
- GitHub tests and type checking act as guardrails
Rather than typing every line of code myself, the process feels more like directing a small engineering team.
I describe the architecture or change I want.
The AI proposes an implementation plan.
I review it, adjust the direction if needed, and then let it execute.
The result feels surprisingly similar to working with a junior or mid-level developer: the AI does the bulk of the typing, while I focus on the structure of the system.
What the system does now
The application converts invoices into a canonical internal format, which allows invoices from different accounting systems to be processed consistently.
Once the invoice is in that format, the system runs several layers of analysis.
First it validates the structure of the invoice to ensure the data is coherent. Then it applies buyer-specific rules — things like purchase order formatting or required fields.
After that it calculates a readiness score, which indicates how close the invoice is to meeting the buyer's requirements.
For example:
Invoice readiness: 82%
Errors: 1
Warnings: 2
Issues detected:
- Purchase order must be 10 digits
- Site name must match official buyer list
The goal is not just to say “valid” or “invalid,” but to give clear guidance on what needs fixing before the invoice is sent.
The system also runs a buyer acceptance simulation, which predicts whether the buyer's system is likely to accept the invoice automatically.
That step turns the tool from a simple validator into something closer to a buyer compliance engine.
Guardrails matter more than prompts
One of the biggest lessons from this experiment is that good guardrails matter more than clever prompts.
It’s tempting to think that the secret to AI-assisted development is prompt engineering.
In reality, the most important factor has been putting constraints around the system.
Those guardrails include:
- strong TypeScript typing
- automated tests
- deterministic validation rules
- versioned invoice models
- audit logging
These systems allow the AI to make large code changes safely.
If something breaks, the tests catch it immediately.
Without those guardrails, the system would drift quickly.
With them, the AI can move very quickly without losing control of the codebase.
How much time did this actually take?
One of the most interesting questions people have asked is how much time this project required.
So far, I’ve spent roughly 8–12 hours of my own time working on the project across a few sessions.
That includes:
- architecture thinking
- reviewing AI implementation plans
- adjusting prompts
- checking test results
- writing documentation
The actual code produced during that time is significantly larger than what I would normally write in that time window.
How long would a traditional team take?
This is obviously a rough comparison, but it’s useful to think about.
To reach the current stage — which includes:
- canonical invoice modelling
- validation engine
- buyer rule system
- readiness scoring
- acceptance simulation
- artifact generation (Peppol XML, CSV, etc.)
- API endpoints
- automated tests
- documentation
A traditional development process might involve:
- a product owner
- a backend developer
- a frontend developer
- possibly a QA engineer
Even with a small team moving quickly, that work could easily represent 120–200 hours of engineering time.
Spread across a typical development cycle, that would likely mean three to six weeks of work before reaching the same level of functionality.
That difference doesn’t mean AI replaces developers.
But it does change the economics of experimentation.
Ideas that previously required weeks of engineering effort can now reach a working prototype in a matter of hours.
The most interesting direction the project is heading
The original goal was simply to validate invoices.
But something more interesting started to emerge while building the system.
As buyer rules are added — for example, Mitre 10 requirements, government invoicing rules, or Peppol constraints — the system starts to accumulate knowledge about how different buyers operate.
Over time this could become a shared registry of buyer requirements:
- purchase order formats
- accepted invoice structures
- known rejection patterns
- integration formats
At that point the product stops being just an invoice validator.
It becomes a buyer compliance knowledge system.
What this experiment changed for me
The biggest change isn’t just productivity.
It’s how development feels.
Instead of spending most of the time typing code, a lot of the work becomes:
- designing the architecture
- defining constraints
- reviewing implementation plans
- steering the system toward the correct design
The role becomes closer to directing an engineering team than acting as a single developer.
And surprisingly, that workflow works.
What happens next
At this point the focus shifts from building features to something more important:
showing the system to real users.
Because ultimately software isn’t valuable because of how it’s built.
It’s valuable if it solves a real problem.
In this case, the question is simple:
Can a system like this help businesses avoid invoice rejections and get paid faster?
That’s my next experiment.
Top comments (0)