Every day is your first day
Allow me to use your mind for a second for a thought experiment.
If you're starting a new job contributing to a codebase, would you read every line in existence before pushing a change? Or every line in the module you are changing? Or every line of the function that you're supposed to alter?
The degree of precision required will vary depending on your task, of course, but I don't think many of us have the luxury of reading every line that exists in the project before getting started on a change.
Now, what if I told you that the code was written by the best software engineer in existence, would it ease your mind? What if it was all written by an intern - or what if it was all AI-generated?
Hopefully, by now, you're starting to catch my drift... We've all seen horrible legacy codebases, intern code, and things that we wish we'd be able to forget. But that's not the end of the world, because...
Legacy Code Runs the World
Legacy code is a good problem to have. That may not resonate with many of you, but having legacy code means that the company has survived long enough to deal with code that has fallen apart over time. And if you're tasked with migrating it or altering it, it probably means that there's still value to be extracted.
We've all heard the tales of COBOL running on a mainframe to this day in many banks and Fortune 500 companies... and while one could speculate about their eventual downfall, the size of their revenue is undeniable.
The Self-Driving Car Problem
Let's hop into another parallel for a second to illustrate a perception problem we might face. If you were given a fully autonomous car that you knew was just as safe as your driving, would you let it drive you around? What if it was ten times safer, or even a hundred times?
For contrast, what if I proposed to replace all other cars on the road, with self-driving cars that are proven to be 25% safer than the average driver. Would you accept to drive in such conditions?
I don't know what your answers are, and they might be very rational... but we have a human bias to overestimate how much control we have on the risks that we can affect through our behaviour. Therefore, it's quite possible that most people would refuse to be driven by an autonomous vehicle, even if they were rationally convinced that it would drive as safely as they personally would.
So... the question is then, would you let a LLM code for you?
LLM Code Is Just Code You Didn't Write
That brings us to my first real proposal. And if you're on the fence about LLMs, maybe you'll just have to try it out for yourself. Give one of the best and latest coding models a self-contained task in your codebase. Plan with it, and work with it to clarify the requirements. Then, once it produced a first version of the code, explain your coding standards and test requirements, and ask it to review itself. And finally, let it open a pull request, and see for yourself. Would you be mad if an intern wrote this? A junior engineer? One of your peers?
You'd be right if you felt that this was a bit mechanical, after all you had to have a conversation about requirements, and also needed to explain coding and test standards. But don't worry, that can also be automated later on, and I'll explain how in later articles.
So far we've just covered common thought experiments about legacy code and AI... but if you're having trouble letting go of controlling every line of code you're writing, you're not alone. I refused to use code generation much, until a very tight deadline came across my desk, and in a last ditch effort, I spun up Claude Sonnet 4.5 and worked a few late nights... and the backlog disappeared.
Reviewing it, I realised I wouldn't be terribly disappointed if a junior or mid-level engineer wrote a PR like Claude Code did. Realistically, it was not too bad of an effort compared to what I could do at 9pm. And thus, the project was back on track, and I was a convert.
The Real Questions
After this experiment, hopefully you'll have seen some passable code being generated, like I did, and you're wondering what's next. After all, this might have just been an easy task, and the code is not as good as what you'd have written anyway.
Then we have to ask, could the output be as good as yours or mine? Well, probably, given an unlimited amount of time iterating over the code and gathering feedback.
But then, can we go much faster with those new tools, without losing much quality?
I think so: It's just that you don't have to read every line.
You'll have to ask yourself some difficult questions. For example, if you're dealing with legacy code, would you always rewrite the entirety of the solution for each migration? That would often be a waste of time. Therefore, that new tooling requires us to be a bit more creative in the way we validate the output.
Fundamentally, at some point, it doesn't matter who wrote it. It matters that it works. But how do we know that it works? Well, that's where I will share with you some of my experience dealing with legacy code... but that will be for another post.
What's coming
I hope that this has been a pleasant introduction to the challenges that one might face when coding with LLMs. It may be a bit basic for some, and I'm sure you've thought some of those things, but now that we're all on the same page, here are a few things you can expect to see in the next few weeks:
- Spot problems before reading any code
- Read code faster and catch more issues
- Write tests that verify behaviour
- Automate the feedback loop so the LLM reviews itself
- Remove yourself as the bottleneck
See you next week for more.
~ Ben
Top comments (0)