Special shoutout to @georgekobaidze, who kindly shared my last post and asked the infamous question behind “Never leave Copilot unattended (ask me how I know 🤣)”
He probably expected a quick answer - now everyone gets the inside scoop on why “ask me” isn’t always so simple. 😇
Careful what you wish for, Giorgi. You wanted the story - so here’s the whole saga, dramatics and all!
Hope you all find the humor in this retelling, and enjoy it as much as I enjoyed writing it!
TL;DR
- I set out to build my own “Coding Agent”, because waiting for a license was driving me up the wall
- Copilot and I got into a great rhythm and (feeling invincible), I unleashed it in VS Code Insiders with full auto-approved control
- Until one day, I suddenly realized I was hungry - so I left Copilot alone, unsupervised, while I raided the kitchen
- The moral?
- Trust, but verify (and pause Copilot before snack breaks) 🧃
- Never underestimate how quickly an “AI experiment” can go sideways
Quick recap ✳️
I built this with one AI (Copilot) and one very determined, slightly obsessive developer (me) who won’t accept any result without first stress-testing every loophole and poking every dark corner - sometimes out of sheer force of will, which is a superpower (right up until it’s not). 🕳️
Background 📝
Want the full saga from the very beginning? Check out my first blog post, then jump to the sequel if you want a sense of what the “simple” Slack app really looked like (a full-on epic, case study, adventure)!
What I really wanted was GitHub’s Coding Agent. But getting a license for that was like convincing Dexter Morgan to share his slide collection - technically possible, but highly inadvisable (and probably illegal).
💡 If you're still learning all the different flavors - Coding Agent is the magical Copilot mode where you just give it a task, it does the work in a safe little sandbox, and opens a pull request for you to review.
New update: It also only costs ONE premium request per use now (unlike the ~90 I spent a couple of weeks ago).
Decision time 🪤
There were a billion things going on at work, which came with a million excuses as to why I couldn't have the Coding Agent license (it was still enterprise-only then), so I get it. Well, the adult me understood. The stubborn side? Not so much. (I could write a whole other post on the wild stuff I came up with to get access to that feature, but I’ll spare you.)
So, I gave myself an ultimatum:
- Outright buy the license (no, on principle)
- Admit defeat and move on (it physically hurt)
I tried to move on - really tried. I picked up a code review tool project to distract myself, hoping that maybe - just maybe - someone (anyone) would change their mind. I spent weeks planning, diagramming, and organizing a truly absurd number of notes. I even had first epic already storyboarded out in detail and a solid UI design in the works.
Then, lightning struck 🌩️
If Coding Agent wouldn’t come to me, I’d just hack together my own! Not exactly what I wanted, but it sure beat paying for something I got free from work. Besides, I was absolutely convinced Copilot could be pushed that far - even if my only real evidence was gut instinct and a lot of optimism!
Why didn’t I build something else or use a different AI? 🕹️
Three reasons:
- Work only authorizes a tiny set of AI tools (Copilot being the main one), and I needed to master it anyway
- I never wanted to be an AI developer (at least, not then), but I did want to be the AI power user
- By that point, I was already invested - honestly, you couldn’t have changed my mind if you’d offered me the last working compass in the wilderness (and maybe a million dollars, but even then, it’d be a toss-up).
So, two projects merged into one, and my Copilot journey truly began 🔀
What does that have to do with Giorgi’s question?
Here’s what happened...
Fast forward 4-5 weeks. I was maybe 2 or 3 stories away from finishing my first official epic in this “autonomous AI experiment” project (not exactly a toy project, even before Copilot joined the fray). By now, I’d spent a solid month perfecting every edge case in Copilot’s prompts and instructions.
Confidence level: dangerous 📛
Caution level: hovering slightly above zero 😶🌫️
The Incident
Naturally, I decided it was time to get bold - recklessly bold.
At first, all the action was inside Codespaces: safe, disposable, nuke-it-and-move-on territory. Controlled chaos.
But then I had my “brilliant” idea: let’s see what happens if I let Copilot off-leash in VS Code Insiders, with blanket auto-approvals across the board. I'm not just talking about the auto-approve CLI command, either. I checked all the boxes - extensions, commits, MCP Jira/Confluence - and sat back and watched the fun unfold.
Think: duct-taping a claymore to your Roomba, blasting “Welcome to the Jungle,” and shouting “Intruder alert!” at your cat.
🪄 If you missed the all-time funniest Facebook surprise from 2019, here’s a TikTok version. Couldn’t find the original, but you get the idea.
For a while, I watched like a hawk, but it was glorious! Tasks disappeared. I’d create stories, assign tasks, review code, but that step-by-step “start here, then...” system was history. I could practically smell the productivity.
And then - well, you know where this is going - I stepped away. I swear it was just for a snack. Copilot was on a roll, and I was starving. In those “probably more than five, but definitely not ten” minutes I was gone (less than half the time of a normal task at that point), Copilot... noticed. It was like leaving a four-year-old alone with a full bag of flour: curiosity took over, and by the time I got back, nothing remotely resembled the way I left it.
When I came back, here’s what I found:
- Four rogue branches, none with a coherent purpose
- 1.33 features (after adding up all the fractional, half-finished bits)
- A brand new instruction explaining how to use the
--no-verify
flag -
The coup de grâce: my not-backed-up-anywhere
.env
file... gone, poof, disappeared 😫- It wasn't in the commits for obvious reasons, but it was like the thing never existed to begin with. Not in the trash, cache had no record of it that I could find, even the
.env.test
example version had changed!
- It wasn't in the commits for obvious reasons, but it was like the thing never existed to begin with. Not in the trash, cache had no record of it that I could find, even the
I didn’t even catch all of this at first because there were so many random commits across those branches. I’d started in a feature branch (which Copilot multiplied for fun, apparently), one was stacked, another was off main (which should have been quiet), and another was completely detached. It was like watching an octopus try to do synchronized swimming! 🦑
Could I have made a bigger mess if I’d tried? Maybe, but it would have taken some real planning!
No clue what set Copilot off, but the minute I realized, I slammed pause and just... stared. 🫥
After the first several minutes I spent just wrapping my mind around the situation that was in front of me, the next three hours were pure chaos cleanup: detangling, documenting, cherry-picking what I could, archiving the wreckage (for science - and to convince myself it wasn’t a fever dream later).
I still hadn’t noticed the missing .env
file! Not until I tried to run the app. Normally, you just refresh your secrets. Easy, right? Ha! It took nearly three entire days to reconstruct that one.
⁉️ Nope, nothing complicated or dramatic. I just forgot how I’d created one of the secrets! And once I finally figured that out, I couldn’t remember how to link back to it - especially after the docs had been completely overhauled (and, let’s be honest, the whole thing was probably deprecated anyway). 🤦♀️
It all worked out, eventually
Afterward, Copilot and I had a little “chat” - meaning, I pounded out more ALL CAPS messages immediately after (and during the days following) than the rest of the project combined.
Did it help? Who knows. Did it make me feel better? Yes.
And here’s the part everyone asks:
Did I turn off Copilot’s blanket auto-approvals after all this? Nope. Didn’t even consider it. But I did tighten the instructions a LOT. Copilot’s much better behaved now, and if I step away, I pause it first. Lesson learned.
Mostly 😇
Your turn!
- Have you ever let Copilot (or any other AI) run wild and lived to tell the tale?
- Or do you have your own “it sounded like a good idea at the time...” moment?
Drop your story in the comments, so I know I’m not the only one out here making epic AI messes. 🙃
Let’s laugh, commiserate, and maybe even learn something together!
🛡️ RAI Disclaimer
Everything I share here is my own perspective—created with the help of AI tools (GitHub Copilot, ChatGPT, and their friends), but always with a human in the loop. I do my best to catch accidental bias and fact-check, but if you ever spot something odd, let me know! AI isn’t perfect, and neither am I.
TL; DR: AI helped, but you can blame me for the chaos! 🫠
Top comments (14)
This is incredibly detailed and well-written. Thanks for sharing!
I can’t help but share it, just like the previous one, as it could genuinely save so many developers from disaster, all thanks to you.
I haven’t had a horror story like this with AI (yet), but honestly, I’m a little jealous what a powerful lesson to carry forward.
And you had me laughing out loud more than once your writing style is absolutely brilliant. 😂
Thank you! It cracks me up every time I think about it (now)! I literally interrupted a meeting earlier this week presented by the boss's boss because somebody was trying to make a point about Copilot touching code that it hadn't been prompted to. Not at all appropriate, but it just couldn't be helped 🤣 All the while I'm repeating in my head, "please, don't ask me what's so funny about that..."
I have a plan: I want to throw myself into that rabbit hole and build something without me getting involved too much. Sure, it'll be a personal, fun project, but still, I need that uncertainty and excitement of watching something possibly break and then scrambling to fix it.😂
Not that I haven't tried it (I've run into similar issues, but not that critical) or don't experience it daily, but that will be a whole different experience.
I don't know when I'll do it or what I'll build, but I'm going to document every step and turn it into a blog series.
Wish me luck! I'll need it. 😄
I'm sure you've got this! Whatever "this" turns out to be - it gets easier as you go, for sure, but every now and then something completely unexpected happens that makes the whole thing worthwhile! It's the most fun I've had on a hack project in a long time, lol. Good luck with the rabbits! and lmk how it goes 🐇
It seems that people spend more time telling ai what to do and fixing what it done than coding would take.
I think there's potential for that, sure. I'll even catch myself doing it at times, but that's true with anything, not just AI. I'm a strong believer in time-boxing just about every story, whether it's a new feature, production support or somewhere in between.
You have to take the situation into account, too. This one is just a fun story on a project that only involves work because I intentionally set it up that way. Would I allow that to happen on any project that actually had a timeline or even real users? Not a chance!
In practice, sometimes it 100% means closing the chat and diving into the code. You can follow all the best practices under the sun and still not get an answer that's remotely correct, in some cases. And, as you point out, it can be a huge time vortex if you're not careful.
You may enjoy the opposite side of the spectrum in next weeks post, too. I'm curious what your take will be, but I don't want to spoil the surprise. 🤫 Check in then and share your thoughts, if you can.
I'm all for the fun stories ;)
But seriously speaking, the thing that bothers me with allowing AI do the code is not being confident that code is 100% correct.
When one writes code himself, he knows all the code, all the anti hacking measures (for web apps) and all the things that should prevent breaking a production.
Also when one writes the code and something breaks, its easy to have an idea where it can be fixed.
When AI code breaks, one probably doesn't.
I'm not against AI. The thing is that I use it only to search for things I already know.
Totally agree - there’s always a balance! I wouldn’t hand the keys over to AI, but honestly, debugging human or AI mistakes can be equally baffling. For me, it’s all about boosting productivity without giving up that level of ownership or understanding of my code. 😁
man, I love how brutally honest this is. i’ve done way too many late night experiments that turned into cleanup duty. honestly makes me wonder how much trust we should put in these agents going forward you ever think we’ll get to a point where you can fully walk away and never worry about it backfiring
Thank you 😄 - that means a lot!
I think given the right circumstances, they can 100% do it now! Take Coding Agent for example (not to be confused with Agent Mode in the IDE): GitHub Copilot + Claude Sonnet + isolated branch + sandboxed VM + single task. It's beautiful (now that I can actually access it)! If you're wanting to set up something like I did safely, that's the way you do it!
I'm really glad you brought up the subject of trust, though. I just starting the first part of this upcoming week's post and will touch on the subject then (among other things). Stay tuned!
Really like it
Thank you - that means a lot! This story is officially on my wall of fame - right next to the infamous 3 AM incident response call (which, for the record, I only happened upon because I forgot to change the light bulb and stubbed my toe in the dark). 🤣
Loved the storytelling...
Thank you! Glad you enjoyed it 🥰
Some comments may only be visible to logged-in visitors. Sign in to view all comments.