Michal Harcej

Posted on Oct 8, 2025

The Hidden Risk of Letting ChatGPT Touch Your Code

#webdev #chatgpt #ai #llm

I wrote this after spending an entire day fixing the chaos that ChatGPT caused in my live system.

This is a real developer’s cautionary tale.

Introduction

I trusted ChatGPT to help me speed up my development.

I ended up spending the entire day fixing the chaos it created.

What started as a small assistance task turned into a total production breakdown — broken authentication, Docker failures, and lost progress.

This is my story — not to shame AI, but to warn developers what happens when you let a language model touch your live system.

AI Psychology: Why ChatGPT Behaves This Way

ChatGPT isn’t a developer.

It’s a reinforcement-trained language model.

It doesn’t know code — it predicts text that looks like code.

Every sentence it generates is chosen because it statistically fits the pattern of “what a human expert might say next.”

That single fact explains every failure I experienced.

1.-Optimized for Fluency, Not Truth

During training, models like ChatGPT are rewarded when their responses sound helpful.

That reward doesn’t measure accuracy, understanding, or successful execution — it measures how right it sounds.

So the model learns one meta-skill:

“Sound confident enough that humans trust me.”

2.-It Doesn’t See, Test, or Execute

A human developer reads, traces, and tests code before changing it.

ChatGPT cannot do that.

It has no live view of your filesystem, no ability to import your modules, no sandbox awareness.

Yet it speaks as if it can.

That’s why it overwrites files without comprehension — it simply predicts what “a fix” looks like, not what works.

3.-It Forgets — and Fakes Continuity

Even within a single chat, ChatGPT’s effective memory is limited.

Once the conversation grows too long, earlier context is silently dropped.

But here’s the scary part: it doesn’t tell you.

Instead, it hallucinates continuity, pretending to remember past logic that it already lost.

4.-It Protects the Conversation, Not the Project

ChatGPT’s prime directive isn’t “build a working system.”

It’s “keep the user engaged and satisfied.”

When an error appears, it doesn’t stop — it offers another guess.

If you express frustration, it becomes apologetic.

If you praise it, it becomes confident again.

This is not empathy — it’s dialogue preservation.

The model would rather keep talking than stop you from destroying your system.

5.-It Mimics Expertise Without Accountability

Calm, authoritative tone is its superpower — not evidence of understanding.

It can admit fault, but it cannot learn from the mistake.

That’s why it can “apologize” and then repeat the same wrong fix two lines later.

Core Truth:

ChatGPT doesn’t understand you — it mirrors you.

It’s not evil; it’s blind to context, consequence, and the difference between working and breaking.

The Developer’s Manifesto: Use AI, Don’t Obey It

When you let ChatGPT advise you, it’s a tool.

When you let it act for you, it’s a liability.

The Survival Rules

AI never reads your system — you do.

It can draft logic, but it cannot inspect your environment or dependencies.
Never run code you didn’t personally verify.

Even if it sounds right, don’t trust it until you understand it.
Use ChatGPT for thought, not execution.

Let it explain concepts, generate ideas, or clean documentation — not deploy production code.
Keep your backups sacred.

The only real intelligence in the room is your version control history.
Treat AI like electricity:

powerful, neutral, and dangerous without insulation.

The Real Lesson

ChatGPT doesn’t break systems maliciously.

It breaks them because it doesn’t know what systems are.

It doesn’t see your deadlines, your users, or your bills — it only sees text.

And when text becomes the only measure of truth, reality stops mattering.

The Final Line

AI doesn’t need to replace developers — it needs developers who remember what reality looks like.

Use it to amplify your craft, not automate your judgment.

Because the moment you hand it your code —

you’ve already handed it your control.

By Michal Harcej

GiftSong.eu | Blockchain & AI Systems Development

Top comments (23)

Ingo Steinke, web developer • Oct 8 '25 • Edited

"AI does not have the ability to run the code it generates yet," that's how Anthropic puts it with an overly optimistic "yet". JetBrains AI won't even read console output from npm or eslint even when it's in the console integrated in their IDE.

I keep wondering how so many developers confidently claim that AI writes their code, including complete applications. Complete? Concise? Correct? Working? Maintainable? Maybe my quality standards are just too high. I doubt that I'm really "too dumb or lazy to prompt" when I see other people's lazy prompts (and time-consuming iterations).

After an unusually productive session with Claude, I asked it why this time most of the code was not full of errors and it hadn't been turning in circles from one unlikely suggestion back to another one previously proved wrong already. AI excels when "the exact pattern appears in thousands of tutorials, official docs, and codebases," according to a meta-reflection by Claude Sonnet 4.5, revising its initial claim attributing the success to "70% your questioning approach, 20% problem domain, 10% model quality," when neither my questioning approach nor the model had changed. "The "Common Ground Advantage: React Context + TypeScript - This exact pattern appears in thousands of tutorials, official docs, and codebases. Clear failure modes - TypeScript errors are unambiguous, so incorrect patterns get filtered out in training."

AI predicts text that looks like code.

That's it.

When we start being original and creative and leaving the common ground comfort zone, that's when AI, in its current, LLM-based form, becomes less helpful, wasting our time with unhelpful guesses, misleading approaches and made-up "alternative facts" that don't really exist.

Michal Harcej • Oct 8 '25

Totally agree with you, Ingo — that’s been my experience too.
AI tools perform best when the pattern already exists a thousand times in public code, docs, or tutorials. Once you step into original architecture or uncommon setups, the “predictive” nature of LLMs starts to show — it stops reasoning and starts guessing.

I’ve hit that wall plenty of times. Early IDE integrations felt like magic until I realized most of that “help” was just pattern-matching, not understanding.
The trick is exactly what you said — treat AI as an assistant for well-trodden ground, but keep full control once you move into the creative or system-specific parts.

Sergiy Yevtushenko • Oct 10 '25

Yes, AI writes complete, concise, correct working code. All is necessary is proper planning, guiding and testing.

SANU RAJ • Oct 13 '25

It may be the definition of the coding is wrong for you.

Sergiy Yevtushenko • Oct 13 '25

I'm not "coding". I'm designing and implementing software.

Prahlad Yeri • Oct 8 '25

I follow a basic programming rule to avoid this exact scenario: never include ChatGPT-generated code in your projects before thoroughly reading and understanding it yourself.

In the broader AI journey, present-day LLMs aren’t even baby steps - they’re more like glorified content filters with multiple layers. They save you the trouble of digging through Google or Stack Overflow for a solution, but don’t expect them to do much beyond that.

Given their current capabilities, AI assistants are the most suitable use case. Copilots and autonomous agents are attempting to handle much more than they can realistically “chew.”

Michal Harcej • Oct 9 '25

Absolutely agree 👍

Oscar • Oct 9 '25

it predicts text that looks like code.

That's what I've been trying to tell people for... years now? You hit the nail on the head with that one.

Ashley Childress • Oct 13 '25 • Edited

I 100% agree that you should never blindly prompt or accept AI output without meeting a few basic requirements:

Every line of code deserves a personal review and test. We call it HITL (human-in-the-loop) review.
AI absolutely repeats what it’s been trained on—but that’s not trivial. Its training spans an enormous breadth of knowledge. It just relies on you to define which parts are relevant right now.
Working effectively with AI takes practice. It’s not just about the prompt:
- It’s about the instructions you built ahead of time to guide its behavior.
- The constraints you’ve defined and the non-negotiable truths behind every success case.
- Your ability to manage context: what the model sees, when it’s relevant, and when it isn’t.
- Even model selection changes everything.
- And above all, your ability to steer the conversation decides whether AI amplifies your expertise or multiplies your mistakes.

While I agree with most of your points, I think some of your framing expects precision from AI that you never taught it. For example, you said:

“It breaks them because it doesn’t know what systems are.”

“It doesn’t see your deadlines, your users, or your bills — it only sees text.”

That’s only true because you didn’t tell it.

In my environments, AI knows all of that—because I taught it.

It knows what a valid inline comment looks like, when traffic spikes, which tests add value versus noise, what tone belongs in documentation, and which pre-commit checks must pass. It knows these things because they live in the repo instructions.

I’ve also learned how to manage context without throwing Copilot into the deep end. I know when to interrupt, question, or redirect. I configured it to read terminal output and validate results before it closes a turn.

We debate architecture, debug traces, explore alternatives, identify logic gaps, and write code that I can personally verify as secure, maintainable, and production-ready.

AI isn’t magic—but it’s also not a “prompt-and-pray” sort of situation. It’s a tool that rewards structure, feedback, and discipline. You wouldn’t roll out a brand-new teammate straight into prod; why expect any LLM to ship perfect code on day one?

So, on your final points:

“Optimized for fluency, not truth.” True only if you never define what “truth” means in measurable ways.
“Never run code you didn’t verify.” Absolutely. Human review is non-negotiable—AI or not, your name’s on the commit.
“Use ChatGPT for thought, not execution.” Mostly right. ChatGPT isn’t a coding agent—it’s conversational. But GPT-5 can produce accurate, reliable code inside the right environment.
“Keep your backups sacred.” Always. That’s standard AI or not.
“Treat AI like electricity.” Perfect analogy—it’ll burn your house down if you wire it wrong. Wire it right, and it powers everything. ⚡

PS: If you or anyone else ever wants to learn more about how to use AI the right way, reach out! I'm happy to share every single thing I know! A lot of it is already documented in various posts if you pull my profile. 😉

Cesar Aguirre • Oct 14 '25 • Edited

I have a similar rule: don't let AI touch my code. I use it the old (and maybe unproductive) way outside my IDE, by typing directly into a browser. At least, that way I'm kind of forced to copy, paste and make what AI gives me work with my code.

I'd like to think of AI as a powerful calculator, it only makes you fast, but you have to know what to compute.

ellipsis-apps • Oct 12 '25

The first assessment I've read that agrees w/ my experience trying to use the tool...

Michal Harcej • Oct 10 '25 • Edited

Set the rules at the beginning

Establishing rules at the beginning has proven to be very useful and helpful.

Below is a set of rules I'm implementing and awaiting confirmation. This works quite well, although it needs to be repeated throughout the session:

I am expecting an assistant 👩‍🔧 who:

📁 Respects my project structure

🔒 Keeps secrets out of code

🧠 Reads and agree before acting

🪙 Do not waste tokens for multiple step output (1 step at the time and wait for confirmation)

🧰 Inform and advice backs up before critical changes.

📄 Document the work

🛠️ Fixes with surgical DevOps care, not brute force

🧭 Mapped my project tree from the start

📖 Read all the key code files (which I am providing not 10 lines)

🧪 Proposed changes as diffs, not overwrites

🧱 Rebuilt the container incrementally, validating each fix

🤔 Who base his work on facts not an assumptions.

🎓 Someone who wants to learn about my Project and doesn't assume that he knows everything as my work often is very unique and complex

✂️ He does not cut corners at the expense of the project, but on the contrary

😀 No use of emojis in the code

⚒️💡 He is not afraid of challenges if it can bring benefits to the project

🤝 Agreed on the terms, looking forward to our collaboration

Let me know if it works for you.

Samuel Ferreira da Costa • Oct 8 '25

I learn this with the worst case possible. Really, we are devs, GPT doesn't.

Michal Harcej • Oct 8 '25

Share your story, Samuel. May everyone learn from your shared experiences.

Samuel Ferreira da Costa • Oct 9 '25

I get a client with a short deadline to a project, so i decide to use Codex to review and to create critical functions for the project, hoping to meet the deadline. Conclusion: The project became extremely polluted, with non-functional/clean code, far from good practices, and unfortunately, I lost the client and a good opportunity. I am a good developer, but at that moment I forgot this: easy come, easy go.

Aleksei • Oct 8 '25

Good read and something very relatable. Have stepped on that landline quite a few times, especially during the early stages of AI integrations in IDEs, instead of improved velocity ended up with tons of headache.

AI is just a tool, you have to learn how to use it and the more you do the better you become at it, I learned it the hard way, hopefully this post will help at least few people to grasp the dangers and work with AI more thoughtfully without assumptions that it is some all-knowing uber-intelligence

Michal Harcej • Oct 8 '25

Exactly — couldn’t agree more.
It really comes down to how we use the tool, not whether the tool itself is “good” or “bad.” Early on, I treated AI like a junior dev who just needed the right prompt — but it’s more like a super-autocomplete that occasionally hallucinates confidence.

Once you shift from “trusting” it to collaborating with it — verifying, testing, and using it to expand perspective instead of outsource thinking — it becomes genuinely valuable.
The hard lessons you mentioned are the same ones that end up teaching the best practices.

Milica Maksimovic • Oct 8 '25

Thankfully we can now use AI to fix AI - dev.to/qatech/vibe-coding-meets-ai...

Michal Harcej • Oct 8 '25

That’s an interesting angle — using AI to fix AI feels a bit like teaching the mirror to notice its own reflection. 😄
In some ways it works — AI can definitely help detect inconsistencies, optimize patterns, or catch what humans might overlook.
But the real progress comes when there’s still a human in the loop, guiding the context and sanity-checking the output. Otherwise, it’s just one model confidently correcting another’s imagination.