DEV Community

Guy
Guy

Posted on

I Put Claude AI in Jail

We just shipped working, secure code to production.

It was written by Claude.

But only after I locked it in a container, stripped its freedoms, and told it exactly what to do.

This isn’t an AI-generated brag post.

This is an explanation of what happens when you stop treating LLMs like co-founders and start treating them like extremely clever interns.

The Problem: Vibe Coding Is Chaos

If you’ve ever prompted AI to “build me a secure backend”, then you’ve experienced:

  • Hard-coded secrets
  • No config separation
  • Auth hacked together
  • Layers in the wrong places
  • Database logic in controller methods
  • Security that is more reminiscent of a first-year student project

It _feels _impressive. But the output is not shippable.

I once tried building a Monkey-Island-style game with Claude at 2am just for fun. It ended with me screaming at a yellow rectangle on an HTML canvas.

Fun? Yes.

Useful? Not remotely.

The Insight: Claude’s Not the Problem, You Are

Claude is phenomenally good at code generation if you feed it the right prompts, at the right level of granularity, and in the right order.

When I use it personally, it acts as a co-architect. I bounce ideas off it, I get help debugging and sometimes it even surprises me with novel solutions (like using inherited env vars + process scanning for child cleanup across Windows/Linux).

But left to its own devices on a complex problem or wide-open scope?

Chaos.

The gap isn’t capability, it’s orchestration.

So… I put Claude in jail. Here’s what I did:

  1. Claude gets containerized
    A clean, temporary dev environment. No Git credentials. Limited network access. No escape.

  2. Start with a user story
    Human developers aren’t expected to work off a one-line mission statement, so why should AI be any different? I feed it a detailed user story that a human developer would be happy with.

  3. Chain-of-thought agent breaks down the work
    “Build a login system” becomes 20+ sub-tasks: token handling, session state, role config, browser caching, etc.

  4. Claude gets micromanaged step-by-step
    Each sub-task is prompted as a mini workflow: analyse → code → fix → verify

  5. Final Claude pass reviews everything
    It outputs a structured JSON diff with explanations.

  6. We converted that to a GitHub PR
    A human reviews. If it’s clean, we merge. If not, we loop until we’re happy.

Every time the task ends, the Claude container is destroyed.

No memory of past sins. No rogue commits.

Clean. Contained. Effective.

The Result?

  • 15–20 minutes per story
  • PRs that pass internal review
  • No vibe coding
  • Shippable code with zero hallucinated libraries or misaligned assumptions

It’s slower per interaction than just “ask it to code” – but way faster overall.

Less rework. Less debugging. More trust in what comes out the other end.

Can You Do This Too?

If you're expecting GPT or Claude to magically build your app from a one-line prompt, you're going to be disappointed.

But if you're willing to:

  • Break tasks down
  • Containerize your AI workflows
  • Build orchestration logic
  • And treat your LLM like a task-executing machine, not a co-pilot

...then yes, it can code for you. And you can ship it.

The Big Question

Don’t think of AI as a replacement. AI is the intern. Orchestration is the manager. And humans are still the ones deciding what matters.

But here’s what I keep asking myself, and I’d love to hear your thoughts:

Should we be building AI tools that act more like interns who learn under supervision… or should we keep pushing for AI that acts like senior engineers we can trust outright?

What do you think?

Want to See the Whole Architecture?

I wrote up a full 3-part breakdown of the system, including failures, lessons, and technical design:

Why I Put Claude in Jail

Read Part 1 on Substack → https://powellg.substack.com/

It’s funny, raw, and surprisingly useful. Part 3 includes a detailed breakdown of the orchestration model and how we integrated Claude into our platform.

TL;DR

LLMs aren't co-founders. They're interns.

Give them tight specs, step-by-step instructions, and no keys to prod.

We built a jail for Claude. And now it ships production-ready code.

Let me know if you want beta access - we’re opening testing soon and would love to get your feedback.

Top comments (4)

Collapse
 
ghotet profile image
Jay

The co-architect line hit me because thats often how I feel when using it. I do my workflows with it the same way, explicit clear detail, no expectations of it perfectly creating what I asked for in one shot. And honestly, i have learned a lot from the smart intern like behaviour.

I'm sure those prioritising profit over real experience in a fast paced business setting would prefer the Engineer that doesn't need oversight, but that's also where jobs are lost and people get a little dumber. Maybe that's just me being a little oldskool. But as you know I have definitely embraced AI. I'm not afraid of the fancy chatbot.

However; I respect the lessons I learned when it failed to complete a task. Even just the soft skills of being able to break down a problem, truly understand what's going on, and be able to communicate it back effectively so that the AI can proceed properly. And sometimes, you just have to go solve a problem yourself anyways because it's repeating bad reddit info or something. We can't pretend like these things aren't trained on reddit lol.

Collapse
 
guypowell profile image
Guy

I'm glad I'm not the only one. The “co-architect” framing is the only way I’ve found to keep my sanity. If you treat the model like a senior engineer you’ll just get frustrated, but if you treat it like a smart intern that occasionally surprises you, you actually build muscle memory around decomposition and orchestration.

And you’re right, the failures are where the real value sits. Every time it blows up on a task, you’re forced to sharpen your own problem-breakdown and communication skills. That’s not wasted time; that’s the part that makes you a better engineer. The profit-driven push for “AI as replacement” misses that entirely. It’s not about eliminating oversight, it’s about leveraging the oversight to get higher-order thinking done.

Reddit-trained hallucinations and all, I’d still rather have an overeager intern with a super brain who occasionally spouts nonsense than nothing at all, so long as I keep them in a sandbox and make sure the sharp tools don’t touch prod lol

Collapse
 
sardanios profile image
Saksham Solanki

That's cool! I'll check it out and will test too once it opens up!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.