<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Arun Raghunath</title>
    <description>The latest articles on DEV Community by Arun Raghunath (@arunraghunath).</description>
    <link>https://dev.to/arunraghunath</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3967903%2Ff629642a-ff69-4ca7-8658-8d96ef440efe.png</url>
      <title>DEV Community: Arun Raghunath</title>
      <link>https://dev.to/arunraghunath</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/arunraghunath"/>
    <language>en</language>
    <item>
      <title>I got tired of running Docker manually. So I built a sandbox for AI-generated code.</title>
      <dc:creator>Arun Raghunath</dc:creator>
      <pubDate>Thu, 04 Jun 2026 09:18:31 +0000</pubDate>
      <link>https://dev.to/arunraghunath/i-got-tired-of-running-docker-manually-so-i-built-a-sandbox-for-ai-generated-code-36ia</link>
      <guid>https://dev.to/arunraghunath/i-got-tired-of-running-docker-manually-so-i-built-a-sandbox-for-ai-generated-code-36ia</guid>
      <description>&lt;p&gt;I've been on sabbatical for a few months. Writing code. Building projects.&lt;/p&gt;

&lt;p&gt;And running Docker manually. Again. And again.&lt;br&gt;&lt;br&gt;
&lt;code&gt;docker run&lt;/code&gt;. Check what's up. &lt;code&gt;docker stop&lt;/code&gt;. Forget one. Find it next week eating RAM. Repeat.&lt;/p&gt;

&lt;p&gt;At some point I asked: why is this still manual? Why can't containers just spin up, run, and die when they're done?&lt;/p&gt;

&lt;p&gt;Then I threw AI into the mix.&lt;br&gt;&lt;br&gt;
Now I'm not just running my code. I'm running code a model wrote. Code I haven't audited line by line. Code that might have &lt;code&gt;os.system(f'rm -rf {user_input}')&lt;/code&gt; because the model had a bad day.&lt;/p&gt;

&lt;p&gt;That's a different problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The question nobody wants to answer
&lt;/h2&gt;

&lt;p&gt;Cursor, Claude Code, Windsurf, Copilot. They all generate Python, Node, Go.&lt;/p&gt;

&lt;p&gt;None of them answer: &lt;strong&gt;where does that code actually run?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Best case: you paste it into your terminal and hope.&lt;br&gt;&lt;br&gt;
Worst case: you're piping untrusted &lt;code&gt;eval()&lt;/code&gt; with access to your &lt;code&gt;.env&lt;/code&gt; file, your AWS creds, and your customer database.&lt;/p&gt;

&lt;p&gt;In a startup that's risky.&lt;br&gt;&lt;br&gt;
In fintech that's an FCA fine and a conversation with Legal you do not want to have.&lt;/p&gt;

&lt;p&gt;I spent 18 years in banking. I watched teams ban AI coding tools outright because nobody could answer: "Where does the generated code run, and what can it touch?"&lt;/p&gt;

&lt;p&gt;So I decided to build the answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What ships today: Petri
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Jhansi.io&lt;/strong&gt; starts with &lt;strong&gt;Petri&lt;/strong&gt;, the execution engine. It's live right now.&lt;/p&gt;

&lt;p&gt;What it does:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spins up an isolated Docker container per request&lt;/li&gt;
&lt;li&gt;Runs Python, Node, or Go code&lt;/li&gt;
&lt;li&gt;Returns stdout/stderr&lt;/li&gt;
&lt;li&gt;Tears down the container. Zero state left behind.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The API:&lt;/p&gt;

&lt;p&gt;POST /v1/sandboxes → Create sandbox, get sb_&lt;br&gt;
POST /v1/sandboxes/{id}/exec → Run code, get output&lt;br&gt;
DELETE /v1/sandboxes/{id} → Destroy it. Gone.&lt;/p&gt;

&lt;p&gt;No Docker CLI. No Compose files. No "wait, is &lt;code&gt;sad_fermat&lt;/code&gt; still running from Tuesday?"&lt;/p&gt;

&lt;p&gt;Petri answers "where does code run". That's it. It does not touch secrets. It does not produce compliance audit logs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why existing tools don't cut it
&lt;/h2&gt;

&lt;p&gt;E2B, Modal, Daytona are great tools. I use them. But they're SaaS only.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;E2B / Modal / Daytona&lt;/th&gt;
&lt;th&gt;Jhansi.io with Petri&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hosting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Public cloud only&lt;/td&gt;
&lt;td&gt;Self-hosted or managed SaaS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data residency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Your code runs on their infra&lt;/td&gt;
&lt;td&gt;Runs in your VPC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Execution model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stateful VMs in many cases&lt;/td&gt;
&lt;td&gt;Ephemeral container per run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Who can use it&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Startups&lt;/td&gt;
&lt;td&gt;Startups, banks, anyone with a regulator&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you're a bank, you cannot send customer PII to a third party to execute. You need to self-host. You need control.&lt;/p&gt;

&lt;p&gt;Petri gives you that. But execution is only 30% of the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The roadmap: What I'm building next
&lt;/h2&gt;

&lt;p&gt;Petri solves "where does it run". It doesn't solve "what can it touch" or "prove it to compliance".&lt;/p&gt;

&lt;p&gt;That's why I'm building &lt;strong&gt;TenantVault&lt;/strong&gt; and the &lt;strong&gt;Audit Layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TenantVault&lt;/strong&gt;: Secrets injection where your AI agent can use a database password to run a query, but it can't read the password, print it, or exfiltrate it.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Audit Layer&lt;/strong&gt;: Full execution traces. What ran, what files it touched, what network calls it made. Stream it to your SIEM.&lt;/p&gt;

&lt;p&gt;I'm building those because 18 years in banking taught me you can't deploy AI codegen without them. "It ran in Docker" isn't enough when the FCA asks questions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full roadmap with ETAs: jhansi.io/roadmap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No vaporware. If it's not on the roadmap with a target date, we're not building it yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where things stand
&lt;/h2&gt;

&lt;p&gt;Petri is running. Python, Node, Go support. REST API. Sub-second cold starts.&lt;/p&gt;

&lt;p&gt;Next up is the SDK so you can do this:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
from jhansi import Sandbox

with Sandbox(language="python") as sb:
    result = sb.exec("print('hello from isolation')")
    print(result.output)

No SDK to share yet. I'm building in public because I want feedback before I lock the API. Especially from teams in fintech, healthtech, or anywhere "oops, it leaked" isn't an option.

## Follow along

I'll post technical deep-dives here and on GitHub as I ship:

1. Python + TypeScript SDKs
2. Self-hosted Docker Compose setup
3. TenantVault and audit streaming

**Jhansi.io — Build it. Run it. Ship it.**  
Because "where does this code run?" shouldn't be a rhetorical question anymore.

---

*Building in public. Star the repo on [GitHub](https://github.com/jhansi-io/jhansi or check the roadmap at [jhansi.io/roadmap](https://jhansi.io/roadmap). Questions? Drop them below.*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>docker</category>
      <category>security</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
