<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Arun Raghunath</title>
    <description>The latest articles on DEV Community by Arun Raghunath (@thearun85).</description>
    <link>https://dev.to/thearun85</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3967903%2Ff629642a-ff69-4ca7-8658-8d96ef440efe.png</url>
      <title>DEV Community: Arun Raghunath</title>
      <link>https://dev.to/thearun85</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/thearun85"/>
    <language>en</language>
    <item>
      <title>We fixed output corruption. Then built persistence. Then TTL. All in v0.6</title>
      <dc:creator>Arun Raghunath</dc:creator>
      <pubDate>Thu, 11 Jun 2026 17:25:58 +0000</pubDate>
      <link>https://dev.to/thearun85/we-fixed-output-corruption-then-built-persistence-then-ttl-all-in-v06-55jn</link>
      <guid>https://dev.to/thearun85/we-fixed-output-corruption-then-built-persistence-then-ttl-all-in-v06-55jn</guid>
      <description>&lt;p&gt;Running untrusted AI-generated code safely is the obvious hard problem.&lt;/p&gt;

&lt;p&gt;But sometimes the problems that break an agent workflow look like boring infrastructure work.&lt;/p&gt;

&lt;p&gt;v0.6 began as plumbing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Persistent sandbox registry&lt;/li&gt;
&lt;li&gt;Automatic cleanup with TTL&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Necessary, but not particularly glamorous.&lt;/p&gt;

&lt;p&gt;Then the tests started failing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The output corruption problem
&lt;/h2&gt;

&lt;p&gt;Every execution returned something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WARNING: Running pip as the 'root' user can result in broken permissions...
[notice] A new release of pip is available: 25.0.1 -&amp;gt; 26.1.2
hello world
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The actual program output was buried under dependency-installation noise.&lt;/p&gt;

&lt;p&gt;For a human reading a terminal, that is annoying.&lt;/p&gt;

&lt;p&gt;For an AI agent parsing execution output, it is broken.&lt;/p&gt;

&lt;p&gt;The cause was straightforward: dependency installation and code execution were chained into a single Docker call, with stderr redirected into stdout.&lt;/p&gt;

&lt;p&gt;Everything ended up in the same stream.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: two Docker calls, not one
&lt;/h2&gt;

&lt;p&gt;We separated the operations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Call 1:&lt;/strong&gt; Install dependencies silently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="n"&gt;dependency_install_command&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DEVNULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DEVNULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Call 2:&lt;/strong&gt; Execute the user command and capture its output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="n"&gt;execution_command&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;stdout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PIPE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;STDOUT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is a small change, but the principle matters:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When infrastructure is built for AI agents, clean output is part of the API contract.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agents parse what you return. Installation logs, warnings and runtime output cannot be treated as one undifferentiated stream.&lt;/p&gt;

&lt;h2&gt;
  
  
  Persistence: SQLite instead of an in-memory dictionary
&lt;/h2&gt;

&lt;p&gt;The original sandbox registry was a Python dictionary.&lt;/p&gt;

&lt;p&gt;Restart the service, and every sandbox record disappeared.&lt;/p&gt;

&lt;p&gt;The containers might still exist, but Jhansi no longer knew about them. Any agent workflow expecting to reconnect after a service restart would fail.&lt;/p&gt;

&lt;p&gt;We considered:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;JSON:&lt;/strong&gt; simple, but vulnerable to partial writes and corruption during crashes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis:&lt;/strong&gt; native TTL and a good operational model, but another service for self-hosters to run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite:&lt;/strong&gt; durable, transactional and already included with Python&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We chose SQLite.&lt;/p&gt;

&lt;p&gt;The schema is intentionally small:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="n"&gt;sandboxes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;language&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;container_id&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;workspace_path&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No ORM.&lt;/p&gt;

&lt;p&gt;No migration framework.&lt;/p&gt;

&lt;p&gt;Just SQLite doing what SQLite is good at.&lt;/p&gt;

&lt;h2&gt;
  
  
  TTL: last active, not creation time
&lt;/h2&gt;

&lt;p&gt;Each sandbox receives an &lt;code&gt;expires_at&lt;/code&gt; value, initially one hour after creation.&lt;/p&gt;

&lt;p&gt;The important decision is that every execution resets the clock:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;new_expires&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TTL_SECONDS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_expires_at&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;sandbox_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;new_expires&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A background task runs every 60 seconds and removes expired sandboxes.&lt;/p&gt;

&lt;p&gt;This makes the TTL activity-based rather than age-based.&lt;/p&gt;

&lt;p&gt;An agent may perform dozens of small executions during a 20-minute analysis. A creation-time TTL can terminate the sandbox in the middle of an active workflow.&lt;/p&gt;

&lt;p&gt;A last-active TTL does not.&lt;/p&gt;

&lt;p&gt;Active sandboxes remain available. Only idle ones are cleaned up.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this unlocks
&lt;/h2&gt;

&lt;p&gt;With persistence and activity-based TTL, Jhansi sandboxes are becoming reliable execution primitives:&lt;/p&gt;

&lt;p&gt;Create a sandbox once.&lt;/p&gt;

&lt;p&gt;Use it repeatedly.&lt;/p&gt;

&lt;p&gt;Survive service restarts.&lt;/p&gt;

&lt;p&gt;Trust that active work will not disappear underneath the agent.&lt;/p&gt;

&lt;p&gt;That is the foundation longer-running agent workflows need.&lt;/p&gt;

&lt;p&gt;Next in v0.7: &lt;strong&gt;streaming execution through Server-Sent Events.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No more waiting for the entire command to finish before seeing its output.&lt;/p&gt;

&lt;p&gt;Jhansi is an open-source cloud sandbox for running AI-generated code safely.&lt;/p&gt;

&lt;p&gt;Self-host it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AI agents need execution, not credentials.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Star it if this problem resonates: &lt;a href="https://github.com/jhansi-io/petri" rel="noopener noreferrer"&gt;https://github.com/jhansi-io/petri&lt;/a&gt;&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>infrastructure</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>pip install jhansi — the SDK is live</title>
      <dc:creator>Arun Raghunath</dc:creator>
      <pubDate>Mon, 08 Jun 2026 19:49:30 +0000</pubDate>
      <link>https://dev.to/thearun85/pip-install-jhansi-the-sdk-is-live-1l03</link>
      <guid>https://dev.to/thearun85/pip-install-jhansi-the-sdk-is-live-1l03</guid>
      <description>&lt;p&gt;Six weeks ago, running code on jhansi.io meant curl + sandbox IDs + manual cleanup.&lt;/p&gt;

&lt;p&gt;Today it looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;jhansi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Sandbox&lt;/span&gt;

&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;Sandbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python main.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;output&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the milestone. The SDK is live.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;The API was always there. Petri — the execution engine underneath — has been running code in isolated Docker containers since v0.1. But you had to understand HTTP, manage container lifecycle, and remember to delete sandboxes or you'd leak resources.&lt;/p&gt;

&lt;p&gt;The SDK removes all of that. You write Python. jhansi.io handles the rest.&lt;/p&gt;

&lt;h2&gt;
  
  
  The context manager was non-negotiable
&lt;/h2&gt;

&lt;p&gt;If you create a sandbox and forget to delete it, you leak containers and workspace storage. That's not acceptable — especially when AI agents are creating sandboxes programmatically.&lt;/p&gt;

&lt;p&gt;The context manager makes cleanup automatic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;Sandbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# sandbox created here
&lt;/span&gt;    &lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upload_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;main.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;python main.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# sandbox deleted here — even if exec raised an exception
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No leaked containers. No cleanup code. No surprises.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Docker-in-Docker problem
&lt;/h2&gt;

&lt;p&gt;Self-hosting Petri via &lt;code&gt;docker compose up&lt;/code&gt; uncovered something we hadn't anticipated.&lt;/p&gt;

&lt;p&gt;Petri runs inside a Docker container. But Petri's job is to spin up Docker containers to run your code. So Petri needs access to Docker — from inside Docker.&lt;/p&gt;

&lt;p&gt;Fix one: mount the Docker socket.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/var/run/docker.sock:/var/run/docker.sock&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fix two: shared workspace path. Petri creates workspace folders inside its container. When it mounts those into sandbox containers, Docker looks for the path on the host — not inside Petri. The path doesn't exist.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/var/run/docker.sock:/var/run/docker.sock&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;/tmp/petri-workspaces:/tmp/petri-workspaces&lt;/span&gt;
&lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;PETRI_WORKSPACE_ROOT=/tmp/petri-workspaces&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same path both sides. Docker finds it. Problem solved.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start Petri&lt;/span&gt;
git clone https://github.com/jhansi-io/petri.git
&lt;span class="nb"&gt;cd &lt;/span&gt;petri
docker compose up

&lt;span class="c"&gt;# Install the SDK&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;jhansi
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full docs at &lt;a href="https://docs.jhansi.io" rel="noopener noreferrer"&gt;docs.jhansi.io&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;v0.6&lt;/strong&gt; — persistent registry so sandboxes survive Petri restarts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;v0.7&lt;/strong&gt; — streaming exec, real-time output as your code runs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server&lt;/strong&gt; — Cursor and Claude Code use Petri directly instead of their own cloud.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The MCP server is the one I'm most excited about. More on that soon.&lt;/p&gt;




&lt;p&gt;Star the repo if you're following the build. ⭐&lt;br&gt;
&lt;a href="https://github.com/jhansi-io/jhansi" rel="noopener noreferrer"&gt;github.com/jhansi-io/jhansi&lt;/a&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>python</category>
      <category>showdev</category>
      <category>tooling</category>
    </item>
    <item>
      <title>We built test mode. Then discovered it was broken.</title>
      <dc:creator>Arun Raghunath</dc:creator>
      <pubDate>Mon, 08 Jun 2026 09:00:11 +0000</pubDate>
      <link>https://dev.to/thearun85/we-built-test-mode-then-discovered-it-was-broken-5a5k</link>
      <guid>https://dev.to/thearun85/we-built-test-mode-then-discovered-it-was-broken-5a5k</guid>
      <description>&lt;p&gt;&lt;em&gt;Part of building jhansi.io in public.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Test mode sounded simple. Upload code, pass a command, jhansi runs it + your test suite. Done.&lt;/p&gt;

&lt;p&gt;Except it wasn't done. First run: empty output. No errors. Just silence.&lt;/p&gt;

&lt;p&gt;Here's what broke — and how it changed how we think about AI-generated code.&lt;/p&gt;




&lt;h2&gt;
  
  
  The original idea
&lt;/h2&gt;

&lt;p&gt;AI writes code. Scripts, APIs, full backends. But code without proof is liability.&lt;/p&gt;

&lt;p&gt;Test mode is the proof. You upload a project to a jhansi sandbox, pass the command that starts your app, and jhansi:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Runs the command&lt;/li&gt;
&lt;li&gt;Waits for the server to come up&lt;/li&gt;
&lt;li&gt;Executes your test suite against it&lt;/li&gt;
&lt;li&gt;Returns results&lt;/li&gt;
&lt;li&gt;Kills everything&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All inside an isolated container. Nothing escapes. Nothing persists.&lt;/p&gt;

&lt;p&gt;This is the verification layer missing from Cursor, Claude Code, Windsurf. They generate. We verify.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem we didn't anticipate
&lt;/h2&gt;

&lt;p&gt;v0.4 of test mode accepted a filename.&lt;/p&gt;

&lt;p&gt;Upload &lt;code&gt;app.py&lt;/code&gt;, call exec with &lt;code&gt;filename: "app.py"&lt;/code&gt;, jhansi figures out how to run it.&lt;/p&gt;

&lt;p&gt;The problem: real projects aren't single files.&lt;/p&gt;

&lt;p&gt;A Flask app is &lt;code&gt;app.py&lt;/code&gt; + &lt;code&gt;tests/&lt;/code&gt; + &lt;code&gt;requirements.txt&lt;/code&gt;. When we uploaded them separately, they landed flat in the workspace. pytest couldn't find &lt;code&gt;tests/&lt;/code&gt;. The installer couldn't find &lt;code&gt;requirements.txt&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We built test mode for the toy world. But AI doesn't generate toys. It generates projects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI agents don't write hello_world.py. They write repos.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The fix: projects are zips, not files
&lt;/h2&gt;

&lt;p&gt;Obvious once you see it. Upload the whole project as a zip.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# From inside your project&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;my_project &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; zip &lt;span class="nt"&gt;-r&lt;/span&gt; ../my_project.zip &lt;span class="nb"&gt;.&lt;/span&gt;

&lt;span class="c"&gt;# Upload to sandbox&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/v1/sandboxes/sb_abc123/upload &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"file=@my_project.zip"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;jhansi extracts it preserving structure. &lt;code&gt;tests/&lt;/code&gt; lands where pytest expects it. &lt;code&gt;requirements.txt&lt;/code&gt; lands where the installer looks.&lt;/p&gt;

&lt;p&gt;This also killed the &lt;code&gt;filename&lt;/code&gt; param. You now pass the actual command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/v1/sandboxes/sb_abc123/exec &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"command": "python app.py", "test": true}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Language-agnostic. &lt;strong&gt;Python, Node, Go, Java.&lt;/strong&gt; Same API. jhansi handles the runtime.&lt;/p&gt;




&lt;h2&gt;
  
  
  What test mode actually does
&lt;/h2&gt;

&lt;p&gt;When &lt;code&gt;test: true&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Install deps&lt;/strong&gt; — blocking. Wait for pip install to finish. This was bug #2.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start your app&lt;/strong&gt; — detached, in the background&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wait 2s&lt;/strong&gt; for the server to bind to port&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run tests&lt;/strong&gt; — pytest, jest, go test, mvn test. Auto-detected.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Return output&lt;/strong&gt; — stdout, stderr, test summary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kill container&lt;/strong&gt; — no state leaks
Test runner needs zero config. If pytest finds it locally, we find it in the sandbox.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The dependency race condition
&lt;/h2&gt;

&lt;p&gt;v1 ran install + app start in one Docker command.&lt;/p&gt;

&lt;p&gt;Container starts → &lt;code&gt;pip install&lt;/code&gt; begins → &lt;code&gt;python app.py&lt;/code&gt; tries to start → pytest fires 2s later.&lt;/p&gt;

&lt;p&gt;But &lt;code&gt;pip install flask&lt;/code&gt; was still downloading. Server wasn't up. Tests hit &lt;code&gt;ConnectionRefused&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The fix: serialize it.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install deps. Block until done.&lt;/li&gt;
&lt;li&gt;Start app. Detach.&lt;/li&gt;
&lt;li&gt;Sleep 2s.&lt;/li&gt;
&lt;li&gt;Test.
Obvious in hindsight. You only learn this by shipping and watching it fail.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The honest bit
&lt;/h2&gt;

&lt;p&gt;We shipped test mode in v0.4. It works. All four languages tested end-to-end.&lt;/p&gt;

&lt;p&gt;But it took discovering that AI generates projects, not scripts, to get there.&lt;/p&gt;

&lt;p&gt;The first design was for the demo. The second design is for the world AI actually creates.&lt;/p&gt;

&lt;p&gt;This is why building in public matters. Not to announce features. To document how the problem reveals itself when you touch it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;v0.5 is serve mode — start a server, get a temporary preview URL, share it with your team, kill it when you're done.&lt;/p&gt;

&lt;p&gt;The last verification step before you deploy anywhere real. No more "works on my machine" from an LLM.&lt;/p&gt;

&lt;p&gt;Code is open source at &lt;a href="https://github.com/jhansi-io/jhansi" rel="noopener noreferrer"&gt;github.com/jhansi-io/petri&lt;/a&gt;. Apache 2.0. Self-host today.&lt;/p&gt;

&lt;p&gt;Building AI tooling at a bank or fintech and this sounds familiar? &lt;a href="https://jhansiio.featurebase.app" rel="noopener noreferrer"&gt;I want to hear from you.&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;jhansi.io — the missing runtime layer for AI-generated code.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>buildinpublic</category>
      <category>ai</category>
      <category>devops</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Closing the execution gap, Part 2: Dependency management</title>
      <dc:creator>Arun Raghunath</dc:creator>
      <pubDate>Sat, 06 Jun 2026 20:29:27 +0000</pubDate>
      <link>https://dev.to/thearun85/closing-the-execution-gap-part-2-dependency-management-3eah</link>
      <guid>https://dev.to/thearun85/closing-the-execution-gap-part-2-dependency-management-3eah</guid>
      <description>&lt;p&gt;&lt;em&gt;This is Part 2 of &lt;a href="https://dev.to/thearun85/closing-the-execution-gap-a-series-3490"&gt;Closing the execution gap&lt;/a&gt; — a series on building jhansi.io, a cloud sandbox for AI-generated code.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The first question I got after shipping persistent sandboxes was predictable:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Great — but do I still have to pip install everything myself?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Yes. You did. That was embarrassing.&lt;/p&gt;

&lt;p&gt;If the pitch is "run AI-generated code with zero friction," making users manage deps manually is a contradiction. For regulated teams it's worse: every new package is a supply-chain review. Friction kills adoption.&lt;/p&gt;

&lt;p&gt;So v0.3 fixes it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem with dependencies in sandboxes
&lt;/h2&gt;

&lt;p&gt;Every sandbox starts as a clean container. Upload &lt;code&gt;main.py&lt;/code&gt;, hit run, and you get:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ModuleNotFoundError: No module named 'requests'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The naive fix is to install at exec time. But downloading from PyPI on every run is slow, expensive, and brittle. &lt;code&gt;pandas&lt;/code&gt; + &lt;code&gt;numpy&lt;/code&gt; is a 40s cold start. Run that 100 times and your AI agent burns budget before it does anything useful.&lt;/p&gt;

&lt;p&gt;The right fix: install once, persist forever.&lt;/p&gt;




&lt;h2&gt;
  
  
  Install once, persist forever
&lt;/h2&gt;

&lt;p&gt;jhansi.io gives every sandbox a persistent workspace — a folder that survives across runs. In v0.3, dependencies live there too.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;First exec: we detect deps, install to &lt;code&gt;/sandbox/deps&lt;/code&gt;, run your code.&lt;/li&gt;
&lt;li&gt;Second exec: deps are already there. Cold start drops dramatically.
This matters for AI agents. Humans tolerate a 30s install. Agents that try 5 approaches to solve a task can't. Workspace-scoped cache means failed attempt #1 pays the install tax. Attempts #2–5 run instantly. &lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  First exec — install + run
&lt;/h1&gt;

&lt;p&gt;$ curl -X POST .../sandboxes/sb_abc123/exec -d '{"filename": "main.py"}'&lt;br&gt;
{&lt;br&gt;
  "output": "Installing requests==2.31.0...\n200\n"&lt;br&gt;
}&lt;/p&gt;

&lt;h1&gt;
  
  
  Second exec — just run
&lt;/h1&gt;

&lt;p&gt;$ curl -X POST .../sandboxes/sb_abc123/exec -d '{"filename": "main.py"}'&lt;br&gt;
{&lt;br&gt;
  "output": "200\n"&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;That's the difference between "AI is too slow" and "AI is faster than a junior dev."&lt;/p&gt;




&lt;h2&gt;
  
  
  Manifest first, auto-detect as fallback
&lt;/h2&gt;

&lt;p&gt;How do we know what to install? Both approaches, in the right order.&lt;/p&gt;

&lt;p&gt;If you provide a manifest, we trust you. You know your deps better than any static analyser. If you don't, we fall back to auto-detection.&lt;/p&gt;

&lt;p&gt;Priority for Python:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;pyproject.toml&lt;/code&gt; → &lt;code&gt;pip install&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;requirements.txt&lt;/code&gt; → &lt;code&gt;pip install -r&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Neither → &lt;code&gt;pipreqs&lt;/code&gt; scan
&lt;code&gt;pipreqs&lt;/code&gt; isn't just &lt;code&gt;import requests&lt;/code&gt; → &lt;code&gt;requests&lt;/code&gt;. It knows &lt;code&gt;import cv2&lt;/code&gt; means &lt;code&gt;opencv-python&lt;/code&gt;, &lt;code&gt;import sklearn&lt;/code&gt; means &lt;code&gt;scikit-learn&lt;/code&gt;. You don't have to remember.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Using a manifest isn't just faster — it's auditable. Auditors can diff your pinned deps between runs. Auto-detect is for prototyping. More on auditability in Part 5.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four languages, four strategies
&lt;/h2&gt;

&lt;p&gt;AI doesn't just write Python. jhansi.io handles the four languages LLMs generate most:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;Manifest detected&lt;/th&gt;
&lt;th&gt;Install command&lt;/th&gt;
&lt;th&gt;No manifest fallback&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Python&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;pyproject.toml&lt;/code&gt;, &lt;code&gt;requirements.txt&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;&lt;code&gt;pip install --target /sandbox/deps&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;pipreqs&lt;/code&gt; auto-detect&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Node&lt;/td&gt;
&lt;td&gt;&lt;code&gt;package.json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;npm install&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Run as-is&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Go&lt;/td&gt;
&lt;td&gt;&lt;code&gt;go.mod&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;go mod download&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;go mod init&lt;/code&gt; + &lt;code&gt;go mod tidy&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Java&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;pom.xml&lt;/code&gt;, &lt;code&gt;build.gradle&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Maven or Gradle&lt;/td&gt;
&lt;td&gt;Direct &lt;code&gt;javac&lt;/code&gt; compile&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each language keeps its own idioms. We don't impose a universal abstraction. Workspace-scoping means one sandbox's &lt;code&gt;torch==2.1.0&lt;/code&gt; can't poison another's &lt;code&gt;torch==1.13&lt;/code&gt;. No dependency hell across AI runs.&lt;/p&gt;




&lt;h2&gt;
  
  
  The trust boundary
&lt;/h2&gt;

&lt;p&gt;One decision worth documenting: we don't vet what gets installed.&lt;/p&gt;

&lt;p&gt;Egress is restricted to official registries — PyPI, npm, Maven Central, &lt;code&gt;proxy.golang.org&lt;/code&gt; — and nothing else. No arbitrary domains. What you install from those registries is your responsibility.&lt;/p&gt;

&lt;p&gt;The contract is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;jhansi.io guarantees isolation. You own your code.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the same model as AWS Lambda or Cloud Run. We contain the blast radius. We don't audit your imports.&lt;/p&gt;

&lt;p&gt;SBOM per exec — a full list of every package installed, with versions and licenses — is on the roadmap. Today we contain. Tomorrow we curate.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;Two things didn't make v0.3:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Streaming output&lt;/strong&gt; — dep installs can take 30s. Right now you wait. Soon you'll see output live and know exactly why &lt;code&gt;torch&lt;/code&gt; is taking forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing import detection&lt;/strong&gt; — if your manifest forgets a package, you get an &lt;code&gt;ImportError&lt;/code&gt; today. We should surface the unlisted import in the response. Coming soon.
Next in the series: &lt;strong&gt;Isolation&lt;/strong&gt; — you can &lt;code&gt;pip install&lt;/code&gt; safely now. But can you stop that package from exfiltrating your AWS credentials? What "hard-sandboxed" actually means, why Docker isn't enough, and the attacks most sandboxes miss.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;jhansi.io is open source (Apache 2.0) at &lt;a href="https://github.com/jhansi-io" rel="noopener noreferrer"&gt;github.com/jhansi-io&lt;/a&gt;. Follow the series on &lt;a href="https://dev.to/thearun85/closing-the-execution-gap-a-series-3490"&gt;Dev.to&lt;/a&gt;, &lt;a href="https://www.linkedin.com/posts/arun-raghunath_run-ai-generated-code-safely-activity-7469098788822093824-oKzG?utm_source=share&amp;amp;utm_medium=member_desktop&amp;amp;rcm=ACoAAAOmdQMBVGWSljvWa9sZSYfndPCZGwXbz0M" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt;, and &lt;a href="https://x.com/thearun85/status/2063334004615528556?s=20" rel="noopener noreferrer"&gt;X&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>devtools</category>
      <category>fintech</category>
    </item>
    <item>
      <title>Closing the execution gap: a series</title>
      <dc:creator>Arun Raghunath</dc:creator>
      <pubDate>Sat, 06 Jun 2026 18:51:07 +0000</pubDate>
      <link>https://dev.to/thearun85/closing-the-execution-gap-a-series-3490</link>
      <guid>https://dev.to/thearun85/closing-the-execution-gap-a-series-3490</guid>
      <description>&lt;p&gt;Every AI coding tool can write Python — Cursor, Claude Code, Windsurf. None of them can run it safely in production.&lt;/p&gt;

&lt;p&gt;That gap between "AI wrote the code" and "the code ran safely" is exactly what I'm building &lt;a href="https://jhansi.io" rel="noopener noreferrer"&gt;jhansi.io&lt;/a&gt; to close.&lt;/p&gt;

&lt;p&gt;This series documents the journey. One layer of the problem at a time.&lt;/p&gt;




&lt;h2&gt;
  
  
  The execution gap
&lt;/h2&gt;

&lt;p&gt;When AI generates code, four things still stand between you and prod:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Dependencies&lt;/strong&gt; — Install the right packages, with versions and licenses you trust&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Isolation&lt;/strong&gt; — Run it hard-sandboxed. No host access, no outbound network, no surprises&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secrets&lt;/strong&gt; — Let AI use your API keys without ever letting it see or leak them&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit&lt;/strong&gt; — Log every execution. Prompt, code, result, timestamp. Compliance-grade.
Most teams stop at step 1. Banks and fintechs can't. FCA, SOC2, and the EU AI Act require audit trails for AI actions. You can't &lt;code&gt;eval()&lt;/code&gt; your way through an audit.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;jhansi.io is the missing &lt;code&gt;run()&lt;/code&gt; for AI-generated code. Open core, cloud sandbox, built to close each part of the gap — layer by layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  The series
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Part 1 — Persistent sandboxes&lt;/strong&gt;&lt;br&gt;
Why "ephemeral" breaks debugging, state, and compliance. The case for giving every AI a home directory.&lt;br&gt;
→ &lt;a href="https://dev.to/thearun85/the-case-for-persistent-sandboxes-in-ai-code-execution-3158"&gt;Read Part 1&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Part 2 — Dependency management&lt;/strong&gt; &lt;em&gt;(coming soon)&lt;/em&gt;&lt;br&gt;
Detecting, installing, and locking deps across Python, Node, Go, and Java. With SBOMs and policy built in.&lt;br&gt;
→ &lt;a href="https://dev.to/thearun85/closing-the-execution-gap-part-2-dependency-management-3eah"&gt;Read Part 2&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Part 3 — Isolation&lt;/strong&gt; &lt;em&gt;(coming soon)&lt;/em&gt;&lt;br&gt;
What "hard isolation" actually means. Containers, Firecracker, zero trust networking, and the metadata service attacks you haven't thought of yet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Part 4 — Secrets&lt;/strong&gt; &lt;em&gt;(coming soon)&lt;/em&gt;&lt;br&gt;
Kernel-level proxies. AI can call Stripe without the key ever entering the sandbox.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Part 5 — Audit&lt;/strong&gt; &lt;em&gt;(coming soon)&lt;/em&gt;&lt;br&gt;
Who ran what, when, with which prompt. Hash-chained logs that satisfy auditors, not just engineers.&lt;/p&gt;




&lt;p&gt;Building this in public. Follow the series on &lt;a href="https://dev.to/thearun85/closing-the-execution-gap-a-series-3490"&gt;Dev.to&lt;/a&gt;, &lt;a href="https://www.linkedin.com/posts/arun-raghunath_run-ai-generated-code-safely-activity-7469098788822093824-oKzG?utm_source=share&amp;amp;utm_medium=member_desktop&amp;amp;rcm=ACoAAAOmdQMBVGWSljvWa9sZSYfndPCZGwXbz0M" rel="noopener noreferrer"&gt;Linkedin&lt;/a&gt;, and &lt;a href="https://x.com/thearun85/status/2063334004615528556?s=20" rel="noopener noreferrer"&gt;X&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Code is Apache 2.0 at &lt;a href="https://github.com/jhansi-io" rel="noopener noreferrer"&gt;github.com/jhansi-io&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>fintech</category>
      <category>devops</category>
    </item>
    <item>
      <title>The case for persistent sandboxes in AI code execution</title>
      <dc:creator>Arun Raghunath</dc:creator>
      <pubDate>Fri, 05 Jun 2026 09:56:53 +0000</pubDate>
      <link>https://dev.to/thearun85/the-case-for-persistent-sandboxes-in-ai-code-execution-3158</link>
      <guid>https://dev.to/thearun85/the-case-for-persistent-sandboxes-in-ai-code-execution-3158</guid>
      <description>&lt;p&gt;Every AI coding tool generates code. None of them solve what happens next.&lt;/p&gt;

&lt;p&gt;Cursor writes your Python. Claude Code refactors your script. Windsurf &lt;br&gt;
ships your feature. But running that code safely, in isolation, with &lt;br&gt;
audit trails, without exposing your secrets, is still an unsolved problem.&lt;/p&gt;

&lt;p&gt;That's what Jhansi.io is built for.&lt;/p&gt;
&lt;h2&gt;
  
  
  The mistake we made in v0.1
&lt;/h2&gt;

&lt;p&gt;Our first execution model was simple. Send code as a string, run it in &lt;br&gt;
an isolated container, return the output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST /v1/sandboxes/{id}/exec
body: { "code": "print('hello world')" }
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It worked. But it had three fundamental problems.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Problem&lt;/th&gt;
&lt;th&gt;Why it breaks&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single file only&lt;/td&gt;
&lt;td&gt;No multi-file projects, no shared modules, no dependencies. Not how production code works.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Full payload on every run&lt;/td&gt;
&lt;td&gt;Even if nothing changed, you resent everything. Wasted bandwidth, added latency.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No foundation for delta sync&lt;/td&gt;
&lt;td&gt;If you're sending everything every time, there's nothing to diff against.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The insight
&lt;/h2&gt;

&lt;p&gt;A sandbox should be a workspace, not a disposable container.&lt;/p&gt;

&lt;p&gt;Give every sandbox a dedicated folder on disk. Files live there between &lt;br&gt;
runs. Execution just says "run this file" — no payload, no resend, no waste.&lt;/p&gt;

&lt;p&gt;This is the architecture shift in Jhansi.io v0.2.&lt;/p&gt;
&lt;h2&gt;
  
  
  What changed
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Workspace per sandbox.&lt;/strong&gt; Every sandbox gets a dedicated folder on disk &lt;br&gt;
at creation time. Zero config locally, overridable in production via &lt;br&gt;
&lt;code&gt;PETRI_WORKSPACE_ROOT&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File upload API.&lt;/strong&gt; Upload once. Upload only when something changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST /v1/sandboxes/{id}/files
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Files land in the sandbox workspace and persist between runs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exec by filename.&lt;/strong&gt; No code in the request body. Just a filename.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST /v1/sandboxes/{id}/exec
body: { "filename": "main.py" }
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Jhansi.io mounts the workspace into a fresh isolated container and runs &lt;br&gt;
the named file. The container dies. The workspace survives.&lt;/p&gt;

&lt;h2&gt;
  
  
  The flow
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create once&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST /v1/sandboxes &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"language": "python"}'&lt;/span&gt;

&lt;span class="c"&gt;# Upload when files change&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST /v1/sandboxes/&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;/files &lt;span class="nt"&gt;-F&lt;/span&gt; &lt;span class="s2"&gt;"file=@main.py"&lt;/span&gt;

&lt;span class="c"&gt;# Exec as many times as you need&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST /v1/sandboxes/&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;/exec &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"filename": "main.py"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What this unlocks
&lt;/h2&gt;

&lt;p&gt;The persistent workspace is the foundation for everything coming next:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Delta sync&lt;/strong&gt; — detect file changes, upload only diffs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto dependency detection&lt;/strong&gt; — parse imports, install packages invisibly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-file projects&lt;/strong&gt; — real codebases, not toy scripts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're building AI agents that generate and run code, we want you in &lt;br&gt;
our design partner program. Early access at jhansiio.featurebase.app&lt;/p&gt;




&lt;p&gt;Jhansi.io — Build it. Run it. Ship it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>architecture</category>
      <category>devtools</category>
    </item>
    <item>
      <title>I got tired of running Docker manually. So I built a sandbox for AI-generated code.</title>
      <dc:creator>Arun Raghunath</dc:creator>
      <pubDate>Thu, 04 Jun 2026 09:18:31 +0000</pubDate>
      <link>https://dev.to/thearun85/i-got-tired-of-running-docker-manually-so-i-built-a-sandbox-for-ai-generated-code-36ia</link>
      <guid>https://dev.to/thearun85/i-got-tired-of-running-docker-manually-so-i-built-a-sandbox-for-ai-generated-code-36ia</guid>
      <description>&lt;p&gt;I've been on sabbatical for a few months. Writing code. Building projects.&lt;/p&gt;

&lt;p&gt;And running Docker manually. Again. And again.&lt;br&gt;&lt;br&gt;
&lt;code&gt;docker run&lt;/code&gt;. Check what's up. &lt;code&gt;docker stop&lt;/code&gt;. Forget one. Find it next week eating RAM. Repeat.&lt;/p&gt;

&lt;p&gt;At some point I asked: why is this still manual? Why can't containers just spin up, run, and die when they're done?&lt;/p&gt;

&lt;p&gt;Then I threw AI into the mix.&lt;br&gt;&lt;br&gt;
Now I'm not just running my code. I'm running code a model wrote. Code I haven't audited line by line. Code that might have &lt;code&gt;os.system(f'rm -rf {user_input}')&lt;/code&gt; because the model had a bad day.&lt;/p&gt;

&lt;p&gt;That's a different problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The question nobody wants to answer
&lt;/h2&gt;

&lt;p&gt;Cursor, Claude Code, Windsurf, Copilot. They all generate Python, Node, Go.&lt;/p&gt;

&lt;p&gt;None of them answer: &lt;strong&gt;where does that code actually run?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Best case: you paste it into your terminal and hope.&lt;br&gt;&lt;br&gt;
Worst case: you're piping untrusted &lt;code&gt;eval()&lt;/code&gt; with access to your &lt;code&gt;.env&lt;/code&gt; file, your AWS creds, and your customer database.&lt;/p&gt;

&lt;p&gt;In a startup that's risky.&lt;br&gt;&lt;br&gt;
In fintech that's an FCA fine and a conversation with Legal you do not want to have.&lt;/p&gt;

&lt;p&gt;I spent 18 years in banking. I watched teams ban AI coding tools outright because nobody could answer: "Where does the generated code run, and what can it touch?"&lt;/p&gt;

&lt;p&gt;So I decided to build the answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What ships today: Petri
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;jhansi.io&lt;/strong&gt; starts with &lt;strong&gt;Petri&lt;/strong&gt;, the execution engine. It's live right now.&lt;/p&gt;

&lt;p&gt;What it does:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spins up an isolated Docker container per request&lt;/li&gt;
&lt;li&gt;Runs Python, Node, or Go code&lt;/li&gt;
&lt;li&gt;Returns stdout/stderr&lt;/li&gt;
&lt;li&gt;Tears down the container. Zero state left behind.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The API:&lt;/p&gt;

&lt;p&gt;POST /v1/sandboxes → Create sandbox, get sb_&lt;br&gt;
POST /v1/sandboxes/{id}/exec → Run code, get output&lt;br&gt;
DELETE /v1/sandboxes/{id} → Destroy it. Gone.&lt;/p&gt;

&lt;p&gt;No Docker CLI. No Compose files. No "wait, is &lt;code&gt;sad_fermat&lt;/code&gt; still running from Tuesday?"&lt;/p&gt;

&lt;p&gt;Petri answers "where does code run". That's it. It does not touch secrets. It does not produce compliance audit logs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why existing tools don't cut it
&lt;/h2&gt;

&lt;p&gt;E2B, Modal, Daytona are great tools. I use them. But they're SaaS only.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;E2B / Modal / Daytona&lt;/th&gt;
&lt;th&gt;jhansi.io with Petri&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hosting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Public cloud only&lt;/td&gt;
&lt;td&gt;Self-hosted or managed SaaS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data residency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Your code runs on their infra&lt;/td&gt;
&lt;td&gt;Runs in your VPC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Execution model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stateful VMs in many cases&lt;/td&gt;
&lt;td&gt;Ephemeral container per run&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Who can use it&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Startups&lt;/td&gt;
&lt;td&gt;Startups, banks, anyone with a regulator&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you're a bank, you cannot send customer PII to a third party to execute. You need to self-host. You need control.&lt;/p&gt;

&lt;p&gt;Petri gives you that. But execution is only 30% of the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The roadmap: What I'm building next
&lt;/h2&gt;

&lt;p&gt;Petri solves "where does it run". It doesn't solve "what can it touch" or "prove it to compliance".&lt;/p&gt;

&lt;p&gt;That's why I'm building &lt;strong&gt;TenantVault&lt;/strong&gt; and the &lt;strong&gt;Audit Layer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TenantVault&lt;/strong&gt;: Secrets injection where your AI agent can use a database password to run a query, but it can't read the password, print it, or exfiltrate it.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Audit Layer&lt;/strong&gt;: Full execution traces. What ran, what files it touched, what network calls it made. Stream it to your SIEM.&lt;/p&gt;

&lt;p&gt;I'm building those because 18 years in banking taught me you can't deploy AI codegen without them. "It ran in Docker" isn't enough when the FCA asks questions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Full roadmap with ETAs: jhansi.io/roadmap&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No vaporware. If it's not on the roadmap with a target date, we're not building it yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where things stand
&lt;/h2&gt;

&lt;p&gt;Petri is running. Python, Node, Go support. REST API. Sub-second cold starts.&lt;/p&gt;

&lt;p&gt;Next up is the SDK so you can do this:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
python
from jhansi import Sandbox

with Sandbox(language="python") as sb:
    result = sb.exec("print('hello from isolation')")
    print(result.output)

No SDK to share yet. I'm building in public because I want feedback before I lock the API. Especially from teams in fintech, healthtech, or anywhere "oops, it leaked" isn't an option.

## Follow along

I'll post technical deep-dives here and on GitHub as I ship:

1. Python + TypeScript SDKs
2. Self-hosted Docker Compose setup
3. TenantVault and audit streaming

**Jhansi.io — Build it. Run it. Ship it.**  
Because "where does this code run?" shouldn't be a rhetorical question anymore.

---

*Building in public. Star the repo on [GitHub](https://github.com/jhansi-io/jhansi or check the roadmap at [jhansi.io/roadmap](https://jhansi.io/roadmap). Questions? Drop them below.*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>docker</category>
      <category>security</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
