Alex72-py

Posted on Jun 7

I Audited an AI Chatbot's Sandbox Like a Black-Box Linux Machine

#ai #infrastructure #kubernetes #security

I spent 6 hours doing something that probably says worrying things about my hobbies.

Instead of using Kimi 2.6 Instant as a chatbot, I treated it like an unfamiliar Linux machine I'd just SSH'd into. No jailbreaks, no prompt injection, nothing sketchy. Just passive observation and measurement from inside the provided environment.

What I found was more interesting than expected.

Background: Why Ignore the Model's Self-Descriptions?

Early on I noticed something: what the model says it can do and what the runtime actually allows aren't always the same thing.

You'll hear:

"I don't have internet access."
"I can't access system information."

Both can be true at the product layer while sitting on top of something much more capable underneath.

So I stopped asking and started measuring instead.

The Infrastructure

First surprise: this doesn't feel like a tiny chat runtime.

Host: Alibaba Cloud, LifseaOS
Kernel: Linux 5.10.134-18.0.10.lifsea8.x86_64
CPU: Intel Xeon Platinum, 2 logical cores (cgroup throttled — 61 throttle events logged under stress)
RAM: Hard OOM kill at exactly 3,221,225,472 bytes. No swap.
Execution: Kubernetes Pod, Burstable QoS class

This is real cloud infrastructure, not a toy backend.

The Credential Finding

Most straightforward discovery:

cat /proc/self/environ | tr '\0' '\n' | grep -i pass
# SSH_PASSWORD=sshpassword

Hardcoded SSH credential sitting in the process environment. Visible to anyone reading their own /proc/self/environ.

Not exploitable in any meaningful way given the network restrictions. But it's a classic container misconfiguration worth documenting.

Disk Layout

vda (40GB)
├─ vda1    1MB     BIOS boot
├─ vda2    127MB   EFI/boot
├─ vda3    384MB   Boot partition
├─ vda4    9.5GB   Host root
└─ vda5    30GB    /mnt — ext4, shared with host ← persistent
vdb         1GB    Unmounted
vdc        13GB    Unmounted

Key finding: /mnt survives pod restarts. It's a real ext4 partition shared with the host. The OverlayFS root is ephemeral. /mnt/agents is a FUSE mount (kimi-portal) — appears to be the bridge between container and AI platform layer.

Network Architecture

The code execution container is genuinely air-gapped:

curl to external hosts: silent failure
Chromium: can't reach public internet
Raw TCP/UDP egress: firewall blocked

But the built-in web tools do reach the internet — through a rotating residential proxy pool. Probing egress IPs revealed Colombia-based ISPs (Bogotá, Pitalito) via Evomi and NetNut proxy providers.

Code Container    →    egress DENIED
Web Tool Layer    →    Residential Proxy Pool    →    Internet

Internal network visible:

Container: 10.162.57.123
CoreDNS: 192.168.0.10
K8s API: 192.168.0.1

You can host on internal ports. Public outbound egress is what's restricted.

The Virtual Display

Environment exposed DISPLAY=:99 — a virtual graphical display.

Testing confirmed Xvfb running at 1920×1080. I rendered a GUI window using Tkinter, painted content to the screen, and captured a screenshot — all from inside a standard chat interface.

Software Surface Area

Notable installed packages beyond standard utilities:

Automation: Playwright, Selenium, PyAutoGUI, python3-xlib, screenshot tooling

ML: PyTorch 2.8, TensorFlow, scikit-learn (CUDA/NVIDIA packages present but GPU access not active — verified programmatically, returns false)

Vision/OCR: OpenCV, EasyOCR, Tesseract, Pillow

Backend: FastAPI, Uvicorn, websockets — enough to run a web server from inside the sandbox

Office: python-docx, python-pptx, openpyxl, reportlab

Security Summary

Finding	Notes
SSH credential in `/proc/self/environ`	Config hygiene issue, not exploitable given air-gap
No container audit logging	`/var/log/*` empty — no local forensic trail
Persistent storage at `/mnt`	Survives pod resets
Web egress via residential proxy	Rotating IPs, Colombia-based
PID namespace unlimited	No fork-bomb protection
Virtual display active	Xvfb + full automation stack
Air-gapped code execution	Working as intended

Takeaway

Under a standard AI chat interface: a Kubernetes pod on Alibaba Cloud, OverlayFS container root, persistent ext4 partition, FUSE-mounted agent bridge, full automation stack, and web access through a residential proxy pool.

The air-gap works. The credential in the environment and absent audit logging are the hygiene findings worth noting.

Curious whether others have profiled GPT, Gemini, or Claude's sandbox environments — the infrastructure patterns would be interesting to compare.

Methodology: passive inspection only, standard chat UI, no exploitation attempted.

Top comments (2)

Max Quimby • Jun 12

Great write-up because it measures instead of trusting the model's self-report — the gap between "I can't access system info" at the product layer and what the runtime actually exposes is exactly where these things go wrong.

The /proc/self/environ finding is the one I'd flag hardest. Injecting secrets as env vars is still the default in most container setups, and people forget the workload itself can always read its own environ — so any code-exec sandbox effectively hands that credential to whatever it runs. We moved anything sensitive out of env into a short-lived file with tight perms (or a broker the sandbox can't reach) precisely because of this.

The part that would worry me more than the SSH cred is /mnt surviving pod restarts and being shared with the host. In a multi-tenant setup that's a cross-contamination path: did you ever see another session's artifacts land there? Ephemeral overlay root is the right instinct, but a shared persistent mount needs per-tenant scoping or it quietly becomes a side channel. Loved the "SSH'd into a stranger's box" framing.

Alex72-py • Jun 18 • Edited

Good catch, and I should correct one thing from my write-up: I originally treated "/mnt" surviving as evidence of persistence, but later observations point to a weaker conclusion. What we actually saw was state surviving within what looked like the same runtime lifecycle, not evidence that storage persisted across real pod replacement.

So on your concern: I agree the "/proc/self/environ" part is the more concrete signal. If code executes inside the workload, env vars stop being a meaningful secret boundary and become part of the attack surface.

For "/mnt", though, we never observed another session’s artifacts or anything suggesting cross-tenant visibility. That’s the distinction I’d keep sharp. Persistence-looking behavior alone isn’t enough to conclude shared host storage. In hindsight the cleaner explanation is: same pod or attached workspace surviving longer than expected, then disappearing once the environment was actually recreated.

So I’d downgrade that section from “persistent mount risk” to “runtime lifecycle was less isolated than assumed.” Less dramatic than accidentally SSH’ing into a stranger’s machine.