I spent 6 hours doing something that probably says worrying things about my hobbies.
Instead of using Kimi 2.6 Instant as a chatbot, I treated it like an unfamiliar Linux machine I'd just SSH'd into. No jailbreaks, no prompt injection, nothing sketchy. Just passive observation and measurement from inside the provided environment.
What I found was more interesting than expected.
Background: Why Ignore the Model's Self-Descriptions?
Early on I noticed something: what the model says it can do and what the runtime actually allows aren't always the same thing.
You'll hear:
"I don't have internet access."
"I can't access system information."
Both can be true at the product layer while sitting on top of something much more capable underneath.
So I stopped asking and started measuring instead.
The Infrastructure
First surprise: this doesn't feel like a tiny chat runtime.
- Host: Alibaba Cloud, LifseaOS
-
Kernel:
Linux 5.10.134-18.0.10.lifsea8.x86_64 - CPU: Intel Xeon Platinum, 2 logical cores (cgroup throttled — 61 throttle events logged under stress)
- RAM: Hard OOM kill at exactly 3,221,225,472 bytes. No swap.
- Execution: Kubernetes Pod, Burstable QoS class
This is real cloud infrastructure, not a toy backend.
The Credential Finding
Most straightforward discovery:
cat /proc/self/environ | tr '\0' '\n' | grep -i pass
# SSH_PASSWORD=sshpassword
Hardcoded SSH credential sitting in the process environment. Visible to anyone reading their own /proc/self/environ.
Not exploitable in any meaningful way given the network restrictions. But it's a classic container misconfiguration worth documenting.
Disk Layout
vda (40GB)
├─ vda1 1MB BIOS boot
├─ vda2 127MB EFI/boot
├─ vda3 384MB Boot partition
├─ vda4 9.5GB Host root
└─ vda5 30GB /mnt — ext4, shared with host ← persistent
vdb 1GB Unmounted
vdc 13GB Unmounted
Key finding: /mnt survives pod restarts. It's a real ext4 partition shared with the host. The OverlayFS root is ephemeral. /mnt/agents is a FUSE mount (kimi-portal) — appears to be the bridge between container and AI platform layer.
Network Architecture
The code execution container is genuinely air-gapped:
-
curlto external hosts: silent failure - Chromium: can't reach public internet
- Raw TCP/UDP egress: firewall blocked
But the built-in web tools do reach the internet — through a rotating residential proxy pool. Probing egress IPs revealed Colombia-based ISPs (Bogotá, Pitalito) via Evomi and NetNut proxy providers.
Code Container → egress DENIED
Web Tool Layer → Residential Proxy Pool → Internet
Internal network visible:
- Container:
10.162.57.123 - CoreDNS:
192.168.0.10 - K8s API:
192.168.0.1
You can host on internal ports. Public outbound egress is what's restricted.
The Virtual Display
Environment exposed DISPLAY=:99 — a virtual graphical display.
Testing confirmed Xvfb running at 1920×1080. I rendered a GUI window using Tkinter, painted content to the screen, and captured a screenshot — all from inside a standard chat interface.
Software Surface Area
Notable installed packages beyond standard utilities:
Automation: Playwright, Selenium, PyAutoGUI, python3-xlib, screenshot tooling
ML: PyTorch 2.8, TensorFlow, scikit-learn (CUDA/NVIDIA packages present but GPU access not active — verified programmatically, returns false)
Vision/OCR: OpenCV, EasyOCR, Tesseract, Pillow
Backend: FastAPI, Uvicorn, websockets — enough to run a web server from inside the sandbox
Office: python-docx, python-pptx, openpyxl, reportlab
Security Summary
| Finding | Notes |
|---|---|
SSH credential in /proc/self/environ
|
Config hygiene issue, not exploitable given air-gap |
| No container audit logging |
/var/log/* empty — no local forensic trail |
Persistent storage at /mnt
|
Survives pod resets |
| Web egress via residential proxy | Rotating IPs, Colombia-based |
| PID namespace unlimited | No fork-bomb protection |
| Virtual display active | Xvfb + full automation stack |
| Air-gapped code execution | Working as intended |
Takeaway
Under a standard AI chat interface: a Kubernetes pod on Alibaba Cloud, OverlayFS container root, persistent ext4 partition, FUSE-mounted agent bridge, full automation stack, and web access through a residential proxy pool.
The air-gap works. The credential in the environment and absent audit logging are the hygiene findings worth noting.
Curious whether others have profiled GPT, Gemini, or Claude's sandbox environments — the infrastructure patterns would be interesting to compare.
Methodology: passive inspection only, standard chat UI, no exploitation attempted.
Top comments (0)