Andrew Wiggins

Posted on May 15 • Originally published at irexta.com

Stop building insecure "Private" AI assistants. Use this Hardened DevSecOps Stack.

#security #ai #devops #tutorial

The Problem: "Private" ≠ "Secure"

We’re all moving toward self-hosted AI platforms like Ollama and LocalLLMs to protect proprietary code and internal workflows. But here’s the uncomfortable reality:

Most local AI deployments are nothing more than security theater.

If your stack is running:

An unauthenticated Redis instance
Containers without syscall isolation
AI-generated code directly on the host kernel

…then your infrastructure is still exposed.

A single SSRF (Server-Side Request Forgery) vulnerability can provide attackers lateral access to internal services, secrets, and execution environments.

What Exactly Is "Hardening"?

In modern DevSecOps, Hardening is the process of minimizing a system’s attack surface by removing insecure defaults and enforcing strict isolation policies.

Instead of deploying a "default install," we harden every layer of the AI stack.

Hardening Principles

Security Layer	Hardened Approach
Authentication	Require credentials for every internal service
Isolation	Sandbox untrusted workloads using gVisor
Failure Handling	Ensure graceful degradation with Lua/OpenResty
Execution Control	Prevent direct host-kernel interaction
Network Security	Restrict unnecessary outbound communication

The iRexta Hardened Architecture

On our iRexta Bare Metal infrastructure, we move away from marketing buzzwords and implement a true Zero-Trust AI blueprint.

1. Authenticated Redis (Stopping SSRF Attacks)

One of the biggest misconceptions in infrastructure security is:

"It’s localhost, so it’s safe."

It isn’t.

Internal services exposed without authentication become high-value SSRF targets. We enforce strict Redis password authentication to prevent lateral movement.

# Install Redis securely
sudo apt install redis-server -y

# Enable authentication
sudo sed -i \
's/# requirepass foobared/requirepass YOUR_COMPLEX_PASSWORD/' \
/etc/redis/redis.conf

# Restart Redis
sudo systemctl restart redis-server

Why This Matters

Without authentication:

Internal APIs can query Redis directly
SSRF vulnerabilities become infrastructure breaches
Session tokens and cached secrets become exposed

With authentication enabled, Redis becomes significantly harder to abuse internally.

2. Resilient Lua Access Control

We use OpenResty + LuaJIT for high-performance request handling and secure gateway enforcement.

Instead of allowing backend failures to crash workers, Lua-based logic ensures graceful failure handling.

-- High-speed, error-aware Redis connection
local ok, err = red:connect("127.0.0.1", 6379)

if not ok then
    ngx.log(ngx.ERR, "failed to connect to Redis: ", err)
    return ngx.exit(500)
end

Benefits of Lua-Based Access Logic

Extremely low latency execution
Graceful failure handling
Better resilience during backend outages
Reduced worker instability under load

This architecture keeps the AI gateway stable even during partial infrastructure failures.

3. gVisor: The Ultimate Sandbox

Traditional Docker containers still share the host kernel.

That becomes dangerous when executing:

AI-generated scripts
Untrusted automation
Dynamically produced code

To solve this, we deploy gVisor using the runsc runtime.

gVisor intercepts system calls and places workloads behind a dedicated user-space kernel boundary.

# Execute untrusted AI code inside gVisor
docker run --rm \
  --runtime=runsc \
  --network=none \
  -v /tmp/ai_eval:/workspace \
  node:20 \
  node /workspace/script.js

Why gVisor Matters

Standard Docker	gVisor Sandbox
Shares host kernel	User-space kernel isolation
Larger attack surface	Reduced syscall exposure
Higher breakout risk	Hardened execution boundary
Minimal runtime filtering	Deep syscall interception

For AI-generated code execution, this isolation layer is critical.

Dual-Model Performance Strategy

Security should not come at the expense of performance.

Instead of relying on a single overloaded model, we separate workloads across specialized models.

Model	Responsibility
Qwen 2.5 Coder	Ultra-fast autocomplete and inline suggestions
DeepSeek Coder V2	Complex reasoning, architecture, and chat workflows

This dual-model approach improves:

Latency
Resource allocation
Context quality
Interactive coding performance

Final Thoughts

A self-hosted AI stack is only as secure as its weakest internal service.

Running AI locally without:

Authenticated internal services
Sandboxed execution
Failure-aware gateways
Network isolation

…does not create a private infrastructure.

It simply creates a larger attack surface.

By integrating authenticated Redis, resilient Lua access control, and gVisor sandboxing on iRexta Bare Metal, you move from hobby-grade deployments to a true DevSecOps-grade AI platform.

Stop deploying "security theater."

Build infrastructure that is actually hardened.

DEV Community