Venkateshwar Rao Nagala

Posted on Feb 24 • Edited on Feb 28

How I Added Zero-Trust Guardrails to 4 MCP Servers Using AgentGateway — and Modernized Legacy COBOL Along the Way

#rust #mcp #agentgateway #kubernetes

How I Added Zero-Trust Guardrails to 4 MCP Servers Using AgentGateway — and Modernized Legacy COBOL Along the Way

Author: Venkat Nagala

GitHub: https://github.com/venkatnagala/Mainframe-Modernization

Competition: Solo.io SOLO AI Hackathon 2026

Demo Videos

▶️ Watch the full pipeline in action:

Quick Demo (2 min): https://www.youtube.com/watch?v=a7Yfz614d5Y
Detailed Walkthrough (9 min): https://www.youtube.com/watch?v=5s6MMIfxNf0
Kubernetes Deployment Demo (3 min): https://youtu.be/05I-q2Ugw5Q

The demo shows:

All 7 services starting up on Kubernetes with a single command
Full COBOL→Rust pipeline: SUCCESS - Outputs match! ✅
AgentGateway RBAC security: authorized: false
Generated Rust code saved to AWS S3

My Story

I started this project because I was frustrated.

After 30 years working with enterprise systems — from Fortran to COBOL to Python — I watched organizations spend millions on mainframe modernization vendors only to get Java code so complex it needed another team to simplify it before humans could maintain it.

I'm not a professional writer. I once spent $200/hour on GMAT verbal tutoring and still scored 640. But I know mainframes. I know Rust. And I know that 800 billion lines of COBOL aren't going to modernize themselves.

I spent nights debugging Gemini API errors at 2 AM. I switched to Claude claude-opus-4-6 when the generated Rust kept failing to compile with errors like .inv() method not found and invalid RoundingStrategy variants. I fixed Docker port conflicts, Rust compiler errors, cargo permission issues, and AWS credential problems — one by one, night after night.

And then the terminal finally showed:

status : SUCCESS - Outputs match! ✅

That moment made every late night worth it.

This is my story of building an open-source COBOL→Rust modernization pipeline — and accidentally building a zero-trust AI security layer along the way. And then deploying the whole thing on Kubernetes.

The Problem That Started It All

Anthropic just published a blog post titled "How AI helps break the cost barrier to COBOL modernization" — and IBM shares dropped 31% after Claude Code demonstrated COBOL modernization capability. This project was built in parallel, independently proving the same thesis with a complete open-source pipeline secured by AgentGateway.

There are an estimated 800 billion lines of COBOL still running in production globally — powering banks, governments, and insurance companies. The workforce that understands this code is retiring at 50,000 developers per year.

I have worked with these systems my entire career. I have seen the complexity firsthand. I have talked to people at modernization vendors and understood the real challenges. The world needs an open-source, automated, validated modernization pipeline — starting with COBOL, with Assembler (HLASM) support planned for Phase 4.

What surprised me most was this: securing the AI agents doing the modernization turned out to be just as important as the modernization itself.

What I Built

A complete AI-powered COBOL→Rust modernization pipeline with:

4 MCP servers — S3, AI Translation (ai_mcp), COBOL, Rust
AgentGateway — zero-trust JWT authentication and RBAC for every MCP call
2 agents — Green Agent (Orchestrator) and Purple Agent (AI Modernizer)
Claude claude-opus-4-6 — for reliable, consistent Rust code generation
Automated validation — outputs compared before any code is saved
Kubernetes deployment — all 7 services orchestrated in production-grade infrastructure

The result that matters:

task_id        : MODERN-DEMO-2026
status         : SUCCESS - Outputs match! ✅
match_confirmed: True
rust_code_url  : https://mainframe-refactor-lab-venkatnagala.s3.us-east-1.amazonaws.com/...

The Security Problem I Didn't Expect

When I started building, my agents talked directly to services. Green Agent called S3 directly. It worked — but one day I asked myself a simple question:

What happens if the Purple Agent (AI Modernizer) is compromised?

Without guardrails:

It could read all COBOL source from S3
It could overwrite modernized Rust code with malicious output
It could exfiltrate data to external services
There would be no audit trail of what happened

I had built a pipeline with no security boundaries. Any compromised agent had unlimited access to everything.

That's when I discovered AgentGateway — and everything changed.

With AgentGateway the Purple Agent is now physically blocked:

{
  "authorized": false,
  "error": "Role Modernizer is not authorized to call fetch_source on s3_mcp",
  "audit_trail": {
    "agent_id": "purple_agent",
    "authorized": false,
    "request_id": "cf1d3191-4053-4b8e-b8a8-d4035023f92a"
  }
}

Blast radius = limited to translation only. The Purple Agent cannot read source, cannot write output, cannot execute code. This is the zero-trust principle applied to AI agents — and it works.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                     Kubernetes (mainframe-modernization)                    │
│                                                                             │
│  ┌──────────────┐     JWT/HTTPS      ┌─────────────────────────────────┐   │
│  │              │ ────────────────► │       Agent Gateway              │   │
│  │ Green Agent  │                   │   (AuthN + AuthZ + Audit)        │   │
│  │(Orchestrator)│ ◄──────────────── │   Port: 8090                    │   │
│  │  Port: 8080  │    Proxy Result   └──────────┬──────────────────────┘   │
│  └──────────────┘                              │ Authorized calls only      │
│                                                ▼                            │
│  ┌──────────────┐      ┌─────────────────────────────────────────────┐     │
│  │ Purple Agent │─────►│            MCP Servers                      │     │
│  │(AI Modernizer│      │  ┌──────┐  ┌──────────┐  ┌───────┐ ┌────┐  │     │
│  │  Port: 8081  │      │  │  S3  │  │AI Trans. │  │ COBOL │ │Rust│  │     │
│  └──────────────┘      │  │:8081 │  │  :8082   │  │ :8083 │ │:84 │  │     │
│                        └─────────────────────────────────────────────┘     │
│   NetworkPolicy: Default DENY ALL — whitelist only                          │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                               ┌──────┴──────┐
                               │   AWS S3    │
                               │  programs/  │
                               │  data/      │
                               │  modernized/│
                               └─────────────┘

Every agent-to-MCP call flows through AgentGateway. No exceptions. The NetworkPolicy enforces this at the network level — agents literally cannot reach MCP servers without going through the gateway.

Building the 4 MCP Servers

Each MCP server is a focused Rust microservice built with Actix-web. I chose Rust for everything — not just the modernization target but the entire infrastructure. The compiler catches security issues at build time. Memory safety prevents credential leaks in the gateway. It was the right call.

1. S3 MCP (Port 8081)

Handles all AWS S3 operations — fetch COBOL source, save validated Rust output, generate pre-signed download URLs.

async fn fetch_source(
    state: web::Data<AppState>,
    body: web::Json<FetchRequest>,
) -> HttpResponse {
    match get_s3_object(&state.s3_client, &body.bucket, &body.key).await {
        Ok(content) => HttpResponse::Ok().json(FetchResponse {
            success: true,
            content: Some(content),
            ..Default::default()
        }),
        Err(e) => HttpResponse::InternalServerError().json(FetchResponse {
            success: false,
            error: Some(e.to_string()),
            ..Default::default()
        })
    }
}

2. AI Translation MCP — ai_mcp (Port 8082)

This is the heart of the pipeline. I started with Gemini 2.5 Pro — and spent many nights debugging generated Rust code that used non-existent methods like .inv(), .quantize(), and invalid RoundingStrategy variants. Every run produced different errors.

I switched to Claude claude-opus-4-6. First attempt — compiled and ran correctly. That was the end of the debate.

const CLAUDE_API_URL: &str = "https://api.anthropic.com/v1/messages";
const CLAUDE_MODEL: &str = "claude-opus-4-6";

async fn call_claude(state: &AppState, prompt: &str) -> Result<String, String> {
    let request = ClaudeRequest {
        model: CLAUDE_MODEL.to_string(),
        max_tokens: 32768,
        messages: vec![ClaudeMessage {
            role: "user".to_string(),
            content: prompt.to_string(),
        }],
    };

    let response = state.http_client
        .post(CLAUDE_API_URL)
        .header("x-api-key", &state.claude_api_key)
        .header("anthropic-version", "2023-06-01")
        .header("content-type", "application/json")
        .json(&request)
        .send()
        .await
        .map_err(|e| format!("Claude API request failed: {}", e))?;

    let claude_response: ClaudeResponse = response.json().await
        .map_err(|e| format!("Failed to parse Claude response: {}", e))?;

    claude_response.content
        .into_iter()
        .find(|c| c.content_type == "text")
        .and_then(|c| c.text)
        .ok_or("Empty response from Claude".to_string())
}

3. COBOL MCP (Port 8083)

Compiles and executes COBOL using GnuCOBOL. Captures stdout for comparison with the Rust output. This is the ground truth — whatever COBOL produces, Rust must match exactly.

let compile_result = Command::new("cobc")
    .args(["-x", "-o", &binary_path, &source_path])
    .output();

4. Rust MCP (Port 8084)

Compiles and executes the Claude-generated Rust code using Cargo. One challenge I solved here was dynamic dependency injection — Claude sometimes generates code using rust_decimal, num-format, or num-traits. The Rust MCP detects which crates are needed and injects them into Cargo.toml automatically.

// Dynamic dependency injection based on generated code
let mut deps = String::from(
    "rust_decimal = \"1.34\"\n\
     rust_decimal_macros = \"1.34\"\n\
     num-format = { version = \"0.4\", features = [\"with-system-locale\"] }\n\
     num-traits = \"0.2\"\n"
);

if body.source.contains("chrono") {
    deps.push_str("chrono = \"0.4\"\n");
}
if body.source.contains("regex") {
    deps.push_str("regex = \"1\"\n");
}

Adding Guardrails with AgentGateway

Here is the complete flow that makes every MCP call secure:

Step 1: Agent Authentication

Each agent authenticates with an API key and receives a JWT token:

POST /auth/token
{
  "agent_id": "green_agent",
  "api_key": "your-api-key",
  "requested_role": "orchestrator"
}

Response:
{
  "access_token": "eyJ...",
  "expires_in": 3600,
  "role": "orchestrator"
}

Step 2: RBAC Enforcement

Every MCP call is checked against the role table:

Agent	Role	S3 MCP	ai_mcp	COBOL MCP	Rust MCP
Green Agent	Orchestrator	✅ All	✅ All	✅ All	✅ All
Purple Agent	Modernizer	❌ Blocked	✅ Translate	❌ Blocked	❌ Blocked

Step 3: Proxied MCP Call

Authorized calls are forwarded to the MCP server:

POST /mcp/invoke
Authorization: Bearer eyJ...
{
  "target_mcp": "s3_mcp",
  "operation": "fetch_source",
  "payload": {
    "bucket": "my-bucket",
    "key": "programs/interest_calc.cbl"
  }
}

Step 4: Audit Trail

Every call — authorized or denied — is logged with a unique request ID:

{
  "timestamp": "2026-02-23T21:00:24Z",
  "agent_id": "green_agent",
  "target_mcp": "s3_mcp",
  "operation": "fetch_source",
  "authorized": true,
  "request_id": "c067e63e-7cda-41dd-a9d0-1aededa00c65"
}

The Validation Pipeline

The automated validation is what makes this production-ready. Only verified, functionally-equivalent Rust code gets saved:

1. Fetch COBOL source from S3          (via S3 MCP)
2. Compile + execute COBOL             (via COBOL MCP)
   → Output: "CALCULATED INTEREST:     550.00"
3. Translate COBOL → Rust via Claude   (via ai_mcp)
4. Compile + execute Rust              (via Rust MCP)
   → Output: "CALCULATED INTEREST: 550.00"
5. Normalize + compare outputs
   → "calculated interest: 550.00" == "calculated interest: 550.00"
6. MATCH → save Rust code to S3        (via S3 MCP)
7. Return pre-signed download URL

If the outputs don't match — the code is not saved. No human needed to check. The pipeline catches it automatically.

Why Rust is the Right Target Language

Claude claude-opus-4-6 generates idiomatic Rust that is:

✅ Clean and readable — functions are concise and directly traceable to the original COBOL business logic
✅ Immediately maintainable — any Rust developer can understand and modify the generated code without specialized mainframe knowledge
✅ Memory safe — no garbage collector, no null pointer exceptions, no buffer overflows
✅ High performance — zero runtime overhead, sub-millisecond execution
✅ Automatically validated — outputs compared against original COBOL before saving
✅ Production ready — compiles with standard Cargo, no custom toolchain needed
✅ Serverless ready — native support on all major serverless platforms

The generated Rust preserves the original business logic clarity while bringing it into a modern, safe, and performant language that enterprises can confidently maintain for the next 30 years.

Technical Challenges I Overcame

I want to be honest about the struggles — this was not a smooth journey:

Rust Compiler Errors: RwLock<Option<String>> doesn't implement Clone — spent hours before finding the fix
Docker Port Conflicts: Two services both wanted port 8081 — separated external (8086) vs internal (8081) ports
AWS SDK Requirements: aws-smithy-runtime requires Rust 1.91+ — learned this the hard way
AI Model Selection: Gemini generated different broken code every run — switched to Claude claude-opus-4-6 which generates correct, compilable Rust consistently
Cargo Permissions: Non-root mcpuser couldn't write to /tmp/cargo_home — needed explicit .cargo directory ownership
Dynamic Dependencies: Rust MCP needed to inject crate dependencies automatically based on what Claude generated
MCP Naming: Renamed from gemini_mcp to ai_mcp — reflects actual function (AI translation) not the model name
Rust compiler messages are detailed and beginner-friendly — while experienced Assembler programmers read storage dumps and PSW registers with ease, Rust's compiler guides developers who are new to low-level systems programming

Each of these took hours to debug. I'm documenting them here so you don't have to go through the same pain.

Phase 3: Kubernetes Deployment — Taking It to Production

After validating the pipeline on Docker Compose, the next challenge was clear: can this run on Kubernetes? For a product targeting enterprise banks and insurance companies, Docker Compose is a prototype. Kubernetes is a product.

The Goal

Deploy all 7 microservices on Kubernetes with:

Zero-trust network policies (default DENY ALL)
Health endpoints on every service
HorizontalPodAutoscaler on Purple Agent (scales 1→5 replicas automatically)
Single command deployment via .\deploy.ps1
Automated demo and RBAC security test after deployment

Kubernetes Manifests

I organized all manifests under k8s/base/ applied in strict order:

00-namespace-rbac.yaml   — namespace + service accounts
01-secrets-config.yaml   — ConfigMap + secret templates
02-agent-gateway.yaml    — AgentGateway deployment + ClusterIP service
03-agents.yaml           — Green Agent + Purple Agent + HPA
04-network-policy.yaml   — zero-trust NetworkPolicy
05-mcp-servers.yaml      — all 4 MCP servers

The order matters. Agent Gateway must be running before agents start — enforced by an initContainer on both Green Agent and Purple Agent:

initContainers:
  - name: wait-for-gateway
    image: busybox:1.36
    command: ['sh', '-c', 'until wget -qO- http://agent-gateway:8090/health; 
    do echo "Waiting for agent-gateway..."; sleep 3; done']

This guarantees agents never start before the security gateway is ready.

Three Hard Problems I Hit on Kubernetes

1. GLIBC Mismatch — cobol-mcp and s3-mcp CrashLoopBackOff

Both pods crashed immediately on Kubernetes:

/lib/x86_64-linux-gnu/libc.so.6: version GLIBC_2.38 not found

The Rust binaries were compiled against a newer GLIBC than debian:bookworm-slim provided. The fix was straightforward — change the runtime stage in both Dockerfiles:

# Before
FROM debian:bookworm-slim

# After
FROM rust:latest

Pushed to GitHub, GitHub Actions rebuilt the images automatically, rolled out the new pods — fixed.

2. JWT Token Expiry — 401 Unauthorized After 3 Hours

Green Agent authenticates with Agent Gateway at startup and caches the JWT token. Works perfectly — until 3 hours later when the token expires and every MCP call returns:

Token decode failed: ExpiredSignature

The fix was adding auto-refresh logic in invoke_mcp(). On a 401 response, Green Agent re-authenticates and retries the request transparently:

if status.as_u16() == 401 {
    info!("🔄 JWT expired — refreshing token...");
    let api_key = std::env::var("AGENT_API_KEY")
        .map_err(|_| "AGENT_API_KEY not set")?;
    self.authenticate(&api_key).await?;
    // Retry with new token...
}

This is the kind of issue that only surfaces under production-like conditions — Docker Compose never runs long enough to hit it.

3. Missing Health Endpoint — Purple Agent 34 Restarts

Purple Agent was stuck in CrashLoopBackOff with 34 restarts. The logs showed it started successfully:

🟣 Purple Agent (AI Modernizer) Online | Listening on 0.0.0.0:8081

But Kubernetes kept killing it. The reason? The liveness probe was hitting /health — a route that didn't exist in the code. Kubernetes saw timeouts, assumed the pod was unhealthy, and restarted it every 30 seconds.

The fix was a single addition to the Axum router:

let app = Router::new()
    .route("/solve", post(handle_modernization))
    .route("/health", get(health));  // ← This one line stopped 34 restarts

Lesson learned: every service needs a health endpoint — not optional on Kubernetes.

Single Command Deployment

deploy.ps1 handles everything end-to-end:

Step 1:  Verify kubectl and cluster connection
Step 2:  Load environment variables from .env
Step 3:  Create namespace and RBAC
Step 4:  Create all Kubernetes secrets from .env
Step 5:  Apply ConfigMap
Step 6:  Deploy Agent Gateway — wait for readiness
Step 7:  Deploy MCP servers
Step 8:  Deploy Green Agent + Purple Agent
Step 9:  Apply zero-trust NetworkPolicy
Step 10: Wait for all 7 pods ready
Step 11: Show pod and service status
Step 12: Port-forward Green Agent (background)
Step 13: Run demo pipeline automatically
Step 14: Run RBAC security test
Step 15: Cleanup port-forwards

No manual steps. No separate terminal windows. One command tells the complete story.

The Final Result

All 7 pods running with 0 restarts:

NAME                          READY   STATUS    RESTARTS
agent-gateway-xxx             1/1     Running   0
ai-mcp-xxx                    1/1     Running   0
cobol-mcp-xxx                 1/1     Running   0
green-agent-xxx               1/1     Running   0
purple-agent-xxx              1/1     Running   0
rust-mcp-xxx                  1/1     Running   0
s3-mcp-xxx                    1/1     Running   0

Pipeline result:

status=SUCCESS - Outputs match! ✅
match_confirmed=True
rust_code_url= [S3 presigned URL]

RBAC security test — automated in deploy.ps1:

✅ RBAC ENFORCED — Purple Agent blocked from S3 as expected!
🛡️  Role 'Modernizer' is NOT authorized to call fetch_source on s3_mcp

What Kubernetes Adds to the Story

For a product targeting enterprise banks, Kubernetes is not optional:

HPA on Purple Agent — scales 1→5 replicas under load, handles bursts of COBOL files automatically
Zero-trust NetworkPolicy — pods cannot talk to each other unless explicitly whitelisted, even inside the cluster
Rolling updates — GitHub Actions rebuilds images on every push, Kubernetes rolls them out with zero downtime
Health probes — Kubernetes self-heals pods that stop responding
Namespace isolation — all 7 services live in mainframe-modernization, isolated from other workloads

This is what separates a hackathon project from a product. 🚀

Quick Start

Step 1 — Clone Repository

git clone https://github.com/venkatnagala/Mainframe-Modernization.git
cd Mainframe-Modernization

Step 2 — Configure Environment

cp .env.example .env
# Edit .env and add:
# CLAUDE_API_KEY        → get from console.anthropic.com ($5 minimum)
# AWS_ACCESS_KEY_ID     → get from console.aws.amazon.com
# AWS_SECRET_ACCESS_KEY
# S3_BUCKET_NAME
# JWT_SECRET            → minimum 32 characters
# AGENT_API_KEY         → your agent API key

Step 3 — Upload Sample COBOL to S3

aws s3 cp legacy_source/interest_calc.cbl s3://YOUR_BUCKET/programs/interest_calc.cbl
aws s3 cp data/loan_data.json s3://YOUR_BUCKET/data/loan_data.json

Step 4 — Enable Kubernetes in Docker Desktop

Settings → Kubernetes → Enable Kubernetes → Apply & Restart

Step 5 — Run Everything (One Command!)

.\deploy.ps1

This automatically:

✅ Deploys all 7 services on Kubernetes
✅ Waits for all pods to be ready
✅ Triggers the modernization pipeline
✅ Runs the RBAC security test
✅ Cleans up port-forwards

Expected output:

🚀 Mainframe Modernization Pipeline - Kubernetes Deployment
============================================================
...
🎉 DEPLOYMENT COMPLETE!

Sending Modernization Task to Green Agent...
Task accepted!
@{task_id=MODERN-DEMO-2026; status=SUCCESS - Outputs match! ✅; match_confirmed=True}

✅ RBAC ENFORCED — Purple Agent blocked from S3 as expected!
🛡️  Role 'Modernizer' is NOT authorized to call fetch_source on s3_mcp

The IBM Connection

IBM shares dropped 31% recently — triggered by Claude Code's ability to modernize COBOL. This single market event confirms what this project demonstrates:

COBOL modernization is no longer a distant future — it is happening now
Open-source, automated, validated pipelines are the answer
Security of AI agents doing the modernization is critical
Rust is the right target language — memory safe, performant, modern, serverless-ready
Assembler (HLASM) modernization is the natural next step — planned for Phase 4

This project is a working proof that the entire pipeline — fetch, compile, translate, validate, save — can be fully automated with zero-trust security, deployed on Kubernetes, and triggered with a single command.

Key Lessons Learned

On MCP Security:
Don't let agents talk directly to MCP servers. I learned this the hard way. AgentGateway makes it straightforward to add JWT auth and RBAC without modifying your MCP servers at all.

On AI Model Selection:
I evaluated Gemini 2.5 Pro extensively. It generated different broken Rust code on every run. Claude claude-opus-4-6 generates correct, compilable Rust consistently — that consistency is critical for an automated pipeline with no human review step.

On Rust for AI Infrastructure:
I chose Rust for the entire stack — not just the modernization target. The compiler catches security issues at build time. Memory safety prevents credential leaks in the gateway. Eight worker threads per service with sub-millisecond JWT validation. It was the right decision.

On Kubernetes vs Docker Compose:
Docker Compose is for development. Kubernetes is for products. Real production issues — GLIBC mismatches, JWT token expiry, missing health endpoints — only surface when you deploy to Kubernetes. Every enterprise customer will ask: "Does it run on Kubernetes?" Now the answer is yes, with a single command.

On MCP vs Agent Skills:
Lin Sun asked a great question about this on LinkedIn. My practical answer from building this pipeline:

MCP: When the operation crosses a trust boundary (S3, compilers, external APIs)
Agent Skills: When the operation is deterministic and stateless within a trust boundary

What's Next — Phase 4: Enterprise Features

Kubernetes deployment is now complete ✅ — the pipeline runs in production-grade container orchestration with zero-trust security, HPA, and single-command deployment.

The roadmap ahead:

IBM HLASM Assembler → Rust — architecture designed, implementation planned
IBM z/OS COBOL compiler integration (IBM Developer for z/OS Enterprise Edition)
COBOL-CICS — modernize transaction processing to modern web services
VSAM to modern databases — ESDS→S3, KSDS→DynamoDB/RDS
Batch processing — modernize entire COBOL codebases in one run
Kubernetes production hardening — multi-node cluster, Ingress controller, TLS termination, persistent volumes

About Me

I am Venkat Nagala — 30+ years from Fortran to Rust.

GATE 1994 AIR 444 — Top 0.4% of India's engineering graduates
CMU-trained in Big Data Analytics (INSOFE)
Fortune 500 experience (AIG, RBC, Thrivent Financial)
Languages lived: Fortran → COBOL → Python → Rust

I started my career writing Fortran. I have written COBOL. I have worked alongside experienced Assembler programmers who read storage dumps and PSW registers as naturally as reading English. I have seen what happens to systems that last 60 years. I built this project because I believe the next 60 years of enterprise computing should be built on Rust — safe, fast, and maintainable.

The late nights were worth it.

Built with Rust. Secured with AgentGateway. Deployed on Kubernetes.

Modernizing mainframes, one COBOL line at a time. Assembler support coming in Phase 4! 🚀

GitHub: https://github.com/venkatnagala/Mainframe-Modernization

Demo Videos: