How I Added Zero-Trust Guardrails to 4 MCP Servers Using AgentGateway — and Modernized Legacy COBOL Along the Way
Author: Venkat Nagala
GitHub: https://github.com/venkatnagala/Mainframe-Modernization
Competition: Solo.io SOLO AI Hackathon 2026
Demo Videos
▶️ Watch the full pipeline in action:
- Quick Demo (2 min): https://www.youtube.com/watch?v=a7Yfz614d5Y
- Detailed Walkthrough (9 min): https://www.youtube.com/watch?v=5s6MMIfxNf0
- Kubernetes Deployment Demo (3 min): https://youtu.be/05I-q2Ugw5Q
The demo shows:
- All 7 services starting up on Kubernetes with a single command
- Full COBOL→Rust pipeline:
SUCCESS - Outputs match! ✅ - AgentGateway RBAC security:
authorized: false - Generated Rust code saved to AWS S3
My Story
I started this project because I was frustrated.
After 30 years working with enterprise systems — from Fortran to COBOL to Python — I watched organizations spend millions on mainframe modernization vendors only to get Java code so complex it needed another team to simplify it before humans could maintain it.
I'm not a professional writer. I once spent $200/hour on GMAT verbal tutoring and still scored 640. But I know mainframes. I know Rust. And I know that 800 billion lines of COBOL aren't going to modernize themselves.
I spent nights debugging Gemini API errors at 2 AM. I switched to Claude claude-opus-4-6 when the generated Rust kept failing to compile with errors like .inv() method not found and invalid RoundingStrategy variants. I fixed Docker port conflicts, Rust compiler errors, cargo permission issues, and AWS credential problems — one by one, night after night.
And then the terminal finally showed:
status : SUCCESS - Outputs match! ✅
That moment made every late night worth it.
This is my story of building an open-source COBOL→Rust modernization pipeline — and accidentally building a zero-trust AI security layer along the way. And then deploying the whole thing on Kubernetes.
The Problem That Started It All
Anthropic just published a blog post titled "How AI helps break the cost barrier to COBOL modernization" — and IBM shares dropped 31% after Claude Code demonstrated COBOL modernization capability. This project was built in parallel, independently proving the same thesis with a complete open-source pipeline secured by AgentGateway.
There are an estimated 800 billion lines of COBOL still running in production globally — powering banks, governments, and insurance companies. The workforce that understands this code is retiring at 50,000 developers per year.
I have worked with these systems my entire career. I have seen the complexity firsthand. I have talked to people at modernization vendors and understood the real challenges. The world needs an open-source, automated, validated modernization pipeline — starting with COBOL, with Assembler (HLASM) support planned for Phase 4.
What surprised me most was this: securing the AI agents doing the modernization turned out to be just as important as the modernization itself.
What I Built
A complete AI-powered COBOL→Rust modernization pipeline with:
- 4 MCP servers — S3, AI Translation (ai_mcp), COBOL, Rust
- AgentGateway — zero-trust JWT authentication and RBAC for every MCP call
- 2 agents — Green Agent (Orchestrator) and Purple Agent (AI Modernizer)
- Claude claude-opus-4-6 — for reliable, consistent Rust code generation
- Automated validation — outputs compared before any code is saved
- Kubernetes deployment — all 7 services orchestrated in production-grade infrastructure
The result that matters:
task_id : MODERN-DEMO-2026
status : SUCCESS - Outputs match! ✅
match_confirmed: True
rust_code_url : https://mainframe-refactor-lab-venkatnagala.s3.us-east-1.amazonaws.com/...
The Security Problem I Didn't Expect
When I started building, my agents talked directly to services. Green Agent called S3 directly. It worked — but one day I asked myself a simple question:
What happens if the Purple Agent (AI Modernizer) is compromised?
Without guardrails:
- It could read all COBOL source from S3
- It could overwrite modernized Rust code with malicious output
- It could exfiltrate data to external services
- There would be no audit trail of what happened
I had built a pipeline with no security boundaries. Any compromised agent had unlimited access to everything.
That's when I discovered AgentGateway — and everything changed.
With AgentGateway the Purple Agent is now physically blocked:
{
"authorized": false,
"error": "Role Modernizer is not authorized to call fetch_source on s3_mcp",
"audit_trail": {
"agent_id": "purple_agent",
"authorized": false,
"request_id": "cf1d3191-4053-4b8e-b8a8-d4035023f92a"
}
}
Blast radius = limited to translation only. The Purple Agent cannot read source, cannot write output, cannot execute code. This is the zero-trust principle applied to AI agents — and it works.
Architecture Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ Kubernetes (mainframe-modernization) │
│ │
│ ┌──────────────┐ JWT/HTTPS ┌─────────────────────────────────┐ │
│ │ │ ────────────────► │ Agent Gateway │ │
│ │ Green Agent │ │ (AuthN + AuthZ + Audit) │ │
│ │(Orchestrator)│ ◄──────────────── │ Port: 8090 │ │
│ │ Port: 8080 │ Proxy Result └──────────┬──────────────────────┘ │
│ └──────────────┘ │ Authorized calls only │
│ ▼ │
│ ┌──────────────┐ ┌─────────────────────────────────────────────┐ │
│ │ Purple Agent │─────►│ MCP Servers │ │
│ │(AI Modernizer│ │ ┌──────┐ ┌──────────┐ ┌───────┐ ┌────┐ │ │
│ │ Port: 8081 │ │ │ S3 │ │AI Trans. │ │ COBOL │ │Rust│ │ │
│ └──────────────┘ │ │:8081 │ │ :8082 │ │ :8083 │ │:84 │ │ │
│ └─────────────────────────────────────────────┘ │
│ NetworkPolicy: Default DENY ALL — whitelist only │
└─────────────────────────────────────────────────────────────────────────────┘
│
┌──────┴──────┐
│ AWS S3 │
│ programs/ │
│ data/ │
│ modernized/│
└─────────────┘
Every agent-to-MCP call flows through AgentGateway. No exceptions. The NetworkPolicy enforces this at the network level — agents literally cannot reach MCP servers without going through the gateway.
Building the 4 MCP Servers
Each MCP server is a focused Rust microservice built with Actix-web. I chose Rust for everything — not just the modernization target but the entire infrastructure. The compiler catches security issues at build time. Memory safety prevents credential leaks in the gateway. It was the right call.
1. S3 MCP (Port 8081)
Handles all AWS S3 operations — fetch COBOL source, save validated Rust output, generate pre-signed download URLs.
async fn fetch_source(
state: web::Data<AppState>,
body: web::Json<FetchRequest>,
) -> HttpResponse {
match get_s3_object(&state.s3_client, &body.bucket, &body.key).await {
Ok(content) => HttpResponse::Ok().json(FetchResponse {
success: true,
content: Some(content),
..Default::default()
}),
Err(e) => HttpResponse::InternalServerError().json(FetchResponse {
success: false,
error: Some(e.to_string()),
..Default::default()
})
}
}
2. AI Translation MCP — ai_mcp (Port 8082)
This is the heart of the pipeline. I started with Gemini 2.5 Pro — and spent many nights debugging generated Rust code that used non-existent methods like .inv(), .quantize(), and invalid RoundingStrategy variants. Every run produced different errors.
I switched to Claude claude-opus-4-6. First attempt — compiled and ran correctly. That was the end of the debate.
const CLAUDE_API_URL: &str = "https://api.anthropic.com/v1/messages";
const CLAUDE_MODEL: &str = "claude-opus-4-6";
async fn call_claude(state: &AppState, prompt: &str) -> Result<String, String> {
let request = ClaudeRequest {
model: CLAUDE_MODEL.to_string(),
max_tokens: 32768,
messages: vec![ClaudeMessage {
role: "user".to_string(),
content: prompt.to_string(),
}],
};
let response = state.http_client
.post(CLAUDE_API_URL)
.header("x-api-key", &state.claude_api_key)
.header("anthropic-version", "2023-06-01")
.header("content-type", "application/json")
.json(&request)
.send()
.await
.map_err(|e| format!("Claude API request failed: {}", e))?;
let claude_response: ClaudeResponse = response.json().await
.map_err(|e| format!("Failed to parse Claude response: {}", e))?;
claude_response.content
.into_iter()
.find(|c| c.content_type == "text")
.and_then(|c| c.text)
.ok_or("Empty response from Claude".to_string())
}
3. COBOL MCP (Port 8083)
Compiles and executes COBOL using GnuCOBOL. Captures stdout for comparison with the Rust output. This is the ground truth — whatever COBOL produces, Rust must match exactly.
let compile_result = Command::new("cobc")
.args(["-x", "-o", &binary_path, &source_path])
.output();
4. Rust MCP (Port 8084)
Compiles and executes the Claude-generated Rust code using Cargo. One challenge I solved here was dynamic dependency injection — Claude sometimes generates code using rust_decimal, num-format, or num-traits. The Rust MCP detects which crates are needed and injects them into Cargo.toml automatically.
// Dynamic dependency injection based on generated code
let mut deps = String::from(
"rust_decimal = \"1.34\"\n\
rust_decimal_macros = \"1.34\"\n\
num-format = { version = \"0.4\", features = [\"with-system-locale\"] }\n\
num-traits = \"0.2\"\n"
);
if body.source.contains("chrono") {
deps.push_str("chrono = \"0.4\"\n");
}
if body.source.contains("regex") {
deps.push_str("regex = \"1\"\n");
}
Adding Guardrails with AgentGateway
Here is the complete flow that makes every MCP call secure:
Step 1: Agent Authentication
Each agent authenticates with an API key and receives a JWT token:
POST /auth/token
{
"agent_id": "green_agent",
"api_key": "your-api-key",
"requested_role": "orchestrator"
}
Response:
{
"access_token": "eyJ...",
"expires_in": 3600,
"role": "orchestrator"
}
Step 2: RBAC Enforcement
Every MCP call is checked against the role table:
| Agent | Role | S3 MCP | ai_mcp | COBOL MCP | Rust MCP |
|---|---|---|---|---|---|
| Green Agent | Orchestrator | ✅ All | ✅ All | ✅ All | ✅ All |
| Purple Agent | Modernizer | ❌ Blocked | ✅ Translate | ❌ Blocked | ❌ Blocked |
Step 3: Proxied MCP Call
Authorized calls are forwarded to the MCP server:
POST /mcp/invoke
Authorization: Bearer eyJ...
{
"target_mcp": "s3_mcp",
"operation": "fetch_source",
"payload": {
"bucket": "my-bucket",
"key": "programs/interest_calc.cbl"
}
}
Step 4: Audit Trail
Every call — authorized or denied — is logged with a unique request ID:
{
"timestamp": "2026-02-23T21:00:24Z",
"agent_id": "green_agent",
"target_mcp": "s3_mcp",
"operation": "fetch_source",
"authorized": true,
"request_id": "c067e63e-7cda-41dd-a9d0-1aededa00c65"
}
The Validation Pipeline
The automated validation is what makes this production-ready. Only verified, functionally-equivalent Rust code gets saved:
1. Fetch COBOL source from S3 (via S3 MCP)
2. Compile + execute COBOL (via COBOL MCP)
→ Output: "CALCULATED INTEREST: 550.00"
3. Translate COBOL → Rust via Claude (via ai_mcp)
4. Compile + execute Rust (via Rust MCP)
→ Output: "CALCULATED INTEREST: 550.00"
5. Normalize + compare outputs
→ "calculated interest: 550.00" == "calculated interest: 550.00"
6. MATCH → save Rust code to S3 (via S3 MCP)
7. Return pre-signed download URL
If the outputs don't match — the code is not saved. No human needed to check. The pipeline catches it automatically.
Why Rust is the Right Target Language
Claude claude-opus-4-6 generates idiomatic Rust that is:
- ✅ Clean and readable — functions are concise and directly traceable to the original COBOL business logic
- ✅ Immediately maintainable — any Rust developer can understand and modify the generated code without specialized mainframe knowledge
- ✅ Memory safe — no garbage collector, no null pointer exceptions, no buffer overflows
- ✅ High performance — zero runtime overhead, sub-millisecond execution
- ✅ Automatically validated — outputs compared against original COBOL before saving
- ✅ Production ready — compiles with standard Cargo, no custom toolchain needed
- ✅ Serverless ready — native support on all major serverless platforms
The generated Rust preserves the original business logic clarity while bringing it into a modern, safe, and performant language that enterprises can confidently maintain for the next 30 years.
Technical Challenges I Overcame
I want to be honest about the struggles — this was not a smooth journey:
-
Rust Compiler Errors:
RwLock<Option<String>>doesn't implement Clone — spent hours before finding the fix - Docker Port Conflicts: Two services both wanted port 8081 — separated external (8086) vs internal (8081) ports
-
AWS SDK Requirements:
aws-smithy-runtimerequires Rust 1.91+ — learned this the hard way - AI Model Selection: Gemini generated different broken code every run — switched to Claude claude-opus-4-6 which generates correct, compilable Rust consistently
-
Cargo Permissions: Non-root mcpuser couldn't write to
/tmp/cargo_home— needed explicit.cargodirectory ownership - Dynamic Dependencies: Rust MCP needed to inject crate dependencies automatically based on what Claude generated
-
MCP Naming: Renamed from
gemini_mcptoai_mcp— reflects actual function (AI translation) not the model name - Rust compiler messages are detailed and beginner-friendly — while experienced Assembler programmers read storage dumps and PSW registers with ease, Rust's compiler guides developers who are new to low-level systems programming
Each of these took hours to debug. I'm documenting them here so you don't have to go through the same pain.
Phase 3: Kubernetes Deployment — Taking It to Production
After validating the pipeline on Docker Compose, the next challenge was clear: can this run on Kubernetes? For a product targeting enterprise banks and insurance companies, Docker Compose is a prototype. Kubernetes is a product.
The Goal
Deploy all 7 microservices on Kubernetes with:
- Zero-trust network policies (default DENY ALL)
- Health endpoints on every service
- HorizontalPodAutoscaler on Purple Agent (scales 1→5 replicas automatically)
- Single command deployment via
.\deploy.ps1 - Automated demo and RBAC security test after deployment
Kubernetes Manifests
I organized all manifests under k8s/base/ applied in strict order:
00-namespace-rbac.yaml — namespace + service accounts
01-secrets-config.yaml — ConfigMap + secret templates
02-agent-gateway.yaml — AgentGateway deployment + ClusterIP service
03-agents.yaml — Green Agent + Purple Agent + HPA
04-network-policy.yaml — zero-trust NetworkPolicy
05-mcp-servers.yaml — all 4 MCP servers
The order matters. Agent Gateway must be running before agents start — enforced by an initContainer on both Green Agent and Purple Agent:
initContainers:
- name: wait-for-gateway
image: busybox:1.36
command: ['sh', '-c', 'until wget -qO- http://agent-gateway:8090/health;
do echo "Waiting for agent-gateway..."; sleep 3; done']
This guarantees agents never start before the security gateway is ready.
Three Hard Problems I Hit on Kubernetes
1. GLIBC Mismatch — cobol-mcp and s3-mcp CrashLoopBackOff
Both pods crashed immediately on Kubernetes:
/lib/x86_64-linux-gnu/libc.so.6: version GLIBC_2.38 not found
The Rust binaries were compiled against a newer GLIBC than debian:bookworm-slim provided. The fix was straightforward — change the runtime stage in both Dockerfiles:
# Before
FROM debian:bookworm-slim
# After
FROM rust:latest
Pushed to GitHub, GitHub Actions rebuilt the images automatically, rolled out the new pods — fixed.
2. JWT Token Expiry — 401 Unauthorized After 3 Hours
Green Agent authenticates with Agent Gateway at startup and caches the JWT token. Works perfectly — until 3 hours later when the token expires and every MCP call returns:
Token decode failed: ExpiredSignature
The fix was adding auto-refresh logic in invoke_mcp(). On a 401 response, Green Agent re-authenticates and retries the request transparently:
if status.as_u16() == 401 {
info!("🔄 JWT expired — refreshing token...");
let api_key = std::env::var("AGENT_API_KEY")
.map_err(|_| "AGENT_API_KEY not set")?;
self.authenticate(&api_key).await?;
// Retry with new token...
}
This is the kind of issue that only surfaces under production-like conditions — Docker Compose never runs long enough to hit it.
3. Missing Health Endpoint — Purple Agent 34 Restarts
Purple Agent was stuck in CrashLoopBackOff with 34 restarts. The logs showed it started successfully:
🟣 Purple Agent (AI Modernizer) Online | Listening on 0.0.0.0:8081
But Kubernetes kept killing it. The reason? The liveness probe was hitting /health — a route that didn't exist in the code. Kubernetes saw timeouts, assumed the pod was unhealthy, and restarted it every 30 seconds.
The fix was a single addition to the Axum router:
let app = Router::new()
.route("/solve", post(handle_modernization))
.route("/health", get(health)); // ← This one line stopped 34 restarts
Lesson learned: every service needs a health endpoint — not optional on Kubernetes.
Single Command Deployment
deploy.ps1 handles everything end-to-end:
Step 1: Verify kubectl and cluster connection
Step 2: Load environment variables from .env
Step 3: Create namespace and RBAC
Step 4: Create all Kubernetes secrets from .env
Step 5: Apply ConfigMap
Step 6: Deploy Agent Gateway — wait for readiness
Step 7: Deploy MCP servers
Step 8: Deploy Green Agent + Purple Agent
Step 9: Apply zero-trust NetworkPolicy
Step 10: Wait for all 7 pods ready
Step 11: Show pod and service status
Step 12: Port-forward Green Agent (background)
Step 13: Run demo pipeline automatically
Step 14: Run RBAC security test
Step 15: Cleanup port-forwards
No manual steps. No separate terminal windows. One command tells the complete story.
The Final Result
All 7 pods running with 0 restarts:
NAME READY STATUS RESTARTS
agent-gateway-xxx 1/1 Running 0
ai-mcp-xxx 1/1 Running 0
cobol-mcp-xxx 1/1 Running 0
green-agent-xxx 1/1 Running 0
purple-agent-xxx 1/1 Running 0
rust-mcp-xxx 1/1 Running 0
s3-mcp-xxx 1/1 Running 0
Pipeline result:
status=SUCCESS - Outputs match! ✅
match_confirmed=True
rust_code_url= [S3 presigned URL]
RBAC security test — automated in deploy.ps1:
✅ RBAC ENFORCED — Purple Agent blocked from S3 as expected!
🛡️ Role 'Modernizer' is NOT authorized to call fetch_source on s3_mcp
What Kubernetes Adds to the Story
For a product targeting enterprise banks, Kubernetes is not optional:
- HPA on Purple Agent — scales 1→5 replicas under load, handles bursts of COBOL files automatically
- Zero-trust NetworkPolicy — pods cannot talk to each other unless explicitly whitelisted, even inside the cluster
- Rolling updates — GitHub Actions rebuilds images on every push, Kubernetes rolls them out with zero downtime
- Health probes — Kubernetes self-heals pods that stop responding
-
Namespace isolation — all 7 services live in
mainframe-modernization, isolated from other workloads
This is what separates a hackathon project from a product. 🚀
Quick Start
Step 1 — Clone Repository
git clone https://github.com/venkatnagala/Mainframe-Modernization.git
cd Mainframe-Modernization
Step 2 — Configure Environment
cp .env.example .env
# Edit .env and add:
# CLAUDE_API_KEY → get from console.anthropic.com ($5 minimum)
# AWS_ACCESS_KEY_ID → get from console.aws.amazon.com
# AWS_SECRET_ACCESS_KEY
# S3_BUCKET_NAME
# JWT_SECRET → minimum 32 characters
# AGENT_API_KEY → your agent API key
Step 3 — Upload Sample COBOL to S3
aws s3 cp legacy_source/interest_calc.cbl s3://YOUR_BUCKET/programs/interest_calc.cbl
aws s3 cp data/loan_data.json s3://YOUR_BUCKET/data/loan_data.json
Step 4 — Enable Kubernetes in Docker Desktop
Settings → Kubernetes → Enable Kubernetes → Apply & Restart
Step 5 — Run Everything (One Command!)
.\deploy.ps1
This automatically:
- ✅ Deploys all 7 services on Kubernetes
- ✅ Waits for all pods to be ready
- ✅ Triggers the modernization pipeline
- ✅ Runs the RBAC security test
- ✅ Cleans up port-forwards
Expected output:
🚀 Mainframe Modernization Pipeline - Kubernetes Deployment
============================================================
...
🎉 DEPLOYMENT COMPLETE!
Sending Modernization Task to Green Agent...
Task accepted!
@{task_id=MODERN-DEMO-2026; status=SUCCESS - Outputs match! ✅; match_confirmed=True}
✅ RBAC ENFORCED — Purple Agent blocked from S3 as expected!
🛡️ Role 'Modernizer' is NOT authorized to call fetch_source on s3_mcp
The IBM Connection
IBM shares dropped 31% recently — triggered by Claude Code's ability to modernize COBOL. This single market event confirms what this project demonstrates:
- COBOL modernization is no longer a distant future — it is happening now
- Open-source, automated, validated pipelines are the answer
- Security of AI agents doing the modernization is critical
- Rust is the right target language — memory safe, performant, modern, serverless-ready
- Assembler (HLASM) modernization is the natural next step — planned for Phase 4
This project is a working proof that the entire pipeline — fetch, compile, translate, validate, save — can be fully automated with zero-trust security, deployed on Kubernetes, and triggered with a single command.
Key Lessons Learned
On MCP Security:
Don't let agents talk directly to MCP servers. I learned this the hard way. AgentGateway makes it straightforward to add JWT auth and RBAC without modifying your MCP servers at all.
On AI Model Selection:
I evaluated Gemini 2.5 Pro extensively. It generated different broken Rust code on every run. Claude claude-opus-4-6 generates correct, compilable Rust consistently — that consistency is critical for an automated pipeline with no human review step.
On Rust for AI Infrastructure:
I chose Rust for the entire stack — not just the modernization target. The compiler catches security issues at build time. Memory safety prevents credential leaks in the gateway. Eight worker threads per service with sub-millisecond JWT validation. It was the right decision.
On Kubernetes vs Docker Compose:
Docker Compose is for development. Kubernetes is for products. Real production issues — GLIBC mismatches, JWT token expiry, missing health endpoints — only surface when you deploy to Kubernetes. Every enterprise customer will ask: "Does it run on Kubernetes?" Now the answer is yes, with a single command.
On MCP vs Agent Skills:
Lin Sun asked a great question about this on LinkedIn. My practical answer from building this pipeline:
- MCP: When the operation crosses a trust boundary (S3, compilers, external APIs)
- Agent Skills: When the operation is deterministic and stateless within a trust boundary
What's Next — Phase 4: Enterprise Features
Kubernetes deployment is now complete ✅ — the pipeline runs in production-grade container orchestration with zero-trust security, HPA, and single-command deployment.
The roadmap ahead:
- IBM HLASM Assembler → Rust — architecture designed, implementation planned
- IBM z/OS COBOL compiler integration (IBM Developer for z/OS Enterprise Edition)
- COBOL-CICS — modernize transaction processing to modern web services
- VSAM to modern databases — ESDS→S3, KSDS→DynamoDB/RDS
- Batch processing — modernize entire COBOL codebases in one run
- Kubernetes production hardening — multi-node cluster, Ingress controller, TLS termination, persistent volumes
About Me
I am Venkat Nagala — 30+ years from Fortran to Rust.
- GATE 1994 AIR 444 — Top 0.4% of India's engineering graduates
- CMU-trained in Big Data Analytics (INSOFE)
- Fortune 500 experience (AIG, RBC, Thrivent Financial)
- Languages lived: Fortran → COBOL → Python → Rust
I started my career writing Fortran. I have written COBOL. I have worked alongside experienced Assembler programmers who read storage dumps and PSW registers as naturally as reading English. I have seen what happens to systems that last 60 years. I built this project because I believe the next 60 years of enterprise computing should be built on Rust — safe, fast, and maintainable.
The late nights were worth it.
Built with Rust. Secured with AgentGateway. Deployed on Kubernetes.
Modernizing mainframes, one COBOL line at a time. Assembler support coming in Phase 4! 🚀
GitHub: https://github.com/venkatnagala/Mainframe-Modernization
Demo Videos:
- Quick Demo (2 min): https://www.youtube.com/watch?v=a7Yfz614d5Y
- Detailed Walkthrough (9 min): https://www.youtube.com/watch?v=5s6MMIfxNf0
- Kubernetes Deployment (3 min): https://youtu.be/05I-q2Ugw5Q
Top comments (0)