DEV Community

Swapping Go for Rust: 10x Cheaper K8s Ingress

Syed Ahmer Shah on May 19, 2026

Let me tell you a story that starts in 2013, peaks somewhere around 2019, and ends with me staring at a $4,200 AWS bill at 11pm on a Tuesday. ...

Read full post

Syed Ahmer Shah • May 19

Thank you! I really wanted to avoid the 'my language is better than yours' debate. At the end of the day, both Go and Rust are phenomenal tools—it’s all about understanding the memory lifecycle and knowing where to apply them. Glad the nuance came through!

Ronan • May 24

Your technical assessment of Go's garbage collection vs. Rust's compile-time memory management is spot on. Go is brilliant for control plane velocity, but continuous HTTP buffer allocations at high frequencies inevitably introduce an infrastructure tax. Using Rust strictly for the critical data plane substrate is the right architectural choice.

Syed Ahmer Shah • May 24

Exactly, Ronan. "Infrastructure tax" is the perfect way to phrase it. Go’s control plane velocity is unmatched for getting things out the door, but when you are pumping high-frequency HTTP buffers through an ingress, those micro-allocations scale up your cloud bill fast. Splitting the architecture to let Rust do the heavy lifting on the data plane while keeping Go where it thrives was the sweet spot. Really appreciate your sharp breakdown here!

Omar Hurain • May 24

This is one of the most balanced perspectives on the Go vs. Rust debate. It completely avoids developer tribalism by recognizing the exact boundaries where each tool shines. Acknowledging that "boring infrastructure is good infrastructure" shows great maturity—optimizing only where the high-throughput proxy bottleneck demands it.

Syed Ahmer Shah • May 24

Thanks, Omar! You nailed exactly what I was hoping to get across. It’s so easy to fall into the "this language is better than that one" trap, but engineering is just about trade-offs. Go is incredible for 90% of what we build, but when you're hitting that specific proxy bottleneck, the GC tax becomes a real financial metric. Keeping the infrastructure "boring" everywhere else is what allowed us to spend those 3 weeks focusing purely on the data plane. Appreciate you reading and bringing out that specific takeaway!

Faique • May 19

"First time you see an Envoy config file you think someone is hazing you."

I felt this in my soul. Moving away from Traefik’s magic annotations into the raw, explicit world of Envoy is a rite of passage. Kudos to you and your co-engineer for surviving those week-one outages and getting it stable!

Syed Ahmer Shah • May 19

Haha, glad I’m not the only one who felt like it was hazing! Moving from the 'magic' of annotations to the explicit configuration of Envoy is definitely a rite of passage. Those first few outages were stressful, but they were the best learning experience I’ve had in a long time. Thanks for the kind words!

Faraz • May 19

"Boring infrastructure is good infrastructure." Words to live by! It’s easy to get sucked into rewriting everything for the sake of hype, but your point about operational complexity is crucial. 3 weeks of engineering time for 2 people is a real cost, but with a 10x savings on a $4,200/month recurring bill, your ROI hit break-even almost immediately. Great execution.

Syed Ahmer Shah • May 19

Exactly! It’s easy to fall into the trap of 'resume-driven development,' but sometimes the most impressive engineering is the kind that just quietly works and saves the company money. Glad you appreciated the ROI breakdown—it's definitely satisfying to see the infrastructure pay for itself so quickly!

Vinod Oad • May 19

The timing of this is perfect. Seeing Cloudflare completely phase out FL1 for their Rust-based FL2 earlier this year really proved that this isn't just a niche optimization anymore—it's the new standard for edge and proxy layers. Go's GC is incredible for rapid development, but constant HTTP buffer allocations will always be its Achilles' heel at scale.

Syed Ahmer Shah • May 19

That’s a great observation about the Cloudflare transition. You hit the nail on the head—the GC overhead in Go is fantastic for velocity, but when you're dealing with high-frequency HTTP buffer allocations at the edge, that 'Achilles' heel' becomes impossible to ignore. Rust’s ownership model changes the game entirely for those layers.

mote • May 23

Ran into this exact problem on my drone's obstacle avoidance system — the agent kept "forgetting" low-priority sensor streams when GPU memory got tight, which is basically what you're describing with token budget pressure.

The real issue isn't just context length, it's prioritization. When you're running a local model on constrained hardware, you need to decide what stays in memory and what gets evicted. Most frameworks don't give you that control.

Curious how you'd handle multi-modal prioritization — like if you have 5 different sensor feeds but can only afford context for 3, how do you decide which ones to compress or drop?

Syed Ahmer Shah • May 23

You hit the exact core of the issue: resource constraints dictate how an intelligent system perceives reality.

For multi-modal prioritization on local hardware, three approaches work well:

Dynamic Gating: Pass a tiny, hyper-fast "meta-stream" first to detect anomalies, then dynamically promote that specific feed's priority.

Semantic Layering: Compress feeds into text/vector summaries before they hit the main model. A drone doesn't need raw frames of a wall; it just needs the string "Obstacle: 1.2m".

Weighted Eviction: Treat context like a ring-buffer where safety-critical tokens have a longer TTL (Time to Live) than low-priority streams.

Most frameworks treat context as a flat sequence rather than a dynamic memory hierarchy. How are you handling the eviction on your drone right now?

Sahil Kumar • May 19

It’s always fascinating to see Rust’s memory efficiency and predictable performance (no GC pauses) yield such massive infrastructure savings when replacing Go in high-throughput network applications. The transition from Go to Rust for an ingress controller makes a ton of sense given how critical low latency and minimal resource footprints are at that layer.

Syed Ahmer Shah • May 20

It really highlights where the language design choices show up in production, Sahil.

Go’s garbage collector is fantastic for getting applications shipped fast without worrying about memory management, but high-throughput network layers are a completely different beast. When you're processing hundreds of thousands of concurrent requests, constantly allocating and freeing up network buffers causes a massive layout of heap allocations that keeps the GC permanently working overtime.

Switching to Rust's resource management strategy completely flips the script. Being able to pass data through user space with zero-copy operations and zero runtime overhead means the ingress can practically flatline both CPU usage and latency. When you aren't over-provisioning clusters just to buffer against GC spikes, those massive infrastructure savings happen naturally. 👍

Amir • May 24

The ROI breakdown here is excellent. Investing 3 weeks of engineering time for two developers is a tangible upfront cost, but dropping a recurring bill from $4,200 to $390 yields immediate break-even. Framing this through financial metrics rather than just language hype makes it a highly practical case study for modern infrastructure teams.

Syed Ahmer Shah • May 24

Thank you, Amir! I really wanted to ground this in cold, hard math rather than just language fandom. At the end of the day, an engineering manager doesn't care about memory safety hype as much as they care about dropping a recurring bill from $4.2k to under $400. Factoring in the 2-developer, 3-week salary cost made the ROI argument undeniable. Glad you appreciated the financial breakdown!

Sagar Kumar • May 19

As a CFO, I love that ending, haha! "I learned a new programming language" is the ultimate engineering mic drop. Seriously though, dropping the node requirement from multiple t3.large instances down to a single t3.small while flattening the CPU spikes is a masterclass in modern cost optimization.

Syed Ahmer Shah • May 19

I’m glad a CFO perspective approves! It’s one thing to talk about performance, but when you can show a massive reduction in the AWS bill while simultaneously flattening those CPU spikes, it’s hard to argue with the results. It was definitely a fun 'mic drop' moment for the team

Kiran Gho • May 25

This is a great case study on when the Rust rewrite is actually justified. Many teams default to Go for microservices and networking because of development velocity, but this proves that for data-plane bottlenecks like an Ingress, the efficiency gains are worth the engineering overhead. I’m curious if you noticed a significant change in your deployment CI/CD pipeline times, given Rust's notoriously slow compile times compared to Go's lightning-fast builds.

Syed Ahmer Shah • May 25

Oh, the CI/CD pipeline slowdown was definitely the elephant in the room! Go builds in seconds, whereas optimization-heavy Rust builds took a massive bite out of our deployment velocity early on. We ended up having to invest quite a bit of time into configuring aggressive Docker layer caching and setting up remote sccache clusters to keep the developer experience from tanking. The infrastructure savings were worth it, but the build-time tax is real.

Dina Khaluj • May 19

While Go is usually the default for the cloud-native ecosystem due to its low concurrency overhead, your results highlight exactly where it hits its limits—garbage collection pauses and memory footprints under heavy, sustained network I/O. Dropping the GC overhead entirely by moving to Rust clearly paid off here.

Syed Ahmer Shah • May 20

Exactly, Dina. The "runtime tax" is a hidden drain on proxies. When you're running a massive reverse proxy or ingress controller, data is basically just passing through user space. Having a garbage collector constantly scanning those transient network buffers means you're burning CPU cycles just on surveillance.

The real killer in Go isn't even the stop-the-world pauses anymore—it’s the "Mark Assist" mechanic. Under heavy, sustained I/O, if allocations outpace the GC, the runtime literally hijacks active worker goroutines and forces them to clean up memory. That's where those random P99 latency spikes come from.

Ditching that entire headache for Rust’s deterministic dropping gives you that beautiful, flatline latency profile, even when traffic surges. Really glad you zeroed in on the memory mechanics of the shift! 👍

Mira Taimur • May 19

I’m curious about the trade-offs your team experienced during the rewrite. Specifically, how did you find the ecosystem maturity for K8s tooling in Rust (like kube-rs or custom async runtimes) compared to the battle-tested Go control-plane ecosystem? Also, how are you handling the increased complexity of the codebase for day-to-day maintenance now?

Syed Ahmer Shah • May 20

That’s the exact million-dollar question, Mira. While the infrastructure savings look great on a slide, moving away from Go's native, bulletproof K8s ecosystem introduces some real friction.

Ecosystem-wise, kube-rs has come a long way, but it definitely lacks that "plug-and-play" feel of the official Go client-go libraries. You end up having to build more boilerplate yourself, and navigating custom async runtimes under heavy I/O can require some serious fine-tuning.

As for day-to-day maintenance, the complexity is definitely real. The compiler guarantees give us tons of confidence once code hits production, but onboarding engineers who aren't deeply familiar with Rust's borrow checker or async mechanics takes noticeably longer than it did with Go. It’s essentially a trade-off where we traded cheaper compute bills for a steeper engineering learning curve. Thanks for bringing up the operational side of this! 👍

Faiza Naseer • May 21

This is a masterclass in modern infrastructure optimization. I love that you avoided the typical 'language tribalism' and focused purely on the architecture—using Go for the control plane and Rust for the data plane is the absolute sweet spot. The trade-off you highlighted between Go's rapid development velocity and its GC overhead during high-frequency HTTP allocations is exactly why we're seeing this shift at the edge layer. Brilliant work flattening those CPU spikes!

Syed Ahmer Shah • May 21

Appreciate the high praise, Faiza! 👍 Glad you appreciated the lack of tribalism. It's too easy to fall into the "X language is better than Y" trap. At the end of the day, GC pauses and memory footprints are just physics. Go is still an absolute joy for writing complex management logic quickly, but when you're sitting at the edge handling hundreds of thousands of concurrent connections, Rust's predictable memory footprint is just unmatched.

Godekwa Takeshi • May 21

This is an incredible write-up. Pointing out the distinction between using Go for the control plane and Rust for the high-performance data plane layer is such a mature architectural take. Go is unmatched for development velocity, but those constant HTTP buffer allocations at the edge really expose GC overhead under intense load. Dropping down to a t3.small while completely flattening the CPU spikes is a massive engineering win!

Syed Ahmer Shah • May 21

Thanks, Takeshi! Flattening those CPU spikes on a modest instance like a t3.small was incredibly satisfying, and you called out the exact technical bottleneck that forced our hand.

However, I'd love to push back a bit on the "Go for control plane, Rust for data plane" strategy as a universal maturity milestone. While it worked beautifully for this specific high-throughput edge use case, do you think it's possible we are collectively over-engineering this split too early?

Rust completely eliminates the garbage collection (GC) overhead, but it introduces a massive complexity tax, steeper learning curves, and slower iteration speeds. In many cases, heavily optimizing Go—using tools like sync.Pool to recycle those HTTP buffers, minimizing allocations, or tuning the GC percentages (GOGC)—can get a team 85% of the performance gains with a fraction of the engineering overhead.

Given how much more expensive engineering time is compared to cloud infrastructure, do you think dropping down to Rust is always a "mature win," or can it sometimes be a premature optimization when simply scaling up to a larger AWS instance might be the more pragmatic business decision? 👍

Rohan Junejo • May 22

Dropping the infrastructure bill from $4,200 to $390 while flattening those CPU spikes is an incredible win. What I appreciate most about this write-up is the lack of language tribalism. You hit the nail on the head regarding the architectural sweet spot: Go is amazing for the orchestration and control plane velocity, but Rust absolutely shines when you need deterministic compile-time memory management for high-frequency data planes. Great execution and fantastic ROI breakdown!

Syed Ahmer Shah • May 22

Completely agree, Rohan. The control plane vs. data plane distinction is everything here. No need for language wars when you just put the right tool in the right slot. Appreciate the feedback! 👍

Ali Shey • May 22

This is an incredible case study. Dropping the infrastructure bill from $4,200 to $390 while moving from multiple t3.large instances down to a single t3.small is a massive engineering win.

I really appreciate the nuance you brought here—too many 'Go vs Rust' pieces devolve into tribalism. Pointing out that Go is fantastic for the control plane/velocity, but that its garbage collection overhead hurts high-frequency HTTP buffer allocations at the data plane level, is the exact technical clarity people need. The 3-week engineering ROI paid for itself almost instantly. Brilliant write-up

Syed Ahmer Shah • May 22

Avoiding language tribalism was key. Go is a beast for velocity, but the GC pauses on high-frequency data planes just can't compete with Rust's determinism. That ROI made the 3 weeks feel like a no-brainer!

Bilal Motiwala • May 22

Dropping the cluster footprint down to a single t3.small while completely stabilizing the metrics is an absolute masterclass in cost optimization. A 10x savings on a thousands-of-dollars recurring monthly bill means the engineering time invested hit break-even almost immediately. Excellent execution and great write-up.

Syed Ahmer Shah • May 22

Thanks, Bilal! That quick break-even point was the ultimate validation for the team. It’s always a gamble pulling engineers off feature work for a rewrite, but when you can shrink the footprint down to a single t3.small and sleep soundly through the night without OOM alerts, the math speaks for itself. Appreciate you reading the breakdown!

Hassan • May 22

The memory footprint reduction here is the real hero. Go’s garbage collector is usually fine, but at Kubernetes ingress scale, those GC spikes and the 'GC tax' on CPU really add up when handling hundreds of thousands of concurrent connections. Dropping the overhead by 10x just by moving to Rust’s manual-under-the-hood memory management completely changes the calculus for cluster scaling. Did you run into any major headaches mapping Rust's strict ownership model to the asynchronous nature of a network proxy, or did libraries like Tokio handle most of the heavy lifting out of the box?

Syed Ahmer Shah • May 22

The "GC tax" at scale is exactly what killed us. To your question: Tokio handled about 80% of the heavy lifting, but mapping Rust’s strict ownership to long-lived, asynchronous connection states definitely caused some sleepless nights. We had to rely heavily on explicit arc-cloning (Arc) and pinned futures to make the proxy happy, but the 10x savings made it entirely worth it.

Emnj Marokh • May 24

The cost optimization here is a textbook example of why compute efficiency matters at scale. While Go is usually efficient enough for standard microservices, its garbage collection cycles and runtime overhead can introduce unpredictable latency spikes and higher memory baselines in high-throughput infrastructure like a Kubernetes Ingress.

Moving to Rust and dropping memory consumption by 90% is massive, especially when it translates directly to smaller node sizes and fewer CPU throttles under peak load.

Syed Ahmer Shah • May 24

You summarized the technical reality beautifully, Emnj. For standard microservices, Go's runtime is brilliant. But at the Ingress level, those tiny latency spikes from garbage collection cycles compound under heavy load, forcing you to over-provision nodes just to handle the headroom. Dropping the memory baseline by 90% and eliminating CPU throttling was an incredible win for our infrastructure stability. Appreciate your deep understanding here!

Virat • May 24

This highlights a great architectural pattern. Using Go for the orchestration control plane where development velocity matters, but swapping the high-frequency data plane over to Rust is the sweet spot for cloud infrastructure. The drop in memory footprint without garbage collection pauses perfectly illustrates why the data plane substrate is moving toward compile-time memory management.

Syed Ahmer Shah • May 24

You hit on the most critical point. It is incredibly easy to let generative tools build a house of cards that collapses the moment you hit production edge cases. The real work was keeping Copilot on a tight leash—ensuring robust test coverage and spending my manual engineering hours on the critical paths like the checkout state machine, where determinism and reliability are non-negotiable.

Virat • May 24

This is one of the most balanced takes on the "Go vs. Rust" debate I’ve read in a while. It’s refreshing that you didn't just fall into tribalism, but instead focused on the architectural sweet spot: leveraging Go's strengths for the control plane and utilizing Rust's deterministic memory management for the data plane. Dropping infrastructure requirements from multiple t3.large instances down to a single t3.small while completely flattening those CPU spikes is an incredible case study in true ROI. It really proves that when you're handling high-frequency HTTP buffer allocations at the ingress layer, eliminating GC overhead completely changes the economics of the cloud.

Syed Ahmer Shah • May 24

Keynote demos always hide the painful reality of production environments. When a black-box model is dynamically spawning subagents or generating underlying architecture, traditional debugging workflows fall apart. Tracing execution loops and handling state errors across multi-agent systems is going to require entirely new tooling and a heavy pivot toward systems auditing.

Kiran Gho • May 25

The resource savings here are massive. In a K8s environment, the Go GC overhead and runtime footprint really compound at scale, especially for high-throughput ingress controllers. Dropping infrastructure costs by 10x by shifting to Rust's zero-cost abstractions and predictable memory model is a huge win for platform engineering. Did you hit any major friction points with async Rust or the learning curve when migrating the existing Go concurrency patterns?

Syed Ahmer Shah • May 25

The learning curve was definitely steep, particularly mapping Go’s straightforward goroutine/channel model to async Rust's Future execution and lifetime constraints. Handling data ownership across async boundaries (Arc> vs. Go's easier pointer passing) caused quite a few fights with the compiler initially. It forces you to reason about your data lifecycles much earlier in the design phase, which slowed us down at first, but resulted in a bulletproof runtime.

Tahir • May 19

Fantastic write-up! That drop from $4,200 to $390 is a massive win, and your breakdown of why it happens (Go's GC latency optimization vs. Rust's deterministic compile-time memory management for proxy allocation lifecycles) is spot on.

It’s refreshing to see a nuanced take that doesn’t just blindly bash Go, but instead highlights the right tool for the right job—Go for orchestrating the control plane, and Rust/C++ for the data plane substrate. Envoy configuration definitely feels like ritual hazing the first time around, but those flat memory metrics make the archaeology completely worth it. Thanks for sharing this!

Syed Ahmer Shah • May 19

Thanks, Tahir! I really appreciate the detailed feedback. You summarized the trade-off perfectly—using Go for the control plane and Rust for the high-performance data plane really is the 'sweet spot' for modern architecture. And yes, the archaeology of the Envoy config file is painful, but those flat memory metrics make the headache worth it every single time!

Darin Ma • May 25

This is one of the most balanced "Go vs. Rust" infrastructure perspectives I've read in a while. Usually, these case studies devolve into language tribalism, but you nailed the actual operational nuance here. Go’s runtime and garbage collector are incredible for rapid control plane orchestration, but high-frequency HTTP proxying at the ingress layer is a completely different beast—those constant buffer allocations eventually turn the GC into a massive bottleneck. Flattening those CPU spikes and dropping the node requirement down like that perfectly illustrates why the data plane substrate is shifting heavily toward Rust. Excellent ROI breakdown.

Syed Ahmer Shah • May 25

"Semantic runtime environment" is the perfect way to phrase it. The token efficiency gains of environmentId are massive, but you're entirely right to call out the security side. If we are exposing raw tools and maintaining persistent multi-turn state directly on the client layer, standard scraping defenses like simple CAPTCHAs become obsolete. We have to start thinking about behavioral rate-limiting at the model-interaction level, which is unchartered territory for most web devs.

Ghafar • May 27

Dropping the cloud bill from $4,200 to $390 by switching the data plane substrate to Rust while keeping Go for control plane orchestration is a phenomenal architectural choice. The way you mapped out the real-world trade-offs—acknowledging the 3 weeks of engineering time vs immediate ROI—proves this wasn't just a hype-driven rewrite. Go’s GC is unmatched for development velocity, but those constant HTTP buffer allocations at high-frequency ingress points really show where its boundaries lie. Flattening the CPU spikes and shrinking the node requirements down to a single t3.small is a massive production win. Great breakdown on memory lifecycle management without the usual tribalism!

Syed Ahmer Shah • May 27

Thanks, Ghafar. It was vital to show that this wasn't just a hype-driven rewrite. Go’s GC is fantastic for the control plane, but at high-frequency ingress points, the buffer allocations just couldn't compete with Rust's memory efficiency. The ROI spoke for itself once we dropped to that single node.

Vicky Jaish • May 19

How did the team find the learning curve going from Go to Rust?

Syed Ahmer Shah • May 20

It was definitely a steep mountain to climb initially, Vicky.

The biggest hurdle for engineers coming from Go is unlearning the habit of just spinning up a goroutine and letting the runtime handle the rest. In Rust, you have to explicitly account for memory ownership, lifetimes, and thread-safety up front.

In the first few weeks, the team spent a lot of time "fighting the borrow checker." Concepts like passing references across async boundaries (Tokio tasks) or managing shared state without heavy runtime locks required a major shift in how we architected code.

That said, once the team got over that initial 3-to-4 week hump, the compiler became more of a helpful guardrail than an enemy. We realized that while development speed slowed down slightly, our debugging and troubleshooting time in production dropped to practically zero—because if the code compiles, it just works. 👍

Zohaib • May 19

Were there any specific Rust libraries (like Axum or Tokio) that made building the new ingress easier, or did you write a lot of the low-level networking from scratch?

Syed Ahmer Shah • May 20

We definitely didn't reinvent the wheel with raw sockets, Zohaib! The Rust async ecosystem is too good to ignore for that.

Tokio was the absolute baseline for the async runtime, but the real heavy lifting on the data plane came from Pingora (Cloudflare’s proxy engine library) and Hyper for raw HTTP parsing. Axum is incredible for building standard web APIs, but for an edge ingress dealing with raw streaming traffic, traffic routing, and custom TLS handshakes, you want to operate a layer or two lower.

We used kube-rs strictly on the control-plane side to watch the Kubernetes API for Ingress and EndpointSlice updates, and then fed those backends into an ultra-fast concurrent map (dashmap) that the proxy layer checks on every request. It gave us the perfect mix—high-level ergonomics for watching K8s events, but absolute bare-metal control over the network packets. 👍

Bit Wombat • Jun 4

Gotta say, your writing is excellent. Succinct, dense, interesting, and technically bang-on. Thanks for writing!

Jefrey Daneil • May 20

so you mean rust is now better than go ?

Syed Ahmer Shah • May 20

Not across the board! For a high-throughput, memory-constrained bottleneck like a massive K8s Ingress, Rust’s lack of a garbage collector gives it a massive cost and performance edge. But for general backend development, API microservices, and rapid developer velocity, Go's simplicity, fast compilation, and concurrency model still make it the better choice for most teams. Rust is better for this specific infrastructure pain point, not a total replacement for Go. 👍

Nasir Hassan • May 20

This is an incredibly refreshing read. Usually, "Go vs Rust" pieces devolve into standard language tribalism, but your breakdown brings the perfect amount of engineering nuance.

Syed Ahmer Shah • May 21

Thanks, Nasir! I really wanted to avoid the usual "X is better than Y" trap. Both languages are phenomenal; it’s just a matter of choosing the right tool for the specific architectural job. I'm really glad the engineering nuance came through. Appreciate you reading and sharing your thoughts! 👍

mihir mohapatra • May 20

Getting good experience by reading your blog😊

Dina Khaluj • May 19

This is a fascinating case study! A 10x cost reduction at the ingress level is massive, especially when dealing with high-throughput K8s traffic.

Syed Ahmer Shah • May 20

It really is a massive win, Dina, especially when you scale it across multiple clusters. At high throughput, small inefficiencies in the ingress layer compound fast into massive cloud bills.

The coolest part is that most of those savings come directly from how efficiently the two languages handle memory under heavy I/O. Go is great, but its garbage collector has to constantly scan transient network buffers, which eats up a ton of CPU. By switching to Rust and utilizing its zero-cost abstractions, the proxy can route packets with practically zero memory footprint and no GC pauses.

When you strip out all that overhead, you realize you need a fraction of the compute power to handle the exact same traffic. Really glad you found the case study valuable! 👍