Your Coroutines Work Locally. Then Production Happens.
You wrote the async code. It's elegant, non-blocking, and runs beautifully on your machine. Then you deploy — and somewhere around 3 AM, Grafana wakes you up with a memory graph that looks like a ski slope. Welcome to Kotlin Coroutines in Production.
This guide skips the Hello World phase entirely. It's about what happens when real load hits — thread starvation, silent memory leaks, and exception handlers that don't actually handle anything. The kind of bugs that only show up at scale and only at the worst possible time.
Scopes, Supervisors, and Why the Wrong Choice Crashes Everything
Most developers treat coroutineScope and supervisorScope as roughly the same thing. They are not. With coroutineScope, one failing child cancels the parent and every sibling — great for all-or-nothing operations, catastrophic for independent tasks. In production, supervisorScope is almost always the right call. Understanding the difference between coroutineContext vs coroutineScope vs supervisorScope is what separates code that survives partial failures from code that doesn't.
Exception Handling That Actually Works
Wrapping await() in a try-catch is not enough. By the time your catch block runs, the parent scope may already be cancelling. Exceptions in coroutines behave differently depending on whether you used launch or async — and a "swallowed" exception in production means 500 errors with no logs on the backend, or an "App has stopped" dialog on Android. The right pattern is a CoroutineExceptionHandler installed at every root scope, paired with supervisorScope to contain blast radius.
Thread Starvation and the Custom Dispatcher You Actually Need
Dispatchers.IO is a reasonable default. It is not enough when you mix non-blocking code with slow legacy database drivers under serious load. The answer is a custom coroutine dispatcher for heavy IO — an isolated fixed thread pool for the slow stuff, so the rest of your app stays responsive. Pair that with limitedParallelism(n) on Dispatchers.Default to cap background CPU work, and you have a proper bulkhead that keeps your latency-sensitive paths alive when everything else is under pressure.
Leaks, Ghosts, and the Danger of GlobalScope
A coroutine lives in memory as long as its Job is active. Lose the reference, and you have a ghost — running, consuming resources, invisible. The most common cause is GlobalScope used for "just a quick task." The diagnostic tool is DebugProbes from kotlinx-coroutines-debug: it dumps every active coroutine with a stack trace so you can see exactly what's suspended and why. The long-term fix is simpler — never break the parent-child hierarchy, and always bind coroutines to a lifecycle-aware scope.
If It Works on Your Machine, That Is Not Enough
Structured concurrency pitfalls in large-scale systems, state confinement without locks, the island effect that leaves thousands of zombie tasks burning CPU — it's all in the full article.
Production won't wait for you to finish the docs. But reading this first might mean you actually sleep through the night.
Top comments (0)