---
title: "Fix Connection Pool Exhaustion: Kotlin Coroutines + HikariCP Under Load"
published: true
description: "Kotlin coroutines exhaust HikariCP pools because JDBC blocks threads coroutines can't reclaim. Here's the dispatcher sizing, R2DBC migration, and circuit-breaker setup that handles 10x spikes."
tags: kotlin, architecture, api, cloud
canonical_url: https://blog.mvp-factory.com/connection-pool-exhaustion-kotlin-coroutines-hikaricp
---
## What We Will Build
In this workshop, I'll walk you through the exact problem that silently kills Kotlin coroutine backends under load — connection pool exhaustion — and three production-tested fixes. By the end, you will have:
- A clear mental model of *why* `suspend fun` + JDBC is a trap
- A dedicated dispatcher configuration sized to your pool
- A circuit-breaker wrapper using Resilience4j
- An understanding of when R2DBC eliminates the problem entirely
Let me show you a pattern I use in every project that runs coroutines against a relational database.
## Prerequisites
- Kotlin 1.7+ with coroutines
- Spring Boot 3.x (WebFlux or MVC with coroutine support)
- HikariCP (ships with Spring Boot by default)
- Resilience4j for circuit breakers
- Familiarity with `suspend fun` and `Dispatchers.IO`
## Step 1: Understand the Mismatch
Here is the gotcha that will save you hours. When a coroutine calls a JDBC driver through HikariCP, it **blocks the underlying thread** until the query completes. The coroutine runtime cannot see this. It cannot reclaim that thread. Your `suspend fun` is suspended in name only.
Under normal load, everything looks fine. Under a 3–5x traffic spike, your connection pool drains within seconds, requests queue behind `getConnection()` timeouts, and latency cascades through every downstream service.
Consider a service on `Dispatchers.IO` (default 64 threads) with a HikariCP pool of 10 connections and queries averaging 50ms:
| Metric | Normal load (200 rps) | Spike (2,000 rps) |
|---|---|---|
| Concurrent DB calls | ~10 | ~100 |
| Pool wait time (p99) | < 5ms | > 30,000ms (timeout) |
| Thread utilization | 15% | 100% (starvation) |
| Dropped connections | 0 | cascading failures |
The formula that exposes the problem:
max_concurrent_db_calls = request_rate × avg_query_duration
2000 rps × 0.05s = 100 concurrent calls vs. 10 pool connections
Those 90 excess coroutines each pin an `IO` dispatcher thread while waiting for a connection that will not arrive before the timeout. This is dispatcher starvation.
## Step 2: Isolate JDBC on a Dedicated Dispatcher
The docs do not mention this, but the single most impactful fix if you are staying on JDBC is creating a **bounded dispatcher sized exactly to your connection pool**:
kotlin
// Dedicated dispatcher — sized to connection pool
val dbDispatcher = Dispatchers.IO.limitedParallelism(10)
// Circuit breaker via Resilience4j
val circuitBreaker = CircuitBreaker.of("db", CircuitBreakerConfig.custom()
.failureRateThreshold(50f)
.waitDurationInOpenState(Duration.ofSeconds(5))
.slidingWindowSize(20)
.build())
suspend fun getUser(id: Long): User = withContext(dbDispatcher) {
circuitBreaker.executeSuspendFunction {
jdbcTemplate.queryForObject("SELECT * FROM users WHERE id = ?", id)
}
}
Here is the minimal setup to get this working. The sizing formula I use in production:
plaintext
dispatcher_parallelism = hikari_max_pool_size
hikari_max_pool_size = (core_count * 2) + effective_spindle_count
This caps the number of coroutines entering the JDBC path at the number of connections available. The circuit breaker handles sustained pressure: when the pool is overwhelmed, it trips open and fails fast. A 2ms failure is better than 100 coroutines each waiting 30 seconds for a connection that never comes.
## Step 3: Evaluate R2DBC for New Services
R2DBC drivers are non-blocking at the protocol level. When a coroutine awaits an R2DBC query, it **truly suspends**. No thread held. Compare:
kotlin
// Blocking JDBC — holds thread for entire query lifecycle
suspend fun getUser(id: Long): User = withContext(Dispatchers.IO) {
jdbcTemplate.queryForObject("SELECT * FROM users WHERE id = ?", id)
}
// R2DBC — true suspension, no thread pinned
suspend fun getUser(id: Long): User {
return databaseClient.sql("SELECT * FROM users WHERE id = ?")
.bind(0, id)
.map { row -> row.toUser() }
.awaitSingle()
}
| Factor | HikariCP + JDBC | R2DBC |
|---|---|---|
| Thread held during query | Yes | No |
| Coroutine suspension | Fake (thread blocked) | Real (non-blocking) |
| Pool exhaustion under 10x spike | Likely | Manageable |
| Backpressure support | None | Built-in (Reactive Streams) |
| Driver maturity | Excellent | Good (PostgreSQL, MySQL stable) |
| ORM support | Full (Hibernate, jOOQ) | Limited (Spring Data R2DBC) |
R2DBC will not magically give you infinite connections, but it removes thread starvation as a failure amplifier. That difference matters more than it sounds.
## Step 4: Monitor Aggressively
A healthy production setup needs dedicated dispatchers per resource, circuit breakers at each boundary, and dashboards tracking these metrics:
- `hikaricp_connections_pending` — early warning for exhaustion
- `r2dbc_pool_acquired` vs `r2dbc_pool_max_allocated` — pool pressure
- Coroutine dispatcher queue depth — starvation indicator
During the long debugging sessions that come with tuning these systems, I keep [HealthyDesk](https://play.google.com/store/apps/details?id=com.healthydesk) running in the background. When you are deep in connection pool traces for hours, those break reminders are the difference between finding the bug and becoming one.
## Gotchas
1. **`Dispatchers.IO` is shared.** If your JDBC calls saturate it, every other coroutine using `IO` — file reads, HTTP calls, logging — starves too. Always isolate with `limitedParallelism`.
2. **Pool size ≠ dispatcher size is a bug.** If your dispatcher allows 64 concurrent coroutines but your pool only has 10 connections, 54 coroutines block on `getConnection()` and hold threads hostage. Match them exactly.
3. **R2DBC does not replace your ORM.** If you rely on Hibernate or complex jOOQ queries, R2DBC with Spring Data is not a drop-in replacement. Evaluate the trade-off honestly before migrating.
4. **Circuit breakers need tuning, not defaults.** A `failureRateThreshold` of 50% with a sliding window of 20 is a starting point. Profile your actual failure patterns and adjust.
## Conclusion
Stop using `Dispatchers.IO` directly for JDBC calls — create a `limitedParallelism` dispatcher sized exactly to your connection pool. For new services with simple data access patterns, use R2DBC to eliminate the blocking mismatch entirely. Put circuit breakers at the connection pool boundary so a tripped breaker that fails in 2ms beats 100 coroutines timing out at 30 seconds each. These three changes turn a system that collapses under 3x load into one that handles 10x spikes without dropping a single connection.
Top comments (0)