DEV Community

Prathamesh Thakre
Prathamesh Thakre

Posted on

Unbounded Queues: The Silent Killer of Production Services

Your service runs fine at 2 PM.

At 6 PM, the database experiences a brief latency spike—nothing catastrophic, maybe 200ms slower than usual. Within minutes, your monitoring alerts start lighting up. Memory usage climbs 40%, then 60%. GC pauses increase. Users start timing out.

By 7 PM, you have an OutOfMemoryError.

You check the logs. Nothing unusual. The database recovered. The CPU is fine. The network is fine. So what killed you?

An unbounded queue in your ThreadPoolExecutor.

This is one of those bugs that feels like it shouldn't exist in 2026. It's well-known, thoroughly documented, yet somehow still sneaks past code reviews and deploys into production. The reason is simple: unbounded queues feel safe at first.

You define a thread pool with 10 threads, and you assume the queue is your safety net. When threads are busy, tasks wait. Seems reasonable. Until the queue has 100,000 tasks in it.

The Deceptive Logic of Unbounded Queues

Here's how the trap works:

ExecutorService executor = new ThreadPoolExecutor(
    10,                                    // corePoolSize
    10,                                    // maxPoolSize
    60, TimeUnit.SECONDS,                 // keepAliveTime
    new LinkedBlockingQueue<>()            // ← DANGER: unbounded queue
);
Enter fullscreen mode Exit fullscreen mode

You've created a thread pool with 10 threads. When all 10 threads are busy, new tasks don't get rejected—they get queued. The queue can hold unlimited tasks.

Your mental model: "Threads are busy, tasks queue up, threads finish, queue drains."

The reality during latency: "Threads are busy waiting on slow database calls, tasks keep arriving and queueing, queue grows indefinitely, memory fills up, GC panics, everything crashes."

The core issue is that a queue is not a buffer—it's a pit.

A buffer should have boundaries. It should say "I can hold X items, then I stop accepting more." A queue with no bounds just keeps taking items until your JVM runs out of memory.

When This Goes Wrong

The sneaky part is that unbounded queues don't cause problems under normal load. They cause problems under the exact circumstances when you most need protection:

Scenario 1: Temporary Latency Spike

Your database experiences a brief slowdown. Queries that normally take 10ms now take 500ms. Your 10-thread pool fills up as threads block waiting for results.

Time 0:
  Thread 1-10: Processing requests (blocking on DB)
  Queue: Empty

Time 1 (DB latency spike):
  Thread 1-10: Still waiting for DB responses
  Queue: 100 pending requests

Time 2:
  Thread 1-10: Still waiting
  Queue: 1,000 pending requests

Time 3:
  Thread 1-10: Finally getting responses back
  Queue: 5,000 pending requests and growing

Time 5:
  Thread 1-10: Working through the backlog
  Queue: 100,000+ pending requests
  Memory: 2GB (heap is 4GB)

Time 7:
  Queue: OutOfMemoryError. Your service dies.
Enter fullscreen mode Exit fullscreen mode

The database recovered at Time 3. But your service didn't. It spent the next 4 minutes executing stale work for requests that already timed out on the client side.

Scenario 2: Cascading Failures

Service A depends on Service B. Service B starts degrading. Service A's thread pool queues up requests waiting for responses. The queue grows. Memory spikes. Service A crashes, adding more load to Service B, which makes other services queuing up requests to Service B, which...

This is how a cascading failure happens. One slow service takes down three others.

The Root Cause: The Acceptance vs. Execution Mismatch

Here's the fundamental problem:

A ThreadPoolExecutor has two knobs:

  1. Core threads — threads that always exist
  2. Queue — where tasks wait when threads are busy

The issue: When the queue is unbounded, the executor accepts every single task regardless of capacity. Your thread pool can't say "no, I'm overloaded, reject this task." It just queues it.

This creates a mismatch between accepting work and executing work.

Request comes in → Added to unlimited queue → Task waits → More requests come in
                                               ↓
                                          Task still waiting
                                          Memory growing
                                          GC struggling
                                          JVM dying
Enter fullscreen mode Exit fullscreen mode

The executor accepted the work (queued it), but never had capacity to execute it.

The Solution: Bounded Queues and Backpressure

The fix is to give your queue a hard limit:

ExecutorService executor = new ThreadPoolExecutor(
    10,                                    // corePoolSize
    20,                                    // maxPoolSize (grow if needed)
    60, TimeUnit.SECONDS,                 // keepAliveTime
    new LinkedBlockingQueue<>(1000),      // ← BOUNDED: max 1000 tasks
    new ThreadPoolExecutor.CallerRunsPolicy()  // Rejection policy
);
Enter fullscreen mode Exit fullscreen mode

Now, when the queue reaches 1000 items, what happens to the 1001st task? It gets rejected.

By default, a rejected task throws a RejectedExecutionException. But that seems harsh—you want your service to degrade gracefully, not crash.

This is where rejection policies come in:

Rejection Policy 1: CallerRunsPolicy (My Favorite)

new ThreadPoolExecutor.CallerRunsPolicy()
Enter fullscreen mode Exit fullscreen mode

When the queue is full, instead of rejecting the task, execute it in the caller's thread. This creates natural backpressure—the caller slows down, which slows down the request ingestion, which protects the thread pool.

Effect: Your API becomes slower under load instead of crashing. Users experience timeouts, not 503s.

Incoming requests → Thread pool queue (1000 items) → FULL
                                                      ↓
                              CallerRunsPolicy: Run in caller thread
                                                      ↓
                              Caller gets blocked → Slows down ingestion
                                                      ↓
                              Natural backpressure applied
Enter fullscreen mode Exit fullscreen mode

Rejection Policy 2: AbortPolicy (Explicit Failure)

new ThreadPoolExecutor.AbortPolicy()
Enter fullscreen mode Exit fullscreen mode

Throw an exception. The caller knows immediately that the system is overloaded. They can retry, circuit-break, or fail fast.

try {
    executor.submit(task);
} catch (RejectedExecutionException e) {
    log.warn("Executor is overloaded, backing off");
    return new ServiceUnavailableResponse();
}
Enter fullscreen mode Exit fullscreen mode

This is more explicit but requires the caller to handle the rejection.

Rejection Policy 3: DiscardPolicy (Nuclear Option)

new ThreadPoolExecutor.DiscardPolicy()
Enter fullscreen mode Exit fullscreen mode

Silently drop the task. Use this only for non-critical work where loss is acceptable (e.g., metrics collection, logging).

Tuning the Queue Capacity

How big should your queue be?

This is where it gets nuanced. Too small, and you reject valid requests during normal fluctuations. Too big, and you're back to the original problem.

A practical approach:

// Estimate based on your latency and throughput
int estimatedQueueSize = (int) (
    averageRequestsPerSecond *  // 1000 req/s
    maxAcceptableLatencySeconds  // 10 second wait
);

// Conservative estimate: 10,000 tasks
// This gives you 10 seconds of buffer at 1000 req/s

ExecutorService executor = new ThreadPoolExecutor(
    10,
    20,
    60, TimeUnit.SECONDS,
    new LinkedBlockingQueue<>(estimatedQueueSize),
    new CallerRunsPolicy()
);
Enter fullscreen mode Exit fullscreen mode

Then monitor in production:

  • If queue hits capacity regularly, increase it (or increase core threads)
  • If queue rarely exceeds 10% capacity, you can reduce it

A Deeper Problem: Stale Task Execution

Even with bounded queues, there's another issue: stale tasks still get executed.

When a client times out waiting for a response, they've given up. But their task might still be sitting in the queue, waiting to execute. Hours later, when the queue drains, the thread pool dutifully executes it.

This is wasted work—your thread pool is doing something nobody cares about anymore.

One partial solution: Use Future with timeouts:

ExecutorService executor = Executors.newFixedThreadPool(10);

Future<Response> future = executor.submit(() -> {
    return expensiveOperation();
});

try {
    Response response = future.get(5, TimeUnit.SECONDS);
    return response;
} catch (TimeoutException e) {
    future.cancel(true);  // Cancel the task
    return timeoutResponse();
}
Enter fullscreen mode Exit fullscreen mode

The cancel(true) flag attempts to interrupt the task. But this only works if your task respects interrupts. Many database drivers don't.

A better approach: Pass a deadline or timeout token to your task itself:

@FunctionalInterface
interface TimeoutAwareTask<T> {
    T execute(Deadline deadline) throws InterruptedException;
}

executor.submit(() -> {
    Deadline deadline = Deadline.ofMillis(System.currentTimeMillis() + 5000);
    if (deadline.isExpired()) {
        log.debug("Task timed out before execution, skipping");
        return;
    }
    expensiveOperation(deadline);
});
Enter fullscreen mode Exit fullscreen mode

Now your task knows when the client gave up and can bail out early.

Additional Improvements

Beyond bounded queues, consider:

1. Thread Pool Size Optimization

// CPU-bound work
int coreThreads = Runtime.getRuntime().availableProcessors();

// IO-bound work (database calls, network)
int coreThreads = Runtime.getRuntime().availableProcessors() * 2;
Enter fullscreen mode Exit fullscreen mode

IO-bound tasks spend time waiting, so more threads are useful.

2. Separate Thread Pools by Workload

Don't use one executor for everything. Separate concerns:

// Fast, non-blocking work
ExecutorService fastExecutor = new ThreadPoolExecutor(
    10, 10, 60, TimeUnit.SECONDS,
    new LinkedBlockingQueue<>(1000),
    new CallerRunsPolicy()
);

// Slow IO work (database queries)
ExecutorService slowExecutor = new ThreadPoolExecutor(
    20, 20, 60, TimeUnit.SECONDS,
    new LinkedBlockingQueue<>(5000),
    new CallerRunsPolicy()
);
Enter fullscreen mode Exit fullscreen mode

One slow dependency doesn't starve the fast paths.

3. Spring Boot Configuration (If You Use Spring)

spring.task.execution.pool.core-size: 10
spring.task.execution.pool.max-size: 20
spring.task.execution.pool.queue-capacity: 1000
spring.task.execution.pool.allow-core-thread-timeout: true
Enter fullscreen mode Exit fullscreen mode

Add a custom executor bean for control:

@Configuration
public class ExecutorConfig {

    @Bean(name = "taskExecutor")
    public ThreadPoolTaskExecutor taskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(10);
        executor.setMaxPoolSize(20);
        executor.setQueueCapacity(1000);
        executor.setRejectedExecutionHandler(
            new ThreadPoolTaskExecutor.CallerRunsPolicy()
        );
        executor.initialize();
        return executor;
    }
}
Enter fullscreen mode Exit fullscreen mode

4. Monitoring and Alerting

Track these metrics religiously:

@Component
public class ExecutorMetrics {

    @Scheduled(fixedRate = 5000)
    public void logExecutorStats(ThreadPoolTaskExecutor executor) {
        log.info(
            "Executor stats - Active: {}, Queue: {}, Completed: {}",
            executor.getActiveCount(),
            executor.getThreadPoolExecutor().getQueue().size(),
            executor.getThreadPoolExecutor().getCompletedTaskCount()
        );
    }
}
Enter fullscreen mode Exit fullscreen mode

Alert if:

  • Queue size consistently > 50% of capacity
  • Active threads = maxPoolSize (means you're at capacity)
  • Task rejection rate increases

The Mindset Shift

Here's what separates services that crash under load from those that degrade gracefully:

Dangerous mindset: "I'll use an unbounded queue as a buffer."

Safe mindset: "My queue has a limit. When I hit that limit, I stop accepting new work and return to the caller that I'm overloaded."

The second approach feels harsh—you're rejecting requests. But that's better than crashing. A rejection is honest; a crash is a lie.

One More Thing

Thread pool tuning is empirical, not theoretical. The "perfect" size for your executor depends on your latency profile, your hardware, and your workload.

Start with bounded queues and reasonable defaults. Deploy. Monitor. Adjust based on production behavior.

And if you see memory climbing during a latency spike, you already know what to look for: check your executor configuration. Odds are, somewhere there's an unbounded queue quietly queuing up requests until your JVM runs out of memory.

The fix is simple. The prevention is simpler. The cost of not doing it is expensive.

Top comments (1)

Collapse
 
dhiraj_wasu profile image
Dhiraj Wasu

Thanks