DEV Community

Nitin N.
Nitin N.

Posted on

I Thought I Understood Async — Until I Built a Real Task Runner in Node.js

If you’ve ever written this:

await Promise.all(tasks.map(t => t()));
Enter fullscreen mode Exit fullscreen mode

You’ve probably told yourself:

“That’s concurrency.”

I used to believe that too.

Then I tried to build something that behaved like a real system — not a tutorial.

What I wanted was simple on paper:
A Node.js task runner that could:

  • Run tasks concurrently
  • Retry failures
  • Never overload the system
  • Always reach a terminal state

What I got instead was a lesson in engineering humility.


The Problem That Started It

I wasn’t trying to optimize performance.
I was trying to build trust into my system.

In production, this is what really matters:

  • APIs fail
  • Networks lie
  • Logs deceive
  • Retries spiral out of control

So I asked myself:

If this were a job queue or microservice worker — would I trust it at 3 AM?

That question shaped everything.


What a “Task” Really Is

In my system, a task is not a promise.

It’s a function that returns a promise.

Why that matters:
Because retries require re-execution, not re-awaiting.

type Task<T> = () => Promise<T>;
Enter fullscreen mode Exit fullscreen mode

This single design choice eliminated:

  • Stale promises
  • Broken retries
  • Phantom concurrency bugs

The Fake API That Told the Truth

Instead of mocking success, I mocked reality:

const fakeApi = (id, delay, shouldSucceed) => {
  const start = Date.now();
  console.info(`STARTING Task ID: ${id}`);

  return new Promise((resolve, reject) => {
    setTimeout(() => {
      const end = Date.now();
      if (shouldSucceed) {
        console.log(`SUCCEEDED Task ID: ${id} | Duration ${end - start}ms`);
        resolve(id);
      } else {
        console.error(`FAILED Task ID: ${id} | Duration ${end - start}ms`);
        reject(new Error(`${id} failed`));
      }
    }, delay);
  });
};
Enter fullscreen mode Exit fullscreen mode

This gave me:

  • Real durations
  • Real failures
  • Real concurrency pressure

No illusions. Only evidence.


The Real Engineering Problem

The hard part wasn’t “running tasks.”

It was enforcing three laws:

1. Concurrency is a Hard Ceiling

Never more than MAX_CONCURRENT_TASKS running — even during retries.

2. Retries Must Be Deterministic

A failed task can retry — but it must never exist twice in-flight.

3. Everything Must End

Every task must reach:

  • Success
  • Or retry exhaustion

No zombies. No silent failures.


The Breakthrough Moment

The moment it clicked wasn’t when my code worked.

It was when my logs told a story:

STARTING Task ID: 2
FAILED Task ID: 2
STARTING Task ID: 2
FAILED Task ID: 2
Enter fullscreen mode Exit fullscreen mode

That wasn’t noise.

That was controlled failure.

That’s what real systems do.


What This Taught Me About Backend Engineering

Concurrency isn’t about parallelism.
It’s about governance.

Retries aren’t about recovery.
They’re about discipline.

Logs aren’t for debugging.
They’re for truth.


Why This Pattern Matters in Production

This architecture maps directly to:

  • Job queues
  • Worker pools
  • Microservice orchestrators
  • API batch processors
  • Event-driven systems

If you can control this — you can control load, cost, and reliability.


Final Thought

Most developers learn async.

Engineers learn systems behavior.

This project pushed me from one side to the other.


Repo link - Concurrency Task Runner

Top comments (1)

Collapse
 
matt_frank_usa profile image
Matt Frank

Great article on understanding real systems behavior! Your point about "engineers learn systems behavior" really resonates. When designing distributed systems with task queues and workers, being able to quickly visualize the architecture is crucial. I've been using InfraSketch (infrasketch.net) to experiment with different task processing architectures—you describe your system conversationally and it generates the diagram. Makes it much faster to explore architectural trade-offs before implementation.