Dhruv Patel

Posted on Jun 28

What I Learned After Building a Redis Queue Feature in SyncFlow

#software #architecture #typescript #saas

While working on SyncFlow, a Shopify embedded app, I recently completed US-002 — Redis Queue Foundation.

The purpose of this feature was simple:

Move inventory sync work away from the main request flow and process it in the background using Redis-backed queues.

At first, the feature looked complete. The queue files were added, Redis configuration was created, producers and workers were implemented, retry handling was introduced, and logging was added.

But after reviewing the uncommitted changes properly, I learned something important:

A feature is not complete when the code is written.
A feature is complete when it is reliable, reproducible, testable, and safe to run.

What the Feature Was Supposed to Do

The goal of US-002 was to create a foundation for background job processing.

The feature included:

Redis connection setup
Queue configuration
Primary sync queue
Retry queue
Dead-letter queue
Queue producer
Queue worker
Queue health check
Structured queue logging

The idea was to make inventory sync more reliable.

Instead of blocking the app request while sync logic runs, the app can now publish a job to a queue. A worker can process that job separately. If the job fails, it can be retried. If it keeps failing, it can move to a dead-letter queue.

That sounds good in theory.

But the review showed that implementation details matter a lot.

Lesson 1: TypeScript Errors Can Block the Entire Feature

One of the first issues found was a TypeScript syntax error in the queue health file.

Because of that, npm run typecheck failed before the app could build.

This was a clear reminder that type checking is not optional in a TypeScript project.

Even if the logic looks correct, invalid types can stop the whole application from being production-ready.

Before committing a feature, I should always run:

npm run typecheck
npm run build

For backend-heavy features, I should also run the worker process locally and confirm that the app and worker can both start correctly.

Lesson 2: Dependency Management Must Be Reproducible

The feature added new packages like:

bullmq
ioredis
pino
tsx

But the lockfile was not updated correctly.

There was also an untracked pnpm-lock.yaml, while the project was using package-lock.json.

This creates a serious problem.

If another developer pulls the repo, installs dependencies, and runs the project, their environment may not match mine.

The lesson is simple:

Pick one package manager and stay consistent.

For example, if the project uses npm, commit:

package.json
package-lock.json

And remove accidental lockfiles like:

pnpm-lock.yaml

A feature is not reproducible if dependencies are not locked correctly.

Lesson 3: Health Endpoints Should Report Failure, Not Crash

A health endpoint should help debug the system.

It should not become another reason the app crashes.

In this feature, the health route had two problems.

First, it imported from a package that was not declared in the project. Second, the Redis connection code could throw immediately if REDIS_URL was missing.

That means the health endpoint might crash instead of returning a clean unhealthy response.

A better health endpoint should return something like:

{
  "status": "unhealthy",
  "redis": "missing_config",
  "worker": "not_ready"
}

With an HTTP status like:

503 Service Unavailable

The lesson:

Health checks should be defensive. Their job is to explain what is broken, not fail silently or crash the route.

Lesson 4: Every Queue Needs a Consumer

The retry queue was created, and jobs could be added to it.

But only the main sync queue had a worker.

That means retry jobs could sit inside the retry queue forever.

This is a common design mistake in queue-based systems.

Creating a queue is not enough.

Every queue must have a clear data flow:

Who writes to it?
Who reads from it?
When does the job move?
What happens if it fails?
What happens after max retries?

For a retry queue, there are usually two options.

One option is to use the queue system’s built-in retry mechanism.

Another option is to create a separate retry worker that moves jobs back to the main queue after a delay.

Without that, retry logic only exists on paper.

Lesson 5: Job Types Should Be Strict

The job payload allowed the job type to be any string.

But the producer always added the job as an inventory sync job.

That creates a mismatch.

A product sync payload could accidentally be queued as an inventory sync job.

This is where TypeScript should help.

Instead of using a loose string, the job type should be constrained.

Example:

type SyncJobType = "INVENTORY_SYNC" | "PRODUCT_SYNC";

Or better, use an enum or constant map.

The lesson:

If the system only supports specific job types, the type system should enforce that.

Loose strings make the code flexible, but also easier to break.

Lesson 6: Environment Variables Need Validation

Some queue environment variables were typed as required strings, but the runtime code treated them as optional and provided defaults.

There was also a risk with numeric environment variables.

For example:

QUEUE_CONCURRENCY=abc

If the code uses Number(process.env.QUEUE_CONCURRENCY), this becomes:

NaN

That can create unexpected worker behavior.

The better approach is to validate environment variables clearly.

For example:

If REDIS_URL is required, fail startup with a clear error.
If QUEUE_CONCURRENCY is optional, document the default.
If a number is invalid, reject it early.

The lesson:

Environment variables are part of the application contract. They need validation, not just typing.

Lesson 7: Public Health Routes Can Leak Internal State

The queue health endpoint exposed operational details publicly.

Queue counts may not be secret, but they still reveal internal system information.

For a public Shopify app URL, this matters.

Health endpoints can expose:

Whether Redis is connected
Whether workers are running
Queue names
Job counts
Failure patterns

For internal monitoring, that is useful.

For public access, that can become unnecessary exposure.

The lesson:

Operational routes should be protected.

Possible solutions include:

Shared secret header
Admin-only access
Internal-only route
Reduced public response
Separate public and private health checks

Lesson 8: Small Untracked Files Matter

There were accidental empty files in the project.

They looked like command artifacts.

This may seem small, but repository hygiene matters.

Before committing, I should always check:

git status

And remove files that do not belong in the project.

A clean commit should contain only intentional changes.

My New Feature Completion Checklist

After this review, my definition of “feature complete” has changed.

Before committing a backend feature, I should verify:

npm install
npm run typecheck
npm run build
git status

Then I should manually confirm:

App starts successfully
Worker starts successfully
Required environment variables are documented
Invalid environment values fail clearly
Health endpoint returns useful output
Every queue has a consumer or processing strategy
Failed jobs have a defined retry path
Dead-letter jobs are visible for debugging
Logs do not expose sensitive data
Lockfile matches the selected package manager

Final Takeaway

Building the Redis Queue Foundation taught me that backend features are not just about writing logic.

They are about designing reliable systems.

A queue feature needs more than a producer and worker. It needs clear failure handling, strict types, reproducible dependencies, safe health checks, clean logs, and predictable runtime behavior.

The biggest lesson:

Code that works locally is not always production-ready.

A production-ready feature must be easy to run, easy to debug, safe to expose, and hard to break.

That is the real difference between just implementing a feature and engineering a feature properly.

DEV Community

What I Learned After Building a Redis Queue Feature in SyncFlow

What the Feature Was Supposed to Do

Lesson 1: TypeScript Errors Can Block the Entire Feature

Lesson 2: Dependency Management Must Be Reproducible

Lesson 3: Health Endpoints Should Report Failure, Not Crash

Lesson 4: Every Queue Needs a Consumer

Lesson 5: Job Types Should Be Strict

Lesson 6: Environment Variables Need Validation

Lesson 7: Public Health Routes Can Leak Internal State

Lesson 8: Small Untracked Files Matter

My New Feature Completion Checklist

Final Takeaway

Top comments (0)