While working on SyncFlow, a Shopify embedded app, I recently completed US-002 — Redis Queue Foundation.
The purpose of this feature was simple:
Move inventory sync work away from the main request flow and process it in the background using Redis-backed queues.
At first, the feature looked complete. The queue files were added, Redis configuration was created, producers and workers were implemented, retry handling was introduced, and logging was added.
But after reviewing the uncommitted changes properly, I learned something important:
A feature is not complete when the code is written.
A feature is complete when it is reliable, reproducible, testable, and safe to run.
What the Feature Was Supposed to Do
The goal of US-002 was to create a foundation for background job processing.
The feature included:
- Redis connection setup
- Queue configuration
- Primary sync queue
- Retry queue
- Dead-letter queue
- Queue producer
- Queue worker
- Queue health check
- Structured queue logging
The idea was to make inventory sync more reliable.
Instead of blocking the app request while sync logic runs, the app can now publish a job to a queue. A worker can process that job separately. If the job fails, it can be retried. If it keeps failing, it can move to a dead-letter queue.
That sounds good in theory.
But the review showed that implementation details matter a lot.
Lesson 1: TypeScript Errors Can Block the Entire Feature
One of the first issues found was a TypeScript syntax error in the queue health file.
Because of that, npm run typecheck failed before the app could build.
This was a clear reminder that type checking is not optional in a TypeScript project.
Even if the logic looks correct, invalid types can stop the whole application from being production-ready.
Before committing a feature, I should always run:
npm run typecheck
npm run build
For backend-heavy features, I should also run the worker process locally and confirm that the app and worker can both start correctly.
Lesson 2: Dependency Management Must Be Reproducible
The feature added new packages like:
bullmqioredispinotsx
But the lockfile was not updated correctly.
There was also an untracked pnpm-lock.yaml, while the project was using package-lock.json.
This creates a serious problem.
If another developer pulls the repo, installs dependencies, and runs the project, their environment may not match mine.
The lesson is simple:
Pick one package manager and stay consistent.
For example, if the project uses npm, commit:
package.json
package-lock.json
And remove accidental lockfiles like:
pnpm-lock.yaml
A feature is not reproducible if dependencies are not locked correctly.
Lesson 3: Health Endpoints Should Report Failure, Not Crash
A health endpoint should help debug the system.
It should not become another reason the app crashes.
In this feature, the health route had two problems.
First, it imported from a package that was not declared in the project. Second, the Redis connection code could throw immediately if REDIS_URL was missing.
That means the health endpoint might crash instead of returning a clean unhealthy response.
A better health endpoint should return something like:
{
"status": "unhealthy",
"redis": "missing_config",
"worker": "not_ready"
}
With an HTTP status like:
503 Service Unavailable
The lesson:
Health checks should be defensive. Their job is to explain what is broken, not fail silently or crash the route.
Lesson 4: Every Queue Needs a Consumer
The retry queue was created, and jobs could be added to it.
But only the main sync queue had a worker.
That means retry jobs could sit inside the retry queue forever.
This is a common design mistake in queue-based systems.
Creating a queue is not enough.
Every queue must have a clear data flow:
- Who writes to it?
- Who reads from it?
- When does the job move?
- What happens if it fails?
- What happens after max retries?
For a retry queue, there are usually two options.
One option is to use the queue system’s built-in retry mechanism.
Another option is to create a separate retry worker that moves jobs back to the main queue after a delay.
Without that, retry logic only exists on paper.
Lesson 5: Job Types Should Be Strict
The job payload allowed the job type to be any string.
But the producer always added the job as an inventory sync job.
That creates a mismatch.
A product sync payload could accidentally be queued as an inventory sync job.
This is where TypeScript should help.
Instead of using a loose string, the job type should be constrained.
Example:
type SyncJobType = "INVENTORY_SYNC" | "PRODUCT_SYNC";
Or better, use an enum or constant map.
The lesson:
If the system only supports specific job types, the type system should enforce that.
Loose strings make the code flexible, but also easier to break.
Lesson 6: Environment Variables Need Validation
Some queue environment variables were typed as required strings, but the runtime code treated them as optional and provided defaults.
There was also a risk with numeric environment variables.
For example:
QUEUE_CONCURRENCY=abc
If the code uses Number(process.env.QUEUE_CONCURRENCY), this becomes:
NaN
That can create unexpected worker behavior.
The better approach is to validate environment variables clearly.
For example:
- If
REDIS_URLis required, fail startup with a clear error. - If
QUEUE_CONCURRENCYis optional, document the default. - If a number is invalid, reject it early.
The lesson:
Environment variables are part of the application contract. They need validation, not just typing.
Lesson 7: Public Health Routes Can Leak Internal State
The queue health endpoint exposed operational details publicly.
Queue counts may not be secret, but they still reveal internal system information.
For a public Shopify app URL, this matters.
Health endpoints can expose:
- Whether Redis is connected
- Whether workers are running
- Queue names
- Job counts
- Failure patterns
For internal monitoring, that is useful.
For public access, that can become unnecessary exposure.
The lesson:
Operational routes should be protected.
Possible solutions include:
- Shared secret header
- Admin-only access
- Internal-only route
- Reduced public response
- Separate public and private health checks
Lesson 8: Small Untracked Files Matter
There were accidental empty files in the project.
They looked like command artifacts.
This may seem small, but repository hygiene matters.
Before committing, I should always check:
git status
And remove files that do not belong in the project.
A clean commit should contain only intentional changes.
My New Feature Completion Checklist
After this review, my definition of “feature complete” has changed.
Before committing a backend feature, I should verify:
npm install
npm run typecheck
npm run build
git status
Then I should manually confirm:
- App starts successfully
- Worker starts successfully
- Required environment variables are documented
- Invalid environment values fail clearly
- Health endpoint returns useful output
- Every queue has a consumer or processing strategy
- Failed jobs have a defined retry path
- Dead-letter jobs are visible for debugging
- Logs do not expose sensitive data
- Lockfile matches the selected package manager
Final Takeaway
Building the Redis Queue Foundation taught me that backend features are not just about writing logic.
They are about designing reliable systems.
A queue feature needs more than a producer and worker. It needs clear failure handling, strict types, reproducible dependencies, safe health checks, clean logs, and predictable runtime behavior.
The biggest lesson:
Code that works locally is not always production-ready.
A production-ready feature must be easy to run, easy to debug, safe to expose, and hard to break.
That is the real difference between just implementing a feature and engineering a feature properly.
Top comments (0)