DEV Community

Asad Abdullah Zafar
Asad Abdullah Zafar

Posted on • Originally published at kolachitech.com

Scaling Shopify Apps to Millions of Requests: 6 Architecture Layers That Actually Hold

Most Shopify apps are built for the average case. They break at the edge case.

At low traffic, a synchronous webhook handler, a single database connection pool, and a naive retry loop on 429s all look fine. They do not look fine when your app is serving 10,000 stores and a flash sale fires on all of them simultaneously.

This post covers the six architecture layers that determine whether a Shopify app survives genuine scale, from 10K to 1M+ requests per day.


Layer 1: Cost-Aware API Rate Limit Management

The Shopify GraphQL Admin API uses a leaky bucket model: 1,000 cost points per bucket, refilling at 50 points per second on standard plans. At scale, naive consumption drains the bucket and every subsequent request returns a 429 until it refills.

The fix is reading the cost from the response header and throttling proactively, not reactively.

async function shopifyQuery(client, query, variables) {
  const response = await client.query({ data: { query, variables } });
  const cost = response.headers.get('x-graphql-cost-include-fields');
  const { throttleStatus } = JSON.parse(cost || '{}');

  if (throttleStatus?.currentlyAvailable < 200) {
    const refillTime = (200 - throttleStatus.currentlyAvailable)
      / throttleStatus.restoreRate;
    await new Promise(r => setTimeout(r, refillTime * 1000));
  }
  return response.body;
}
Enter fullscreen mode Exit fullscreen mode

React to 429s and you are already behind. Track bucket state and you never get there.


Layer 2: Four-Layer Caching Strategy

The fastest API call is the one you never make. A well-designed cache cuts Admin API consumption by 60 to 80 percent in most production apps.

Cache Layer What to Cache TTL Implementation
Storefront API Product data, collections, metafields 5 to 15 minutes Built-in response cache
Redis (App Layer) Session tokens, shop config, variant inventory 60 to 300 seconds ioredis / Upstash
Edge Cache (CDN) Storefront pages, static API responses Minutes to hours Fastly / Cloudflare
In-Memory (Worker) Shop plan data, feature flags, rate limit state Worker lifetime Node.js Map / LRU

Important: use webhook events to invalidate cache entries on data changes. TTL-only expiry leaves stale data alive too long under high write volume.


Layer 3: Stateless Workers and Connection Pooling

A Shopify app that cannot scale horizontally cannot reach millions of requests without degrading. The architectural requirement is stateless workers: every process must handle any job without local state.

The connection pool is usually the bottleneck before CPU. 50 concurrent workers sharing 10 database connections creates queue pressure that slows every job. Use PgBouncer in transaction pooling mode for PostgreSQL and set explicit pool sizes that match your concurrency limits per queue, not total worker count.


Layer 4: Webhook Deduplication

Shopify guarantees at-least-once delivery. At millions of events, duplicates are not edge cases. Two workers processing the same order event will produce inconsistent state without explicit deduplication.

async function handleWebhook(topic, shopDomain, webhookId, payload) {
  const lockKey = `webhook:${shopDomain}:${webhookId}`;

  // Atomic set-if-not-exists with 24hr TTL
  const acquired = await redis.set(lockKey, '1', 'EX', 86400, 'NX');
  if (!acquired) {
    console.log(`Duplicate webhook skipped: ${webhookId}`);
    return;
  }
  await processWebhookJob(topic, shopDomain, payload);
}
Enter fullscreen mode Exit fullscreen mode

One Redis SET NX call per webhook. Cheap, atomic, and eliminates the entire duplicate processing problem.


Layer 5: Distributed Locking for Race Conditions

At low traffic, race conditions are theoretical. At millions of requests, they are inevitable.

The classic example: two workers read the same inventory level simultaneously. Both see stock available. Both decrement it. Result is negative inventory. This is not a Shopify bug. It is a read-then-write concurrency problem.

Solve it with optimistic database locking or a Redis distributed lock using SET NX before any read-modify-write sequence on shared resources.


Layer 6: Composite Observability Alerting

At this scale, the difference between a 2-minute incident and a 2-hour one is alerting that fires before users notice.

Signal Tool Alert Threshold What It Catches
API error rate Datadog / Sentry > 1% 4xx / 5xx Rate limit saturation, auth failures
Queue depth BullMQ / Prometheus > 500 pending jobs Under-provisioned workers
Job failure rate BullMQ DLQ depth > 0 new DLQ jobs Logic bugs, malformed payloads
DB connection pool PgBouncer metrics > 80% utilisation N+1 queries, pool exhaustion
p99 job latency Datadog APM > 10 seconds Slow queries, under-provisioned workers

Set composite alerts that fire when two signals breach simultaneously. High API error rate combined with rising queue depth usually means a rate limit cascade, not an isolated error. That distinction changes your response entirely.


Scale Decision Matrix

Request Volume Priority Patterns Infrastructure
Under 10K / day Basic rate limiting, Redis caching Single server, managed Redis
10K to 100K / day Above + async queues, stateless workers 2 to 4 workers, connection pooling
100K to 1M / day Above + idempotency, race condition guards Horizontal worker fleet, PgBouncer
1M+ / day All patterns + circuit breakers, cost-aware GraphQL Auto-scaling workers, multi-region Redis, full APM

Wrapping Up

Scaling is not a single refactor. It is six deliberate decisions made at every layer of your app. Start by identifying your current bottleneck, instrument it, fix it, then move to the next.

Full breakdown with additional code examples, caching invalidation strategy, and fault tolerance patterns here:

👉 https://kolachitech.com/scaling-shopify-apps-millions-of-requests/

Drop a comment if you are hitting a specific layer right now. Happy to go deeper.

Top comments (0)