Muhammad Arslan

Posted on Feb 16 • Edited on Mar 3 • Originally published at muhammadarslan.codes

Scaling MERN Stack Applications to 100K+ Users: A Real-World Guide

#react #mongodb #node #webdev

Your MERN stack app works beautifully with 100 concurrent users. Then one day, traffic spikes to 10,000 — and everything falls apart. Database queries timeout, the Node.js process maxes out its memory, and your React frontend takes 8 seconds to load.

I've been through this scenario multiple times across enterprise projects. Scaling isn't about rewriting your entire stack — it's about systematically identifying and eliminating bottlenecks at every layer.

Here's the playbook I use.

Architecture Overview

Before diving in, here's the high-level architecture of a scaled MERN application:

                    ┌──────────────┐
                    │   CDN        │  ← Static assets (React build, images)
                    │  CloudFront  │
                    └──────┬───────┘
                           │
                    ┌──────▼───────┐
                    │ Load Balancer│  ← Distributes traffic
                    │   (Nginx)    │
                    └──────┬───────┘
                           │
              ┌────────────┼────────────┐
              │            │            │
        ┌─────▼────┐ ┌────▼─────┐ ┌────▼─────┐
        │ Node.js  │ │ Node.js  │ │ Node.js  │  ← Multiple instances
        │ Instance │ │ Instance │ │ Instance │
        └─────┬────┘ └────┬─────┘ └────┬─────┘
              │            │            │
              └────────────┼────────────┘
                           │
              ┌────────────┼────────────┐
              │                         │
        ┌─────▼────┐           ┌───────▼──────┐
        │  Redis   │           │   MongoDB    │
        │  Cache   │           │  Replica Set │
        └──────────┘           └──────────────┘

Now let's break down each layer.

1. Database Optimization (MongoDB)

The database is almost always the first bottleneck. Here's how to fix it.

Indexing — The Single Biggest Win

Unindexed queries on large collections can take seconds. With proper indexes, they take milliseconds.

// Check which queries are slow
db.setProfilingLevel(1, { slowms: 100 });

// View slow queries
db.system.profile.find().sort({ ts: -1 }).limit(10);

// Create indexes for your most common queries
// Schema definition with Mongoose
const orderSchema = new Schema({
  userId: { type: Schema.Types.ObjectId, index: true },
  status: { type: String, index: true },
  createdAt: { type: Date, index: true },
  total: Number,
  items: [orderItemSchema],
});

// Compound index for queries that filter on multiple fields
orderSchema.index({ userId: 1, status: 1, createdAt: -1 });

// Text index for search functionality
orderSchema.index({ 'items.name': 'text', 'items.description': 'text' });

Use the Aggregation Pipeline

For complex data processing, the aggregation pipeline runs everything inside MongoDB — far faster than pulling data into Node.js and processing it there:

// ❌ BAD — Pull all orders into Node.js memory
const orders = await Order.find({ status: 'completed' });
const totalByUser = {};
orders.forEach(order => {
  totalByUser[order.userId] = (totalByUser[order.userId] || 0) + order.total;
});

// ✅ GOOD — Let MongoDB do the heavy lifting
const totalByUser = await Order.aggregate([
  { $match: { status: 'completed' } },
  { $group: {
    _id: '$userId',
    totalSpent: { $sum: '$total' },
    orderCount: { $sum: 1 },
    avgOrderValue: { $avg: '$total' },
  }},
  { $sort: { totalSpent: -1 } },
  { $limit: 100 },
]);

Connection Pooling

Don't create a new connection per request. Configure your pool:

// mongoose connection with proper pooling
mongoose.connect(process.env.DATABASE_URL, {
  maxPoolSize: 50,           // Max connections in the pool
  minPoolSize: 10,           // Keep at least 10 connections ready
  serverSelectionTimeoutMS: 5000,
  socketTimeoutMS: 45000,
  bufferCommands: false,     // Fail fast if not connected
});

Pagination — Never Load Everything

// Cursor-based pagination (better than skip/limit for large datasets)
const getProducts = async (lastId, limit = 20) => {
  const query = lastId 
    ? { _id: { $gt: new ObjectId(lastId) } } 
    : {};

  return Product.find(query)
    .sort({ _id: 1 })
    .limit(limit)
    .lean(); // .lean() returns plain objects — 5x faster than Mongoose documents
};

2. API Performance (Node.js & Express)

Redis Caching

Cache expensive or frequently-accessed data:

const Redis = require('ioredis');
const redis = new Redis(process.env.REDIS_URL);

// Cache middleware
const cache = (keyPrefix, ttlSeconds = 300) => {
  return async (req, res, next) => {
    const key = `${keyPrefix}:${req.originalUrl}`;

    try {
      const cached = await redis.get(key);
      if (cached) {
        return res.json(JSON.parse(cached));
      }

      // Override res.json to cache the response
      const originalJson = res.json.bind(res);
      res.json = (data) => {
        redis.setex(key, ttlSeconds, JSON.stringify(data));
        return originalJson(data);
      };

      next();
    } catch (err) {
      next(); // If Redis fails, just skip caching
    }
  };
};

// Usage
app.get('/api/products', cache('products', 600), productController.getAll);
app.get('/api/products/:id', cache('product', 300), productController.getById);

Cache Invalidation

The hardest problem in computer science. Keep it simple:

// When a product is updated, clear related cache keys
const updateProduct = async (id, data) => {
  const product = await Product.findByIdAndUpdate(id, data, { new: true });

  // Clear specific cache
  await redis.del(`product:/api/products/${id}`);

  // Clear list caches (use pattern matching)
  const keys = await redis.keys('products:*');
  if (keys.length) await redis.del(...keys);

  return product;
};

Compression

Reduce payload sizes by 60-80%:

const compression = require('compression');

app.use(compression({
  level: 6,                    // Balance between speed and compression ratio
  threshold: 1024,             // Only compress responses > 1KB
  filter: (req, res) => {
    if (req.headers['x-no-compression']) return false;
    return compression.filter(req, res);
  },
}));

Node.js Clustering

Use all CPU cores:

// cluster.js
const cluster = require('cluster');
const os = require('os');

if (cluster.isPrimary) {
  const numCPUs = os.cpus().length;
  console.log(`Primary process ${process.pid} starting ${numCPUs} workers`);

  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker) => {
    console.log(`Worker ${worker.process.pid} died, spawning replacement`);
    cluster.fork();
  });
} else {
  require('./app'); // Each worker runs the Express app
}

Or better yet, use PM2 in production:

# pm2 ecosystem file
pm2 start app.js -i max --name "api" --max-memory-restart "500M"

3. Frontend Efficiency (React)

Optimistic UI Updates

Don't wait for the server to confirm — update the UI immediately and roll back on failure:

const useOptimisticUpdate = () => {
  const queryClient = useQueryClient();

  const toggleLike = useMutation({
    mutationFn: (postId: string) => api.post(`/posts/${postId}/like`),

    // Optimistically update the cache
    onMutate: async (postId) => {
      await queryClient.cancelQueries(['posts']);
      const previousPosts = queryClient.getQueryData(['posts']);

      queryClient.setQueryData(['posts'], (old: Post[]) =>
        old.map(post =>
          post.id === postId
            ? { ...post, liked: !post.liked, likes: post.liked ? post.likes - 1 : post.likes + 1 }
            : post
        )
      );

      return { previousPosts };
    },

    // Roll back on error
    onError: (err, postId, context) => {
      queryClient.setQueryData(['posts'], context.previousPosts);
    },
  });

  return toggleLike;
};

React Query for Server State

Stop using useEffect + useState for API calls:

// ❌ BAD — manual state management
const [products, setProducts] = useState([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState(null);

useEffect(() => {
  fetch('/api/products')
    .then(res => res.json())
    .then(setProducts)
    .catch(setError)
    .finally(() => setLoading(false));
}, []);

// ✅ GOOD — React Query handles caching, refetching, error states
const { data: products, isLoading, error } = useQuery({
  queryKey: ['products'],
  queryFn: () => fetch('/api/products').then(r => r.json()),
  staleTime: 5 * 60 * 1000,  // Cache for 5 minutes
  retry: 3,
});

Bundle Size Optimization

Analyze and shrink your bundle:

# For Vite projects
npx vite-bundle-visualizer

# For Webpack (CRA) projects
npx source-map-explorer build/static/js/*.js

Common wins:

Replace moment.js (300KB) with date-fns (tree-shakeable) or dayjs (2KB)
Replace lodash (70KB) with individual imports: import debounce from 'lodash/debounce'
Lazy load routes (we covered this in the React performance article)

4. Infrastructure

Load Balancing with Nginx

upstream node_api {
    least_conn;  # Send to the server with fewest active connections
    server 127.0.0.1:3001;
    server 127.0.0.1:3002;
    server 127.0.0.1:3003;
    server 127.0.0.1:3004;
}

server {
    listen 80;
    server_name api.yourdomain.com;

    # Serve static files directly — don't waste Node.js on this
    location /static/ {
        root /var/www/app/build;
        expires 1y;
        add_header Cache-Control "public, immutable";
    }

    location / {
        proxy_pass http://node_api;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_cache_bypass $http_upgrade;
    }
}

CDN for Static Assets

Serve your React build from a CDN (CloudFront, Cloudflare, Fastly). This alone can cut load times by 50-70% for global users because assets are served from the edge location nearest to the user.

MongoDB Replica Set

Never run a single MongoDB instance in production:

Primary  →  handles all writes
Secondary →  handles read queries (read preference: secondaryPreferred)
Secondary →  handles read queries + acts as failover

This gives you:

High availability: If primary goes down, a secondary is automatically promoted
Read scaling: Distribute read queries across secondaries
Data safety: Your data exists on multiple machines

5. Monitoring & Alerts

You can't fix what you can't see. Set up monitoring from day one:

What to Monitor	Tool	Alert When
API response times	Datadog / New Relic	p95 > 500ms
Error rate	Sentry	> 1% of requests
MongoDB slow queries	MongoDB Atlas	> 100ms
Memory usage	PM2 / Docker	> 80%
CPU usage	Cloud provider metrics	Sustained > 70%
Redis hit rate	Redis CLI / Datadog	< 80%

Scaling Checklist

Here's the order I tackle scaling issues:

📊 Measure first — Don't guess. Profile your app, find the actual bottleneck.
🗄️ Database indexes — This alone often fixes 80% of performance issues.
🔴 Redis caching — Cache frequently-read, rarely-changed data.
📦 API compression — Gzip/Brotli middleware for 60-80% smaller payloads.
⚛️ Frontend code splitting — Don't ship 2MB of JS on first load.
🖥️ Node.js clustering — Use all your CPU cores.
⚖️ Load balancing — Scale horizontally with multiple server instances.
🌐 CDN — Serve static assets from the edge.
🔄 Read replicas — Scale database reads independently.

Conclusion

Scaling is an iterative process, not a one-time event. Monitor your application's bottlenecks and address them systematically. The order matters — always start with the database and caching before throwing more hardware at the problem.

The patterns in this guide have helped me scale MERN applications from hundreds to hundreds of thousands of users. They'll work for you too.

Have you scaled a MERN stack application? What was your biggest bottleneck? Share your experience in the comments 👇

For more architecture deep dives and case studies, visit muhammadarslan.codes or connect with me on LinkedIn and GitHub.

DEV Community