DEV Community: Subhajit Gorai

Wanna Learn Web Development?

Subhajit Gorai — Tue, 02 Jun 2026 07:14:17 +0000

Original article: https://www.opencanvas.institute/p/wanna-learn-web-development-6a19a10bc7d49a7b9a4eda21

So you want to start web dev. Nice :)
What kind of developer you wanna be?

You got inspired by some great 3D, WebGL website on awwwards and now wanna see yourself doing the same... Creative Developer / Design Engineer / Frontend Engineer

You are thrilled with backend systems and wanna design a highly scalable site capable of handling millions of concurrent requests... Backend Engineer / Distributed Systems Engineer / Site Reliability Engineer (SRE)

You wanna develop something, build your own, or maybe build every dev aspect like a Fullstack Engineer

Or maybe you are into machine learning or other domains and wanna deploy something real quick. Project Need

Find the one which best suits you.

If it is Project Need then you don't have to spend your time learning web dev things. Right now lots of frameworks and AI tools exist. Give a good prompt, tell your target requirement specifically, tell the AI not to overdo it, and it will be done. My suggestion is to focus on your main project goal, not the presentation.

Also worth noting: I mentioned some general and specialised resources but this might not be the best article if you are into a serious guide on how to make "awwwards level designs and sites". My journey is Fullstack/SRE kind. So it's prone to those aspects.

Must Know for Everyone

HTML, CSS, JS -> Must learn

Initially have an idea of HTML and CSS, learn basic things only. You will learn other things while developing projects. Don't try to do it all from the start.

First HTML

quick guide: Bro Code is what I followed as it was short. You can watch any.
Bro code html, apna college
best for in depth: https://web.dev/learn/html/ [go through some of it at first]

Have clarity on basic tags like h tags, p, div, span, code, pre, strong
ul, ol, li etc. Also some basic meta tags. like viewport, title, description, icon, link tags etc etc. And always follow the semantics and proper order of tags.

Here's a basic boilerplate.

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Document Title</title>

    <link rel="stylesheet" href="style.css">
  </head>
  <body>

    <h1>Hello, World!</h1>

    <script src="script.js"></script>
  </body>
</html>

For vs code download some extension for getting boilerplate code. In zed it already exists. !+enter / !+tab

Then CSS

It's vast. Here it's important that you don't try to go through all of it at first. It's inefficient and not fun that way. You will learn with projects.

quick guide: bro code
best playlist: covers things that are vastly sufficient, like layouting, positioning, flexbox, grid etc. slaying the dragon
detailed docs: https://web.dev/learn/css, mdn docs, w3school

learn SCSS and Tailwind too. Those are simple and at most will take an hour each for understanding. Ask any AI to reference them.

If you make fairly large projects you will understand why Tailwind is a lifesaver. You'll enjoy it.

so, how to center a div? (qn for you)

Now JavaScript

This is the most vital, and it's gonna be of super use. Do not skip and jump to React unless you are comfortable in it. Previous HTML, CSS can be done in a day but here spend one week at minimum. It will take 1-6 weeks depending on your current programming knowledge.

note: you need to properly understand Object Oriented Programming for this.

best: https://web.dev/learn/javascript
good if you don't like reading: chai or code js playlist

Must have clarity with anonymous functions, objects, destructuring, realms, prototypal inheritance, js execution, event loop, task queue, micro task queue, promises in JS.
also know IIFE, closures and similar things.

These visualisations by Lydia are great: event loop, closures, execution context, promise execution

Now, Document Object Model (DOM) is an essential part for web dev. You can learn it from anywhere. My recommendations will be: designcourse, apna college

Quick project:
After learning HTML, CSS and JS make sure to build something with it. It's vital. I recommend this: https://www.youtube.com/watch?v=QRrPE9aj3wI&t=2s (html, css)
Or anything you wish. But do make.

Getting less popular in vibe coding era, but here are some icon libraries you'll need for sure while building projects with html & css.

svgrepo: https://www.svgrepo.com
hero icons: https://heroicons.com
lucide: https://lucide.dev

Now that HTML, CSS and JS are done and if you honestly didn't skip JS, you are already a good web developer. Not kidding.

It's good to have some knowledge on browser internals for everyone, maybe watch this: https://youtu.be/5rLFYtXHo9s?si=KaUl08CMwBSbqsSv
For more go to the frontend section below.

Specialisation

Now there is no such bound here, like "for frontend learn all of this, for backend that". No. You can pick any tech from both frontend and backend sides. It's a mix.
Your specialisation will be on your depth of knowledge in the technologies you chose.

So here I'm listing the best resources I found.

Frontend

React

https://react.dev (best source, nothing else needed). Have clarity on useState, useEffect, useRef, useContext, useMemo.

chai or code react playlist is also good.
https://youtube.com/playlist?list=PLC3y8-rFHvwgg3vaYJgHGnModB54rxOk3&si=jY8G_EIAS5Iva5jT this is fine too. Search on YT, watch any.

Must learn Tanstack alongside it.

State management: Once your app grows, prop drilling gets painful. useState alone stops scaling.

Options: Zustand, Redux Toolkit, Jotai. For most projects Zustand is the right choice. Minimal API, no boilerplate. Redux only makes sense if the team is large or the state is genuinely complex. Jotai if you want atomic granularity.

Don't add state management until you actually need it. Premature abstraction gets problematic.

Routing

React Router v6 is standard. Have clarity on nested routes, loaders, and the difference between client-side and server-side navigation. Check HashRouter too.

Browser internals

Important, not optional.
Understand the event loop, call stack, task queue, microtask queue. Async JS knowledge should be concrete. Knowing why Promise.then fires before setTimeout matters when you're debugging real bugs.

How the render pipeline, hydration works (parse HTML -> build DOM -> CSSOM -> layout -> paint -> composite). This directly helps you write performant UI.

Repaint vs reflow. Avoiding layout thrash. These come up.

check https://browser.engineering, it's really great.

PWAs

Now you can have almost app like experience with dev skills only. Progressive Web Applications are for this. Try adding a simple manifest.json and you'll see the difference.

Best source to learn more: web.dev

Also you can use capacitor.js etc frameworks to haven an proper android apk.

some useful sites for generating pwa icons:

some for pwa to apk:

pwabuilder

not necessary if you are just starting

TypeScript

When working with large react projects you will understand the need of type safety in React. TS saves you from this fustration. Its a superset of JS.

Core things to have clarity on: primitive types, interfaces vs types, generics, utility types (Partial, Pick, Omit, Record), and how to type props and hooks in React.

Sources:

best: TS Handbook
for React specifically: Total TypeScript by Matt Pocock, he has free tutorials and they are genuinely the best on typing React components, hooks, and event handlers.
for practice, make projects, you can use this too, its great. https://github.com/type-challenges/type-challenges

Safe to skip for starters

IndexedDB

In case you need more than ~5 MB of your Local Storage, this gives you a lot more. Async and no sql.
Dexie is a great wrapper for it: dexie.js

Definitely skip if you are just starting

WebAssembly (WASM)

When JS performance is a bottleneck. Image processing, audio, physics simulation, in-browser ML inference. If you know Rust you can compile to WASM and call it from JS. The interop is straightforward.

Most projects don't need it.

If you want a real hands-on target: take a computationally heavy function you've written in JS, rewrite it in Rust, compile to WASM, and benchmark the difference. You'll never forget what WASM is for.

animations

Its honestly not my turf. Suggestion would be if you want motion, look into Framer Motion for React. GSAP if you want timeline control. Three.js / React Three Fiber for 3D. These are rabbit holes, go in only if that's your direction.

Backend

Here DSA & DEV intersects. Try buillding any core backend tech from scratch. You are gonna love it.

SQL and DBMS

Understand joins deeply, understand indexing (clustered vs non-clustered, when composite indexes help), understand query plans.
Have clear idea of Normalization & Denormalization, use based on requirements.

Learning PostgreSQL is best for now. MySQL is fine too but Postgres has better extensions, better JSON support, better window functions.

resources: https://www.postgresqltutorial.com, CMU 15-445 lectures on YouTube for depth.

MongoDB

Do SQL first. Then only you will get this properly. Good for flexible schema, rapid prototyping, and when your data is naturally hierarchical.

Dont use MongoDB only to avoid schema design. That's not a valid reason. If your data is relational, use a relational DB.

sufficient: engineering digest playlist

Have clarity on: Query operators like $match, $group, $lookup, $unwind, aggregation pipelines.
Go through mongoose & mongodb client.

The data objects returned by mongoose have to go through hydration which takes time and can be a bottleneck for high throughput / rps. Use lean in those cases to make data retrival much faster. Aggregation pipelines do it by default.

Here is one of my repo to see production ready use: opencanvas backend

Flask

Finally, some proper web development framework right? after all that...
Flask is the best framework I have personally worked with. It is tidy, sufficient, has everything, you have proper customisation. Truely gold. But it was rabit hole for me. See how it works out for you.

Learn flask in depth, rest will be very easy to get then afterwards.

core things to have clarity upon: request response cycle, templating, file serving, request handling, user session, cookies, csrf/xss etc common attacks, blueprints and factory pattern and database integration, migrations

Trust me. knowing these in flask, you will understand any framework. Everything will be relateable.

sources:

Corey Schafer's playlist
docs: https://flask.palletsprojects.com/en/stable/
sqlalchemy docs: https://docs.sqlalchemy.org/en/20/intro.html

mrigul grinberg's book is excellent. Give it a read, on portions you need clarity: web version, pdf

FastAPI

Official FastAPI Documentation: Good, interactive guide covering basics to advanced topics like dependency injection and security.
Fastapi-users: Library for ready to use registration and authentication systems.

Production Templates & Architecture

Full Stack FastAPI Template: Official boilerplate by fastapi creator showing FastAPI, PostgreSQL, Docker, and frontend integration.
FastAPI Best Practices

I recommend learning Flask or Other Frameworks first

Django

Flask is a lightweight micro framework that relies on a collection of external modules, whereas Django provides a complete ecosystem, often said as "batteries-included".

Core Resources

Official Django Documentation: Best reference for Django, featuring detailed explanations of the ORM, routing, and security features.
Django REST Framework (DRF): Standard toolkit for building Web APIs on top of Django.

Production Templates & Architecture

Cookiecutter Django: Most popular framework for jumpstarting production ready Django projects quickly.
HackSoft Django Styleguide: Great set of rules and architectural patterns that separate business logic from models and views for large scale Django apps.

Express

Official & Core Resources

Official Express Documentation: Best guide for routing, middleware, and request handling.

Production Templates & Architecture

Node Express Boilerplate by hagopj13: Featuring built in JWT authentication, Jest testing, Docker support, and a centralized error handling.
Node.js Boilerplate by foyzulkarim: Best practices for folder architecture, modular routing, and clean code separation.

Building a Production Ready Node.js Boilerplate
This is good. Give it a watch if facing trouble understanding.

Also here is one of my repo to see production ready use: opencanvas backend

Skip for Starters

System design

This is something people treat as interview prep only. Don't do that.
Have genuine clarity on:

when to use serverless (bursty traffic, low baseline, short execution time). Consider cold starts. Do not use serverless for latency sensitive always on services.
horizontal vs vertical scaling
rate limiting, API gateways, load balancers, cdns

Redis

chai or code: https://youtu.be/5YqP18Gyop0?si=H3BC7GZjKuQzX9-w
web dev simplified: https://youtu.be/jgpVdJB2sKQ?si=TX4njkS6h_tf6vh-

real situation example: Read through this section. You will understand the need of redis in real environment. https://www.opencanvas.institute/p/from-1993-to-17007-requests-per-second-how-i-optimised-a-nodejs-mongodb-backend-at-scale-69ad48fc4684635da9b4c72c#7-in-memory-ttl-cache-with-intelligent-invalidation

Next.js

React but with SSR (server side rendering) and SSG (static site generation) built in. Use it when:

SEO matters (rendered HTML, not just husk)
You want your frontend and lightweight API routes in one codebase
You're building a content heavy site

Right now its getting used everywhere. But don't use it just because of that. If SEO doesn't matter and you don't need SSR, a plain React SPA is simpler and easier to reason about.

sources:

official next js docs are the best: https://nextjs.org/docs
https://nextjs.org/learn/dashboard-app

Deployment

Deploying is the most awaited part. The desire to see your work on a live url is insane after completing a project. But it gets really messy sometimes. You dont find appropriate free plans, errors just keep popping out of nowhere. That's why you need to have some knowledge on production environments and practices. Discussing all that here will make it long. So for now let's be aware with these platforms briefly. I'll link a seperate article on common error fixes, troubleshooting etc here.

GitHub: Best for static html css based sites and supports React etc SPAs too.
Vercel: Best for Next.js and any frontend based project. Free analytics.
Netlify: Similar to Vercel for static sites. I dont like it as it charges for analytics.
CloudFlare: cloudflare workers and pagers are also great.
PythonAnywhere: Best for small (less than 512mb) Flask/Django projects. Free tier is genuinely good. No Docker needed.
Railway: They give 5$ credit for new accounts. So if you need quick demo for a short period you can get everything for free here. Its almost the new Heroku.
HuggingFace Spaces: For ML demos and Gradio/Streamlit apps. If you want to show a model working, this is the fastest path.
VPS (DigitalOcean, Hetzner, Linode): When you need actual control.
Docker: Learn it. Not optional if you're serious. Containerising your apps means your deployment is reproducible regardless of where it runs. Dockerfile, docker-compose, basic networking between containers. That's sufficient to start.

One last thing.

The best engineers I've seen around me didn't learn the most tools. They got very good at a small set of things, built actual projects, and read enough to understand what was happening under the surface. The intersection of knowing your data structures and knowing your system is where the interesting problems live.

Best of luck 🍀
Keep building!

From 1,993 to 17,007 RPS: How I Optimized a Node.js/MongoDB Backend on a Single Machine

Subhajit Gorai — Thu, 12 Mar 2026 03:13:46 +0000

I seeded a MongoDB database with 1.4 million documents and tried to break my own backend. What started as a curiosity about real-world performance turned into a months long optimisation journey - from 1,993 to 17,007 requests per second, on a single machine, with zero request failures.

Introduction

OpenCanvas is a platform built to bridge the gap between ResearchGate and Reddit, giving college researchers and writers a place to publish, discover, and discuss work that would otherwise never surface. Behind it sits a standard Express and MongoDB stack, nothing exotic. What made this a meaningful engineering exercise was the constraint: the database was seeded with 400,000 users, 100,000 posts, 500,000 interactions, 100,000 follows, and 320,000 comments, and the goal was to squeeze every last request per second out of a single Node.js process before reaching for additional infrastructure.

The result was a journey from 1,993 RPS to 6,138 RPS on a single thread, and then to 17,007 RPS in cluster mode, all measured under aggressive autocannon stress tests against a live MongoDB-backed feed route. This article documents every decision that drove that improvement, with the actual code, the reasoning, and the numbers.

GitHub Link of the codes provided: https://github.com/Dream-World-Coder/opencanvas/tree/main/server/src

The Baseline: What I Was Working With

Before any optimization, the primary bottleneck was the /articles feed route. Every request was doing the following:

Running a find query with a skip offset for pagination
Populating the author's data on every post document using Mongoose's .populate()
Returning the full post content field, which for article-type posts could be tens of kilobytes of Markdown
No caching whatsoever

Under a simple autocannon test with 100 connections:

    Requests per second: 1,993
    Average Latency: 49ms

Under a stressed test with 500 connections and 10 pipelining:

    Requests per second: 2,504
    Average Latency: 1,609ms

Those numbers with 100,000 posts and half a million interactions in the database meant the platform would fall over under any meaningful traffic. The fixes were architectural, not superficial.

Optimization 1: Denormalization and the authorSnapshot Pattern

The most expensive operation in the original feed query was the .populate() call. Every time a list of posts was fetched, Mongoose would issue a separate query to the users collection for each unique authorId in the result set. At 10 posts per page, that is up to 10 additional round trips to MongoDB before the response could be sent.

The fix was to embed a snapshot of the author's display data directly inside the Post document at write time.

// Post.js (schema)
authorSnapshot: {
  username: { type: String, required: true },
  profilePicture: { type: String },
  fullName: String,
},

This authorSnapshot is written when a post is created or updated, always kept current, and never populated on read. The feed query now fetches everything it needs in a single collection scan. There is a known trade-off: if a user changes their username or profile picture, older posts will briefly show stale snapshot data until they next save a post. For a content platform, this is an entirely acceptable consistency model.

// post.js (route) - snapshot is always refreshed on save
authorSnapshot: {
  username: req.user.username,
  profilePicture: req.user.profilePicture,
  fullName: req.user.fullName,
},

The same pattern is used for comments. Rather than populating authorId on every comment fetch, the authorSnapshot with username and profilePicture is embedded at comment creation time. At 320,000 comments in the database, the savings are significant.

Optimization 2: contentPreview and Payload Reduction

The original feed was returning the full content field of every post. A research article might be 8,000 to 15,000 characters of Markdown. Returning that for 10 posts per page, across potentially thousands of concurrent users, is an enormous waste of bandwidth and serialization time.

The solution was a dedicated contentPreview field, populated at write time with the first 700 characters of the content.

// Post.js (schema)
contentPreview: {
  type: String,
  default: "",
  maxlength: 700,
},

// post.js (route) - sliced on save, not on read
contentPreview: content?.slice(0, 700) ?? "",

The feed query then selects only contentPreview, never content. The full content field is only fetched on the individual post page route. This single change reduced per-response payload size by roughly 95 percent for article-type posts, which directly translates to higher RPS and lower average latency.

The field selection in the feed route is explicit and tight:

// feed.js 
.select(
  "title contentPreview slug type tags readTime thumbnailUrl isPremium isPublic authorSnapshot stats createdAt updatedAt",
)

The content field is never included. Every byte not sent is a byte the event loop does not have to serialize.

Optimization 3: Eliminating .populate() with .lean()

Mongoose documents returned from a query are full class instances. They carry prototype methods, virtual fields, getter and setter logic, and change-tracking overhead. For read-only endpoints, all of that is pure waste.

The .lean() modifier tells Mongoose to return plain JavaScript objects instead of Mongoose document instances, bypassing all of that overhead.

// feed.js 
const posts = await Post.find(query)
      .sort({ createdAt: -1, _id: -1 })
      .limit(limit + 1)
      .select(
        "title contentPreview slug type tags readTime thumbnailUrl isPremium isPublic authorSnapshot stats createdAt updatedAt",
      )
      .lean();

I used it in other read only parts also. like updateEngagementScore.js and search.js

The performance gain from .lean() is especially meaningful on high-throughput routes where the same query runs thousands of times per minute. Combined with the removal of .populate() via denormalization, the per-query CPU cost drops substantially.

Optimization 4: Cursor-Based Pagination vs skip()

The skip() approach to pagination is one of the most common performance mistakes in MongoDB applications. When you write .skip(500).limit(10), MongoDB still has to scan through 500 documents before discarding them and returning the next 10. So, for example, a user on page 50 causes MongoDB to scan 500 documents every time. Under concurrent load, this degrades quadratically.

Cursor based pagination replaces the offset with a positional bookmark. The client sends the position of the last seen item, and the server queries for documents that come after that position. MongoDB can use an index to jump directly to the right location.

The feed cursor is encoded as a base64 JSON object containing createdAt and _id of the last document seen:

// feed.js 
nextCursor = Buffer.from(
  JSON.stringify({
    createdAt: last.createdAt.toISOString(),
    lastId: last._id.toString(),
  }),
).toString("base64");

On the next request, the server decodes this cursor and constructs a range query:

// feed.js 
query.$or = [
  { createdAt: { $lt: cursorDate } },
  { createdAt: cursorDate, _id: { $lt: cursorId } },
];

The tie-break on _id handles the edge case where two posts have identical createdAt timestamps, which can happen in test environments with seeded data or under very high write concurrency. This compound condition maps directly onto the compound index defined in the schema:

// Post.js 
postSchema.index({ isPublic: 1, createdAt: -1, _id: -1 });

MongoDB can satisfy this query with a single index scan, no collection scan, no document discard. The cost of fetching page 1 and page 5,000 is identical.

The skip() approach is still used on lower traffic routes like paginated comments, follower lists, and collection browsers where the dataset per user is bounded and the trade-off in code simplicity is justified.

Optimization 5: Proper Indexing Strategy

Indexes are the single highest-leverage optimization in any database-backed application. The wrong indexes will slow writes without helping reads. The right indexes turn expensive collection scans into fast index scans.

The Post collection carries three compound indexes, each targeting a specific query pattern:

// Post.js 
postSchema.index({ authorId: 1, isPublic: 1, createdAt: -1 }); // profile page posts
postSchema.index({ isPublic: 1, createdAt: -1, _id: -1 });  // articles feed
postSchema.index({ tags: 1, isPublic: 1 });  // topic search

The field order within a compound index matters. The feed query always filters by isPublic: true first, then sorts by createdAt descending. Placing isPublic first in the index allows MongoDB to immediately narrow to the public subset before doing any range scan. If the order were reversed, the index would be far less useful.

The Follow and Interaction collections both carry unique compound indexes that serve double duty: they enforce data integrity (no duplication) and provide fast lookup paths for the most common read patterns.

// Follow.js
followSchema.index({ followerId: 1, followingId: 1 }, { unique: true });

// Interaction.js
interactionSchema.index({ userId: 1, targetId: 1, type: 1 }, { unique: true });
interactionSchema.index({ targetId: 1, type: 1 });

The second index on Interaction (targetId + type) allows efficient queries like "how many likes does this post have" without scanning the entire interactions collection for a specific user.

Optimization 6: Atomic Counters and Denormalized Stats

A naive implementation of a likes counter would count matching Interaction documents every time the stat is needed:

// what not to do on every feed request
const likeCount = await Interaction.countDocuments(
  { targetId: postId, type: "like" }
 );

At 500,000 interactions in the database, this is unacceptably expensive for a field displayed on every post card in the feed.

The solution is to maintain denormalized counters on the parent document and update them atomically using MongoDB's $inc operator. The counter update happens at write time (when a like or dislike action occurs), so read time requires no computation at all.

// post.js (like route)
await Post.findByIdAndUpdate(postId, { $inc: { [statField]: 1 } });

The same pattern governs follow counts on the User document:

// follow.js
await User.findByIdAndUpdate(req.userId, {
  $inc: { "stats.followingCount": 1 },
});
await User.findByIdAndUpdate(targetUserId, {
  $inc: { "stats.followersCount": 1 },
});

These $inc operations are atomic in MongoDB, meaning concurrent requests cannot produce a race condition that results in an incorrect count. The Interaction collection remains the source of truth for deduplication (enforced by the unique index), while the counters on User and Post exist purely to serve read performance.

Optimization 7: In-Memory TTL Cache with Intelligent Invalidation

The /articles feed is the most read route on the platform. Under the Artillery load test configuration, 70 percent of all simulated traffic targeted this single endpoint. Sending every one of those requests to MongoDB would be wasteful given that the feed content changes slowly, not on every request.

The cache implementation is a simple but well designed in-memory TTL store backed by a JavaScript Map:

// cacheService.js
class CacheService {
  constructor() {
    this.store = new Map();
    setInterval(() => this._evictExpired(), 2 * 60 * 1000).unref();
  }

  set(key, value, ttlSeconds) {
    this.store.set(key, {
      value,
      expiresAt: Date.now() + ttlSeconds * 1000,
    });
  }

  get(key) {
    const entry = this.store.get(key);
    if (!entry) return null;
    if (Date.now() > entry.expiresAt) {
      this.store.delete(key); // lazy eviction on read
      return null;
    }
    return entry.value;
  }

Two TTL values govern the two cached routes:

// cacheService.js 
const TTL = {
  ARTICLES_FEED: 30,  // 30 seconds
  TOP_WRITERS: 5 * 60,   // 5 minutes
};

The 30-second TTL on the articles feed is intentionally short. It is long enough that a viral traffic spike serves hundreds of cache hits per second without touching the database, but short enough that new posts appear within half a minute.

Cache keys are constructed to account for pagination state and limit variations:

// feed.js 
const cacheKey = `articles:c${rawCursor}:l${limit}`;

This means articles:c:l10, articles:cABC123:l10, and articles:c:l25 are all separate cache entries. The Artillery load test exploited this deliberately, with 30 percent of traffic using randomized limit values to generate cache misses and force DB hits, simulating worst-case conditions.

The cache also supports prefix-based invalidation. When a post is created, deleted, or toggled between public and private, the entire articles feed cache is invalidated:

// post.js
cache.invalidatePrefix("articles:");

This uses a simple iteration over the Map's keys, deleting any entry whose key starts with the given prefix. It is $O(n)$ over the number of cached keys, but with a 30 second TTL the cache rarely holds more than a few dozen entries, making this cost negligible.

The top writers route uses a single named key with a 5-minute TTL:

// user.js 
const CACHE_KEY = "writers:top";
const cached = cache.get(CACHE_KEY);
if (cached) {
  return res.status(200).json(
    {
      success: true,
      data: cached,
      fromCache: true
    });
}

This key is deleted by name whenever a like or post creation/deletion event changes the underlying ranking data, ensuring correctness without waiting for TTL expiry.

Optimization 8: Aggregation Pipeline for Ranked Queries

Note: This is optimal but not related to the high RPS gain on /articles.

The top writers feature computes a likesPerPost ratio across all users and returns the top five. Doing this in application code would require loading all users with at least one post into memory, computing the ratio for each, sorting them, and slicing the result. In JavaScript, on a 400,000-user dataset, that would be both memory intensive and slow.

MongoDB's aggregation pipeline does this entirely inside the database engine, returning only the five documents the application actually needs:

// user.js 
const topWriters = await User.aggregate([
  { $match: { "stats.postsCount": { $gt: 0 } } },
  {
    $addFields: {
      likesPerPost: {
        $divide: ["$stats.likesReceivedCount", "$stats.postsCount"],
      },
    },
  },
  { $sort: { likesPerPost: -1 } },
  { $limit: 5 },
  {
    $project: {
      _id: 1, username: 1, fullName: 1,
      profilePicture: 1, designation: 1,
      stats: 1, likesPerPost: 1,
    },
  },
]);

The $match stage filters out users with no posts before the sort, dramatically reducing the working set. The $addFields stage computes the virtual likesPerPost field server-side. The $project stage trims the output to only what the frontend needs. Combined with the 5-minute cache on the result, this expensive aggregation runs at most once every five minutes regardless of traffic.

Optimization 9: Streaming Cursor for Batch Jobs

Note: This is optimal but not related to the high RPS gain on /articles.

The engagement score recalculation job runs every 15 minutes via cron and must update all 100,000 posts. The naive approach would load all 100,000 documents into memory at once:

// what not to do
const posts = await Post.find({}).lean();
// posts is now 100k objects in RAM

Instead, a streaming cursor is used to process documents one at a time, batching writes into groups of 1,000 for efficiency:

// updateEngagementScore.js 
const cursor = Post.find({})
  .select("_id stats createdAt updatedAt")
  .lean()
  .cursor();

The .cursor() call returns an async iterable that fetches documents from MongoDB in batches internally, keeping memory usage constant regardless of collection size. The .lean() combined with tight field selection via .select() means each document in memory is a minimal plain object, not a full Mongoose instance.

Writes are batched and dispatched using bulkWrite with ordered: false, which allows MongoDB to execute the updates in parallel and continue past any individual failure rather than aborting the batch:

// updateEngagementScore.js 
if (bulkOps.length >= BATCH_SIZE) {
  await Post.bulkWrite(bulkOps, { ordered: false });
  updatedCount += bulkOps.length;
  bulkOps = [];
}

This processes 100,000 posts in 100 round trips to MongoDB instead of 100,000 individual updates.

Scalable Schema Design

Performance is not only about query optimization. The schema design itself determines how well the system handles growth and viral events.

Social graph data lives in a dedicated Follow collection rather than as an embedded array on the User document. A user with 50,000 followers would create an enormous User document if followers were embedded, making every User read slow. The separate collection keeps User documents small and follow lookups indexed.

// Follow.js
followSchema.index({ followerId: 1, followingId: 1 }, { unique: true });

Similarly, all engagement actions (likes, dislikes, saves) live in a single polymorphic Interaction collection rather than per-type arrays on their target documents. A post that goes viral might receive 100,000 likes. Storing those as an embedded array would make the Post document unmanageable. The Interaction collection scales independently, and the unique compound index prevents duplicate votes at the database level, not just the application level.

// Interaction.js
interactionSchema.index({ userId: 1, targetId: 1, type: 1 }, { unique: true });

URL design also reflects scalability thinking. Post URLs follow the format /p/<title-slug>-<24-char-objectid>. The ObjectId is always the last segment and is the actual lookup key. The slug portion is cosmetic. This means the title can change without breaking links and the database lookup is always an O(1) ObjectId match:

// post.js
const extractPostId = (slug) => {
  const parts = slug.split("-");
  const last = parts[parts.length - 1];
  return /^[a-f0-9]{24}$/i.test(last) ? last : null;
};

The Results

After applying all of the above, the numbers under autocannon stress tests:

Condition	Pre-Optimization	Post-Optimization (Single Thread)
Simple (100 conn)	1,993 RPS	6,138 RPS (avg latency 15ms)
Stressed (500x10)	2,504 RPS	6,679 RPS (avg latency 353ms)

The Artillery real-world simulation ran three phases totaling 8,600 virtual user requests: a 30-second warm-up at 10 req/sec, a 60-second ramp from 10 to 100 req/sec, and a 10-second viral spike at 500 req/sec.

metric	value
Total requests	8,600
HTTP 200 responses	8,600
Failures	0
Overall median	1ms
p95	1ms
p99	2ms
Absolute max latency	18ms (cold cache, first window only)

The 18ms maximum in the entire test occurred in the very first window, when the cache was cold and the first request hit MongoDB directly. After that, the worst response across the entire 102-second test, including the 500 req/sec spike, was 5ms.

Cluster Mode: The Multiplier

Node.js runs on a single thread. No matter how well the application code is optimized, a single process is bounded by one CPU core. The Node.js cluster module solves this by forking one worker process per CPU core, all listening on the same port, with the OS kernel distributing incoming connections across them.

// index.js
const NUM_WORKERS = os.cpus().length;

if (cluster.isPrimary) {
  for (let i = 0; i < NUM_WORKERS; i++) {
    cluster.fork();
  }
  cluster.on("exit", (worker, code, signal) => {
    cluster.fork(); // auto-restart on crash
  });

  cron.schedule("*/15 * * * *", () => {
    updateEngagementScore(); // cron runs only in primary
  });
} else {
  require("./server"); // workers just run express
}

The cron job runs only in the primary process to avoid N simultaneous engagement score recalculations. Each worker maintains its own MongoDB connection pool, which MongoDB Atlas handles without issue.

Cluster results under the same autocannon tests:

Condition	Single Thread	Cluster Mode
Simple (100 conn)	6,138 RPS	15,913 RPS (avg latency 5ms)
Stressed (500x10)	6,679 RPS	17,007 RPS (avg latency 209ms)

The more meaningful number is the error rate: under the stressed test, single thread produced 1,000 timeouts. Cluster mode under identical conditions produced 70. That is a 14x reduction in failures, which is what matters in production.

One important caveat: the in-memory cache in cacheService.js is per-process. Each worker has its own independent cache. Under cluster mode, the first request on each worker will be a cache miss even if another worker has already cached the result. With 8 cores, you could theoretically have 8 simultaneous cold cache DB hits for the same cache key. This is not a correctness problem, but it does reduce cache efficiency compared to a shared external cache. This is the primary reason Redis becomes compelling at scale, which is discussed next.

What Comes Next: Fastify, Redis, and Nginx

The current architecture is Express, Node.js cluster, MongoDB, and an in-memory cache. This is solid for early production but has well understood ceilings.

Fastify is a drop in replacement for Express with a significantly faster HTTP layer. It uses a radix tree router instead of a linear scan, serializes JSON responses via compiled schemas instead of JSON.stringify, and has lower per-request overhead across the board. Benchmarks consistently show Fastify handling 30 to 50 percent more requests per second than Express on identical hardware for equivalent workloads. Migrating to Fastify would require porting route and middleware definitions but would not change any MongoDB or business logic.

Redis solves the per process cache problem. Moving the cacheService from an in-memory Map to Redis means all workers share a single cache. A cache miss on worker 3 populates an entry that worker 7 will then hit. This also unlocks persistence across restarts, distributed invalidation across multiple server instances, and access to Redis data structures like sorted sets, which could power the engagement score ranking query directly instead of running it as a cron job.

Nginx sits in front of Node.js and handles concerns that Node.js is not well-suited for: TLS termination, static file serving, connection rate limiting, request buffering, and load balancing across multiple server instances. Offloading TLS termination to Nginx alone reduces CPU load on Node.js meaningfully under HTTPS traffic.

The target architecture looks like this:

                         CLIENT
                           |
                           | HTTPS
                           v
              +------------------------+
              |         NGINX          |
              |  TLS termination       |
              |  Static file serving   |
              |  Rate limiting         |
              |  Load balancing        |
              +------------------------+
                    |           |
               HTTP/1.1    HTTP/1.1
                    |           |
         +----------+           +----------+
         |  Server Instance 1   |  Server Instance 2   (horizontal scaling)
         |                      |
         |  +----------------+  |  +----------------+
         |  | Node Cluster   |  |  | Node Cluster   |
         |  |                |  |  |                |
         |  | W1  W2  W3 W4  |  |  | W1  W2  W3 W4  |
         |  | (Fastify)      |  |  | (Fastify)      |
         |  +----------------+  |  +----------------+
         |         |            |         |
         +---------+------------+---------+
                         |
          +--------------+--------------+
          |                             |
    +----------+                  +----------+
    |  REDIS   |                  | MongoDB  |
    |          |                  | (Atlas   |
    | Shared   |                  |  Replica |
    | Cache    |                  |  Set)    |
    | Pub/Sub  |                  |          |
    | Sessions |                  | Primary  |
    +----------+                  | Secondary|
                                  | Secondary|
                                  +----------+

In this architecture, Nginx terminates all SSL connections and distributes traffic across multiple server instances. Each instance runs a Node.js cluster with Fastify workers. All workers on all instances share a single Redis instance for caching, session data, and pub/sub notifications. MongoDB runs as a replica set, enabling read scaling by routing read queries to secondary nodes while writes go to the primary.

The path from the current setup to this architecture is incremental. Redis can be introduced first by replacing the CacheService class with a Redis-backed equivalent without changing any route code. Nginx can be added as a reverse proxy with minimal configuration. Fastify migration can happen route by route. None of these changes require schema redesign or data migration.

Conclusion

The optimization journey on OpenCanvas demonstrates that the most impactful performance improvements come from rethinking data shape and query patterns rather than from infrastructure changes. Removing .populate() through denormalization, reducing payload size through field selection and content previews, eliminating skip based pagination with cursor queries, and adding a small in-memory cache together produced a 3x improvement in RPS and a 100x improvement in tail latency on the most trafficked route.

The database scale, 400,000 users, 100,000 posts, and over a million secondary documents, was not a liability. It was the condition under which these optimizations were proven. A single Node.js thread, with no special hardware, handled 8,600 concurrent user requests spanning a warm-up, a ramp-up, and a 500 req/sec viral spike with zero failures and a p99 latency of 2 milliseconds.

This is the ceiling of what application level optimiation can achieve. The architecture described in the final section is what pushes the ceiling higher without changing a line of business logic.