Managing millions of products with their countless variations is one of e-commerce's biggest headaches. A poorly designed catalog system forces you to duplicate data, explodes your database size, and makes updates painfully slow. Today we're exploring how to architect a product catalog service that stays lean and performant, even when handling SKUs in the millions.
Architecture Overview
A robust product catalog system separates concerns into distinct layers. At its core, you have a product service that manages the base product data (name, description, pricing, images), a variant service that tracks attribute combinations without duplicating entire product records, and a search and filtering layer built on technologies like Elasticsearch for real-time querying across millions of items. These services communicate through well-defined APIs and share data through a central cache layer, typically Redis, to minimize database hits during peak traffic.
The system also needs to handle bulk imports efficiently. A dedicated import service processes CSV or JSON files asynchronously, validating data and batching inserts to avoid overwhelming the primary database. This keeps your main product API responsive while background workers handle the heavy lifting. Categories and hierarchies are stored in a separate tree structure, allowing for flexible navigation and faceted search without embedding category data in every product record.
The data model is the real differentiator here. Rather than creating a new database entry for every size-color combination, the architecture uses a composition pattern. The base product record contains shared attributes like brand, material, and base price. Attributes like size and color are stored as metadata with their own cardinality and constraints. SKU generation becomes a process of combining these attributes algorithmically, which means you store the potential combinations mathematically rather than physically.
Design Insight: Handling Variants Smartly
Here's the key insight that makes this architecture elegant: variants are defined as attribute combinations, not as separate entities. When you have a shirt in sizes small through XXL and five colors, you don't create 25 product entries. Instead, you define the product once, list size and color as variant attributes with their allowed values, and generate SKUs dynamically as combinations. This approach reduces storage by orders of magnitude and simplifies inventory management.
The variant service maintains a mapping of SKU to the specific combination of attribute values, plus variant-specific data like images, pricing overrides, and stock levels. When a customer searches for red shirts in size large, the search layer queries attributes and returns results efficiently. Updates are simpler too, your inventory team can adjust stock for a specific SKU without touching the base product record. This separation of concerns is what allows the system to scale from thousands to millions of SKUs without breaking a sweat.
Watch the Full Design Process
Want to see how this architecture comes together in real-time? We used InfraSketch to visualize the entire system design, showing component interactions and data flows as a complete diagram. You can watch the process unfold here:
Try It Yourself
The best way to understand system design is to build one yourself. Head over to InfraSketch and describe your system in plain English. In seconds, you'll have a professional architecture diagram, complete with a design document. Whether you're designing a catalog service or tackling another complex system, this approach removes the friction from moving ideas to diagrams.
This is Day 19 of our 365-day system design challenge. Ready to level up your architecture skills?
Top comments (0)