SQL vs. NoSQL in System Design

#architecture #database #sql #systemdesign

Foundations of SQL Databases

SQL databases, also known as relational databases, organize data into structured tables with predefined schemas. Each table consists of rows and columns, where columns enforce specific data types and constraints. Relationships between tables are established through foreign keys, enabling complex joins to retrieve interconnected data efficiently. This model follows the principles of normalization to minimize data redundancy and ensure data integrity.

The core strength of SQL lies in its adherence to ACID properties: Atomicity, Consistency, Isolation, and Durability. These guarantees make SQL ideal for applications requiring strict transactional consistency, such as financial systems or e-commerce platforms where partial failures cannot occur.

A complete SQL schema for an e-commerce system illustrates this structure. Consider the following full code snippet using PostgreSQL syntax:

-- Complete SQL schema for e-commerce system
CREATE TABLE users (
    user_id SERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    password_hash TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    name VARCHAR(200) NOT NULL,
    description TEXT,
    price DECIMAL(10,2) NOT NULL,
    stock_quantity INTEGER DEFAULT 0,
    category_id INTEGER REFERENCES categories(category_id)
);

CREATE TABLE categories (
    category_id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL
);

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(user_id) ON DELETE CASCADE,
    order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    total_amount DECIMAL(10,2) NOT NULL,
    status VARCHAR(20) DEFAULT 'pending'
);

CREATE TABLE order_items (
    order_item_id SERIAL PRIMARY KEY,
    order_id INTEGER REFERENCES orders(order_id) ON DELETE CASCADE,
    product_id INTEGER REFERENCES products(product_id),
    quantity INTEGER NOT NULL,
    price_at_purchase DECIMAL(10,2) NOT NULL
);

-- Example transaction ensuring ACID compliance
BEGIN;
INSERT INTO orders (user_id, total_amount, status) 
VALUES (1, 99.99, 'pending');
INSERT INTO order_items (order_id, product_id, quantity, price_at_purchase) 
VALUES (currval('orders_order_id_seq'), 5, 2, 49.99);
UPDATE products SET stock_quantity = stock_quantity - 2 WHERE product_id = 5;
COMMIT;

This structure enforces referential integrity through foreign keys and cascading deletes. The transaction block guarantees that either all operations succeed or none do, maintaining consistency even under concurrent access. In system design, such SQL setups are typically deployed with master-slave replication for read scalability and vertical scaling by upgrading hardware on a single server.

Foundations of NoSQL Databases

NoSQL databases, or non-relational databases, reject the rigid table structure in favor of flexible data models. They store data in formats such as documents, key-value pairs, wide-column stores, or graphs. Schemas are often dynamic or schema-less, allowing applications to evolve without downtime for migrations. This flexibility prioritizes scalability and performance over strict consistency.

NoSQL systems typically follow BASE principles: Basically Available, Soft state, and Eventual consistency. They excel in horizontal scaling across distributed clusters, making them suitable for high-throughput applications like social media feeds, real-time analytics, or content management systems.

NoSQL encompasses four primary types:

Document stores (e.g., MongoDB) store self-contained JSON-like documents.
Key-value stores (e.g., Redis) provide ultra-fast lookups.
Column-family stores (e.g., Cassandra) handle massive sparse data.
Graph databases (e.g., Neo4j) optimize for relationship-heavy queries.

A complete NoSQL implementation example uses MongoDB for the same e-commerce scenario. The following full code snippet demonstrates document-based storage and operations:

// Complete MongoDB setup and operations for e-commerce system
const { MongoClient } = require('mongodb');
const uri = "mongodb://localhost:27017";
const client = new MongoClient(uri);

async function run() {
  try {
    await client.connect();
    const database = client.db("ecommerce");

    // Collections are created implicitly on first insert
    const usersCollection = database.collection("users");
    const productsCollection = database.collection("products");
    const ordersCollection = database.collection("orders");

    // Insert a user document (schema-less)
    const newUser = {
      username: "johndoe",
      email: "john@example.com",
      passwordHash: "hashedpassword123",
      createdAt: new Date(),
      addresses: [  // Embedded array for flexibility
        { street: "123 Main St", city: "Nairobi", country: "Kenya" }
      ]
    };
    const userResult = await usersCollection.insertOne(newUser);

    // Insert product with dynamic fields
    const newProduct = {
      name: "Wireless Headphones",
      description: "Noise-cancelling over-ear",
      price: 99.99,
      stockQuantity: 50,
      category: "Electronics",
      tags: ["audio", "wireless"]  // Flexible array
    };
    await productsCollection.insertOne(newProduct);

    // Complete order as a single document with embedded items (denormalized for speed)
    const newOrder = {
      userId: userResult.insertedId,
      orderDate: new Date(),
      totalAmount: 99.99,
      status: "pending",
      items: [
        {
          productId: "productObjectIdHere",
          name: "Wireless Headphones",
          quantity: 1,
          priceAtPurchase: 99.99
        }
      ],
      // Embedded shipping info without separate table
      shippingAddress: {
        street: "123 Main St",
        city: "Nairobi",
        country: "Kenya"
      }
    };
    await ordersCollection.insertOne(newOrder);

    // Query example with aggregation for analytics
    const analytics = await ordersCollection.aggregate([
      { $match: { status: "completed" } },
      { $group: { _id: null, totalRevenue: { $sum: "$totalAmount" } } }
    ]).toArray();

    console.log("Analytics:", analytics);
  } finally {
    await client.close();
  }
}
run().catch(console.dir);

This code demonstrates denormalization by embedding related data directly into documents, eliminating the need for joins. Operations occur atomically within a single document, and the system scales horizontally by adding more nodes to the replica set or sharded cluster. In system design, MongoDB would use consistent hashing for data partitioning and replication for high availability.

Comparative Analysis in System Design

SQL and NoSQL differ fundamentally in how they handle data, scale, and guarantee consistency, directly impacting architectural decisions.

Aspect	SQL Databases	NoSQL Databases
Data Model	Relational tables with fixed schema	Flexible documents, key-value, etc.
Schema	Rigid and enforced	Dynamic or schema-less
Query Language	Standardized SQL with joins	Database-specific (e.g., MongoDB Query Language)
Consistency	ACID (strong guarantees)	BASE (eventual consistency)
Scalability	Vertical scaling preferred	Horizontal scaling across clusters
Use Case Fit	Transactions, complex relationships	High volume, unstructured data, real-time

SQL shines when data integrity and complex queries are paramount. NoSQL excels when velocity, variety, and volume dominate, as in big data pipelines or global user bases.

In a distributed system design, a hybrid approach often emerges. User authentication and financial transactions might reside in PostgreSQL for ACID compliance, while user activity logs and product recommendations use Cassandra or MongoDB for horizontal scaling and eventual consistency. Data partitioning via sharding in NoSQL contrasts with partitioning strategies in SQL that rely more on read replicas.

Practical Implementation in Distributed Systems

System designers must evaluate trade-offs when selecting databases. For a high-traffic social platform, SQL might handle user profiles with strong consistency via two-phase commits in rare cross-service transactions. NoSQL would manage feeds using event-driven architecture with message queues publishing changes for eventual propagation.

Leader election and consensus algorithms like Raft ensure NoSQL clusters remain available during node failures. SQL clusters often employ multi-master replication with careful conflict resolution.

The choice influences microservices architecture: each service owns its database, enforcing database-per-service patterns. API gateways route requests, while circuit breakers and retry mechanisms handle transient failures across database boundaries.

This comprehensive understanding equips system designers to architect resilient, scalable solutions tailored to specific requirements.

To help visualize the concepts discussed, here is one complete image:

System Design Handbook

To master these concepts and many more, purchase the complete System Design Handbook at https://codewithdhanian.gumroad.com/l/ntmcf.

Buy me coffee to support my content at: https://ko-fi.com/codewithdhanian