Wallace Freitas

Posted on Sep 22

System Design Terminologies: Key Concepts Every Engineer Should Know

#systemdesign

For software developers, particularly those working on large-scale applications, system design is an essential ability. Knowing the terms used in system design is crucial, whether you're creating distributed systems or getting ready for an interview. We'll go into basic vocabulary and ideas that underpin system design in this essay. Knowing these words will facilitate conversations, enable you to create scalable architectures, and enhance your overall design proficiency.

1. Scalability
Scalability refers to a system’s ability to handle increased load without compromising performance. Systems can scale in two ways:

👉🏻 Vertical Scaling (Scaling Up): Increasing the resources (CPU, RAM, storage) of a single machine to handle more load. However, this has hardware limitations.

👉🏻 Horizontal Scaling (Scaling Out): Adding more machines or nodes to distribute the load. This is often preferred in distributed systems for better fault tolerance and capacity.

Example: Scaling an API horizontally by adding multiple servers behind a load balancer to distribute incoming requests.

2. Load Balancer

A load balancer is a component that distributes incoming network traffic across multiple servers to ensure that no single server is overwhelmed. Load balancers can also detect unhealthy servers and reroute traffic to healthy ones.

There are two main types of load balancers:

👉🏻 Layer 4 (Transport Layer): Operates at the TCP/UDP level.

👉🏻 Layer 7 (Application Layer): Operates at the HTTP level and can make routing decisions based on content.

Example: Nginx and AWS Elastic Load Balancer (ELB) are popular load balancing solutions.

3. Caching

Caching is the process of storing copies of frequently accessed data in a temporary storage location (cache) to reduce latency and load on the database or backend. There are two main types of caching:

👉🏻 Client-Side Caching: Data is cached on the client (e.g., browser cache).

👉🏻 Server-Side Caching: Data is cached on the server, often using services like Redis or Memcached.

Example: Caching user profile data in Redis to reduce repeated database queries when the same user logs in frequently.

4. Database Sharding

Sharding is a database partitioning technique where large datasets are split into smaller, more manageable pieces called shards. Each shard holds a subset of the data and can be distributed across different machines to improve performance and scalability.

👉🏻 Horizontal Sharding: Dividing the data across multiple databases based on a shard key.

👉🏻 Vertical Sharding: Splitting tables into different databases based on table columns.

Example: Splitting a user database into multiple shards based on user location (e.g., users from North America on one shard, Europe on another).

5. Consistency, Availability, and Partition Tolerance (CAP Theorem)

The CAP Theorem states that in a distributed system, it’s impossible to achieve all three properties simultaneously:

👉🏻 Consistency: All nodes see the same data at the same time.
Availability: Every request receives a response, even if some nodes are down.

👉🏻 Partition Tolerance: The system continues to function despite network partitions.

Depending on the system's requirements, trade-offs must be made between these three properties. For example, a system might favor Availability and Partition Tolerance over strict Consistency.

Example: Distributed databases like Cassandra and DynamoDB provide high availability and partition tolerance but may relax consistency.

6. Latency

Latency is the time it takes for a request to travel from the client to the server and for the server to respond. Lower latency improves user experience, especially in real-time applications like video streaming, gaming, or stock trading platforms.

👉🏻 Network Latency: Time taken by data to travel across the network.

👉🏻 Processing Latency: Time taken by the server to process a request.

Example: Reducing latency by deploying servers closer to users through Content Delivery Networks (CDNs).

7. Throughput

Throughput refers to the number of requests or transactions a system can process in a given period. It’s often measured in requests per second (RPS) or transactions per second (TPS). High throughput systems can handle more requests concurrently.

Example: An e-commerce website can increase throughput by using asynchronous processing for non-critical tasks, such as sending confirmation emails after a purchase.

8. Fault Tolerance

Fault tolerance is the ability of a system to continue operating correctly even in the event of a failure. Fault-tolerant systems are designed to detect and handle failures gracefully without downtime.

👉🏻 Redundancy: Duplicating critical components (e.g., multiple servers, databases).

👉🏻 Failover: Automatically switching to a backup component when a primary component fails.

Example: A database cluster with automatic failover to a replica when the primary database goes down.

Eventual Consistency In distributed systems, eventual consistency ensures that all nodes will eventually converge to the same data state, even if they aren’t immediately consistent. This is common in systems that prioritize availability over consistency (as per CAP Theorem).

Example: In Amazon’s DynamoDB, when you update a record, it may take some time before the changes are visible across all nodes, but eventually, all nodes will reflect the updated data.

10. Microservices

A microservices architecture involves breaking down a large application into smaller, independent services that can be developed, deployed, and scaled individually. Each microservice typically has its own database and handles a specific business function.

Example: An e-commerce system may have separate microservices for user management, order processing, and payment handling.

11. Rate Limiting

Rate limiting controls the number of requests a client can make to a server within a given timeframe. It helps protect the server from being overwhelmed by too many requests and can prevent abuse or DoS attacks.

Example: Limiting API users to 100 requests per minute to prevent excessive load on the server.

12. Message Queue

A message queue is a communication mechanism used to enable asynchronous communication between different components or services in a system. Messages are sent to the queue, where they can be processed by the recipient at a later time, decoupling the sender and receiver.

Example: RabbitMQ and Apache Kafka are commonly used message queues that help distribute tasks in distributed systems.

Understanding these key system design terminologies is essential for building robust, scalable, and fault-tolerant systems. Whether you’re designing distributed systems, microservices, or handling high-traffic applications, knowing these concepts will help you make informed decisions and improve your overall design skills.

DEV Community

System Design Terminologies: Key Concepts Every Engineer Should Know

Top comments (0)

Read next

How to Start Your Programming Journey: 5 Practical Tips for Beginners

Demystifying VPC Peering: Connecting Your AWS VPCs Seamlessly

Amazon Q Developer Tips: No.19 Amazon Q Developer Agents - /doc

How to introduce 🦀 Rust at your company 🏭?