System Design study guide / interview prep for me
Content here is not original. It is is a summary of the following sources:
- freeCodeCamp - System Design Interview Question (really comprehensive 20/10 will recommend you give it a read)
- Why load balancing
server selection strategy
- round robin, weighted round robin
- load based server selection
- IP hashing based selection
- Adding servers / handling failing servers
- CAP theorem
- Scaling: vertical VS horizontal
- Relational Database
- Non-relational database
- NoSQL VS relational
- Replication & Sharding
- Use cases
- protecting against DOS
pub-sub Messaging / MQs
- Use case / purpose
- pub/sub messaging
- comparison to DB
References to getting started
- IGotAnOffer - Database: system design interview concepts
- freeCodeCamp - System Design interview concepts
- medium - 25 java software design questions ### HTTP Methods
- retrieve resources
- submit new data
- update / replace existing data
- removes data
- modify resource. only contains changes not whole resource
- requests available communication methods
- request resource metadata without retrieve full resource representation
- 400 Bad request
- client side input fail validation
- 401 Unauthorized
- user not authenticated
- 403 forbidden
- user authenticated but not privileged to access resources
- 404 Not Found
- resource not found
- 500 Internal Server Error
- generic server error. shouldn't be thrown explicitly.
- 502 Bad gateway
- invalid response from upstream server
- 503 Service Unavailable
- something unexpected happen on server side (server overload, some parts fiail( reference: SO
Disk is persistent. (when power is off, data will "persist").
RAM is transient. When lost power memory is wiped away.
If data is useful only during session of server then keep it in Memory.
Keeping it in RAM is much faster and less expensive than writing to persistent database.
Latency: duration for action to complete / produce result
Throughput: maximum capacity of system / work done in unit time.
- load site fast & smoothly
- fast lookups
- avoid pinging distant servers.
- can improve by reducing bottleneck
- system will only be as fast as its slowest bottleneck
- increase by improving slowest bottleneck
Hard ware definition from compritech:
Latency: time taken for packet to be transferred across network
Throughput: quantity of data being send & recevied per unit time
Vertical Scaling: adding compute (CPU) and memory (RAM, Disk, SSD) resources to a single computer.
Horizontal Scaling: adding more computers to a cluster
- fairly straightforward
- much lower overall memory capacity
- higher overall compute and storage capacity
- sized dynamically without downtime
- most popular DB model hard to scale horizontally.
availability is like resiliency of system. fault tolerant system makes an available system.
Measured using (uptime): percentage time system's primary function is available in given window of time
Service Level Agreement / Assurance
set of guaranteed service level metrics.
Reduce or eliminate single points of failure
This is done by designing redundancy into the system.
Eg. two or more servers to provide services.
Caching helps reduce latency of system.
- app can perform faster
- faster to retrieve data from memory than disk
Works best when
- store static / infrequently caching data
- source of change is single operations rather than user-generate operations
If data consistency & freshness is critical. Caching may not be optimal unless cache is refreshed frequently (refresh cache frequently may not be ideal either)
- when using certain piece of data often
- backend has computationally intensive work (cache reduces complexity to O(1))
- server makes multiple netwrk requests & API calls
- CDN (content delivery network)
- caches content (images / videos/ webpages) in proxy server located closer to end user than origin server
- can deliver content more quickly.
deals with the "write" operations of cache.
- keep cache & DB in sync
- should it be done synchronously or asyncrhonouslt
- data eviction - Which old records to "kick out" for new data?
- LFU ### Proxy:basics proxy: basically a middleman between client & origin server.
There are 2 kinds of proxies
- reverse Forward
- computer make requests to sites on internet
- proxy server intercepts request
- communicate to web server on behalf of clients
- client send request to origin server of website
- requests intercepted at network edge by reverse proxy server
- reverse proxy sends request to & receive response from origin server
proxy: basically a middleman between client & origin server.
How proxies work
Forward: sits in front of client, no origin server communicates with specific client. Server has no knowledge of client
Reverse: sits in front of origin server, no client communicate directly with origin server. Client has no knowledge of server.
- avoid company browsing restrictions.
- clients connect to proxy rather than directly to sites they are visiting
- block access to certain content
- school network, configure to connect to web via proxy that blocks stuff
- IP address of client is harder to tarce, only IP address of prxy server is visible
- load balancing
- distribute incoming traffic among different origin servers
- protection from attacks
- web site/service never needs to reveal IP address of original server
- harder to have targeted attack against origin server (eg. DDoS). hackers can only target reverse proxy. Reverse proxy will have tighter security & more resources.
- Global Server Local Balancing
- website distributed on several servers around the globe.
- reverse proxy send clients to server geographically closets to them.
- reverse proxy can cache content -> faster performance
- SSL encryption
- reverse proxy can decrypt all incoming requests & encrypt all outgoing repsonses
- free up valuable resources on origin server.
Maintain availability & throughput
By distributing incoming request loads across multiple servers
When server get lots of requests it can:
- slow down (throughput reduces, more latency)
- fail (no availability)
Load Balancing helps this by Distribute incoming traffic among origin servers
Probability server selection
- Round Robin
- loop through servers in fixed sequence
- Weighted Round Robin
- assign different weights / probabilities each server.
- traffic split up according to weights
- Load-based server selection
- monitor current capacity / performance of servers
- send request to server with highest throughput / lowest latency.
- IP Hashing based selection
- hash IP address of incoming request
- use hash value to allocate server
- Path / Service based selection
- route requests based on path / service provided.
any distributed DB can only satisfy 2 of three features
- Consistency: every node responds with most recent version of data
- Availability: Any node can send a response
- Partition tolerance: system continues working even if communication between any of the nodes is broken
DB is usually CP or AP database. Since cannot garuntee statbility of network, P is non-negotiable.
database using a relational data model, organizes data in tables with rows of data entries and columns of predetermined data types.
- many-to-many relatinoship between entries
- data needs to follow predetermined schema
- consistent transactions are important
- relationship between data always need to be accurate
- transaction is atomic / smallest unit.
- all instructions in transaction will execute or non at all
- If DB in initially consistent, it should remain consistent after every transaction
- eg. write operation (transfer money from A to B) failed & transaction not rolled back. Db is inconsistent because amount of money between A &B (A+B) not equal before & after transaction
- multiple transaction running concurrently they should be affected by each other
- result should be same as result obtained if transactions running sequentially.
- changes committed to DB should remain even in case of software & system failure.
also known as NoSQL database.
at core, DB hold data in hash-table like structure.
Use cases: caching, environment variables, configuration files / session state
Use environment: in memory & persisten storage
Since their structure is like hashtable there is minimal over head
- extremely fast
- simple & easy to use
Basically available: system guarantees availability
soft state: state of system may change over time even without input
eventual consistency: system will be consistent over very short period of time unless inputs are received
- Graph database
- many-to-many realtinoships
- fase at following graph edges
- Document Store
- isolated documents, retrieve by a key
- documents with different schemas that are easy to update
- easy to scale
- Key-value store
- like a very large hashtable
- opaque values (DB has no notion of what is stored in value only provides read, overwrite and delete operations)
- simple operations (no schemas, joins or indices)
- minimal overhead - easy to scale
- suitable for caching implementations
- column-family DB
- dynamic schema.
- dev can use "unstructured data" can build application without defining schema
- scales horizontally over cheap commodity servers
- Simple Operations
- data retrieval is simple
- Cheap hardware
- app deployed to commodity hardware like public clouds
- workload volume is consistent
- ACID garuntees required
- data is predictable & highly structured
- data est expressed relationally
- write safety required
- app deployed to large high-end hardware
When system scales horizontally, some tasks require precise coordination between nodes. Where there is leader nodes directing follower nodes.
Leader Election algo
how cluster of nodes without leader communicate with each other & choose one to be leader.
algo executed when cluster starts or when leader node goes down.
- Any node can be leader, no single point of failure required to coordinate system
- System doing complex work that need good coordination
- eg. compute how protein folds. cluster needs leader node to assign each node to work on different part then add results together
- System executes many distributed writes to data & requires strong consistency
- no matter which node handles request user will always have most up-to-date version of data.
- leader creates consistency by being source of truth on what the most recent state of system is.
- split brain
- bad implementation -> 2 nodes controlling system
- single point of failure / bottleneck
- leader starts making bad decisions entire system will follow
- app specific faults wont impact system
- if one component fails, all others can continue interacting with queue, processing / producing messages.