- Reliability: The system should continue working correctly even when things go wrong e.g. the system should be able to deal with the user making mistakes.
- Scalability: The system should be able to serve it's users as the userbase grows.
- Maintainability: Many people will work on a given system and this should be as productive as possible.
We can define the load on a system using load parameters and this varies from system to system.
- Requests per second to a web server
- Number of concurrent users
- Hit rate on a cache
Latency vs Response Time
Response time is what the client sees (the time taken to process the request). Latency is the duration during which a request is waiting to be handled (awaiting service).
When discussing response times, using percentiles is better than just describing the average response time. The average does not provide an idea of how many users actually experience that delay. High percentile of response times (e.g. 99.9th percentile) aka tail latencies are important as they affect the users experience of the service.
Percentiles are often used in SLAs.
Coping with load
- Vertical scaling: moving to a more powerful machine
- Horizontal scaling: distributing load across multiple machines
Elastic systems involve scaling horizontally once an increase in load is detected.
A maintainable system is one that is easy to operate (easy for ops teams to keep the system running smoothly), simple to understand for new engineers and is easily extensible or evolvable.
(summary of chapter 1 from Designing Data-Intensive Applications by Martin Kleppmann)
Top comments (0)