$Cover image for Back of the Envelope Calculation$

Vladislav

Posted on Apr 10, 2022 • Edited on May 10, 2022

Back of the Envelope Calculation

#interview #systemdesign

Intro

I found this topic rather misleading and always overcomplicated. Though I cannot disagree the version below is a lot more simplified than real life calculations, it's still covers 99% of the things you can encounter in your interview process.

What to estimate?

QPS - queries per second

RPS - reads per second

WPS - writes per second

Peak QPS = QPS * 2 (usually)

RW - read write ratio

Message size - size of the message in bytes if not given

Read Throughput - RPS * message size = N bytes per second

Write Throughput - WPS* message size = N bytes per second

💡 Throughput is how much data actually passed through and bandwidth is how much data CAN be passed through (network configuration)
Ex: 1gbps network bandwidth can pass 125mb/s

Storage - usually storage for N years

Replica storage - storage * 2-3 times

Cache storage - usually 20% of storage or so

Cache replica storage - cache storage * 2-3 times

Basic Numbers

seconds in a day - 24 * 60 * 60 = 86400, roughly 10^5

1 ASCI letter - 1 char

timestamp - 8 bytes (2^64)

10³ - 1kb

10⁶ - 1mb

10⁹ - 1gb

10¹² - 1tb

10¹⁵ - 1pb

10¹⁸ - 1eb

Powers of two

Power           Exact Value         Approx Value        Bytes
---------------------------------------------------------------
7                             128
8                             256
10                           1024   1 thousand           1 KB
16                         65,536                       64 KB
20                      1,048,576   1 million            1 MB
30                  1,073,741,824   1 billion            1 GB
32                  4,294,967,296                        4 GB
40              1,099,511,627,776   1 trillion           1 TB

Latency numbers every programmer should know

Latency Comparison Numbers
--------------------------
L1 cache reference                           0.5 ns
Branch mispredict                            5   ns
L2 cache reference                           7   ns                      14x L1 cache
Mutex lock/unlock                           25   ns
Main memory reference                      100   ns                      20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy            10,000   ns       10 us
Send 1 KB bytes over 1 Gbps network     10,000   ns       10 us
Read 4 KB randomly from SSD*           150,000   ns      150 us          ~1GB/sec SSD
Read 1 MB sequentially from memory     250,000   ns      250 us
Round trip within same datacenter      500,000   ns      500 us
Read 1 MB sequentially from SSD*     1,000,000   ns    1,000 us    1 ms  ~1GB/sec SSD, 4X memory
HDD seek                            10,000,000   ns   10,000 us   10 ms  20x datacenter roundtrip
Read 1 MB sequentially from 1 Gbps  10,000,000   ns   10,000 us   10 ms  40x memory, 10X SSD
Read 1 MB sequentially from HDD     30,000,000   ns   30,000 us   30 ms 120x memory, 30X SSD
Send packet CA->Netherlands->CA    150,000,000   ns  150,000 us  150 ms

Notes
-----
1 ns = 10<sup>-9</sup> seconds
1 us = 10<sup>-6</sup> seconds = 1,000 ns
1 ms = 10<sup>-3</sup> seconds = 1,000 us = 1,000,000 ns

Handy metrics based on latency numbers

Read sequentially from HDD at 30 MB/s
Read sequentially from 1 Gbps Ethernet at 100 MB/s
Read sequentially from SSD at 1 GB/s
Read sequentially from main memory at 4 GB/s
6-7 world-wide round trips per second
2,000 round trips per second within a data center

How to estimate?

Clarify number of daily users and number of total users.
Ask about number of request from user on average. From here you can get QPS.
Think about peak QPS, Reads and Writes.
Assume (clarify) message size.
Calculate throughput.
If it's possible think about average data size. And calculate storage and cache here.

Estimation example

You have 10M daily active users and each of them makes 100 read requests per day on average and new data is created 5 times per day.

RPS = 10M * 100 / 86400 = 12000 r/s

WPS = 10M * 5 / 86400 = 580 w/s

Peak QPS = 24000 r/s

Let's assume(clarify with the interviewer) that the average read message size is 50 bytes and the written message is 1kb.

Avg Read throughput 50 * 12*10^3 = 60kb/s

Avg Write throughput 1kb * 580 = 580kb/s

Here we can think about the type of data/metadata etc. Let's assume that you have clarified with your interviewer and the size of the new data is 1kb.

5 years storage - 10M * 1kb * 5 time per day * 365 days per year * 5 years = 91tb * 3 = 300tb with replicas.

Lets assume that you have only 10% of hot data and you agreed to use 20% as cache.

Cache storage - 10% * 90tb * 20% * 3 replicas = 5.5 tb

Top comments (2)

Imthath • Oct 4 • Edited

I have some confusion about calcucating cache size for 5 years. So I asked ChatGPT the following.

In the above estimation, why do we have to think about cache over 5 years? Isn't cache updated frequently with a max limit? For example, the most frequently accessed data today will not be the same as 1 year from now, right?

And here's the relevant part from its answer.

A More Accurate Estimation Approach:

Instead of considering 5 years’ worth of data, you should estimate the cache size based on the data’s access patterns and the working set size. Here’s how you might adjust the calculation:
1. Determine the Working Set Period: Decide on a time frame that represents the typical period during which data remains “hot.” For many applications, this could be the last few days or weeks.
2. Calculate Data Generated in the Working Set Period:
• For example, if the working set period is one month:
• Daily data generated: 10M users x 1KB x 5 times/day = 50 GB/day
• Monthly data generated: 50 GB/day x 30 days = 1.5 TB
3. Estimate the Hot Data Percentage:
• If 10% of this monthly data is hot:
• Hot data size: 1.5 TB x 10% = 150 GB
4. Calculate Cache Size with Replication and Overhead:
• If you use 20% of the hot data size for cache (which might represent a cache hit ratio you’re aiming for):
• Cache size: 150 GB x 20% = 30 GB
• With 3 replicas: Total cache storage: 30 GB x 3 = 90 GB

Imthath • Oct 4

@vladisov Thanks for the simple and clear explanation.

Even some popular books in the area couldn't convey it this easy.

DEV Community

Back of the Envelope Calculation

Intro

What to estimate?

Basic Numbers

Powers of two

Latency numbers every programmer should know

Handy metrics based on latency numbers

How to estimate?

Estimation example

Top comments (2)

Read next

Design Pattern: Singleton - using GoLang

Greedy Algorithm Problems in Data Structures and Algorithms (DSA)

System Design of YouTube: A Detailed Deep Dive into the Video Giant

Unveiling the Backbone of YouTube Live Streaming: A Deep Dive into YouTube’s Architecture and Real-Time Video Processing