DEV Community

Cover image for System Design : Basics
Bhushan Rane
Bhushan Rane

Posted on

System Design : Basics

Back of Envolp

Before Starting the HLD start with Back of Envolp Estimations

BOE drives decision for System Design:

Consideration: Rough Estimation

  • Keep assumption value simple like taking multiplication in 10.
  • Lets see cheat sheet: Everything is is mul. 3 zeros

Value ---------- | Traffic ------------ | Storage

| 3 Zeros-----------> | Thousand----------- > | KB
| 6 Zeros----------- > | Million-----------> | MB
| 9 Zeros----------- > | Billion----------- > | GB
| 12 Zeros----------- > | Trillion------ ----- > | TB
| 15 Zeros----------- > | Quilon ----------- > | PB

Other Consideration

char - 2 byte
long/double - 8 byte
Image - 300 KB

Formula:
X million User * Y mb = XY TB
6 zeros 6 Zeros 12 Zeros

so,
5 million User * 2 KB = 10 GB
6 Zeros -----------3 Zeros -----------9 Zeros

Lets take an example now, any Social Media

Given Data :

Total User - > 1 Billion
DAU (Daily Active User) - > 25% Total User - > 250 million User
Read & Write : 5 read & 2 write

a) Traffic :

Calculations:

  • 250 Million USer * (5 + 2) query / 1 Day (86400 sec = ~ 100000) 17500 ~= 18 K Query/Sec.

b) Storage Assumption :

Every user writing 2 post ( lets say each is 250 char)
consider 10% User upload 1 image ( 1 img = 300 kb)

1 post - > 250 char = > 250*2 byte.
2 post - > 500 char = > 1000byte = > 1 KB.

250 million user * 1 KB = (250 * 10^6) * (1 * 10^3) = 250 GB/DAY

25 Million * 300 KB = 7.5 GB ~= 8 TB/DAY

lets say we store data for 5 year then total storage requir :

5 year = 1825 days ~= 2000 day
Total Storage : 2000 * 250 GB + 2000 * 8 TB = 16.5 PB

c) Ram Estimation :

  • For each user last 5 post we are pulling each. 1 post = 500 byte from above calculations. 5*500 = 2500 byte / user ~ = 3000
  • 250 million * 3000 = 750 GB - memory Space
  • 1 Machine = 75 GB data in memory so 10 machine we needed for cache

Latency : 95% 500ms (Rough Estimation)

  • 18 Query/sec
  • 1 server = > 50 thread = > 100 request / sec
  • Server No : 18k/100 = 180 server

Trade OFF

Since it is read heavy system , consistency we can ignore
so in CAP theorem we need AP.

Top comments (0)