DEV Community

Cover image for System Design : Basics
Bhushan Rane
Bhushan Rane

Posted on

System Design : Basics

Back of Envolp

Before Starting the HLD start with Back of Envolp Estimations

BOE drives decision for System Design:

Consideration: Rough Estimation

  • Keep assumption value simple like taking multiplication in 10.
  • Lets see cheat sheet: Everything is is mul. 3 zeros

Value ---------- | Traffic ------------ | Storage

| 3 Zeros-----------> | Thousand----------- > | KB
| 6 Zeros----------- > | Million-----------> | MB
| 9 Zeros----------- > | Billion----------- > | GB
| 12 Zeros----------- > | Trillion------ ----- > | TB
| 15 Zeros----------- > | Quilon ----------- > | PB

Other Consideration

char - 2 byte
long/double - 8 byte
Image - 300 KB

Formula:
X million User * Y mb = XY TB
6 zeros 6 Zeros 12 Zeros

so,
5 million User * 2 KB = 10 GB
6 Zeros -----------3 Zeros -----------9 Zeros

Lets take an example now, any Social Media

Given Data :

Total User - > 1 Billion
DAU (Daily Active User) - > 25% Total User - > 250 million User
Read & Write : 5 read & 2 write

a) Traffic :

Calculations:

  • 250 Million USer * (5 + 2) query / 1 Day (86400 sec = ~ 100000) 17500 ~= 18 K Query/Sec.

b) Storage Assumption :

Every user writing 2 post ( lets say each is 250 char)
consider 10% User upload 1 image ( 1 img = 300 kb)

1 post - > 250 char = > 250*2 byte.
2 post - > 500 char = > 1000byte = > 1 KB.

250 million user * 1 KB = (250 * 10^6) * (1 * 10^3) = 250 GB/DAY

25 Million * 300 KB = 7.5 GB ~= 8 TB/DAY

lets say we store data for 5 year then total storage requir :

5 year = 1825 days ~= 2000 day
Total Storage : 2000 * 250 GB + 2000 * 8 TB = 16.5 PB

c) Ram Estimation :

  • For each user last 5 post we are pulling each. 1 post = 500 byte from above calculations. 5*500 = 2500 byte / user ~ = 3000
  • 250 million * 3000 = 750 GB - memory Space
  • 1 Machine = 75 GB data in memory so 10 machine we needed for cache

Latency : 95% 500ms (Rough Estimation)

  • 18 Query/sec
  • 1 server = > 50 thread = > 100 request / sec
  • Server No : 18k/100 = 180 server

Trade OFF

Since it is read heavy system , consistency we can ignore
so in CAP theorem we need AP.

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (0)

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay