DEV Community

Cover image for Back-of-the-envelope Estimation System Design
Pranjal Sharma
Pranjal Sharma

Posted on

Back-of-the-envelope Estimation System Design

Back-of-the-envelope estimation is a technique used to quickly approximate values and make rough calculations using simple arithmetic and basic assumptions.


Estimation Techniques

1) Rule of Thumb →

General principals applied to make good estimates. eg : 1 user generates 1MB of data on social media / day.

2) Approximations →

Rounding of complex calculations to powers of 10 or 2 to simply and get to the estimates easily. eg: 1 day = 10^5 seconds.

3) BreakDown and aggregation →

Breaking down bigger problems to smaller components and estimating them individually along with aggregating or combining them to reach the results. eg: Social media data = User Data + Multimedia Data + Metadata .

4) Sanity check →

Just having an overall check over the possibility of the estimates not varying a lot from reality is needed at last . For eg : The numbers achieved should match the original real life data.


Types of Estimations

1) Load Estimations

Designing a post generation social media platform.
Daily Active Users ( DAU ) → 100 Million
Avg. Posts → 10 per user per day
Total posts → 100 M * 10 = 1B post/day

Hence Request rate per second = 1B / 10^5 requests/second = 10000 req/sec.

2) Storage Estimations

Twitter Storage
DAU → 500 M
1 user = 3 tweets (avg)/day
1 tweet text ~ 250B
1 photo ~ 200KB [10% contain photo]
1 video ~ 300MB [5% contain video]

Total storage/day ~ 1500M * (250B + 20KB + 15KB)
~ 375 GB + 30TB + 225TB ~ 255TB

3) Bandwidth requirements

  • Estimate the daily amount of incoming data to the service.
  • Estimate the daily amount of outgoing data from the service.
  • Estimate the bandwidth in Gbps (Gigabits per second) by dividing the incoming and outgoing data by the number of seconds in a day.

4) Latency Estimation

For eg. API consist of RestCall 1 , Rest Call 2 , Rest Call 3

Total Latency → 50ms + 100ms + 150ms ~ 300ms [ if it is sequential ]
→ max(50,100,150) ~ 150ms [ if it is parallel ]

5) Resource Estimation

1 req ~ 10ms of CPU
total req ~ 10000req/sec
total cpu time ~ 10000 * 10 = 100000 ms/sec.
1 CPU can handle 1000ms/sec
Total CPU core = 100000 / 1000 = 100

Top comments (0)