DEV Community

Cover image for Backend Performance 101: Understanding Performance Part 1.
Rajat
Rajat

Posted on • Edited on

Backend Performance 101: Understanding Performance Part 1.

What is Performance

How fast and responsive a system is under a workload on a given hardware.

🔈 Workload(Volume)

  • The volume of processing data
  • The volume of requests

⚙️ Hardware

  • The type of hardware
  • Capacity of the hardware

To Measure a performance of a system these two parameters should be fixed.
When we design a system our goal is that as we increase the workload, our performance should remain stable. It should not severely degrade.
On the hardware side, what we expect is that if we increase the capacity of our hardware or if we bring in superior hardware our system performance should improve.

Identifying Performance Problems

Every performance problem is the result of some queue build-up in the system.

  • HTTP Request IO queue
  • Database IO queue
  • Network queue
  • Task Queue

Reason for queue build-up

  • Slow processing
  • Limited resource capacity
  • Serial Requests

Performance Tenets

Efficient Resource Utilisation

Efficient Resource

  • Memory
  • Network
  • Disk
  • CPU

Efficient Code

  • Algorithms
  • Data Structure
  • Database Queries

Efficient Data Storage

  • Type of Database
  • Database Configuration
  • Database Schema
  • Caching

Concurrency: Handling multiple requests simeananteouly within a system

  • Hardware: Should support concurrency
  • Software: Frameworks, Multi threading/processing, Queuing Mechanism.

Capacity: If we increase the capacity hardware, the performance will increase.

If while designing a system we take care of efficiency and concurrency of a system to make it highly performant

Objectives while designing a high performant system

  • Minimize Request-Response Latency: Latency refers to the time between when a request was received by the system and when a response was delivered to the client.

Latency Depends on the

  • wait/idle Time: Request is waiting to be picked-up for processing
  • Processing Time: When the system is processing the request

    • Maximize Throughput: Throughput is measured as the rate of requests a system can handle.

Throughput depends on

  • Latency
  • Capacity

Performance Metrics

  • Latency
    • Affects user experience
    • As low as possible
  • Throughput
    • Numbers of users than can be supported
    • Greater than the max request rate.
  • Errors
    • Affects the system functionality
    • Errors should be none
  • Hardware Consumption

    • Shouldn't be under and over utilized
  • Tail Latency

Latency

  • Tail Latency is an indication of queuing of requests
    • Gets bad with higher number of request.
  • Average Latency hides the effects of tail latency.

In the provided graph, the majority of requests exhibit a low latency, but there is a small percentage of requests that experience significantly higher latency. When compared to the average latency, these instances of high latency are considered quite poor. Approximately 99% of the requests have low latency, implying that every 100th request will encounter a high latency, which is deemed unacceptable for a high-performing system. Furthermore, if the workload is increased, the tail latency will worsen, resulting in a greater percentage of requests experiencing delays. The cause of the tail latency is the queuing of requests. Therefore, to assess performance, it is insufficient to solely rely on the average latency; measuring the tail latency is also crucial.

In Part 2 we'll discuss about the

  • Network Latency
  • Disk Latency
  • Memory Latency
  • Processing Latency

Top comments (0)