Disaster recovery (DR) is a process of regaining access and functionality of the infrastructure after events like a natural disaster, cyber attack, or even business disruptions.
Disaster recovery relies upon the replication of data and computer processing in an off-premises location not affected by the disaster. When servers go down because of a disaster, a business needs to recover lost data from a second location where the data is backed up. Ideally, an organization can transfer its computer processing to that remote location as well in order to continue operations.
Disaster Recovery is often not actively discussed during system design interviews but it's important to have some basic understanding of this topic. You can learn more about disaster recovery from AWS Well-Architected Framework.
Why is disaster recovery important?
Disaster recovery can have the following benefits:
- Minimize interruption and downtime
- Limit damages
- Fast restoration
- Better customer retention
Terms
Let's discuss some important terms relevantly for disaster recovery:
RTO
Recovery Time Objective (RTO) is the maximum acceptable delay between the interruption of service and restoration of service. This determines what is considered an acceptable time window when service is unavailable.
RPO
Recovery Point Objective (RPO) is the maximum acceptable amount of time since the last data recovery point. This determines what is considered an acceptable loss of data between the last recovery point and the interruption of service.
Strategies
A variety of disaster recovery (DR) strategies can be part of a disaster recovery plan.
Back-up
This is the simplest type of disaster recovery and involves storing data off-site or on a removable drive.
Cold Site
In this type of disaster recovery, an organization sets up basic infrastructure in a second site.
Hot site
A hot site maintains up-to-date copies of data at all times. Hot sites are time-consuming to set up and more expensive than cold sites, but they dramatically reduce downtime.
This article is part of my open source System Design Course available on Github.
karanpratapsingh / system-design
Learn how to design systems at scale and prepare for system design interviews
System Design
Hey, welcome to the course. I hope this course provides a great learning experience.
This course is also available on my website and as an ebook on leanpub. Please leave a β as motivation if this was helpful!
Table of contents
-
Getting Started
-
Chapter I
-
Chapter II
-
Chapter III
- N-tier architecture
- Message Brokers
- Message Queues
- Publish-Subscribe
- Enterprise Service Bus (ESB)
- Monoliths and Microservices
- Event-Driven Architecture (EDA)
- Event Sourcing
- Command and Query Responsibility Segregation (CQRS)
- API Gateway
- REST, GraphQL, gRPC
- Long polling, WebSockets, Server-Sent Events (SSE)
-
Chapter IV
- β¦
Top comments (0)