DEV Community

Manoj Kumar Patra
Manoj Kumar Patra

Posted on

AWS Basics - Beginner's Guide to RDS

RDS

RDS stands for Relational Database Service

RDS is generally used for Online Transaction Processing (OLTP) workloads.

RDS Datatypes include: SQLServer, Oracle, MySQL, PostgreSQL, MariaDB and Amazon Aurora.

How is OLTP different from OLAP (Online Analytics Processing)?

OLTP OLAP
Processes data from transactions in real time Processes complex queries to analyse historical data
It is all about data processing and completing large number of small transactions in real time It is all about data analysis using large amounts of data, and complex queries that take a long time to complete

Is RDS suitable for analyzing large amounts of data?

RDS is not suitable for analyzing large amounts of data. Instead use a data warehouse like RedShift which is optimized for OLAP.

Steps to connect an RDS MySQL instance to an EC2 instance

  1. Launch a RDS instance from Services/Database/RDS/Create Database.
  2. Launch an EC2 instance
  3. Install the MySQL database client on the EC2 instance with User Data.

    #!/bin/bash
    yum update -y
    yum install mysql -y
    
  4. Check the mysql install status on EC2 instance with mysql --version from the command line.

  5. Edit the security groups for the RDS instance to include a new Inbound Rule with type MySQL/Aurora and source set to EC2 instance's security group that the RDS instance should connect to.

  6. Connect from the EC2 instance to RDS instance using the MySQL database client:

mysql -u <DB_username> -p -h <DB_endpoint> <DB_name>.

Some common MySQL commands are as follows:

  • status: To check the status of the database
  • show databases;: To list all databases
  • exit: To quit the database connection

Backup RDS Data

Database snapshot Automated Backup
Manual, ad-hoc and user initiated Enabled by default
Provides a snapshot of the storage volume attached to the RDB instance Creates daily backups or snapshots that run during a backup window we define
Creates transaction logs used to replay transactions during the restore process

Automated Backup

Point-in-time recovery: Recover database to any point in time within a defined retention period (1 - 35 days).

In order to do this, it takes a full daily backup (or snapshot) storing it in S3 and stores transaction logs throughout the day.

During the recovery, AWS will choose the most recent daily backup and then run the transaction logs for that day up to the recovery point.

Free storage space for automated backup is equal to the size of the database.

Database snapshot

Database snapshots are not automated. They don't have a retention period, which means, they are not deleted even after deleting the RDS instance.

Database instances can be backed up and restored to a known state frequently and at any time.

The restored version of the database will always be a new RDS instance with a new DNS endpoint.

Database encryption

Encryption can only be enabled at creation time.

Encryption is done using AWS Key Management Service (KMS) and is of type AES-256.

Encrypting a database instance also encrypts all underlying storage, automated backups, snapshots, logs and read replicas.

How can we encrypt a database instance at a later stage?

To encrypt a database instance at a later stage, create a snapshot, then encrypt the snapshot and finally, perform a database restore using the encrypted snapshot.

RDS Multi-AZ

Multi-AZ is an exact copy of the production database (primary instance) in another Availability Zone. These copies are also referred to as standby instances.

Multi-AZ is for disaster recovery, and not for improving performance. So, standby instances are accessible from EC2 instances only in case of failure in connection to the primary instance or when a maintenance is going on.

To improve performance, use Read replicas.

Almost all RDS types can be configured as Multi-AZ - SQL Server, Oracle, MySQL, Postgres, MariaDB.

Read replicas

A read replica is a read-only copy of the primary database.

Read replicas can be located in the same availability zone as the primary database, in a different availability zone or even be in a different region.

Read replicas will have their own DNS endpoints.

Read replicas can be changed to their own databases. If we do so, it will break the connection to it's primary database thus, stopping further replication.

Automatic backups must be enabled in order to deploy a read replica.

Multiple read replicas: MySQL, MariaDB, PostgreSQL, Oracle and SQL server allow up to 5 read replicas on each DB instance.

Read replicas are good for read-heavy workloads.

Top comments (0)