DEV Community

Msaghu
Msaghu

Posted on • Edited on

Free AWS BootCamp WEEK 4 PART 2: PostgreSQL Databases in AWS

Hi guys, we will now cover how to install PostgreSQL on our local machine for use to run SQL commands. But first, we will have go through some basic definitions:

What is AWS Relational Database Service(RDS)?

What are the advantages of AWS Relational Database Service(RDS)?

  • Allows you to create and scale relational databases in the cloud.
  • RDS runs on virtual machines (can’t log in to the OS or SSH in).
  • AWS handles admin tasks for you like hardware provisioning, patching & backups.
  • RDS is not serverless — (one exception Aurora Serverless)
  • Allows you to control network access to your database Offers encryption at rest — done with KMS (data stored, automated backups, read replicas and snapshots all encrypted)
  • AWS Aurora is the managed version of SQL databases that is fully managed by AWS.
  • When create a database we need to be sure that it has been created in the same region where our account is.
  • You can create and modify a DB instance by using the AWS Command Line Interface (AWS CLI), the Amazon RDS API, or the AWS Management Console.
  • Amazon RDS is responsible for hosting the software components and infrastructure of DB instances and DB cluster. We are responsible for query tuning, which is the process of adjusting SQL queries to improve performance.
  • We can run our DB instance in several Availability Zones, an option called a Multi-AZ deployment. When we choose this option, Amazon automatically provisions and maintains one or more secondary standby DB instances in a different Availability Zone. Your primary DB instance is replicated across Availability Zones to each secondary DB instance.
  • We use security groups to control the access to a DB instance. It does so by allowing access to IP address ranges or Amazon EC2 instances that you specify.

Supported RDS Database engines

A DB engine is the specific relational database software that runs on your DB instance. Amazon RDS currently supports the following engines:
-MariaDB

-Microsoft SQL Server

-MySQL

-Oracle

-PostgreSQL

RDS Main Features

Multi AZ Recovery

  • Have a primary and secondary database, if you lose the primary database, AWS would detect and automatically update the DNS to point at the secondary database.
  • Used for DISASTER RECOVERY, it doesn’t improve performance.

Read Replicas

  • Every time you write to the main database, it is replicated in the secondary database.
  • If you lose the primary database there is no automatic failover, you need to manually update the URL to it yourself
  • IMPROVES PERFORMANCE
  • Used for scaling
  • Automatic backups must be turned on
  • Up to 5 read replicas of any database
  • It is possible to have read replicas of read replicas - but this can introduce latency.
  • Each read replica has its own DNS
  • Can have multi AZ

RDS Backups

Automated Backups

  • Allows you to recover your database to any point in time within the specified retention period (Max 35 days)
  • Takes daily snapshots and stores transition logs
  • When recovering AWS will choose the most recent backup
  • Enabled by default
  • Backup data is stored in S3
  • May experience latency when backup is being taken
  • Backups are deleted once you remove the original RDS instances

Database Snapshot

  • User-initiated, must be manually done by yourself
  • Stored until you explicitly delete them, even after you delete the original RDS instance they are still persisted. However this is not the case with automated backups.

Step 1: Install PostgreSQL

  • Go to the link below
    Follow the link

  • Select click on the installer and choose the latest version to begin the download process.

  • Follow the prompts, and finish download(do not choose the Stack driver for this purpose).

  • From Windows , open Postgresql and copy the shell and pgadmin4 onto the desktop.

Step 2: Connect to the DB client.

  • We can connect using 2 options:
    1. GUI Client - i.e using DataGrip, Postico(for MAC users), pgAdmin(for Windows users).
    2. Terminal/CMD - allows us to use the commands to access the DB.
    3. Application - we write a server side application to return data from our database.

Step 3: Startup the Database Server to connect to our local server.

  • We will open the PostgreSQL shell, we can also use this to connect to a remote server by pasting in the url during
Server[localhost]: --paste in url here--
Enter fullscreen mode Exit fullscreen mode
  • So we will press enter to cycle through the following
Server[localhost]:
Database[postgres]:  
Port[5432]: 
Username[postgres]:
Password for user postgres:  (now enter the password we used when configuring our system)
Enter fullscreen mode Exit fullscreen mode
  • To use the GUI , click on pgAdmin4

Step 4: Create a Database

  • While using the Postgres shell, type the following commands
psql: \l                    --------> lists all databases
psql: CREATE DATABASE test; --------> creates a database with the name test
Enter fullscreen mode Exit fullscreen mode

Step 5: Connect to a Database

  • To be able to use Postgresql in our cmd, we need to add Postgresql to PATH, we can do this by adding C:\Program Files\PostgreSQL\15\bin and C:\Program Files\PostgreSQL\15\lib to path

  • Therefore to access the postgres client from the CMD type

Users> psql --help

Enter fullscreen mode Exit fullscreen mode
  • In the command below postgres is the default username and test is the database we created above. The command will drop us into the postgres shell for the existing DB.
Users> psql -h localhost -p 5432 -U postgres test
Enter fullscreen mode Exit fullscreen mode

-To connect to the database when we are in the postgres shell, we will run:

test=# \l                      ----> lists all databases
test=# \c test
test=# \c killmonger           ----> to connect to another database called killmonger, we simply use the command
test=# DROP DATABASE test;     ----> to permanently delete a database

Step 6: Creating tables and adding data in PostgreSQL
- While logged into our database above we will run the following
Enter fullscreen mode Exit fullscreen mode

test=# CREATE TABLE person(
test(# id INT,
test(# first_name VARCHAR(50),
test(# last_name VARCHAR(50),
test(# gender VARCHAR(7),
test(# date_of_birth DATE);

-- To se the table we will run
test=# \d ----> to describe the table
test=# \d person ----> shows the table schema


- To create a table with constraints i.e we must have values
Enter fullscreen mode Exit fullscreen mode

test=# CREATE TABLE person(
test(# id BIGSERIAL NOT NULL PRIMARY KEY,
test(# first_name VARCHAR(50) NOT NULL,
test(# last_name VARCHAR(50) NOT NULL,
test(# gender VARCHAR(7) NOT NULL,
test(# date_of_birth DATE) NOT NULL,
test(# email VARCHAR(150) );

-- To se the table we will run
test=# \d ----> to describe the table
test=# \d person ----> shows the table schema


- To add records(which are rows) into tables
INSERT INTO person(
    first_name,
    last_name,
    gender,
    date_of_birth)
VALUES ('Anne', 'Smith', 'FEMALE', DATE '1988-01-09');

Enter fullscreen mode Exit fullscreen mode

test=# \dt ----> to see just the table
test=# INSERT INTO person(first_name, last_name, gender, date_of_birth)
test=# VALUES( 'Anne', 'Smith', 'FEMALE', '1988-01-09');

--To add a value with the email column:
test=# \dt ----> to see just the table
test=# INSERT INTO person(first_name, last_name, gender, date_of_birth, email)
test=# VALUES( 'Jake', 'James', 'MALE', '1988-01-09', 'jake@gmail.com');


- As we have seen so far, physically typing in the data is time-consuming, therefore, if we need to work with large datasets , the exercise will be almost impossible to achieve, therefore to add multiple people we can use websites such as **Mockaroo**

- We will create our file there and open it with VSCode to ensure that some values are NOT NULL as below
Enter fullscreen mode Exit fullscreen mode

create table person (
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(150),
gender VARCHAR(15) NOT NULL,
date_of_birth DATE NOT NULL,
country_of_birth VARCHAR(50)
);


- To import/ open it in postgres we will run
Enter fullscreen mode Exit fullscreen mode

--be sure to copy the file path from where our file is located and make use of the slash as shown below and wrap in single quotes

test=# \i 'C:/path/path/person.sql'


- To make sure that we only have current values, we can drop the previous table and create a new table 
Enter fullscreen mode Exit fullscreen mode

test=# \d person
test=# DROP TABLE person; ---> since some values in our imported table are not in the existing table, we will delete the existing table
test=# \d
test=# \i 'C:/path/path/person.sql'
test=# SELECT * FROM person; --->all values are now pulled in


- To ensure that they are automatically numbered, add id into the code in VSCode
Enter fullscreen mode Exit fullscreen mode

create table person (
id BIGSERIAL NOT NULL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(150),
gender VARCHAR(15) NOT NULL,
date_of_birth DATE NOT NULL,
country_of_birth VARCHAR(50)
);


- We will repeat the following commands:
Enter fullscreen mode Exit fullscreen mode

test=# DROP TABLE person; ---> since some values in our imported table are not in the existing table, we will delete the existing table
test=# \i 'C:/path/path/person.sql'
test=# SELECT * FROM person; --->all values are now pulled including the id




**Step 7: Using commands**
test=# SELECT * FROM person;  ----> shows all columns/fields in the table
test=# SELECT first_name, last_name FROM person;

***Sorting using the ORDER BY keyword***

test=# SELECT * FROM person ORDER BY country_of_birth;  ----> sort by ascending order by default
test=# SELECT DISTINCT(country_of_birth) FROM person ORDER BY country_of_birth;        ----> to only get the country value once

***Sorting using the WHERE keyword***
test=# SELECT * FROM person WHERE gender = 'Female';
test=# SELECT * FROM person WHERE gender = 'Female' AND (country_of_birth='Poland' OR country_of_birth='China');
test=# SELECT * FROM person WHERE gender = 'Female' AND (country_of_birth='Poland' OR country_of_birth='China') AND last_name = 'Kimble';

***Comparison operators***
test=# SELECT 1 < 1;
test=# SELECT 1 >= 2;
test=# SELECT 1 <> 1;   ----> 1 not equal to 2
test=# SELECT 1 + 1;

***LIMIT and OFFSET keywords***
test=# SELECT * FROM person LIMIT 12;
test=# SELECT * FROM person OFFSET 5 LIMIT 5;   ----> starts from id 6 to 10
SELECT * FROM person OFFSET 5 FETCH FIRST 5 ROW ONLY;
test=# SELECT * FROM person OFFSET 5 FETCH FIRST 1 ROW ONLY;

***IN keywords***
test=# SELECT * FROM person WHERE country_of_birth='China' OR country_of_birth='Brazil' OR country_of_birth='France';
test=# SELECT * FROM person WHERE country_of_birth IN('China', 'Brazil', 'France');
test=# SELECT * FROM person WHERE country_of_birth IN('China', 'Brazil', 'France', 'Kenya', 'Cambodia') ORDER BY gender;
test=# SELECT * FROM person WHERE country_of_birth IN('China', 'Brazil', 'France', 'Kenya', 'Cambodia') ORDER BY country_of_birth;

***BETWEEN keyword to get data from a range***
test=# SELECT * FROM person WHERE date_of_birth BETWEEN DATE '2022-01-01' AND '2022-03-30';

***LIKE keyword values against a pattern using wildcards***
- % is a wildcard thus accepts all entries
- ILIKE accepts both uppercase and lowercase
test=# SELECT * FROM person WHERE email LIKE '%@google.com'; 
test=# SELECT COUNT(*) FROM person WHERE email LIKE '%@google.%'; 
test=# SELECT COUNT(*) FROM person WHERE country_of_birth ILIKE 'p%';       ----> ignores the case

***GROUP BY ***
test=# SELECT country_of_birth, COUNT(*) FROM person GROUP BY country_of_birth;
test=# SELECT country_of_birth, COUNT(*) FROM person GROUP BY country_of_birth ORDER BY country_of_birth; 

***HAVING keyword***
test=# SELECT country_of_birth,COUNT(*) FROM person GROUP BY country_of_birth HAVING COUNT(*)<=140 ORDER BY country_of_birth;





####RESOURCES
1. [Adding to environment variables to PATH in Windows](https://www.youtube.com/watch?v=2oAM4Q-9DMU)
2. [PostgreSQL Data types - Documentation](https://www.postgresql.org/docs/current/datatype.html)
3. [PostgreSQL Aggregate Function Documentation](https://www.postgresql.org/docs/9.5/functions-aggregate.html)
Enter fullscreen mode Exit fullscreen mode

Top comments (0)