Thy Pham

Posted on Jan 27, 2022

What we need to know about the database

#programming #database #computerscience

In this post, I'm not trying to explain every point in detail, but rather giving an overview with a brief description of some important aspects that we need to keep in mind while working with databases.

Types of databases

Relational databases

Data are structured using tables with columns and rows. The relationship between tables is predefined. SQL is used to query the data in the database.

E.g. PostgreSQL, MySQL, MariaDB, SQLite.

Non-Relational Databases (NoSQL - aka Not Only SQL)

Data are not organized in tables but rather have a more flexible structure. These databases are designed for easy scaling horizontally and ease of use. There are some main types of Non-Relational databases:

Document databases

Document databases store each item in a JSON, BSON, or XML encoded object called a document.

E.g. MongoDB, CouchDB,

Key-Value store

Each item has a key and a value. E.g. name=John, last_name=Doe.

E.g., Redis, Memcached

Column-oriented databases

Data is stored in columns. It is normally used when our data items have a huge amount of properties (each property is stored in one column), and we just want to query some of the properties to avoid unnecessary data.

E.g., Cassandra, HBase

Graph databases

Data is held in nodes and edges. The main focus of this type of database is the connection between nodes (using edges).

E.g., Neo4j, Amazon Neptune

Consistency model

ACID

Atomic, Consistent, Isolated, Durable

BASE

Basic Availability, Soft-state, Eventual consistent

Database replication

The data has its copies on multiple interconnected machines. These machines can be in different locations. To reduce access latency, increase availability and read throughput. Some typical database replication techniques are:

Single-Leader

New data is written on only one single machine called Leader. The data will then be replicated to other machines. These machines are called Replicas. Reading data can be done on both Leader and Replicas.

Multi-Leader

Same as Single-Leader, but there are multiple Leaders. Each data center has one Leader. Data written to one Leader will be replicated to the other Leaders.

Leaderless

There is no Leader. The Replicas receive read and write requests directly.

Transaction

We use transactions when we want to execute a set of operations that can either be successful altogether or not at all.

Index

An index is a data structure that helps read data faster with a charge of slower write operations and more space needed.

Database backup

Database backup is a way to make a copy of the data. In case of a disaster that causes data loss, we can use this copy to restore our important data.

Full backup

Backup everything

Differential backup

Only backup what was changed since the last full backup

Incremental backup

Backup what was changed since the last incremental backup

Database migration

Our software often requires changes in the code and the database schema. When we deploy new code to production, the database should be updated as well. This process is called database migration. We have to make sure that during the migration progress, the database can always be rolled back to its original state if something wrong happens.

(Bonus) Serverless databases

Cloud providers like Amazon or Google will manage everything (provisioning, scaling, maintenance, etc.) for us. We just need to pay for what we use (pay-per-use).

Top comments (1)

LarryWilliams • Jan 28 '22

What do I need to know about database?
powerful islamic mantras

Forem