Sabiha Ali

Posted on Jan 28, 2023 • Edited on Jan 30, 2023

Columnar database and Row database — how are they different?

#aws #database #analytics #olap

We want to make our databases fast so that the queries which we run should return the results fast.

By changing how we store the data in our database we can make the database respond to the queries and optimize our database to either serve the analytical workload or the transactional workload which we call OLAP and OLTP.

Computers in general, even the databases, read the data from disks, hard disks. We are not talking about the RAM data which gets deleted once the computer is shut down but we are talking about the hard disks where the data is stored permanently.

On these storage disks data is stored in the form of blocks. it is evident that if the computer has to read fewer blocks, it is going to take less time and if the computer is going to read data from more blocks, it is going to take more time.

If the data which we query are on fewer blocks, then the response of the database would be faster.

Let us understand this with an example

This is a sample sales table

Let us see how this data will be stored in a row-based data store.

In the row-based format, the entire record is stored in one block meaning as you can see in the figure, the record of the first customer — — -the customer name, item ID, and sales amount is stored in the first block.

Therefore, if we are looking for the first customers details you can find all that in one single block. This type of data stores will be great for transactional databases where we need the details of a single entity.

If I have to take the average of all the sales then I would have scan through all the five blocks of data, resulting in more IO and more cost.

Let us see how this data will be stored in a row-based data store.

Data in the column stores are stored differently. Here entire columns are written down inside one block so here we have all the customer names in one block all the item IDS in the next block and all the sales amount in the third block.

So, instead of having each customer’s data together we have the whole column of data together.

If I want to access the sum or the average of all the sales amount this database would be fast because everything would be saved in one single block.

I will use this type of database for analytical queries like sum or avarages etc

Columnar database is also very suitable for compression because each block will have the same type of data either numbers or strings etc. so it would be very easy to compress the data in the block efficiently.

How would queries work with these different storage methods
Transactional style query
Eg:

SELECT Customer_name , Sales

FROM data

WHERE Item_ID=3000

This is a transactional style query where we need all details of one person.

In the row database, we need to scan the data from just one block whereas in the columnar database we would have to scan through a lot of blocks.

Analytical style query
Eg:

SELECT Item_ID, count(1)

FROM data

GROUP BY Item_ID

Here we want to count how many observations we have from each Item_ID, which could be done by reading a single block in the Columnar database, whereas in the row database we would have to scan through all the blocks

Happy Databasing !!!!

By, Sabiha Ali, Solutions Architect, ScaleCapacity Inc.

Deploy and scale your apps on AWS and GCP with a world class developer experience

Coherence makes it easy to set up and maintain cloud infrastructure. Harness the extensibility, compliance and cost efficiency of the cloud.

Learn more

DEV Community

Columnar database and Row database — how are they different?

Deploy and scale your apps on AWS and GCP with a world class developer experience

Top comments (0)

A Workflow Copilot. Tailored to You.

Read next

How to Fix Next.js CloudFront Permission Denied: Complete Guide to Static Export URL Issues

Automating Data Analysis with Python: A Hands-On Guide to My Project

Amazon Q Developer Tips: No.13 Generating perfect functions

🚀 "Building an Enterprise-Grade 💼 Generative AI Chatbot 🤖 Using Amazon Q" 🔍

Okay