DEV Community

Cover image for Day 2 - Importing data and practicing queries
uguremirmustafa
uguremirmustafa

Posted on

Day 2 - Importing data and practicing queries

Today we will learn how to find some structured data and import it into our database.

Generating some dummy data

There are some online tools for generating dummy data for testing purposes. Mockaroo is my favorite.

Let's create some user data and download it as CSV which is one of the most popular ways to store data in a text file.

Make sure your CSV headers match your table column names, or be ready to map them manually. Handle data types carefully - dates, numbers, and text need to be in the right format. Mockaroo has options to create your data in the correct format.

Mockaroo screenshot

Preparing docker container

We need to run some sql statement to import the data.

That's great but how am I going to pass the data to docker?

Volume mapping in a nutshell:

It's like creating a shared folder between your computer and the Docker container.

The Problem: Docker containers are isolated - they can't see files on your computer by default.

The Solution: Volume mapping creates a "bridge" so both your computer and the container can access the same files.

We need to re-create our docker container. Luckly we don't have any crucial data to save so we can delete the container and create it with a volume mapping.

# stop container
docker stop travel_mania_container

# delete container
docker rm travel_mania_container

# create a folder for the project and
# for the data to share between the host and container
cd ~ && mkdir travel_mania_project \
&& cd ./travel_mania_project \
&& mkdir container_data

# re-create container with a volume mapping
docker run --name travel_mania_container \
  -e POSTGRES_USER=ugur \
  -e POSTGRES_PASSWORD=ugur1234 \
  -e POSTGRES_DB=travel_mania \
  -p 5432:5432 \
  -v ./container_data:/data \
  -d postgres
Enter fullscreen mode Exit fullscreen mode

After this operation don't forget to re-create the users table.

Importing CSV

You can import your data using SQL statements or using GUI of your SQL client. But what happens if your column names does not match 1:1 with the table definition?

My sample csv does not match with the actual table definition

In that case, we need to map our column names somehow.

Using column order

First option is to rely on column order. The HEADER option tells PostgreSQL to skip the first row, and you're explicitly mapping the column order by delimiting with comma.

-- Your CSV has:   id, email_addr, username, fname,      lname,     creation_time, update_time
-- Your table has: id, email,      username, first_name, last_name, created_at,    updated_at

COPY users(id, email, username, first_name, last_name, created_at, updated_at)
FROM '/data/MOCK_DATA.csv'
DELIMITER ','
CSV HEADER;
Enter fullscreen mode Exit fullscreen mode

Now you can run the COPY statement we wrote above. Here is the result:

psql select result

Notice how the column names are mapped correctly.

Using a temp table

You can create a temporary table and insert the csv there. Later insert into actual table from that temp table. In this method you don't import into your main table directly and have an opportunity to map columns more granully.

-- Create temp table matching your CSV exactly
CREATE TEMP TABLE temp_users (
  id INTEGER,
  user_email VARCHAR(255),
  username VARCHAR(50),
  fname VARCHAR(100),
  lname VARCHAR(100),
  creation_time TIMESTAMP,
  update_time TIMESTAMP
);

-- Import to temp table
COPY temp_users FROM '/data/MOCK_DATA.csv' DELIMITER ',' CSV HEADER;

-- Insert into your real table with proper mapping
INSERT INTO users
  (id, email, username, first_name, last_name, created_at, updated_at)
SELECT
  id, user_email, username, fname, lname, creation_time, update_time
FROM temp_users;

-- Clean up
DROP TABLE temp_users;
Enter fullscreen mode Exit fullscreen mode

Practicing queries

Question: Find all users whose first name starts with 'J' and last name starts with 'F'. Order by first name ascending.

Solution:

select
    username,
    first_name,
    last_name,
    email
from
    users
where
    first_name like 'J%'
    and last_name like 'F%'
order by
    first_name;
Enter fullscreen mode Exit fullscreen mode

Question: Find users whose username contains numbers and group them by email domain (gmail.com, yahoo.com, etc.).

Solution:

select
    SPLIT_PART(email, '@', 2) as email_domain,
    COUNT(*) as users_with_numbers_in_username
from
    users
where
    username ~ '[0-9]'
    -- Contains at least one number
group by
    SPLIT_PART(email, '@', 2)
having
    COUNT(*) > 0
order by
    users_with_numbers_in_username desc;
Enter fullscreen mode Exit fullscreen mode

End of Day 2

Today we learned how to import large dataset into the database using COPY statement.

Tomorrow we will learn about string functions like SPLIT_PART, LENGTH, TRIM and date functions AGE, DATE_PART, TO_CHAR.

Top comments (0)