Mwenda Harun Mbaabu

Posted on Nov 26, 2024

Exploring Advanced SQL Techniques: Pivoting, Data Modeling, and Interview Challenges

I have been polishing my SQL skills and have been practicing many SQL interview questions, and here is one of the most interesting questions I have come across. The challenge involves working with a sales dataset and using SQL pivot operations to analyze and summarize data.

Question 1.
We have a table sales_transactions that contains records of product sales, including details like transaction_id, product_name, category, region, sales_amount, and sales_date. The task is to write an SQL query that pivots the data to show the total sales for each product category by region.

Table Structure:
The sales_transactions table has the following columns:

The goal:
The goal is to pivot the sales data so that for each region, you can see the total sales for each category.

The result should look something like this:

1). Write an SQL query to pivot this data and return the total sales amount for each product category by region.

2). Explain your approach to pivoting this data in SQL and the reasoning behind the steps.

HINT: Note that in this case, you can use the PIVOT operator, which is available only in SQL Server and Oracle, or alternatively, you can use the CASE statement.

You can use this interactive web-based compiler to complete the solution: https://onecompiler.com/mysql/42z9a9edr

More Questions.

1). Database Normalization and Denormalization.

Question:
You are tasked with designing a database for an e-commerce system. The system has users, products, orders, and reviews.'
What normalization steps would you take to design the schema? What are the potential trade-offs if you decide to denormalize the schema for better query performance?

Key Concepts to Discuss:

Normalization (1NF, 2NF, 3NF, BCNF)
Denormalization and its impact on performance
Redundancy vs. query optimization trade-offs
Use of indexing and partitioning

2). Designing a Fact and Dimension Model for Data Warehousing.

Question:
Imagine you are designing a data warehouse for a retail company. You need to model the sales data. Design a fact table and dimension tables, and explain how you would handle slowly changing dimensions (SCDs).

Key Concepts to Discuss:

Fact table design (granularity, foreign keys)
Dimension tables (static vs. slowly changing dimensions)
Types of SCDs (SCD Type 1, SCD Type 2, SCD Type 3)
Surrogate keys vs. natural keys
Schema design: Star schema vs. Snowflake schema

3). Handling JSON Data in SQL.

Question:
You have a user_data table that stores user profiles as JSON data in the profile column. The JSON structure includes first_name, last_name, and preferences (which is a nested JSON object containing theme, notifications, etc.).

Write an SQL query to extract the first_name and theme from the JSON data.

Expected SQL Query (for databases like MySQL or PostgreSQL):

SELECT 
    profile->>'$.first_name' AS first_name,
    profile->>'$.preferences.theme' AS theme
FROM 
    user_data;

Key Concepts to Discuss:

Handling JSON data types in SQL (JSON functions in MySQL/PostgreSQL)
Querying nested JSON objects
Performance considerations for querying JSON fields

4). SQL for Data Quality Checks

Question:
You are tasked with performing data quality checks on a customer database that contains a table customers with the following columns: customer_id,email, phone_number, date_of_birth, and registration_date.

Write an SQL query to identify all customers who have either no phone number or an invalid email address.

Expected SQL Query:

SELECT customer_id, email, phone_number
FROM customers
WHERE phone_number IS NULL 
   OR email NOT LIKE '%@%.%';

Key Concepts to Discuss:

Handling missing values (IS NULL)
Basic data validation using string patterns (e.g., LIKE for email validation)
Data cleaning techniques in SQL

4). Transaction Management and ACID Properties.

Question:
You are designing a banking system and need to handle transfers between accounts.

Write an SQL transaction that transfers money from one account to another, ensuring that the transaction follows the ACID properties (Atomicity, Consistency, Isolation, Durability).

Expected SQL Query:

START TRANSACTION;

UPDATE accounts 
SET balance = balance - 500 
WHERE account_id = 1;

UPDATE accounts 
SET balance = balance + 500 
WHERE account_id = 2;

COMMIT;

If there is an error in the middle of the transaction, the changes should be rolled back.

Key Concepts to Discuss:

The use of START TRANSACTION, COMMIT, and ROLLBACK
Isolation levels: READ COMMITTED, SERIALIZABLE, REPEATABLE READ
Ensuring consistency during transactions

I understand that some of the problems and questions might be challenging, but mastering advanced SQL and data modeling techniques is essential for tackling complex real-world data challenges. Whether you're optimizing queries, designing efficient database schemas, or managing large datasets, the skills you develop will be key to your success in data analysis, engineering, or software development roles. Keep practicing, stay curious, and continue to build your expertise

Top comments (3)

Aasir Waseer • Nov 27 '24

This exceptional SQL guide brilliantly demystifies advanced database techniques, offering a comprehensive roadmap for professionals seeking to elevate their technical skills. By presenting intricate interview-style challenges across data pivoting, normalization, and transaction management, the document transforms complex concepts into actionable strategies.

Mwenda Harun Mbaabu • Nov 27 '24

Thank you for the feedback.

Shawn Guillemette • Nov 27 '24

No windowing functions? Those are great analytical parts of SQL that i would suggest before a pivot operator is taught