interviewgpt

Posted on May 17

20 Tricky SQL Interview Questions That Separate Good Candidates from Great Ones (2026)

#interview #sql

20 Tricky SQL Interview Questions That Separate Good Candidates from Great Ones (2026)

SQL interviews at top tech companies have evolved. Forget basic SELECT queries — modern data engineering and backend interviews test your ability to handle window functions, complex aggregations, recursive CTEs, and real-world analytical scenarios.

This guide covers the most challenging SQL interview questions you'll face, each linked to a full solution.

What SQL Interviewers Are Really Testing

Most candidates can write a GROUP BY query. What separates top performers is their ability to:

Use window functions (ROW_NUMBER, RANK, LAG, LEAD, SUM OVER) fluently
Write self-joins for hierarchical or sequential data problems
Handle NULLs carefully and intentionally
Think about performance — indexes, query plans, and avoiding full scans
Pivot and reshape data dynamically
Apply set theory to solve complex membership and exclusion problems

The SQL Questions (With Solutions)

📊 Window Functions & Rankings

Nth Highest Distinct Salary

Given an 'Employee' table with 'id' (Primary Key) and 'salary' (Integer), write a SQL query to retrieve the Nth highest distinct salary. If there are fewer than N distinct salaries in the table, the query should return NULL. The solution should be flexible enough to handle any integer input N.
Monthly Department vs Company Salary Benchmarking

You are given two tables: 'Salary' (containing 'id', 'employee_id', 'amount', and 'pay_date') and 'Employee' (containing 'employee_id' and 'department_id'). For every month present in the data, determine if the average salary within each department was higher, lower, or equal to the average salary of the entire company for that same month. Return a result table with the month (formatted as 'YYYY-MM'), the 'department_id', and a 'comparison' column indicating 'higher', 'lower', or 'same'.

🔁 Sequential & Time-Based Problems

Consecutive Product Usage Streak

Given a table filed_taxes with columns filing_id, user_id, filing_date, and product, find all users who have filed their taxes using any 'TurboTax' product for at least three consecutive years. Note that 'TurboTax' products may have different version names (e.g., 'TurboTax Basic', 'TurboTax Deluxe'). Assume each user files at most once per calendar year. Return the results sorted by user_id.
Consecutive SaaS Subscription Retention

Given a table filed_taxes with columns filing_id, user_id, filing_date, and product, identify users who have demonstrated loyalty by using any version of the 'TurboTax' product suite for at least three consecutive calendar years. Note that 'TurboTax' may appear in the product column under various names (e.g., 'TurboTax Deluxe 2023', 'TurboTax Free Edition'). A user may have multiple entries, but only one filing per year should be considered for the continuity count. Output the list of unique user IDs sorted numerically/alphabetically.
Consecutive High Attendance Records

You are given a table Stadium containing stadium visit data with columns id (integer), visit_date (date), and people (integer). The id is an auto-incrementing value that corresponds with the chronological order of the visit_date. Write a SQL query to identify all sequences of three or more consecutive IDs where each visit in that sequence had a population of 100 or more people. The result should include the original row details and be ordered by the visit date.

🏅 Customer Retention & Loyalty

Identifying Multi-Year Customer Loyalty

Find customers who made purchases in every year of a multi-year window. Tests intersection and set-based thinking.
Identifying Multi-Year Customer Retention

A variation focusing on customer cohort retention rates across annual periods.

🔄 Data Transformation & Pivoting

Finding Median Rows per Partition Given a table Employee with columns id (INT, PK), company (VARCHAR), and salary (INT), write a SQL query to identify the specific rows that represent the median salary for each company.

Rules for median calculation:

Sort salaries in ascending order for each company.
In case of identical salaries, use id as a secondary ascending sort key to break ties.
If a company has an odd number of employees, return the single middle row.
If a company has an even number of employees, return the two middle rows.

The result should include the id, company, and salary columns for these specific median records.

Title Case Transformation with Hyphen Sensitivity You are given a table user_content with a column content_text. Your task is to generate a report that transforms the text into Title Case based on specific business rules.

The first letter of every word must be capitalized, and all subsequent letters in that word must be lowercase.
The transformation must recognize a hyphen (-) as a word boundary, meaning segments on both sides of a hyphen (e.g., 'high-speed') must be capitalized ('High-Speed').
All original spacing, including multiple consecutive spaces, must be preserved exactly as they appear in the source data.

Return the original content_id, content_text, and the newly transformed_text column.

Correcting Swapped Sequences in Delivery Logs A delivery platform experienced a technical glitch where every pair of consecutive items in their order logs was swapped (i.e., the item intended for ID 1 was recorded for ID 2, and vice versa). You are given a table orders with columns order_id (unique, sequential integers starting from 1) and item (the name of the food).

Write a query to restore the correct mapping. If the total number of orders is odd, the final order ID should remain paired with its original item. The output should be sorted by order_id in ascending order.

Alphabetical Continent Pivot

Given a table Student containing name and continent columns (with potential duplicates), pivot the table so that student names are grouped under their respective continent headers: 'America', 'Asia', and 'Europe'. Within each column, names must be sorted alphabetically. If one continent has more students than another, the extra rows for the smaller continents should be filled with NULL. The result should show the first alphabetical student for each continent in the first row, the second in the second row, and so on.
Identifying Consistent Median Performers

Given a Student table containing names and IDs, and an Exam table containing scores for various exams, define a 'Quiet Student' as one who has participated in at least one exam but has never achieved the highest or lowest score in any exam they sat for. Write a SQL query to find the student_id and student_name of all quiet students, ensuring that students who never took an exam are excluded. Note: If an exam has multiple students with the same top score, all are considered 'loud' for that exam. Sort the output by student_id.

🏢 Organizational Hierarchy

Identifying Senior Management Tiers Given a table employees with columns emp_id, manager_id, and manager_name, define a 'Senior Manager' as an employee who manages at least one individual who is himself/herself a manager, but does not manage anyone who qualifies as a Senior Manager.

Write a SQL query to return the name of every Senior Manager and the total count of their direct reports. If an employee reports to multiple managers, they should be counted as a report for each. Sort the results by the report count in descending order.

Note: A manager is any employee who has at least one direct report.

🔧 Data Quality & Deduplication

Identifying Incomplete Assembly Parts Find assembly records where required component parts are missing. A manufacturing/logistics favorite.

How to Study SQL for Interviews

The three patterns that cover 80% of hard SQL questions:

1. Window Functions

SELECT 
  employee_id,
  salary,
  RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS salary_rank
FROM employees;

Master RANK, DENSE_RANK, ROW_NUMBER, NTILE, LAG, LEAD, SUM OVER, AVG OVER.

2. CTEs for Readability and Recursion

WITH ranked AS (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS rn
  FROM sales
)
SELECT * FROM ranked WHERE rn = 1;

3. Self-Joins for Sequential Logic

SELECT a.user_id
FROM activity a
JOIN activity b 
  ON a.user_id = b.user_id 
  AND DATEDIFF(b.activity_date, a.activity_date) = 1;

CRM Marketing Touch Streak Analysis A marketing team wants to reward highly engaged potential customers. Given a marketing_touches table (event_id, contact_id, event_type, event_date) and a crm_contacts table (contact_id, email), find the email addresses of all contacts who meet two conditions:
They had at least one marketing touch per week for at least three consecutive weeks (weeks are defined as starting on Monday).
They have performed at least one touch of type 'trial_request' at any point in their history.

Ensure the solution handles contacts with multiple touches in a single week correctly and accounts for streaks that might cross year boundaries.

Best-Selling Products per Category with Tie-Breakers

You are given two tables: 'products' (containing product_id, product_name, and category_name) and 'product_sales' (containing product_id, sales_quantity, and rating). Write a PostgreSQL query to identify the top-performing product in every category. The primary metric for performance is 'sales_quantity'. In the event of a tie in sales, the product with the higher 'rating' should be prioritized. If multiple products remain tied after checking both metrics, include all of them. The final result should display the category name and product name, sorted alphabetically by the category name.
Server Fleet Utilization Analysis

A cloud provider tracks server uptime through a series of status logs. You are given a table server_utilization with columns server_id (int), status_time (timestamp), and session_status (string, either 'start' or 'stop'). Each server can start and stop multiple times throughout the period.

Calculate the total uptime for the entire fleet of servers. The result should be returned as the total number of full days (24-hour periods) of cumulative uptime across all servers, rounded down to the nearest integer.

Ensure your solution accounts for the chronological order of events per server and correctly pairs start events with their subsequent stop events.

Identify Retried Payment Transactions

Given a transactions table with columns transaction_id, merchant_id, credit_card_id, amount, and transaction_timestamp, write a query to identify the total number of 'accidental' repeat payments. A repeat payment is defined as a transaction that occurs at the same merchant, with the same credit card, for the exact same amount, within a 10-minute window of a previous transaction. Note: In a sequence of three identical transactions each within 10 minutes of the last, the second and third should be counted as repeats, but the first should not.
Daily Odd and Even Sensor Measurement Totals

Given a table measurements containing measurement_id, measurement_value, and measurement_time, calculate the daily sum of values based on their chronological order within each day. A measurement is 'odd-numbered' if it is the 1st, 3rd, or 5th recorded event of that day, and 'even-numbered' if it is the 2nd, 4th, or 6th. Your output should contain the date, the total sum for odd-numbered measurements, and the total sum for even-numbered measurements for each day represented in the dataset.
Top Grossing Products by Category

Given a product_spend table documenting customer transactions, write a query to retrieve the top two highest-grossing products within each category for the calendar year 2022. Gross spend is defined as the sum of the spend column for a given product. Your output should include the category name, the product name, and the total calculated spend, ordered by category and then by the highest spend.
Top-Selling Products by Category

Given a table product_spend representing Amazon transactions, write a query to identify the top 2 products in terms of total revenue for each category during the year 2022. Your output should display the category name, the product name, and the total spend associated with that product. In cases of identical spend within a category, ensure the ranking logic is consistent with standard business reporting (e.g., handling ties).
Manufacturing Pipeline Bottlenecks

You are analyzing production data for a high-volume manufacturing facility. The table parts_assembly tracks individual steps in the assembly process for various components.

A part is considered 'In Progress' if it has at least one entry in the table. A part is considered 'Unfinished' if there is at least one step in its sequence where the finish_date is missing (NULL).

Write a query to return a unique list of all parts that are currently in the assembly process but have not yet been completed.

Practice Tips

Always think about NULL handling first — COUNT(*) vs COUNT(col) matters
Use EXPLAIN to validate your solution isn't doing full table scans
For "consecutive" problems, the LAG/LEAD approach is almost always cleaner than self-joins
For "per group" questions, think PARTITION BY before GROUP BY
Time-window problems almost always need BETWEEN or DATEDIFF with careful boundary thinking

DEV Community

20 Tricky SQL Interview Questions That Separate Good Candidates from Great Ones (2026)

20 Tricky SQL Interview Questions That Separate Good Candidates from Great Ones (2026)

What SQL Interviewers Are Really Testing

The SQL Questions (With Solutions)

📊 Window Functions & Rankings

🔁 Sequential & Time-Based Problems

🏅 Customer Retention & Loyalty

🔄 Data Transformation & Pivoting

🏢 Organizational Hierarchy

🔧 Data Quality & Deduplication

How to Study SQL for Interviews

The three patterns that cover 80% of hard SQL questions:

Practice Tips

Top comments (0)