20 Tricky SQL Interview Questions That Separate Good Candidates from Great Ones (2026)
SQL interviews at top tech companies have evolved. Forget basic SELECT queries — modern data engineering and backend interviews test your ability to handle window functions, complex aggregations, recursive CTEs, and real-world analytical scenarios.
This guide covers the most challenging SQL interview questions you'll face, each linked to a full solution.
What SQL Interviewers Are Really Testing
Most candidates can write a GROUP BY query. What separates top performers is their ability to:
-
Use window functions (
ROW_NUMBER,RANK,LAG,LEAD,SUM OVER) fluently - Write self-joins for hierarchical or sequential data problems
- Handle NULLs carefully and intentionally
- Think about performance — indexes, query plans, and avoiding full scans
- Pivot and reshape data dynamically
- Apply set theory to solve complex membership and exclusion problems
The SQL Questions (With Solutions)
📊 Window Functions & Rankings
Nth Highest Distinct Salary
Given an 'Employee' table with 'id' (Primary Key) and 'salary' (Integer), write a SQL query to retrieve the Nth highest distinct salary. If there are fewer than N distinct salaries in the table, the query should return NULL. The solution should be flexible enough to handle any integer input N.Monthly Department vs Company Salary Benchmarking
You are given two tables: 'Salary' (containing 'id', 'employee_id', 'amount', and 'pay_date') and 'Employee' (containing 'employee_id' and 'department_id'). For every month present in the data, determine if the average salary within each department was higher, lower, or equal to the average salary of the entire company for that same month. Return a result table with the month (formatted as 'YYYY-MM'), the 'department_id', and a 'comparison' column indicating 'higher', 'lower', or 'same'.
🔁 Sequential & Time-Based Problems
Consecutive Product Usage Streak
Given a tablefiled_taxeswith columnsfiling_id,user_id,filing_date, andproduct, find all users who have filed their taxes using any 'TurboTax' product for at least three consecutive years. Note that 'TurboTax' products may have different version names (e.g., 'TurboTax Basic', 'TurboTax Deluxe'). Assume each user files at most once per calendar year. Return the results sorted byuser_id.Consecutive SaaS Subscription Retention
Given a tablefiled_taxeswith columnsfiling_id,user_id,filing_date, andproduct, identify users who have demonstrated loyalty by using any version of the 'TurboTax' product suite for at least three consecutive calendar years. Note that 'TurboTax' may appear in the product column under various names (e.g., 'TurboTax Deluxe 2023', 'TurboTax Free Edition'). A user may have multiple entries, but only one filing per year should be considered for the continuity count. Output the list of unique user IDs sorted numerically/alphabetically.Consecutive High Attendance Records
You are given a tableStadiumcontaining stadium visit data with columnsid(integer),visit_date(date), andpeople(integer). Theidis an auto-incrementing value that corresponds with the chronological order of thevisit_date. Write a SQL query to identify all sequences of three or more consecutive IDs where each visit in that sequence had a population of 100 or more people. The result should include the original row details and be ordered by the visit date.
🏅 Customer Retention & Loyalty
Identifying Multi-Year Customer Loyalty
Find customers who made purchases in every year of a multi-year window. Tests intersection and set-based thinking.Identifying Multi-Year Customer Retention
A variation focusing on customer cohort retention rates across annual periods.
🔄 Data Transformation & Pivoting
-
Finding Median Rows per Partition
Given a table
Employeewith columnsid(INT, PK),company(VARCHAR), andsalary(INT), write a SQL query to identify the specific rows that represent the median salary for each company.
Rules for median calculation:
- Sort salaries in ascending order for each company.
- In case of identical salaries, use
idas a secondary ascending sort key to break ties. - If a company has an odd number of employees, return the single middle row.
- If a company has an even number of employees, return the two middle rows.
The result should include the id, company, and salary columns for these specific median records.
-
Title Case Transformation with Hyphen Sensitivity
You are given a table
user_contentwith a columncontent_text. Your task is to generate a report that transforms the text into Title Case based on specific business rules.
- The first letter of every word must be capitalized, and all subsequent letters in that word must be lowercase.
- The transformation must recognize a hyphen (-) as a word boundary, meaning segments on both sides of a hyphen (e.g., 'high-speed') must be capitalized ('High-Speed').
- All original spacing, including multiple consecutive spaces, must be preserved exactly as they appear in the source data.
Return the original content_id, content_text, and the newly transformed_text column.
-
Correcting Swapped Sequences in Delivery Logs
A delivery platform experienced a technical glitch where every pair of consecutive items in their order logs was swapped (i.e., the item intended for ID 1 was recorded for ID 2, and vice versa). You are given a table
orderswith columnsorder_id(unique, sequential integers starting from 1) anditem(the name of the food).
Write a query to restore the correct mapping. If the total number of orders is odd, the final order ID should remain paired with its original item. The output should be sorted by order_id in ascending order.
Alphabetical Continent Pivot
Given a tableStudentcontainingnameandcontinentcolumns (with potential duplicates), pivot the table so that student names are grouped under their respective continent headers: 'America', 'Asia', and 'Europe'. Within each column, names must be sorted alphabetically. If one continent has more students than another, the extra rows for the smaller continents should be filled withNULL. The result should show the first alphabetical student for each continent in the first row, the second in the second row, and so on.Identifying Consistent Median Performers
Given aStudenttable containing names and IDs, and anExamtable containing scores for various exams, define a 'Quiet Student' as one who has participated in at least one exam but has never achieved the highest or lowest score in any exam they sat for. Write a SQL query to find thestudent_idandstudent_nameof all quiet students, ensuring that students who never took an exam are excluded. Note: If an exam has multiple students with the same top score, all are considered 'loud' for that exam. Sort the output bystudent_id.
🏢 Organizational Hierarchy
-
Identifying Senior Management Tiers
Given a table
employeeswith columnsemp_id,manager_id, andmanager_name, define a 'Senior Manager' as an employee who manages at least one individual who is himself/herself a manager, but does not manage anyone who qualifies as a Senior Manager.
Write a SQL query to return the name of every Senior Manager and the total count of their direct reports. If an employee reports to multiple managers, they should be counted as a report for each. Sort the results by the report count in descending order.
Note: A manager is any employee who has at least one direct report.
🔧 Data Quality & Deduplication
- Identifying Incomplete Assembly Parts Find assembly records where required component parts are missing. A manufacturing/logistics favorite.
How to Study SQL for Interviews
The three patterns that cover 80% of hard SQL questions:
1. Window Functions
SELECT
employee_id,
salary,
RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS salary_rank
FROM employees;
Master RANK, DENSE_RANK, ROW_NUMBER, NTILE, LAG, LEAD, SUM OVER, AVG OVER.
2. CTEs for Readability and Recursion
WITH ranked AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS rn
FROM sales
)
SELECT * FROM ranked WHERE rn = 1;
3. Self-Joins for Sequential Logic
SELECT a.user_id
FROM activity a
JOIN activity b
ON a.user_id = b.user_id
AND DATEDIFF(b.activity_date, a.activity_date) = 1;
-
CRM Marketing Touch Streak Analysis
A marketing team wants to reward highly engaged potential customers. Given a
marketing_touchestable (event_id, contact_id, event_type, event_date) and acrm_contactstable (contact_id, email), find the email addresses of all contacts who meet two conditions: - They had at least one marketing touch per week for at least three consecutive weeks (weeks are defined as starting on Monday).
- They have performed at least one touch of type 'trial_request' at any point in their history.
Ensure the solution handles contacts with multiple touches in a single week correctly and accounts for streaks that might cross year boundaries.
Best-Selling Products per Category with Tie-Breakers
You are given two tables: 'products' (containing product_id, product_name, and category_name) and 'product_sales' (containing product_id, sales_quantity, and rating). Write a PostgreSQL query to identify the top-performing product in every category. The primary metric for performance is 'sales_quantity'. In the event of a tie in sales, the product with the higher 'rating' should be prioritized. If multiple products remain tied after checking both metrics, include all of them. The final result should display the category name and product name, sorted alphabetically by the category name.Server Fleet Utilization Analysis
A cloud provider tracks server uptime through a series of status logs. You are given a tableserver_utilizationwith columnsserver_id(int),status_time(timestamp), andsession_status(string, either 'start' or 'stop'). Each server can start and stop multiple times throughout the period.
Calculate the total uptime for the entire fleet of servers. The result should be returned as the total number of full days (24-hour periods) of cumulative uptime across all servers, rounded down to the nearest integer.
Ensure your solution accounts for the chronological order of events per server and correctly pairs start events with their subsequent stop events.
Identify Retried Payment Transactions
Given atransactionstable with columnstransaction_id,merchant_id,credit_card_id,amount, andtransaction_timestamp, write a query to identify the total number of 'accidental' repeat payments. A repeat payment is defined as a transaction that occurs at the same merchant, with the same credit card, for the exact same amount, within a 10-minute window of a previous transaction. Note: In a sequence of three identical transactions each within 10 minutes of the last, the second and third should be counted as repeats, but the first should not.Daily Odd and Even Sensor Measurement Totals
Given a tablemeasurementscontainingmeasurement_id,measurement_value, andmeasurement_time, calculate the daily sum of values based on their chronological order within each day. A measurement is 'odd-numbered' if it is the 1st, 3rd, or 5th recorded event of that day, and 'even-numbered' if it is the 2nd, 4th, or 6th. Your output should contain the date, the total sum for odd-numbered measurements, and the total sum for even-numbered measurements for each day represented in the dataset.Top Grossing Products by Category
Given aproduct_spendtable documenting customer transactions, write a query to retrieve the top two highest-grossing products within each category for the calendar year 2022. Gross spend is defined as the sum of thespendcolumn for a given product. Your output should include the category name, the product name, and the total calculated spend, ordered by category and then by the highest spend.Top-Selling Products by Category
Given a tableproduct_spendrepresenting Amazon transactions, write a query to identify the top 2 products in terms of total revenue for each category during the year 2022. Your output should display the category name, the product name, and the total spend associated with that product. In cases of identical spend within a category, ensure the ranking logic is consistent with standard business reporting (e.g., handling ties).Manufacturing Pipeline Bottlenecks
You are analyzing production data for a high-volume manufacturing facility. The tableparts_assemblytracks individual steps in the assembly process for various components.
A part is considered 'In Progress' if it has at least one entry in the table. A part is considered 'Unfinished' if there is at least one step in its sequence where the finish_date is missing (NULL).
Write a query to return a unique list of all parts that are currently in the assembly process but have not yet been completed.
Practice Tips
- Always think about NULL handling first —
COUNT(*)vsCOUNT(col)matters - Use
EXPLAINto validate your solution isn't doing full table scans - For "consecutive" problems, the LAG/LEAD approach is almost always cleaner than self-joins
- For "per group" questions, think
PARTITION BYbeforeGROUP BY - Time-window problems almost always need
BETWEENorDATEDIFFwith careful boundary thinking
Top comments (0)