DEV Community

interviewgpt
interviewgpt

Posted on

20 Tricky SQL Interview Questions That Separate Good Candidates from Great Ones (2026)

20 Tricky SQL Interview Questions That Separate Good Candidates from Great Ones (2026)

SQL interviews at top tech companies have evolved. Forget basic SELECT queries — modern data engineering and backend interviews test your ability to handle window functions, complex aggregations, recursive CTEs, and real-world analytical scenarios.

This guide covers the most challenging SQL interview questions you'll face, each linked to a full solution.


What SQL Interviewers Are Really Testing

Most candidates can write a GROUP BY query. What separates top performers is their ability to:

  • Use window functions (ROW_NUMBER, RANK, LAG, LEAD, SUM OVER) fluently
  • Write self-joins for hierarchical or sequential data problems
  • Handle NULLs carefully and intentionally
  • Think about performance — indexes, query plans, and avoiding full scans
  • Pivot and reshape data dynamically
  • Apply set theory to solve complex membership and exclusion problems

The SQL Questions (With Solutions)

📊 Window Functions & Rankings

  • Nth Highest Distinct Salary

    Given an 'Employee' table with 'id' (Primary Key) and 'salary' (Integer), write a SQL query to retrieve the Nth highest distinct salary. If there are fewer than N distinct salaries in the table, the query should return NULL. The solution should be flexible enough to handle any integer input N.

  • Monthly Department vs Company Salary Benchmarking

    You are given two tables: 'Salary' (containing 'id', 'employee_id', 'amount', and 'pay_date') and 'Employee' (containing 'employee_id' and 'department_id'). For every month present in the data, determine if the average salary within each department was higher, lower, or equal to the average salary of the entire company for that same month. Return a result table with the month (formatted as 'YYYY-MM'), the 'department_id', and a 'comparison' column indicating 'higher', 'lower', or 'same'.


🔁 Sequential & Time-Based Problems

  • Consecutive Product Usage Streak

    Given a table filed_taxes with columns filing_id, user_id, filing_date, and product, find all users who have filed their taxes using any 'TurboTax' product for at least three consecutive years. Note that 'TurboTax' products may have different version names (e.g., 'TurboTax Basic', 'TurboTax Deluxe'). Assume each user files at most once per calendar year. Return the results sorted by user_id.

  • Consecutive SaaS Subscription Retention

    Given a table filed_taxes with columns filing_id, user_id, filing_date, and product, identify users who have demonstrated loyalty by using any version of the 'TurboTax' product suite for at least three consecutive calendar years. Note that 'TurboTax' may appear in the product column under various names (e.g., 'TurboTax Deluxe 2023', 'TurboTax Free Edition'). A user may have multiple entries, but only one filing per year should be considered for the continuity count. Output the list of unique user IDs sorted numerically/alphabetically.

  • Consecutive High Attendance Records

    You are given a table Stadium containing stadium visit data with columns id (integer), visit_date (date), and people (integer). The id is an auto-incrementing value that corresponds with the chronological order of the visit_date. Write a SQL query to identify all sequences of three or more consecutive IDs where each visit in that sequence had a population of 100 or more people. The result should include the original row details and be ordered by the visit date.


🏅 Customer Retention & Loyalty


🔄 Data Transformation & Pivoting

  • Finding Median Rows per Partition Given a table Employee with columns id (INT, PK), company (VARCHAR), and salary (INT), write a SQL query to identify the specific rows that represent the median salary for each company.

Rules for median calculation:

  1. Sort salaries in ascending order for each company.
  2. In case of identical salaries, use id as a secondary ascending sort key to break ties.
  3. If a company has an odd number of employees, return the single middle row.
  4. If a company has an even number of employees, return the two middle rows.

The result should include the id, company, and salary columns for these specific median records.

  1. The first letter of every word must be capitalized, and all subsequent letters in that word must be lowercase.
  2. The transformation must recognize a hyphen (-) as a word boundary, meaning segments on both sides of a hyphen (e.g., 'high-speed') must be capitalized ('High-Speed').
  3. All original spacing, including multiple consecutive spaces, must be preserved exactly as they appear in the source data.

Return the original content_id, content_text, and the newly transformed_text column.

  • Correcting Swapped Sequences in Delivery Logs A delivery platform experienced a technical glitch where every pair of consecutive items in their order logs was swapped (i.e., the item intended for ID 1 was recorded for ID 2, and vice versa). You are given a table orders with columns order_id (unique, sequential integers starting from 1) and item (the name of the food).

Write a query to restore the correct mapping. If the total number of orders is odd, the final order ID should remain paired with its original item. The output should be sorted by order_id in ascending order.

  • Alphabetical Continent Pivot

    Given a table Student containing name and continent columns (with potential duplicates), pivot the table so that student names are grouped under their respective continent headers: 'America', 'Asia', and 'Europe'. Within each column, names must be sorted alphabetically. If one continent has more students than another, the extra rows for the smaller continents should be filled with NULL. The result should show the first alphabetical student for each continent in the first row, the second in the second row, and so on.

  • Identifying Consistent Median Performers

    Given a Student table containing names and IDs, and an Exam table containing scores for various exams, define a 'Quiet Student' as one who has participated in at least one exam but has never achieved the highest or lowest score in any exam they sat for. Write a SQL query to find the student_id and student_name of all quiet students, ensuring that students who never took an exam are excluded. Note: If an exam has multiple students with the same top score, all are considered 'loud' for that exam. Sort the output by student_id.


🏢 Organizational Hierarchy

  • Identifying Senior Management Tiers Given a table employees with columns emp_id, manager_id, and manager_name, define a 'Senior Manager' as an employee who manages at least one individual who is himself/herself a manager, but does not manage anyone who qualifies as a Senior Manager.

Write a SQL query to return the name of every Senior Manager and the total count of their direct reports. If an employee reports to multiple managers, they should be counted as a report for each. Sort the results by the report count in descending order.

Note: A manager is any employee who has at least one direct report.


🔧 Data Quality & Deduplication


How to Study SQL for Interviews

The three patterns that cover 80% of hard SQL questions:

1. Window Functions

SELECT 
  employee_id,
  salary,
  RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS salary_rank
FROM employees;
Enter fullscreen mode Exit fullscreen mode

Master RANK, DENSE_RANK, ROW_NUMBER, NTILE, LAG, LEAD, SUM OVER, AVG OVER.

2. CTEs for Readability and Recursion

WITH ranked AS (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY category ORDER BY revenue DESC) AS rn
  FROM sales
)
SELECT * FROM ranked WHERE rn = 1;
Enter fullscreen mode Exit fullscreen mode

3. Self-Joins for Sequential Logic

SELECT a.user_id
FROM activity a
JOIN activity b 
  ON a.user_id = b.user_id 
  AND DATEDIFF(b.activity_date, a.activity_date) = 1;
Enter fullscreen mode Exit fullscreen mode
  • CRM Marketing Touch Streak Analysis A marketing team wants to reward highly engaged potential customers. Given a marketing_touches table (event_id, contact_id, event_type, event_date) and a crm_contacts table (contact_id, email), find the email addresses of all contacts who meet two conditions:
  • They had at least one marketing touch per week for at least three consecutive weeks (weeks are defined as starting on Monday).
  • They have performed at least one touch of type 'trial_request' at any point in their history.

Ensure the solution handles contacts with multiple touches in a single week correctly and accounts for streaks that might cross year boundaries.

  • Best-Selling Products per Category with Tie-Breakers

    You are given two tables: 'products' (containing product_id, product_name, and category_name) and 'product_sales' (containing product_id, sales_quantity, and rating). Write a PostgreSQL query to identify the top-performing product in every category. The primary metric for performance is 'sales_quantity'. In the event of a tie in sales, the product with the higher 'rating' should be prioritized. If multiple products remain tied after checking both metrics, include all of them. The final result should display the category name and product name, sorted alphabetically by the category name.

  • Server Fleet Utilization Analysis

    A cloud provider tracks server uptime through a series of status logs. You are given a table server_utilization with columns server_id (int), status_time (timestamp), and session_status (string, either 'start' or 'stop'). Each server can start and stop multiple times throughout the period.

Calculate the total uptime for the entire fleet of servers. The result should be returned as the total number of full days (24-hour periods) of cumulative uptime across all servers, rounded down to the nearest integer.

Ensure your solution accounts for the chronological order of events per server and correctly pairs start events with their subsequent stop events.

  • Identify Retried Payment Transactions

    Given a transactions table with columns transaction_id, merchant_id, credit_card_id, amount, and transaction_timestamp, write a query to identify the total number of 'accidental' repeat payments. A repeat payment is defined as a transaction that occurs at the same merchant, with the same credit card, for the exact same amount, within a 10-minute window of a previous transaction. Note: In a sequence of three identical transactions each within 10 minutes of the last, the second and third should be counted as repeats, but the first should not.

  • Daily Odd and Even Sensor Measurement Totals

    Given a table measurements containing measurement_id, measurement_value, and measurement_time, calculate the daily sum of values based on their chronological order within each day. A measurement is 'odd-numbered' if it is the 1st, 3rd, or 5th recorded event of that day, and 'even-numbered' if it is the 2nd, 4th, or 6th. Your output should contain the date, the total sum for odd-numbered measurements, and the total sum for even-numbered measurements for each day represented in the dataset.

  • Top Grossing Products by Category

    Given a product_spend table documenting customer transactions, write a query to retrieve the top two highest-grossing products within each category for the calendar year 2022. Gross spend is defined as the sum of the spend column for a given product. Your output should include the category name, the product name, and the total calculated spend, ordered by category and then by the highest spend.

  • Top-Selling Products by Category

    Given a table product_spend representing Amazon transactions, write a query to identify the top 2 products in terms of total revenue for each category during the year 2022. Your output should display the category name, the product name, and the total spend associated with that product. In cases of identical spend within a category, ensure the ranking logic is consistent with standard business reporting (e.g., handling ties).

  • Manufacturing Pipeline Bottlenecks

    You are analyzing production data for a high-volume manufacturing facility. The table parts_assembly tracks individual steps in the assembly process for various components.

A part is considered 'In Progress' if it has at least one entry in the table. A part is considered 'Unfinished' if there is at least one step in its sequence where the finish_date is missing (NULL).

Write a query to return a unique list of all parts that are currently in the assembly process but have not yet been completed.


Practice Tips

  • Always think about NULL handling first — COUNT(*) vs COUNT(col) matters
  • Use EXPLAIN to validate your solution isn't doing full table scans
  • For "consecutive" problems, the LAG/LEAD approach is almost always cleaner than self-joins
  • For "per group" questions, think PARTITION BY before GROUP BY
  • Time-window problems almost always need BETWEEN or DATEDIFF with careful boundary thinking

Top comments (0)