If you've ever practiced SQL interview questions, you have almost certainly crossed paths with this classic prompt: “Find the second highest salary in the company.”
When the problem lands on the table, your first instinct is usually to smile, open your editor, and type this in four seconds:
SELECT salary
FROM employees
ORDER BY salary DESC
LIMIT 1 OFFSET 1;
And just like that, you are probably out of the interview. Not because the syntax is broken—that runs perfectly in PostgreSQL—but because you just committed the cardinal sin of data engineering: you dove right into the code before understanding the business.
Today, I want to break down this problem (inspired by the Second Highest Salary challenge on DataLemur). The goal isn't to teach you basic SQL syntax, but to show you how an analyst actually thinks when handed an ambiguous requirement.
Phase 1: Ask the Awkward Questions
Picture this: you aren't sitting on a coding platform; the CFO just walked into your office on a Tuesday at 9:00 AM asking for this number. If you run the OFFSET 1 query, you will immediately step on three landmines:
- The Tie Trap: Suppose the CEO makes $12,000, and both the CTO and the Principal Architect make $10,000. If you sort descending, Row 1 is $12k, and Row 2 is $10k. Perfect. But what if the CTO also makes $12,000? Your query returns $12,000 as the "second salary." That is wrong. That is the first salary tier, repeated. The CFO wants the second monetary tier, not the second individual on the payroll.
- The Phantom Hierarchy: The prompt says "employees". Do Managers count as employees in this database, or do they sit on a separate pay scale? Are Directors in this table? If you don’t ask, you risk handing the executive a Vice President's salary when they actually wanted data on the general staff. (For our exercise, we will assume everyone coexists in one table).
-
The Invisible Edge Case: What if the company is a newly founded startup with only one employee? Your
OFFSET 1query will return an empty table (0 rows). In software engineering, a database returning "nothing" causes the backend consuming your API to throw a lovelyNullPointerException. The SQL standard dictates that if a requested metric does not exist, the database should returnNULL.
Once we reconcile the prompt with real-world logic, the requirement translates to:
“Give me the second-highest unique value from the salary column; if that tier does not exist, return
NULL.”
Phase 2: Ditch the Subquery, Embrace Window Functions
The classic textbook solution to fix the tie issue is a nested subquery:
SELECT MAX(salary) AS second_highest_salary
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
It works, but it isn't scalable engineering.
If your boss comes back tomorrow and says, "Great job... now give me the 5th highest salary," this approach devolves into an unreadable Russian nesting doll of subqueries.
This is where we call upon Window Functions. However, SQL gives us three different ways to rank data, and we have to pick the right tool for the job. Given the salary set [12000, 12000, 9000]:
-
ROW_NUMBER()assigns:1, 2, 3(Discards the 9000 as the 3rd rank. Wrong). -
RANK()assigns:1, 1, 3(Skips rank 2 entirely because of the tie. Wrong). -
DENSE_RANK()assigns:1, 1, 2(Ties share rank 1, and the next unique value gets rank 2. Bingo).
Phase 3: Build the Solution in Blocks (The Power of CTEs)
Instead of writing a monolithic 10-line query where everything happens at once, write it using CTEs (Common Table Expressions). CTEs don’t make your query run faster; they do something much more important: they make your logic auditable.
Let’s build our first logical block to map the entire company:
WITH ranked_salaries AS (
SELECT
salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees
)
If we run just this CTE in our heads, we get a clean virtual table:
| salary | salary_rank |
|---|---|
| 12000 | 1 |
| 12000 | 1 |
| 9500 | 2 |
| 8000 | 3 |
Now, we write the main query to consume this block:
WITH ranked_salaries AS (
SELECT
salary,
DENSE_RANK() OVER (ORDER BY salary DESC) AS salary_rank
FROM employees
)
SELECT MAX(salary) AS second_highest_salary
FROM ranked_salaries
WHERE salary_rank = 2;
The Technical "Magic Trick"
Look closely at that final SELECT. I wrapped the column in MAX(salary), even though every row caught by WHERE salary_rank = 2 holds the exact same number. Why?
To defeat Landmine #3 (the single-employee company). If the database only holds one person, WHERE salary_rank = 2 finds zero rows.
- If I had written
SELECT salary..., SQL outputs an empty table. - By writing
SELECT MAX(salary)..., we force the engine to calculate the maximum of an empty set. By mathematical definition, the maximum of nothing in SQL is `NULL`.
We satisfied all three business requirements with a single line of defensive code.
When solving challenges on platforms like DataLemur, the compiler simply spits out a green "Accepted". But in the real world, the engine validating your code is an exhausted user trying to break your dashboard at 3:00 AM.
Writing Senior-level SQL isn't about memorizing obscure syntax; it’s about operating under the assumption that your data will always be messier than the ticket promises.
Top comments (0)