Placements | HackerRank | MSSQL

#mssql #hackerrank #database #tutorial

The Problem

In this SQL problem, we're given three tables: Students, Friends, and Packages.

The Students table contains two columns:

ID (the student's ID), and
Name (the student's name).

The Friends table contains two columns:

ID (the student's ID), and
Friend_ID (ID of the ONLY best friend of the student).

The Packages table contains two columns:

ID (the student's ID), and
Salary (offered salary to the student, in $ thousands per month).

Here's a visual representation of these three tables:

The task is to write an SQL query that outputs the names of those students whose best friends got offered a higher salary than them. The names should be ordered by the salary amount offered to the best friends. It's also guaranteed that no two students received the same salary offer.

Explanation

Consider the following input:

The expected output should be:



Samantha
Julia
Scarlet

Here's why:

Samantha, Julia, and Scarlet's best friends all got offered a higher salary than they did, while Ashley's best friend did not.

The Solution

Here, I present two SQL solutions with slightly different approaches. Each has its own strengths, weaknesses, and applicability.

Direct Join Approach

The first source code uses direct JOIN operations to link the necessary data from all tables:



SELECT s.name
FROM Students s JOIN Friends f ON s.ID = f.ID
JOIN Packages p1 ON p1.ID = s.ID
JOIN Packages p2 ON p2.ID = f.Friend_ID
WHERE p1.Salary < p2.Salary
ORDER BY p2.Salary

In this approach, we are directly joining the Students, Friends, and Packages tables based on the respective IDs. After joining the tables, we filter out the students whose salary (from p1) is less than their friend's salary (from p2), and finally order the result by their friend's salary. This is a straightforward and easy-to-understand method.

Subquery Join Approach

The second solution uses subqueries to pre-filter the tables before joining:



SELECT s1.Name
FROM 
    (
        SELECT s.ID, s.Name, p.Salary
        FROM Students s
        JOIN Packages p ON s.ID = p.ID
    ) s1
JOIN 
    (
        SELECT f.ID, p.Salary [Friend_Salary]
        FROM Friends f
        JOIN Packages p ON f.Friend_ID = p.ID
    ) s2 ON s1.ID = s2.ID
WHERE s1.Salary < s2.Friend_Salary
ORDER BY s2.Friend_Salary

Here, instead of joining all the tables at once, we are creating two subqueries s1 and s2 to pre-join the Students and Packages tables, and Friends and Packages tables, respectively. Then, we join these two subquery results on the ID, filter the rows where the student's salary is less than their friend's salary, and order the result by the friend's salary. This approach can be a bit more efficient if there's a significant size difference between the tables, as it reduces the size of the data before the join operation.

Conclusion

Both methods are effective in solving this problem, with the difference mainly lying in their performance and scalability with varying table sizes. Depending on the actual data distribution and size, the subquery join approach might offer better performance due to reduced data size during joins.

You can find the original problem at HackerRank.

For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.

👉 all the links on my beacons.ai page 👈