The Problem
Given a table, Functions, containing two columns: X and Y.
Two pairs (X1, Y1) and (X2, Y2) are said to be symmetric pairs if X1 = Y2 and X2 = Y1. The task is to write a SQL query that outputs all such symmetric pairs in ascending order by the value of X, while ensuring X1 ≤ Y1.
Explanation
Consider the following input:
The expected output would be:
20 20
20 21
22 23
Here, (20, 20), (20, 21) with its symmetric pair (21, 20), and (22, 23) with its symmetric pair (23, 22) are the symmetric pairs in the table, sorted in ascending order by X.
The Solution
Here are two SQL solutions for the problem, each highlighting a different approach and their respective strengths, weaknesses, and structures.
Direct Join with ROW_NUMBER()
The first approach uses a common table expression (CTE) and a direct join:
WITH CTE AS (
SELECT
X,
Y,
ROW_NUMBER() OVER(ORDER BY X) [rn]
FROM Functions
)
SELECT DISTINCT
f1.X,
f1.Y
FROM CTE f1 JOIN CTE f2 ON f1.X = f2.Y AND f1.Y = f2.X
WHERE f1.X <= f1.Y
AND f1.rn != f2.rn
ORDER BY f1.X
In this method, a CTE is first created, and each row in the table is assigned a unique number using the ROW_NUMBER()
function. Then, a self join is performed on the CTE to match the rows where X and Y are symmetric. The f1.rn != f2.rn
condition ensures that the same row is not matched with itself, which would lead to false positive results.
EXISTS with ROW_NUMBER()
The second approach also uses a CTE and the ROW_NUMBER()
function, but instead of a direct join, it uses the EXISTS
operator:
WITH CTE AS (
SELECT
X,
Y,
ROW_NUMBER() OVER(ORDER BY X, Y) [rn]
FROM Functions
)
SELECT DISTINCT
f1.X,
f1.Y
FROM CTE f1
WHERE EXISTS (
SELECT 1
FROM CTE f2
WHERE f1.X = f2.Y AND f1.Y = f2.X AND f1.rn <> f2.rn
)
AND f1.X <= f1.Y
ORDER BY f1.X
Here, a subquery is used in the WHERE
clause to check for the existence of symmetric pairs in the CTE. The EXISTS
operator returns true if the subquery returns at least one row, which in this case, would indicate the existence of a symmetric pair.
Conclusion
Both methods effectively solve the problem, but their performances can vary depending on the size and distribution of the data in the table. Generally, the method using EXISTS
can be faster if the subquery returns few rows, as it stops executing as soon as it finds a match. However, if the subquery often returns many rows, the direct join approach can be more efficient.
The performance may also vary depending on the actual RDBMS used, and the system resources available, including memory and CPU.
You can find the original problem at HackerRank.
For more insightful solutions and tech-related content, feel free to connect with me on my Beacons page.
Top comments (1)
I'm going through MySQL exercises now, on HackerRank, and just came across this very problem. I got close, getting all the symmetric pairs except for (13,13), and none of my code modifications retrieved that pair.
So, thank you for posting this, because now I can look it over, and see where my code wasn't sufficient.
Note: For anyone else reading this, the code below is not a solution for this problem. I merely included it to emphasize that I got sooooo close, but couldn't get the final pair, (13,13)