DEV Community

Pranav Bakare
Pranav Bakare

Posted on

Self Join in SQL | Best Explanation with Examples

What is a Self-Join in SQL?

A self-join in SQL is a type of join where a table is joined with itself. It is useful when you want to compare rows within the same table or retrieve related data from the same dataset. Self-joins are often used to model hierarchical relationships (like employee-manager structures) or to find combinations within a set (like possible match-ups between teams).


Definition:

A self-join is a regular join where the table is joined with itself using different aliases. It is essentially used to compare rows of a table to other rows within the same table.

Syntax:

SELECT a.column1, b.column2
FROM table_name a
JOIN table_name b ON a.common_column = b.common_column;
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • table_name a: Creates an alias (a) for the table.
  • table_name b: Creates another alias (b) for the same table.
  • ON a.common_column = b.common_column: The condition to join the two aliases based on common columns.

1. Self-Join Example: Employee and Manager Scenario

Scenario:

You have an Employees table, and you need to find out which employee reports to which manager. Each row in the table contains details of employees, and the ManagerID column holds the EmployeeID of the manager.

Sample Table Creation and Data Insertion:


-- Create the Employees table
CREATE TABLE Employees (
    EmployeeID NUMBER PRIMARY KEY,
    EmployeeName VARCHAR2(50),
    ManagerID NUMBER
);

Enter fullscreen mode Exit fullscreen mode

-- Insert sample data
INSERT INTO Employees (EmployeeID, EmployeeName, ManagerID) 
VALUES (1, 'John', NULL);
INSERT INTO Employees (EmployeeID, EmployeeName, ManagerID) 
VALUES (2, 'Mike', 1);
INSERT INTO Employees (EmployeeID, EmployeeName, ManagerID) 
VALUES (3, 'Sarah', 1);
INSERT INTO Employees (EmployeeID, EmployeeName, ManagerID) 
VALUES (4, 'Kate', 2);
INSERT INTO Employees (EmployeeID, EmployeeName, ManagerID) 
VALUES (5, 'Tom', 2);


-- Commit the changes
COMMIT;

Enter fullscreen mode Exit fullscreen mode

Self-Join Query in Oracle:

SELECT e1.EmployeeName AS Employee, 
       e2.EmployeeName AS Manager
FROM Employees e1
LEFT JOIN Employees e2 ON e1.ManagerID = e2.EmployeeID;

Enter fullscreen mode Exit fullscreen mode

Explanation:

  • e1 is an alias representing employees.
  • e2 is another alias representing managers.

The LEFT JOIN helps include all employees, even those who don’t have a manager (ManagerID is NULL).

Output:

Employee Manager
John NULL
Mike John
Sarah John
Kate Mike
Tom Mike

2. Self-Join Example: IPL Matches (Every Team Plays Against Every Other Team Once)

Scenario:

You have a list of IPL teams, and you want to generate a list of matches where each team plays against every other team once.

Sample Table Creation and Data Insertion:

-- Create the Teams table
CREATE TABLE Teams (
    TeamID NUMBER PRIMARY KEY,
    TeamName VARCHAR2(100)
);
Enter fullscreen mode Exit fullscreen mode
-- Insert sample data
INSERT INTO Teams (TeamID, TeamName) 
VALUES (1, 'Mumbai Indians');
INSERT INTO Teams (TeamID, TeamName) 
VALUES (2, 'Chennai Super Kings');
INSERT INTO Teams (TeamID, TeamName) 
VALUES (3, 'Royal Challengers Bangalore');
INSERT INTO Teams (TeamID, TeamName) 
VALUES (4, 'Kolkata Knight Riders');

-- Commit the changes
COMMIT;
Enter fullscreen mode Exit fullscreen mode

Self-Join Query in Oracle:

SELECT t1.TeamName AS Team1, 
       t2.TeamName AS Team2
FROM Teams t1
JOIN Teams t2 ON t1.TeamID < t2.TeamID;
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • t1 and t244 are aliases for the Teams table.

The condition t1.TeamID < t2.TeamID ensures each match pairing is listed only once (avoiding duplicates like Team A vs. Team B and Team B vs. Team A).

Output:

Team1 Team2
Mumbai Indians Chennai Super Kings
Mumbai Indians Royal Challengers Bangalore
Mumbai Indians Kolkata Knight Riders
Chennai Super Kings Royal Challengers Bangalore
Chennai Super Kings Kolkata Knight Riders
Royal Challengers Bangalore Kolkata Knight Riders

3. Self-Join Example: IPL Matches (Every Team Plays Against Every Other Team Twice)

Scenario:

You want to generate a list where each IPL team plays against every other team twice (once as the home team, and once as the away team).

Self-Join Query in Oracle:

SELECT t1.TeamName AS Team1, 
       t2.TeamName AS Team2
FROM Teams t1
JOIN Teams t2 ON t1.TeamID != t2.TeamID;
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • t1 and t2 are aliases for the Teams table.

The condition t1.TeamID != t2.TeamID ensures that all possible match-ups are listed, including both Team A vs. Team B and Team B vs. Team A.

Output:

Team1 Team2
Mumbai Indians Chennai Super Kings
Mumbai Indians Royal Challengers Bangalore
Mumbai Indians Kolkata Knight Riders
Chennai Super Kings Mumbai Indians
Chennai Super Kings Royal Challengers Bangalore
Chennai Super Kings Kolkata Knight Riders
Royal Challengers Bangalore Mumbai Indians
Royal Challengers Bangalore Chennai Super Kings
Royal Challengers Bangalore Kolkata Knight Riders
Kolkata Knight Riders Mumbai Indians
Kolkata Knight Riders Chennai Super Kings
Kolkata Knight Riders Royal Challengers Bangalore

Finding Duplicate Customer Records - Additional Example

Scenario:
You have a Customers table where each customer should have a unique combination of FirstName, LastName, and DateOfBirth. However, there may be accidental duplicates, and you want to identify them using a self-join.

Sample Table Creation and Data Insertion:

-- Create the Customers table
CREATE TABLE Customers (
    CustomerID NUMBER PRIMARY KEY,
    FirstName VARCHAR2(50),
    LastName VARCHAR2(50),
    DateOfBirth DATE
);
Enter fullscreen mode Exit fullscreen mode
-- Insert sample data (including duplicates)
INSERT INTO Customers (CustomerID, FirstName, LastName, DateOfBirth) VALUES (1, 'John', 'Doe', TO_DATE('1990-01-01', 'YYYY-MM-DD'));
INSERT INTO Customers (CustomerID, FirstName, LastName, DateOfBirth) VALUES (2, 'Jane', 'Smith', TO_DATE('1992-02-02', 'YYYY-MM-DD'));
INSERT INTO Customers (CustomerID, FirstName, LastName, DateOfBirth) VALUES (3, 'John', 'Doe', TO_DATE('1990-01-01', 'YYYY-MM-DD'));
INSERT INTO Customers (CustomerID, FirstName, LastName, DateOfBirth) VALUES (4, 'Alice', 'Johnson', TO_DATE('1995-03-03', 'YYYY-MM-DD'));
INSERT INTO Customers (CustomerID, FirstName, LastName, DateOfBirth) VALUES (5, 'John', 'Doe', TO_DATE('1990-01-01', 'YYYY-MM-DD'));

-- Commit the changes
COMMIT;
Enter fullscreen mode Exit fullscreen mode

Self-Join Query to Find Duplicates:

SELECT c1.CustomerID AS DuplicateRecordID1, 
       c2.CustomerID AS DuplicateRecordID2, 
       c1.FirstName, 
       c1.LastName, 
       c1.DateOfBirth
FROM Customers c1
JOIN Customers c2 ON c1.FirstName = c2.FirstName
                 AND c1.LastName = c2.LastName
                 AND c1.DateOfBirth = c2.DateOfBirth
                 AND c1.CustomerID < c2.CustomerID;
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • c1 and c2 are aliases for the same Customers table.
  • The condition c1.FirstName = c2.FirstName AND c1.LastName = c2.LastName AND c1.DateOfBirth = c2.DateOfBirth checks for matching values across multiple columns, indicating a duplicate.
  • c1.CustomerID < c2.CustomerID ensures that each duplicate pair is shown only once, avoiding repetition like Customer A vs. Customer B and Customer B vs. Customer A.

Output:

RecordID1 RecordID2 FirstName LastName DateOfBirth
1 3 John Doe 1990-01-01
1 5 John Doe 1990-01-01
3 5 John Doe 1990-01-01

Conclusion:

  • A self-join allows you to connect rows from the same table by creating multiple aliases. It is useful in scenarios where data needs to be compared within the same dataset. In the above examples:
  • The employee-manager example shows how to use self-joins for hierarchical data.
  • The IPL match-ups illustrate how to generate combinations within a single dataset, whether for a single match per pair or double matches (home and away games).
  • These scenarios demonstrate the flexibility and power of self-joins in SQL.

Top comments (0)