In this blog, we will solve one MySQL question and understand errors while solving the problem.
Problem :
Delete Duplicate Emails
Table: Person
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| id | int |
| email | varchar |
+-------------+---------+
id is the primary key column for this table.
Each row of this table contains an email. The emails will not contain uppercase letters.
Write an SQL query to delete all the duplicate emails, keeping only one unique email with the smallest id.
Input:
Person table:
+----+------------------+
| id | email |
+----+------------------+
| 1 | john@example.com |
| 2 | bob@example.com |
| 3 | john@example.com |
+----+------------------+
Output:
+----+------------------+
| id | email |
+----+------------------+
| 1 | john@example.com |
| 2 | bob@example.com |
+----+------------------+
Explanation: john@example.com is repeated two times. We keep the row with the smallest Id = 1.
If We want No. of count which is having same email then we should use GROUP BY method.
The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of id in each email".
The GROUP BY statement is often used with aggregate functions (COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more columns.
Here we group each row that has the same email and find a count of the row with COUNT() function. email john@example.com occurs two times so we get a count as 2 and for bob@example.com we get a count as 1 because it occurs only one time.
Here email john@example.com has two ids 1 and 3. If We want to get minimum id from these two ids or minimum id from any no. of ids for particular email then we have to use MIN() function.
MySQL query should be like :
Here we get id 1 for email john@example.com because min(1,3) should be 1 and get 2 for bob@example.com because it has only one id that should be returned in MIN() function.
Now we have completed our major task in this problem. We want only id that is not in this min id because we want to keep the smallest id in our table and delete all ids that are greater than small id which has same email. so we first group row that has same email and then finds the smallest id from that to keep smallest id in table and delete all the row that has same email with id > small(id).
so the query will delete all id that is not equal to smallest id of particular email.
Oops...
We have got an Error.
ERROR #1093 : You can't specify the target table table_name for an update in FROM clause
What you can do is change the query to something like this below.
Okay, let me explain how the magic happens here.
You can not delete the rows from the same data source to which your sub-query refers. This above-mentioned query is a working, but it’s ugly for several reasons, including performance. Here nested subquery makes a temporary table. So it doesn’t count as the same table you’re trying to delete data from.
This is because your update could be cyclical. what if updating that record causes something to happen which made the WHERE condition FALSE? You know that isn’t the case, but the engine doesn’t. There also could be opposing locks on the table in the operation.
Top comments (0)