Upsert is a database operation used to insert data or update it if the data already exists.
Upsert is very useful for performing insert and update operations in a single query, for example when importing a large amount of data.
In MySQL, upsert can be done by adding the ON DUPLICATE KEY UPDATE
clause to an INSERT
statement. For example:
INSERT INTO users
(email, name)
VALUES
('adam@gmail.com', 'Adam'),
('samir@gmail.com', 'Samir'),
('bokek@gmail.com', 'Bokek')
AS new
ON DUPLICATE KEY UPDATE
name = new.name;
In MySQL versions below 8.0.19,
AS new
andnew.name
are not supported. In that case, useVALUES(name)
instead. In MySQL versions starting above 8.0.20,VALUES(name)
is deprecated.
The query above will insert three new rows. However, if an email already exists (because the email
column is unique), the corresponding row will be updated instead.
To perform an upsert, two conditions must be met:
- There must be a unique column. In the previous example,
email
is the unique column. - Each column that will be updated during the upsert must be specified in the
ON DUPLICATE KEY UPDATE
statement.
Example
There is a users
table, structured like this:
DESC users;
-- +-------+-----------------+------+-----+---------+----------------+
-- | Field | Type | Null | Key | Default | Extra |
-- +-------+-----------------+------+-----+---------+----------------+
-- | id | bigint unsigned | NO | PRI | NULL | auto_increment |
-- | email | varchar(100) | NO | UNI | NULL | |
-- | name | varchar(50) | NO | | NULL | |
-- +-------+-----------------+------+-----+---------+----------------+
In the users
table, there are 3 rows:
SELECT * FROM users;
-- +----+-----------------+-------+
-- | id | email | name |
-- +----+-----------------+-------+
-- | 1 | adam@gmail.com | Adam |
-- | 2 | samir@gmail.com | Samir |
-- | 3 | bokek@gmail.com | Bokek |
-- +----+-----------------+-------+
3 rows in set (0.00 sec)
Now, five new rows need to be inserted: two rows for new users (with new emails), and three rows for existing users (emails already exist but with different names).
If this operation is performed using a regular INSERT
statement, it will throw an error, because three of the new rows already exist (based on the email).
INSERT INTO users
(email, name)
VALUES
('adam@gmail.com', 'Adamson'),
('samir@gmail.com', 'Samirni'),
('bokek@gmail.com', 'Bokek'),
('rutop@gmail.com', 'Rutoptop'),
('admin@gmail.com', 'Admin');
-- ERROR 1062 (23000): Duplicate entry 'adam@gmail.com' for key 'users.email'
To solve this, use an upsert. New users will be inserted as usual, while existing users will be updated.
INSERT INTO users
(email, name)
VALUES
('adam@gmail.com', 'Adamson'),
('samir@gmail.com', 'Samirni'),
('bokek@gmail.com', 'Bokek'),
('rutop@gmail.com', 'Rutoptop'),
('admin@gmail.com', 'Admin')
AS new
ON DUPLICATE KEY UPDATE
name = new.name;
-- Records: 5 Duplicates: 2 Warnings: 0
As a result, two new rows will be inserted, and the names of the three existing rows will be updated.
SELECT * FROM users;
-- +----+-----------------+----------+
-- | id | email | name |
-- +----+-----------------+----------+
-- | 1 | adam@gmail.com | Adamson |
-- | 2 | samir@gmail.com | Samirni |
-- | 3 | bokek@gmail.com | Bokek |
-- | 24 | rutop@gmail.com | Rutoptop |
-- | 25 | admin@gmail.com | Admin |
-- +----+-----------------+----------+
-- 5 rows in set (0.00 sec)
Top comments (0)