DEV Community

Cover image for UPDATE Queries - Advanced CRUD explanation part 3
DbVisualizer
DbVisualizer

Posted on • Originally published at dbvis.com

UPDATE Queries - Advanced CRUD explanation part 3

UPDATE queries are another necessary component of any database management system – in this blog, we walk you through them. Join us!


Tools used in this tutorial

DbVisualizer, top rated database management tool and SQL client


UPDATE queries are a necessary part of every application. Everyone needs to update data once in a while – everyone makes mistakes, right?

Update queries are just that – they’re queries that help developers update their data. In this blog we’re walking you through them as well.

Why UPDATE?

In its most basic form, UPDATE queries look something along those lines:


UPDATE Queries.

UPDATE Queries.



Do note that by default, UPDATE queries must have only the SET clause – however, additional parameters can also be included, these being:

  • Parameters regulating priority – both LOW_PRIORITY and HIGH_PRIORITY are acceptable values.
  • The IGNORE clause to tell the query to ignore any and all errors that are encountered.
  • The WHERE clause regulating what parts of data within a table should be updated.

The most likely clause you’re going to encounter is WHERE, followed by the IGNORE clause. The priority clauses are used very rarely and if they’re used, they’re usually used in environments where there are a lot of queries running simultaneously and developers need to update data in chunks.

UPDATE Query From the Inside

Here’s how UPDATE queries look like from the inside (to learn how to profile a query, turn back to the second or first part of these tutorials):


Profiling an UPDATE Query.

Profiling an UPDATE Query.



We can see that there are multiple things going on:

  1. starting – the query starts.
  2. checking permissions – the query checks for permissions and if they are not sufficient, the query stops here and provides the user an error.
  3. Opening tables – the query is opening tables – tables need to be open in order for the query to complete.
  4. Init – the query initializes its processes.
  5. System lock – the query is checking whether there are any reading / writing locks in place.
  6. updating – the query updates data.
  7. end – the query ends all of its processes.
  8. query end – the query stops running (not to be confused with the process in the step #7 – there the query just ends all of its processes inside of itself, but still continues to run.)
  9. closing tables – all of the affected tables are being closed.
  10. cleaning up – the database is cleaning up and preparing for the next query to be run.

With UPDATE queries, everything works in a simple fashion – we must have sufficient permissions, InnoDB will create row-level locks in the process, update the data, and end.

Everything may look relatively simple from the outset, but when we dig into the processes we quickly see that everything’s a little more complex than we could imagine.

Advanced DBAs will know that both indexes and partitions slow DELETE, INSERT, and UPDATE queries down because of the fact that data existing in indexes and partitions need to be updated together with the data itself. That’s not it – UPDATE queries can also update multiple columns at once if we run them by specifying multiple columns like so:


Updating Multiple Values at Once.

Updating Multiple Values at Once.



Do note that we specified multiple columns after the “,” sign before the WHERE clause. There’s a downside – when updating a lot of data such queries will definitely take up a lot of time, so there are a couple of other tricks we can pull from our sleeves: the first one is using the IGNORE keyword to ignore all of the errors, but that’s not it either – did you know that the DEFAULT keyword can also be used to update bigger data sets? Yes, that‘s not an usual practice, but the DEFAULT statement can also double as an UPDATE statement in very specific cases:

  • Assume we have a lot of data (like billions – one of the biggest & fastest data breach search engines in the world, BreachDirectory, allows people to check if they’re at risk of identity theft and runs on tens of billions of records without any issues)
  • We’re also pretty low on disk space (perhaps not this low, but you get what we mean) – to free up data on the disk, we have to load data in chunks, then delete the first file straight after it’s uploaded into the database (we need to track when the LOAD DATA INFILE queries finish):


    Low on Disk Space.

    Low on Disk Space.
  • We also have a table that we desperately need to insert data into. We have around 100 million records.

  • We make use of LOAD DATA INFILE for our database to load data in a quick fashion (you’ve read our previous blog about INSERT queries in these series, right?)

  • We start loading data into the database, and... face an error. We run out of storage space because before loading data into the database many DBMS need to make a copy of the table and since we also have to store the file that we load into the database onto the same disk. With only 2GB of free space, that’s not a possibility.

Now consider the same scenario, just with the DEFAULT keyword in use:

  • We create our table in such a way that includes the DEFAULT keyword and at the same time we specify the necessary values to avoid running the UPDATE query once the table is already created. All UPDATE queries would make a copy of the table on the disk if the table is big:

The DEFAULT keyword when creating a table.<br>

The DEFAULT keyword when creating a table.
  • We make use of LOAD DATA INFILE to only load the necessary data into our specified fields – the IGNORE keyword will ignore all encountered errors:

LOAD DATA INFILE with an IGNORE keyword.<br>

LOAD DATA INFILE with an IGNORE keyword.
  • Our data is loaded into the database without any issues! The email field will be left blank, and the purchases_list column will be pre-filled with “Coca Cola, Fanta” values meaning that we won’t need to run an additional UPDATE on that table – the results will be pre-filled. How cool is that?

The reason why our DBMS behaved in this way is pretty self-explanatory: some of the fields have already been filled-in prior to inserting the data. We shot two rabbits with one shot: the file was smaller since we didn’t specify other values (think of how much space we could save if there would’ve been billions of records?), and we didn’t have to run an UPDATE query after the LOAD DATA INFILE query – the values were already filled in by the database itself.

Neat, right? Now quick – grab a free trial of DbVisualizer and start being in charge of your database!

Summary

In this blog, we have walked you through the third query in the CRUD acronym – the UPDATE query that’s used to update data within tables. We bet you didn’t know that the DEFAULT keyword can also be used to update data!

Follow us on Twitter, read our blog to stay updated around what’s happening in the database world, and until next time.

FAQs

What Does the UPDATE Query Do? Do Indexes Help it?

The UPDATE query updates data within a table. Indexes and partitions make it slower at the expense of speeding up SELECTs.

What’s the Best Way to Optimize an UPDATE Query?

The exact way depends on the DBA – consider removing indexes and partitions, and in cases where you find yourself having a lot of data, the DEFAULT keyword as well.

About the author

Lukas Vileikis is an ethical hacker and a frequent conference speaker. He runs one of the biggest & fastest data breach search engines in the world - BreachDirectory.com, frequently speaks at conferences and blogs in multiple places including his blog over at lukasvileikis.com.

Top comments (0)