DEV Community

Cover image for Fixing Messy Queries with DbVisualizer – a Guide for MySQL
DbVisualizer
DbVisualizer

Posted on • Edited on • Originally published at dbvis.com

Fixing Messy Queries with DbVisualizer – a Guide for MySQL

Messy queries are not news to any DBA. As time flies, more work piles up, and more data is written to the database that‘s supporting the application the company you work with is selling, the database, and, consequently, associated queries are bound to become a little harder to manage.

Messy queries can be noticed by assessing:

  • Database performance – SHOW PROCESSLIST will help, but do you have time to figure out exactly what part of a query is causing havoc?
  • The performance of a web application – if it’s nosediving, it can be a direct result of slow and messy queries as well – in fact, the performance of our application is a very frequent indicator of issues on the database front.
  • Complaints from users surrounding the performance of the application is a good indicator too – if many customers are complaining about the service being slow, out of order, etc., it may be time to check up on our database.

You get the point – messy queries that weren’t that well thought out are a problem on many fronts: customers will complain, our applications will grind to a halt… The work of developing the application will be impacted. However, there is a simple way to fix these issues – choosing a professional SQL client like DbVisualizer will help us write efficient queries and make sure that our application runs like the wind.

What Are SQL Clients And How Can They Help?

SQL clients target multiple database management systems (think MySQL, PostgreSQL, MongoDB, Oracle, and the like) with an aim to simplify our work with databases – all of them are, of course, built differently, but good SQL clients come with the following features:

  • SQL editors (a part of the application that lets us run SQL queries) – some SQL clients are also able to automatically complete queries if given the task to do so;
  • The ability to let us view and modify our databases and the data within them – good SQL clients encompass the following features:
    • The ability to add and modify databases and tables.
    • The ability to view the data types and indexes.
    • The ability to interact with our data – yep, just like in Microsoft Excel.
    • The ability to visualize our data – powerful SQL clients are able to build various charts based on the data within our databases. The charts can be very useful to encourage visual communication in meetings and present information to stakeholders, team members, etc.
    • Some SQL clients also provide their users with the ability to encrypt all of the data moving back and forth through the client – as more and more companies that are adhering to privacy and security standards (GDPR, ISO 27001, etc.) provide built-in ways to secure the most sensitive data belonging to their clients, SQL clients aren’t the exception either.
  • Most good SQL clients also provide support for multiple database management systems at once – we’ve already told you that most SQL clients support MySQL, PostgreSQL, Oracle, and MongoDB, however, some also provide support for more “exotic” types of DBMS too: as far as DbVisualizer is concerned, Cassandra, Azure SQL, SQLite, Db2, Mimer SQL, and even Redshift and Snowflake are all in the list meaning that no one is left out: SQL clients are a fit for those that work with personal projects, and aside from that, managing data built on big data-powering machines such as MongoDB is also possible.
  • Finally – and this is not news to anyone acquainted with the software engineering world – SQL clients usually provide free 30-day trials to let their users try the software before buying it. Some software solutions upgrade their users to paid versions automatically, others let them choose to stay on the free version indefinitely or upgrade to access premium features.

At the end of the day, SQL clients shoot multiple rabbits with one bullet – they help us manage the data within our databases and fix messy queries too – everything under one roof. Keep reading and we’ll tell you how exactly it all works.

Fixing Messy Queries with DbVisualizer

To help DbVisualizer gain insight into our queries, we first need to import a database instance into the tool – if you’re running DbVisualizer for the first time, you can do that by following the instructions on the screen, and if you’ve been working with DbVisualizer for a longer period, select the database you want to work with on the left-hand side or create a new database connection as shown below and you should be good to go:

Creating a Database Connection with DbVisualizer
Image 1 - Creating a Database Connection with DbVisualizer

Once you‘ve imported the database into DbVisualizer, confirm it‘s able to connect to the database instance by either looking for the checkmark towards the left side of the name of the DBMS, or expanding the connection in question. If the connection is successful, you should be able to see a list of databases:

A List of Databases
Image 2 - a List of Databases

Expand a database you want to work with, then head over to the table that is causing problems.

Listing the Columns In a Table
Image 3 - Listing the Columns In a Table

Now we can get to work. Big things start small, remember? Messy queries start with the data types and indexes being defined improperly or not being defined at all – in this case, the table we are looking at:

  • Has the name of „demo.“
  • Has five columns:
  • One column denotes an ID of the row and is of the integer type.
    • One column stores usernames as VARCHAR values.
    • One column stores email addresses as text.
    • One column stores metadata related to user accounts in the type of VARCHAR as well.
    • Finally, the last column stores IP addresses of users as VARCHAR values too – the most likely reason why IPs are not stored as integer values is because there are dots („.“) in between the numbers.
  • Has five indexes: one of them is primary, the others are secondary.

Understanding the database structure is key to fixing messy queries because the data that is selected and the indexes on the column containing the data directly impact the speed of the queries. Let‘s assume we‘re looking into the performance of a SELECT query. Suppose our problematic query looks like so:

Demo Query with DbVisualizer
Image 4 - Demo Query with DbVisualizer

A DBA will instantly notice that there are a couple of things that this query does:

  1. SELECT * selects all of the columns.
  2. A wildcard is used to return all rows matching an expression.
  3. A fulltext index is used.
  4. The „AND“ operator is used.

DBAs will know that all of them are problematic in their own regard since all of them make the query slow. Here‘s why:

  1. Selecting all columns means a scan through more data – the more data we have, the slower the query will become.
  2. Wildcards are only useful when they are at the end of the search string (in this case, two wildcards are surrounding the string.)
  3. Attentive readers will notice the MATCH() AGAINST() clause – such a clause is making use of a full-text index present on the column, and such an approach can be especially problematic with email addresses when MySQL 5.7 and big data sets are in use – such an approach is the cause of the bug #104263 – an „@“ sign inside of the query on bigger data sets makes it slow to a halt.
  4. The „AND“ operator isn‘t a problem in and of itself, however, the more columns are selected, the more work MySQL has to do to find those rows – if possible, avoid using it.

However, chances are that you‘re not a DBA and you might not know all of the details – that‘s where SQL clients such as DbVisualizer step in.

Fixing Issues with DbVisualizer

Head back to the left side of the tool – remember the columns and indexes? Start by exploring those – click on each of the columns one by one to see how they are built. You will see something similar to the screenshot below:

The Username Column in DbVisualizer
Image 5 - the Username Column in DbVisualizer

For performance reasons, note down the values of „size“, „type“, and „nulls.“ Then explore indexes and triggers if there are any (for this specific example, note that the name of the index and the column can differ):

Image 6 - Indexes on a Column
Image 6 - Indexes on a Column

Performing basic check-ups on the structure of the table, indexes, and triggers within them is key to fixing the mess our queries have made – couple that with basic knowledge on the database front, and you should be good to go!

For now, though, let‘s come back to our example above. On the same column, we see that there is an index called username, it‘s unique and ascending, and we also see that there aren‘t any unique values in it as of yet. All of those things should raise two questions:

  1. Why doesn‘t the column have any cardinality?
  2. Why is the column constrained to a relatively low length? (20)

Answer those questions – the answers can lead to actions that improve query speed or assist us in other areas: in this specific case, raising the length from 20 to 255 (the maximum value) could mean less customer complaints due to longer usernames not being registered without a strain on the database.

As we inspect other columns, we may see a different pattern – the columns may have unique values in them (cardinality), their length could be bigger, the default value of the column could be NULL instead of an empty one, etc. That‘s why no matter how powerful a SQL client is, you would still need to combine your knowledge of databases to push them to their maximum potential. When you find yourself rebuilding messy queries, keep in mind the following things:

  • SELECT queries:
    • Indexes are not all the same – B-Tree indexes are not the same as FULLTEXT indexes and the wildcards within them differ too (we use the „%“ operator for B-Tree indexes and the „*“ operator for FULLTEXT indexes.)
    • Indexes speed up SELECT queries.
    • Contrary to popular belief, the length of a column doesn‘t have much impact on performance until we‘re working with billions of rows.
    • The fewer rows are selected, the faster the query will be (e.g. avoid using SELECT * if you only need one column to be returned and consider making use of LIMIT to limit the number of rows that are returned if necessary.)
  • INSERT queries:
    • When dealing with bigger data sets, LOAD DATA INFILE is always faster than INSERT.
    • Running a multiple-value INSERT once will generally yield better results than running multiple INSERT operations at once.
    • More indexes = slower INSERT.
    • UPDATE queries:
    • For better performance, lock tables before updating data, then unlock them.
    • UPDATE statements are optimized like SELECT statements with a write overhead.
    • More indexes = slower UPDATE.
  • DELETE queries:
    • More indexes = slower DELETE.
    • TRUNCATE is always faster than DELETE.

Keep the advice above in mind when reworking your queries with DbVisualizer – many of the issues you‘re facing can be solved by wrapping your head around one or more of these points. However, understanding them alone is not enough to fix the mess that your queries have made; understanding the structure of your tables (the left side of DbVisualizer), properly writing queries (the SQL commander option in DbVisualizer), and monitoring the speed of your queries (the bottom right side of DbVisualizer) are all essential for the brighter future of your database and application.

We hope that this article has been informational and helped you fix issues surrounding your queries or at least wrap your head around them. Come back to the blog for more similar content, and until next time.

About the author

Lukas Vileikis is an ethical hacker and a frequent conference speaker. He runs one of the biggest & fastest data breach search engines in the world - BreachDirectory.com, frequently speaks at conferences and blogs in multiple places including his blog over at lukasvileikis.com.

Top comments (0)