When working with databases, optimizing SQL queries is essential to achieve better performance and reduce response times. In this post, we'll cover...
For further actions, you may consider blocking this person and/or reporting abuse
ORDER BY and LIMIT still sorts all the records, otherwise you'd never be able to do paged sorted lists.
Not sure how it can be "bad practice" to use a %searchitem% if that's how the data is! Presumably a good idea to use other ANDed statements to try to reduce the list so that the final scanned set is as limited as possible - presuming such limitations are viable in the scenario.
I'm confused about your DISTINCT example, definitely not true for MySQL as MySQL can optimize DISTINCT queries by using an index to quickly find and remove duplicate rows. This can be a significant performance improvement for queries that return a large number of rows.
MySQL cannot optimize GROUP BY queries in the same way, because the GROUP BY clause requires MySQL to calculate aggregate functions for each group of rows. This can be a significant performance penalty for queries that return a large number of rows.
In general, you should use the DISTINCT clause whenever you only need to return unique rows from a table. You should use the GROUP BY clause whenever you need to calculate aggregate functions for groups of rows.
Simply the database engine gets the first 10 rows ASC/DESC, so no need to sort 100,000 rows, when the SQL engine will find 10 will stop
limit-optimization
MySQL is really smart and can figure out when you only want a few rows from a big table. When you use LIMIT, sometimes MySQL can use a special trick to find those rows really fast, If you use LIMIT with an ORDER BY clause, MySQL will stop sorting the rows as soon as it finds enough rows to show you. This is really fast if the table is already sorted in a certain way
DISTINCTandGROUP BY- Sometime the scenario change, but the behavior and implementation is different - DISTINCT uses temp table, while GROUP BY uses Hashdistinct-optimization
group by and distinct
@miketalbot
ORDER + LIMIT sounds like it would be of limited use. Usually with ORDER, I'm looking for the "Top X" of something, but always out of the ENTIRE population. I can't think of a scenario where I'd be interested in the "Top X" of a random sample of results. So the query may be optimized, no argument there - but I would not offer this as standard advice because the utility is so limited outside of testing scenarios.
In my experience ORDER + LIMIT is of common use and it's a good practice, if the sorting column is an ordered primary key (which is a very good practice). For example, if you need the records created by a user in the last week, but not more than 10, instead of
and then filter out the extra results in the backend code, you can do
It's not only more efficient because it skips the records not needed, but using the DESC order it scans the rows from the last created and it stops when it reach the 10th matching result. Without the DESC order it has to scan all the first records created by the user. The only downside is that the results ale reversed, but if you need the in the ASC order you can order them with a parent query or the backend code side.
Bonus tip
If you don't need to limit the results, this example doesn't work well, because the DB engine does not know that created_at follows the same order of the id column. The best thing you can do in this case is to split the extraction in 2 queries:
The subquery extracts the last record id that doesn't meet the desired time range.
The main query directly uses the primary key to skip all the not needed records in a very efficient way.