DEV Community

Cover image for Beyond Functional: Writing Professional and Performant SQL Queries
Kipngeno Gregory
Kipngeno Gregory

Posted on

Beyond Functional: Writing Professional and Performant SQL Queries

Structured Query Language (SQL) is one of the most widely used languages for interacting with databases, yet even experienced developers often make subtle mistakes that affect performance, readability, and security. Writing high-quality SQL queries is critical for scalability, maintainability, and efficiency.

1. The Siren Call of SELECT * Instead of Explicit Columns

Problem: SELECT * retrieves every column from a table, even those not needed. This increases network load, slows down queries, and can break applications if the schema changes.

Solution: Always specify the exact columns you need:

SELECT id, name, created_at FROM users;

-- Instead of:
SELECT * FROM orders;

-- Use:
SELECT order_id, customer_id, order_date, total_amount
FROM orders;
Enter fullscreen mode Exit fullscreen mode

This improves performance and keeps your queries predictable.

2. Neglecting the Power of Indexes

Problem: Queries that filter large tables without proper indexes often result in full table scans, significantly slowing down performance.

The Mistake: Writing predicates that prevent index usage.

Common culprits include:

  • Wrapping a column in a function: WHERE YEAR(order_date) = 2023
  • Using a wildcard at the beginning of a LIKE pattern:WHERE customer_name LIKE '%Smith%'
  • Using OR conditions on different columns without appropriate indexes. The Impact: The query forces a full table scan, which becomes exponentially slower as the table grows.

Solution: Ensure that columns used in WHERE, JOIN, and ORDER BYclauses are indexed where appropriate. Always analyze query execution plans to confirm indexes are being used.
code

-- Instead of (non-sargable):
SELECT * FROM orders WHERE YEAR(order_date) = 2023;

-- Use (sargable - Search ARGument ABLE):
SELECT * FROM orders WHERE order_date >= '2023-01-01' AND order_date < '2024-01-01';
Enter fullscreen mode Exit fullscreen mode

3. Mishandling NULL Values

Problem: Failing to account for NULL values can produce unexpected results, especially with comparison operators.

Solution: Use IS NULL or COALESCE() or IS NOT NULL or IFNULL() to handle null values explicitly:

SELECT COALESCE(email, 'kipngenogregory@gmail.com') AS safe_email FROM users;

or

-- Instead of (will not find NULL phone numbers):
SELECT * FROM customers WHERE phone_number = NULL;

-- Use:
SELECT * FROM customers WHERE phone_number IS NULL;

-- To safely perform calculations:
SELECT product_id, price * COALESCE(quantity, 0) AS estimated_value FROM order_items;
Enter fullscreen mode Exit fullscreen mode

4. Improper Filtering in GROUP BYand HAVING

A confusion between the WHERE and HAVING clauses leads to inefficient queries.

  • The Mistake: Using the HAVING clause to filter rows before aggregation.
  • The Impact: The HAVING clause filters groups after they have been aggregated. Filtering individual rows first is the job of the WHERE clause, which is far more efficient as it reduces the working data set before the costly aggregation operation. The Professional Approach: Use WHERE to filter rows. Use HAVING to filter groups.
-- Inefficient:
SELECT customer_id, COUNT(order_id) AS order_count
FROM orders
GROUP BY customer_id
HAVING customer_id > 100; -- Filtering on a single row *after* grouping

-- Efficient:
SELECT customer_id, COUNT(order_id) AS order_count
FROM orders
WHERE customer_id > 100  -- Filter rows *before* grouping
GROUP BY customer_id;
Enter fullscreen mode Exit fullscreen mode

Conclusion

By eschewing SELECT *, respecting indexes, handling NULLscorrectly, using explicit joins, and filtering strategically, you elevate your code from merely functional to truly exceptional. This leads to systems that are faster, more reliable, and easier to debug—a hallmark of a true data professional.

by gregory.tech

Top comments (0)