Tuning SQL Server Performance, Not a Science

#dotnet

Originally published on lavkesh.com

I've spent many nights analyzing execution plans, trying to determine why a query that previously ran in milliseconds now takes seconds. Performance tuning is detective work that requires intuition, experience, and the right tools. Most importantly, you need to think like the query optimizer.

Before tuning, you must identify what's actually wrong. The usual suspects include bad queries with missing indexes, poor joins, or table scans instead of seek operations. A query that should access 100 rows but scans a million instead severely impacts performance.

Indexing issues are another common problem. This includes missing indexes, too many unused indexes slowing down writes, or fragmented indexes that cause seeks to become slow scans. Resource contention occurs when multiple queries compete for CPU, memory, or I/O, bottlenecking resources and slowing everything down under load.

During a 2023 migration of a high‑traffic e‑commerce platform, we discovered over 5,000 indexes across 200 tables. The index count looked impressive on paper, but the write latency jumped by roughly 15 % because each insert had to touch dozens of secondary indexes. We ran sp_Blitz and sp_Indexes to flag indexes that hadn't been used in the last 30 days. Dropping about 800 of those unused indexes cut the insert latency back down and reduced the transaction log growth rate from 2 GB per hour to 1.2 GB. The lesson was clear: more indexes are not always better; each one adds CPU and I/O overhead on DML operations.

Fragmentation is a subtle issue; as indexes fragment, seeking becomes slower due to increased disk I/O operations. Rebuilding indexes can suddenly improve performance. Stale statistics also cause problems; the query optimizer relies on data distribution statistics to decide on execution plans. Outdated statistics lead to poor decisions, causing a query that once ran fine to become slow after new data is inserted.

The tuning process starts with profiling. Open SQL Server Management Studio and examine execution plans. This graphical representation of how SQL Server executes your query is invaluable. Look for table scans when index seeks are expected, excessive sorts, or spills to disk, as these are red flags.

One tool that saved us countless late‑night fires was Query Store. By enabling it on the primary databases, we got a history of actual plans and their runtime statistics. When a plan regression appeared - say a hash join replaced a nested loop and the duration doubled - we could query sys.query_store_plan and force the known‑good plan back with ALTER DATABASE SCOPED CONFIGURATION. The catch is to avoid over‑forcing; a plan that works for one parameter set may cause parameter sniffing on another, so we always paired forcing with a review of the parameter distribution.

Use SQL Server Profiler or Extended Events to capture slow-running queries in production. Don't guess; capture actual workload data. Focus on queries with high CPU usage, high I/O, or long durations that impact your application.

For each slow query, analyze its execution plan. Check for discrepancies between estimated and actual rows, which indicate outdated statistics. Look for missing indexes; the optimizer may suggest them. Use the Database Engine Tuning Advisor but treat its suggestions as recommendations, not absolute solutions.

Rewrite queries if necessary. Sometimes, changing the join strategy, adding a covering index, or breaking a complex query into simpler parts can help. Test changes in a staging environment that mimics production and measure the difference.

Testing query changes in isolation is tempting, but in production the interaction with concurrent workloads matters. We built a lightweight tSQLt suite that runs the rewritten query against a copy of the production data set and asserts that row counts match. To gauge impact we used sp_WhoIsActive during a controlled load test with HammerDB, capturing average CPU time and I/O per execution. In one case a covering index reduced the average CPU from 120 ms to 45 ms and cut logical reads from 3,200 to 900, which translated to a 30 % reduction in overall request latency under load.

Monitor fragmentation using DMVs. Rebuild indexes if fragmentation exceeds 30 percent; between 10-30 percent, reorganizing might suffice. Below 10 percent, no action is needed. Avoid creating indexes indiscriminately, as each index slows inserts and updates. Regularly review and drop unused indexes.

Monitor Performance Monitor counters for SQL Server, such as CPU, memory pages per second, and disk queue length. Sustained high values indicate resource contention. Use Resource Governor to throttle resource-intensive queries or sessions and prioritize critical workloads.

Enable automatic updates for statistics, but note it's not perfect. For tables with significant changes, update statistics more frequently, as they drive the query optimizer's decisions. Tools like SSMS, SQL Server Profiler, and DMVs provide valuable insights into performance issues.

Performance tuning is an ongoing process. Your workload changes, data grows, and patterns shift. Set up monitoring and alerting to catch performance issues early. Track query performance over time and foster a culture that values execution plans, not just functional code.

DEV Community

Tuning SQL Server Performance, Not a Science

Top comments (0)