SQL Server Databases (4 Part Series)
This is a continuation of a series of articles I've written on SQL Server concepts.
When you work with SQL Server long enough on a database with enough traffic, you're eventually going to encounter deadlocks.
This article discusses what deadlocks are, how to interpret deadlock graphs, and some options for handling deadlocks.
A deadlock occurs when two processes are competing for multiple resources in a way that does not resolve itself. When this occurs, SQL Server must terminate one of the two processes, resulting in the query failing to execute and the transaction failing. The error is then propagated to the executing code which can determine how to proceed.
So, what does this look like?
In this scenario, we have a query that needs to update a
People entry as well as a
PersonSkill entry associated with that person and a skill. In order to do this, the query needs a lock on the
People table and a lock on a range of data in the
PersonSkill table, which the query already has a lock for.
Unfortunately, the row in question in the
People table is already exclusively locked by the Update User Skill Count query, creating a deadlock.
Update User Skill Count query would complete, release its lock and then the
Update PersonSkill query could acquire its lock on
People and complete its task.
Update User Skill Count is currently waiting for
Update PersonSkill to complete and release its lock on
PersonSkills so we now have a scenario where two queries each have something the other needs and will not release their locks until they complete.
Thankfully, SQL Server has a deadlock resolution mechanism to prevent processes from keeping the database busy waiting for something that will never happen. In deadlock resolution, SQL Server will choose a victim at random (more on this later) and kill the process.
In this case, the
Update PersonSkill query was chosen as the deadlock victim and terminated. An exception propagated up to the caller informing it that the process was terminated due to deadlock. Meanwhile, the locks held by the query are released and
Update User Skill Count is able to acquire the lock it needed and complete without any knowledge that anything happened.
We'll talk more about how to deal with deadlocks later on, but next, let's look at how to get deadlock information from SQL Server.
You can run the following query to get a list of deadlock event XML and the time the deadlock occurred (Query and image taken from Microsoft's documentation).
SELECT xdr.value('@timestamp', 'datetime') AS [Date], xdr.query('.') AS [Event_Data] FROM (SELECT CAST([target_data] AS XML) AS Target_Data FROM sys.dm_xe_session_targets AS xt INNER JOIN sys.dm_xe_sessions AS xs ON xs.address = xt.event_session_address WHERE xs.name = N'system_health' AND xt.target_name = N'ring_buffer' ) AS XML_Data CROSS APPLY Target_Data.nodes('RingBufferTarget/event[@name="xml_deadlock_report"]') AS XEventData(xdr) ORDER BY [Date] DESC
The XML can then be opened via SQL Server Management Studio by clicking on it, which will then open the deadlock graph.
Note the similarities in the deadlock graph to our examples above. The range of information on the database objects in a deadlock graph is much deeper in terms of IDs, object names, index names, lock types, etc. but at its core, it is just telling you that two queries needed the same two locks and couldn't resolve their differences.
So now that we've looked into what deadlocks are and how you get basic information about them, we need to shift our focus to resolving those deadlocks.
There are a few ways of approaching deadlock problems, and I present to you a few options from most preferred to least preferred.
First of all, deadlocks are a result of objects being locked. Sometimes the objects locked are wider than they need to be. For example, if a query needs to scan a table to find a particular row or set of rows, the query might lock the table or a page of data when locking individual rows would be more efficient.
By improving the indexes that are used to join to the table in question, you can potentially make the set of locks so specific that they drastically reduce the odds of a deadlock occurring to begin with.
This is the most preferred way of dealing with a deadlock, but keep in mind that there are tradeoffs made whenever you add a new index.
Next, while we said that the deadlock victim is random, it's random within a priority bucket. You can set the
DEADLOCK_PRIORITY of a transaction manually. If a deadlock involves two transactions with differing priorities, it will always kill the process with the lower priority.
You can set priority with the following SQL:
SET DEADLOCK_PRIORITY LOW -- or NORMAL or HIGH
You can also set priority numerically from a -10 to 10 range with -10 being lowest and 10 being highest priority.
SET DEADLOCK_PRIORITY -5 -- Equivalent to LOW
By default all transactions will have a priority of
NORMAL or 0.
Bear in mind that this does not prevent the deadlock from occurring, but it does state a preference for which query will survive and which will be terminated.
In combination with the other modes, you can design routines with deadlocks in mind. If you expect a deadlock may occur in some scenarios, you can look for deadlock exceptions and retry the query in question up to a certain number of times until a success is reached. This is done outside of SQL and is a programming language-specific detail, so an example will not be provided.
Generally speaking this option is a longer-lasting solution, but takes longer to implement and is much more difficult to test during development and quality assurance.
Finally, you can change the transaction's isolation mode so that it is less aggressive about the locks it requests. See my article on isolation modes for more details on this process.
When changing isolation modes to avoid deadlocks, you typically will move from
READ COMMITTED to
READ UNCOMMITTED, but keep in mind that
READ COMMITTED protects you from dirty reads or reads of potentially incomplete information in rows. If you're reading something infrequently updated such as a name, this is probably fine, but if you require an accurate numerical total of a frequently updated field or your data needs to be completely accurate, you probably can't go to
READ UNCOMMITTED and you're better off with another solution.
In summary, deadlocks are a naturally occurring problem for busy databases, but there are a few things you can do to reduce their likelihood or strategically handle them if they do occur.
Next in this series I'll discuss database indexes and performance.