Manage database concurrency
- 10/9/2017
- Skill 3.1: Implement transactions
- Skill 3.2: Manage isolation levels
- Skill 3.3: Optimize concurrency and locking behavior
- Skill 3.4: Implement memory-optimized tables and native stored procedures
- Thought experiment
- Thought experiment answers
Skill 3.3: Optimize concurrency and locking behavior
SQL Server uses locks to control the effect of concurrent transactions on one another. Part of your job as an administrator is to improve concurrency by properly managing locking behavior. That means you need to understand how to uncover performance problems related to locks and lock escalations. Additionally, you must know how to use the tools available to you for identifying when and why deadlocks happen and the possible steps you can take to prevent deadlocks from arising.
Troubleshoot locking issues
Before you can troubleshoot locking issues, you must understand how SQL Server uses locks, which we describe in detail in Skill 3.1. As part of the troubleshooting process, you need to determine which resources are locked, why they are locked, and the lock type in effect.
You can use the following dynamic management views (DMVs) to view information about locks:
sys.dm_tran_locks Use this DMV to view all current locks, the lock resources, lock mode, and other related information.
sys.dm_os_waiting_tasks Use this DMV to see which tasks are waiting for a resource.
sys.dm_os_wait_stats Use this DMV to see how often processes are waiting while locks are taken.
Before we look at these DMVs in detail, let’s set up our environment as shown in Listing 3-7 so that we can establish some context for locking behavior.
LISTING 3-7 Create a test environment for testing locking behavior
CREATE TABLE Examples.LockingA ( RowId int NOT NULL CONSTRAINT PKLockingARowId PRIMARY KEY, ColumnText varchar(100) NOT NULL ); INSERT INTO Examples.LockingA(RowId, ColumnText) VALUES (1, 'Row 1'), (2, 'Row 2'), (3, 'Row 3'), (4, 'Row 4'); CREATE TABLE Examples.LockingB ( RowId int NOT NULL CONSTRAINT PKLockingBRowId PRIMARY KEY, ColumnText varchar(100) NOT NULL ); INSERT INTO Examples.LockingB(RowId, ColumnText) VALUES (1, 'Row 1'), (2, 'Row 2'), (3, 'Row 3'), (4, 'Row 4');
sys.dm_tran_locks
The sys.dm_tran_locks DMV provides you with information about existing locks and locks that have been requested but not yet granted in addition to details about the resource for which the lock is requested. You can use this DMV only to view information at the current point in time. It does not provide access to historical information about locks. Table 3-2 describes each column in sys.dm_tran_locks.
TABLE 3-2 sys.dm_tran_locks
COLUMN |
DESCRIPTION |
resource_type |
One of the following types of resources: DATABASE, FILE, OBJECT, PAGE, KEY, EXTENT, RID, APPLICATION, METADATA, HOBT, or ALLOCATION_UNIT. |
resource_subtype |
If a resource has a subtype, this column displays it. |
resource_database_id |
The ID of the database containing the resource. |
resource_description |
Additional information, if available, about the resource not found in other resource columns. |
resource_associated_entity_id |
The ID of the entity with which the resource is associated, such as an object ID, HoBT ID, or Allocation Unit ID. |
resource_lock_partition |
The ID of the lock partition for partitioned lock resource. The value is 0 for a non-partitioned lock resource. |
request_mode |
The lock mode requested by waiting requests or granted for other requests. |
request_type |
This value is always LOCK. |
request_status |
One of the following values to reflect the current status of the request: GRANTED, CONVERT, WAIT, LOW_PRIORITY_CONVERT, LOW_PRIORITY_WAIT, or ABORT_BLOCKERS. |
request_reference_count |
The approximate number of times the requestor has requested the resource. |
request_lifetime |
This column is not supported. |
request_session_id |
The ID of the session that owns the request. An orphaned distributed transaction has a value of -2. A deferred recovery transaction has a value of -3. |
request_exec_context_id |
The ID of the execution context for the process that owns the request. |
request_request_id |
The ID of the request for the process that owns the request. This value changes when the active MARS connection for the transaction changes. |
request_owner_type |
The type of entity that owns the request: TRANSACTION, CURSOR, SESSION, SHARED_TRANSACTION_WORKSPACE, EXCLUSIVE_TRANSACTION_WORKSPACE, or NOTIFICATION_OBJECT. |
request_owner_id |
The ID of the owner of the request. |
request_owner_guid |
The GUID of the owner of the request. |
request_owner_lockspace_id |
This column is not supported. |
lock_owner_address |
The memory address of the internal data structure that is tracking the request. Join it with the resource_address column in sys.dm_os_waiting_tasks. |
pdw_node_id |
The ID for node in the Analytics Platform System (formerly known as Parallel Data Warehouse). |
Let’s start some transactions to observe the locks that SQL Server acquires. In one session, execute the following statements:
BEGIN TRANSACTION; SELECT RowId, ColumnText FROM Examples.LockingA WITH (HOLDLOCK, ROWLOCK);
In a separate session, start another transaction:
BEGIN TRANSACTION; UPDATE Examples.LockingA SET ColumnText = 'Row 2 Updated' WHERE RowId = 2;
Now let’s use the sys.dm_tran_locks DMV to view some details about the current locks:
SELECT request_session_id as s_id, resource_type, resource_associated_entity_id, request_status, request_mode FROM sys.dm_tran_locks WHERE resource_database_id = db_id('ExamBook762Ch3');
Although your results might vary, especially with regard to identifiers, the DMV returns results similar to the example below. Notice the wait for the exclusive lock for session 2. It must wait until session 1 releases its shared range (RangeS-S) locks that SQL Server takes due to the HOLDLOCK table hint. This table hint is equivalent to setting the isolation level to SERIALIZABLE. SQL Server also takes intent locks on the table (which appears on the OBJECT rows of the results) and the page, with session 1 taking intent shared (IS) locks and session 2 taking intent exclusive (IX) locks.
s_id resource_type resource_associated_entity_id request_status request_mode ---- ------------- ----------------------------- -------------- -------------- 1 DATABASE 0 GRANT S 2 DATABASE 0 GRANT S 1 PAGE 72057594041729024 GRANT IS 2 PAGE 72057594041729024 GRANT IX 1 KEY 72057594041729024 GRANT RangeS-S 1 KEY 72057594041729024 GRANT RangeS-S 1 KEY 72057594041729024 GRANT RangeS-S 1 KEY 72057594041729024 GRANT RangeS-S 1 KEY 72057594041729024 GRANT RangeS-S 2 KEY 72057594041729024 WAIT X 1 OBJECT 933578364 GRANT IS 2 OBJECT 933578364 GRANT IX
Connect to the ExamBook762Ch3 database containing the resource and use one of the resource_associated_entity_id values from the previous query in the WHERE clause to see which object is locked, like this:
SELECT object_name(object_id) as Resource, object_id, hobt_id FROM sys.partitions WHERE hobt_id=72057594041729024;
When you view the results of this latter query, you can see the name of the resource that is locked, like this:
Resource object_id hobt_id -------- ---------- ------------------- LockingA 933578364 72057594041729024
In the previous example, you can also see the object_id returned from sys.partitions corresponds to the resource_associated_entity_id associated with the OBJECT resource_type in the DMV.
When troubleshooting blocking situations, look for CONVERT in the request_status column in this DMV. This value indicates the request was granted a lock mode earlier, but now needs to upgrade to a different lock mode and is currently blocked.
sys.dm_os_waiting_tasks
Another useful DMV is sys.dm_os_waiting_tasks. Whenever a user asks you why a query is taking longer to run than usual, a review of this DMV should be one of your standard troubleshooting steps. You can find a description of each column in this DMV in Table 3-3.
TABLE 3-3 sys.dm_os_waiting_tasks
COLUMN |
DESCRIPTION |
waiting_task_address |
The address of the waiting task. |
session_id |
The ID of the session that owns the task. |
exec_context_id |
The ID of the execution context of the task. |
wait_duration_ms |
The total wait time for this wait type in milliseconds. This value in- cludes signal_wait_time_ms. |
wait_type |
The type of wait. |
resource_address |
The address of the resource for which the task is waiting. |
blocking_task_address |
The task that is currently holding the requested resource. |
blocking_session_id |
The ID of the session that is blocking the request. This column is NULL if the task is not blocked, -2 if the blocking resource is owned by an orphaned transaction, -3 if the blocking resource is owned by a deferred recovery transaction, and -4 if the session ID of the blocking latch owner cannot be determined due to internal latch state transitions. |
blocking_exec_context_id |
The ID of the execution context of the blocking task. |
resource_description |
The description of the resource consumed. See https://msdn.micro- soft.com/en-us/library/ms188743.aspx for more information. |
pdw_node_id |
The ID for node in the Analytics Platform System (formerly known as Parallel Data Warehouse). |
In particular, you can use the sys.dm_trans_locks DMV in conjunction with the sys.dm_os_waiting_tasks DMV to find blocked sessions, as shown in Listing 3-8.
LISTING 3-8 Use system DMV sys.dm_tran_locks and sys.dm_os_waiting_tasks to display blocked sessions
SELECT t1.resource_type AS res_typ, t1.resource_database_id AS res_dbid, t1.resource_associated_entity_id AS res_entid, t1.request_mode AS mode, t1.request_session_id AS s_id, t2.blocking_session_id AS blocking_s_id FROM sys.dm_tran_locks as t1 INNER JOIN sys.dm_os_waiting_tasks as t2 ON t1.lock_owner_address = t2.resource_address;
Whereas the earlier query showing existing locks is helpful for learning how SQL Server acquires locks, the query in Listing 3-8 returns information that is more useful on a day-to-day basis for uncovering blocking chains. In the query results shown below, you can see that session 2 is blocked by session 1.
res_typ res_dbid res_entid mode s_id blocking_s_id ------- -------- ------------------ ---- ------ ----------------- KEY 27 72057594041729024 X 2 1
Execute the following statement in both sessions to release the locks:
ROLLBACK TRANSACTION;
sys.dm_os_wait_stats
The sys.dm_os_wait_stats DMV is an aggregate view of all waits that occur when a requested resource is not available, a worker thread is idle typically due to background tasks, or an external event must complete first. Table 3-4 explains the columns in sys.dm_os_wait_stats.
TABLE 3-4 sys.dm_os_wait_stats
COLUMN |
DESCRIPTION |
wait_type |
The type of wait. The wait types associated with locks all begin with LCK. |
waiting_tasks_count |
The number of waits having this wait type. The start of a new wait increments this value. |
wait_time_ms |
The total wait time for this wait type in milliseconds. This value includes signal_wait_time_ms. |
max_wait_time_ms |
The highest wait time for this wait type in milliseconds. |
signal_wait_time_ms |
The amount of time in milliseconds between the time the waiting thread was signaled and the time it started running. |
pdw_node_idpdw_node_id |
The ID for node in the Analytics Platform System (formerly known as Parallel Data Warehouse |
There are many wait types unrelated to locks, so when using the sys.dm_os_wait_stats DMV, you should apply a filter to focus on lock waits only, like this:
SELECT wait_type as wait, waiting_tasks_count as wt_cnt, wait_time_ms as wt_ms, max_wait_time_ms as max_wt_ms, signal_wait_time_ms as signal_ms FROM sys.dm_os_wait_stats WHERE wait_type LIKE 'LCK%' ORDER BY wait_time_ms DESC;
The partial results of this query on our computer shown in the following example indicate that our SQL Server instance have the longest waits when threads are waiting for an exclusive (X) lock. On the other hand, the greatest number of waits is a result of waiting for a schema modification (SCH-M) lock. In both cases, the waits are caused because SQL Server has already granted an incompatible lock to the resource on another thread. This information is useful for identifying long-term trends, but does not show you details about the locked resources.
wait wt_cnt wt_ms max_wt_ms signal_ms ------------- ------- -------- ---------- ---------- LCK_M_X 6 1170670 712261 114 LCK_M_S 28 19398 2034 43 LCK_M_SCH_M 449 92 28 46 LCK_M_SCH_S 1 72 72 0
You can reset the cumulative values in the sys.dm_os_wait_stats DMV by executing the following statement: DBCC SQLPERF (N’sys.dm_os_wait_stats’, CLEAR);. Otherwise, these values are reset each time that the SQL Server service restarts.
Identify lock escalation behaviors
Lock escalation occurs when SQL Server detects too much memory, or too many system resources are required for a query’s locks. It then converts one set of locks to another set of locks applied to resources higher in the lock hierarchy. In other words, SQL Server tries to use fewer locks to cover more resources. As an example, SQL Server might choose to escalate a high number of row locks to a table lock. This capability can reduce overhead on the one hand, but can impact performance on the other hand because more data is locked. As a result, there is greater potential for blocking.
Lock escalation occurs when more than 40 percent of the available database engine memory pool is required by lock resources, or at least 5,000 locks are taken in a single T-SQL statement for a single resource. SQL Server converts an intent lock to a full lock, as long as the full lock is compatible with existing locks on the resource. It then releases system resources and locks on the lower level of the lock hierarchy. If the new lock is taken on a row or a page, SQL Server adds an intent lock on the object at the next higher level. However, if other locks prevent lock escalation, SQL Server continues attempting to perform the escalation for each new 1,250 locks it takes.
In most cases, you should let SQL Server manage the locks. If you implement a monitoring system, take note of Lock:Escalation events to establish a benchmark. When the number of Lock:Escalation events exceeds the benchmark, you can take action at the table level or at the query level.
Another option for monitoring lock escalation is to benchmark the percentage of time that intent lock waits (LCK_M_I*) occur relative to regular locks in the sys.dm_os_wait_stats DMV by using a query like this:
SELECT wait_type as wait, wait_time_ms as wt_ms, CONVERT(decimal(9,2), 100.0 * wait_time_ms / SUM(wait_time_ms) OVER ()) as wait_pct FROM sys.dm_os_wait_stats WHERE wait_type LIKE 'LCK%' ORDER BY wait_time_ms DESC;
Capture and analyze deadlock graphs
Usually the process of locking and unlocking SQL Server is fast enough to allow many users to read and write data with the appearance that it occurs simultaneously. However, sometimes two sessions block each other and neither can complete, which is a situation known as deadlocking. Normally, the database engine terminates a thread of a deadlocked transaction with error 1205 and suggests a remedy, such as running the transaction again.
Let’s deliberately create a deadlock between two transactions. Start two sessions and add the following statements to the first session:
BEGIN TRANSACTION; UPDATE Examples.LockingA SET ColumnText = 'Row 1 Updated' WHERE RowId = 1; WAITFOR DELAY '00:00:05'; UPDATE Examples.LockingB; SET ColumnText = 'Row 1 Updated Again' WHERE RowId = 1;
Next, in the second session, add the following statements:
BEGIN TRANSACTION; UPDATE Examples.LockingB SET ColumnText = 'Row 1 Updated' WHERE RowId = 1; WAITFOR DELAY '00:00:05'; UPDATE Examples.LockingA; SET ColumnText = 'Row 1 Updated Again' WHERE RowId = 1;
Now execute the statements in the first session, and then, within five seconds, execute the second session’s statements. Only one of the transaction completes and the other was terminated with a rollback by SQL Server as shown by the following message:
Msg 1205, Level 13, State 51, Line 6 Transaction (Process ID 70) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
In this example, both transactions need the same table resources. Both transactions can successfully update a row without conflict and have an exclusive lock on the updated data. Then they each try to update data in the table that the other transaction had updated, but each transaction is blocked while waiting for the other transaction’s exclusive lock to be released. Neither transaction can ever complete and release its lock, thereby causing a deadlock. When SQL Server recognizes this condition, it terminates one of the transactions and rolls it back. It usually chooses the transaction that is least expensive to rollback based on the number of transaction log records. At that point, the aborted transaction’s locks are released and the remaining open transaction can continue.
Of course, deadlocks are not typically going to happen while you watch, so how can you know when and why they occur? You can use either SQL Server Profiler or Extended Events to capture a deadlock graph, an XML description of a deadlock.
SQL Server Profiler deadlock graph
If you use SQL Server Profiler to capture a deadlock graph, you must configure the trace before deadlocks occur. Start by creating a new trace, and connect to your SQL Server instance. In the Trace Properties dialog box, select the Events Selection tab, select the Show All Events check box, expand Locks, and then select the following events:
Deadlock graph
Lock:Deadlock
Lock:Deadlock Chain
On the Events Extraction Settings tab, select the Save Deadlock XML Events Separately option, navigate to a directory into which SQL Server Profiler saves deadlock graphs, and supply a name for the graph. You can choose whether to save all deadlock graphs in a single .xdl file or save multiple deadlock graphs as a separate .xdl file.
Now set up the deadlock scenario again to generate the deadlock graph. In one session, add the following statements:
BEGIN TRANSACTION; UPDATE Examples.LockingA SET ColumnText = 'Row 2 Updated' WHERE RowId = 2; WAITFOR DELAY '00:00:05'; UPDATE Examples.LockingB SET ColumnText = 'Row 2 Updated Again' WHERE RowId = 2;
Next, in the second session, add the following statements:
BEGIN TRANSACTION; UPDATE Examples.LockingB SET ColumnText = 'Row 2 Updated' WHERE RowId = 2; WAITFOR DELAY '00:00:05'; UPDATE Examples.LockingA SET ColumnText = 'Row 2 Updated Again' WHERE RowId = 2;
When a deadlock occurs, you can see the deadlock graph as an event in SQL Server Profiler, as shown in Figure 3-1. In the deadlock graph, you see the tables and queries involved in the deadlock, which process was terminated, and which locks led to the deadlock. The ovals at each end of the deadlock graph contain information about the processes running the deadlocked queries. The terminated process displays in the graph with an x superimposed on it. Hover your mouse over the process to view the statement associated with it. The rectangles labeled Key Lock identify the database object and index associated with the locking. Lines in the deadlock graph show the relationship between processes and database objects. A request relationship displays when a process waits for a resource while an owner relationship displays when a resource waits for a process.
FIGURE 3-1 A deadlock graph
Extended Events deadlock graph
In Extended Events, you can use the continuously running system_health session to discover past deadlocks. As an alternative, you can set up a new session dedicated to capturing deadlock information. The system_health session automatically captures detected deadlocks without requiring special configuration. That means you can analyze a deadlock after it has occurred.
To find deadlock information in the Extended Events viewer, open SQL Server Management Studio, connect to the database engine, expand the Management node in Object Explorer, expand the Extended Events node, expand the Sessions node, and then expand the System_health node. Right-click Package0.event_file, and select View Target Data. In the Extended Events toolbar, click the Filters button. In the Filters dialog box, select Name in the Field drop-down list, type xml_deadlock_report in the Value text box, as shown in Figure 3-2, and then click OK. Select Xml_deadlock_report in the filtered list of events, and then click the Deadlock tab below it to view the deadlock graph.
FIGURE 3-2 An Extended Events filter for xml_deadlock_report
Identify ways to remediate deadlocks
Deadlocks are less likely to occur if transactions can release resources as quickly as possible. You can also lock up additional resources to avoid contention between multiple transactions. For example, you can use a hint to lock a table although this action can also cause blocking.
Usually the best way to resolve a deadlock is to rerun the transaction. For this reason, you should enclose a transaction in a TRY/CATCH block and add retry logic. Let’s revise the previous example to prevent the deadlock. Start two new sessions and add the statements in Listing 3-9 to both sessions.
LISTING 3-9 Add retry logic to avoid deadlock
DECLARE @Tries tinyint SET @Tries = 1 WHILE @Tries <= 3 BEGIN BEGIN TRANSACTION BEGIN TRY UPDATE Examples.LockingB SET ColumnText = 'Row 3 Updated' WHERE RowId = 3; WAITFOR DELAY '00:00:05'; UPDATE Examples.LockingA SET ColumnText = 'Row 3 Updated Again' WHERE RowId = 3; COMMIT TRANSACTION; END TRY BEGIN CATCH SELECT ERROR_NUMBER() AS ErrorNumber; ROLLBACK TRANSACTION; SET @Tries = @Tries + 1; CONTINUE; END CATCH END
Next, execute each session. This time the deadlock occurs again, but the CATCH block captured the deadlock. SQL Server does not automatically roll back the transaction when you use this method, so you should include a ROLLBACK TRANSACTION in the CATCH block. The @@TRANCOUNT variable resets to zero in both transactions. As a result, SQL Server no longer cancels one of the transactions and you can also see the error number generated for the deadlock victim:
ErrorNumber ------------- 1205
Re-execution of the transaction might not be possible if the cause of the deadlock is still locking resources. To handle those situations, you could need to consider the following methods as alternatives for resolving deadlocks.
Use SNAPSHOT or READ_COMMITTED_SNAPSHOT isolation levels. Either of these options avoid most blocking problems without the risk of dirty reads. However, both of these options require plenty of space in tempdb.
Use the NOLOCK query hint if one of the transactions is a SELECT statement, but only use this method if the trade-off of a deadlock for dirty reads is acceptable.
Add a new covering nonclustered index to provide another way for SQL Server to read data without requiring access to the underlying table. This approach works only if the other transaction participating in the deadlock does not use any of the covering index keys. The trade-off is the additional overhead required to maintain the index.
Proactively prevent a transaction from locking a resource that eventually gets locked by another transaction by using the HOLDLOCK or UPDLOCK query hints.