Microsoft SQL Server 2008 Internals: Transactions and Concurrency
- 3/11/2009
- Concurrency Models
- Transaction Processing
- Locking
- Lock Compatibility
- Internal Locking Architecture
- Row-Level Locking vs. Page-Level Locking
- Row Versioning
- Controlling Locking
- Summary
Row-Level Locking vs. Page-Level Locking
Although SQL Server 2008 fully supports row-level locking, in some situations, the lock manager decides not to lock individual rows and instead locks pages or the whole table. In other cases, many smaller locks are escalated to a table lock, as I discuss in the upcoming section entitled “Lock Escalation.”
Prior to SQL Server 7.0, the smallest unit of data that SQL Server could lock was a page. Even though many people argued that this was unacceptable and it was impossible to maintain good concurrency while locking entire pages, many large and powerful applications were written and deployed using only page-level locking. If they were well designed and tuned, concurrency was not an issue, and some of these applications supported hundreds of active user connections with acceptable response times and throughput. However, with the change in page size from 2 KB to 8 KB for SQL Server 7.0, the issue has become more critical. Locking an entire page means locking four times as much data as in previous versions. Beginning with SQL Server 7.0, the software implements full row-level locking, so any potential problems due to lower concurrency with the larger page size should not be an issue. However, locking isn’t free. Resources are required to manage locks. Recall that a lock is an in-memory structure of 64 or 128 bytes (for 32-bit or 64-bit machines, respectively) with another 32 or 64 bytes for each process holding or requesting the lock. If you need a lock for every row and you scan a million rows, you need more than 64 MB of RAM just to hold locks for that one process.
Beyond memory consumption issues, locking is a fairly processing-intensive operation. Managing locks requires substantial bookkeeping. Recall that, internally, SQL Server uses a lightweight mutex called a spinlock to guard resources, and it uses latches—also lighter than full-blown locks—to protect non-leaf level index pages. These performance optimizations avoid the overhead of full locking. If a page of data contains 50 rows of data, all of which are used, it is obviously more efficient to issue and manage one lock on the page than to manage 50. That’s the obvious benefit of page locking—a reduction in the number of lock structures that must exist and be managed.
Let’s say two processes each need to update a few rows of data, and even though the rows are not the same ones, some of them happen to exist on the same page. With page-level locking, one process would have to wait until the page locks of the other process were released. If you use row-level locking instead, the other process does not have to wait. The finer granularity of the locks means that no conflict occurs in the first place because each process is concerned with different rows. That’s the obvious benefit of row-level locking. Which of these obvious benefits wins? Well, the decision isn’t clear-cut, and it depends on the application and the data. Each type of locking can be shown to be superior for different types of applications and usage.
The ALTER INDEX statement lets you manually control the unit of locking within an index with options to disallow page locks or row locks within an index. Because these options are available only for indexes, there is no way to control the locking within the data pages of a heap. (But remember that if a table has a clustered index, the data pages are part of the index and are affected by a value set with ALTER INDEX.) The index options are set for each table or index individually. Two options, ALLOW_ROW_LOCKS and ALLOW_PAGE_LOCKS, are both set to ON initially for every table and index. If both of these options are set to OFF for a table, only full table locks are allowed.
As mentioned earlier, during the optimization process, SQL Server determines whether to lock rows, pages, or the entire table initially. The locking of rows (or keys) is heavily favored. The type of locking chosen is based on the number of rows and pages to be scanned, the number of rows on a page, the isolation level in effect, the update activity going on, the number of users on the system needing memory for their own purposes, and so on.
Lock Escalation
SQL Server automatically escalates row, key, or page locks to coarser table or partition locks as appropriate. This escalation protects system resources—it prevents the system from using too much memory for keeping track of locks—and increases efficiency. For example, after a query acquires many row locks, the lock level can be escalated because it probably makes more sense to acquire and hold a single lock than to hold many row locks. When lock escalation occurs, many locks on smaller units (rows or pages) are released and replaced by one lock on a larger unit. This escalation reduces locking overhead and keeps the system from running out of locks. Because a finite amount of memory is available for the lock structures, escalation is sometimes necessary to make sure the memory for locks stays within reasonable limits.
The default in SQL Server is to escalate to table locks. However, SQL Server 2008 introduces the ability to escalate to a single partition using the ALTER TABLE statement. The LOCK_ESCALATION option of ALTER TABLE can specify that escalation is always to a table level, or that it can be to either a table or partition level. The LOCK_ESCALATION option can also be used to prevent escalation entirely. Here’s an example of altering the TransactionHistory table (which you may have created if you ran the partitioning example in Chapter 7), so that locks can be escalated to either the table or partition level:
ALTER TABLE TransactionHistory SET (LOCK_ESCALATION = AUTO);
Lock escalation occurs in the following situations:
The number of locks held by a single statement on one object, or on one partition of one object, exceeds a threshold. Currently that threshold is 5,000 locks, but it might change in future service packs. The lock escalation does not occur if the locks are spread over multiple objects in the same statement—for example, 3,000 locks in one index and 3,000 in another.
Memory taken by lock resources exceeds 40 percent of the non-AWE (32-bit) or regular (64-bit) enabled memory and the locks configuration option is set to 0. (In this case, the lock memory is allocated dynamically as needed, so the 40 percent value is not a constant.) If the locks option is set to a nonzero value, memory reserved for locks is statically allocated when SQL Server starts. Escalation occurs when SQL Server is using more than 40 percent of the reserved lock memory for lock resources.
When the lock escalation is triggered, the attempt might fail if there are conflicting locks. So, for example, if an X lock on a RID needs to be escalated and there are concurrent X locks on the same table or partition held by a different process, the lock escalation attempt fails. However, SQL Server continues to attempt to escalate the lock every time the transaction acquires another 1,250 locks on the same object. If the lock escalation succeeds, SQL Server releases all the row and page locks on the index or the heap.
Controlling Lock Escalation
Lock escalation can potentially lead to blocking of future concurrent access to the index or the heap by other transactions needing row or page locks on the object. SQL Server cannot de-escalate the lock when new requests are made. So lock escalation is not always a good idea for all applications.
SQL Server 2008 also supports disabling lock escalation for a single table using the ALTER TABLE statement. Here is an example of disabling lock escalation on the TransactionHistory table:
ALTER TABLE TransactionHistory SET (LOCK_ESCALATION = DISABLE);
SQL Server 2008 also supports disabling lock escalation using trace flags. Note that these trace flags affect lock escalation on all tables in all databases in a SQL Server instance.
Trace flag 1211 completely disables lock escalation. It instructs SQL Server to ignore the memory acquired by the lock manager up to the maximum statically allocated lock memory (specified using the locks configuration option) or 60 percent of the non-AWE (32-bit) or regular (64-bit) dynamically allocated memory. At that time, an out-of-lock memory error is generated. You should exercise extreme caution when using this trace flag as a poorly designed application can exhaust the memory and seriously degrade the performance of your SQL Server instance.
Trace flag 1224 also disables lock escalation based on the number of locks acquired, but it allows escalation based on memory consumption. It enables lock escalation when the lock manager acquires 40 percent of the statically allocated memory (as per the locks option) or 40 percent of the non-AWE (32-bit) or regular (64-bit) dynamically allocated memory. You should note that if SQL Server cannot allocate memory for locks due to memory use by other components, the lock escalation can be triggered earlier. As with trace flag 1211, SQL Server generates an out-of-memory error when memory allocated to the lock manager exceeds the total statically allocated memory or 60 percent of non-AWE (32-bit) or regular (64-bit) memory for dynamic allocation.
If both trace flags (1211 and 1224) are set at the same time, trace flag 1211 takes precedence. Remember that these trace flags affect the entire SQL Server instance. In many cases, it is desirable to control the escalation threshold at the object level, so you should consider using the ALTER TABLE command when possible.
Deadlocks
A deadlock occurs when two processes are waiting for a resource and neither process can advance because the other process prevents it from getting the resource. A true deadlock is a Catch-22 in which, without intervention, neither process can ever make progress. When a deadlock occurs, SQL Server intervenes automatically. I refer mainly to deadlocks acquired due to conflicting locks, although deadlocks can also be detected on worker threads, memory, and parallel query resources.
In SQL Server, two main types of deadlocks can occur: a cycle deadlock and a conversion deadlock. Figure 10-5 shows an example of a cycle deadlock. Process A starts a transaction, acquires an exclusive table lock on the Product table, and requests an exclusive table lock on the PurchaseOrderDetail table. Simultaneously, process B starts a transaction, acquires an exclusive lock on the PurchaseOrderDetail table, and requests an exclusive lock on the Product table. The two processes become deadlocked—caught in a “deadly embrace.” Each process holds a resource needed by the other process. Neither can progress, and, without intervention, both would be stuck in deadlock forever. You can actually generate the deadlock in SQL Server Management Studio, as follows:
Open a query window, and change your database context to the AdventureWorks2008 database. Execute the following batch for process A:
BEGIN TRAN UPDATE Production.Product SET ListPrice = ListPrice * 0.9 WHERE ProductID = 922;
Open a second window, and execute this batch for process B:
BEGIN TRAN UPDATE Purchasing.PurchaseOrderDetail SET OrderQty = OrderQty + 200 WHERE ProductID = 922 AND PurchaseOrderID = 499;
Go back to the first window, and execute this UPDATE statement:
UPDATE Purchasing.PurchaseOrderDetail SET OrderQty = OrderQty - 200 WHERE ProductID = 922 AND PurchaseOrderID = 499;
At this point, the query should block. It is not deadlocked yet, however. It is waiting for a lock on the PurchaseOrderDetail table, and there is no reason to suspect that it won’t eventually get that lock.
Go back to the second window, and execute this UPDATE statement:
UPDATE Production.Product SET ListPrice = ListPrice * 1.1 WHERE ProductID = 922;
Figure 10-5. A cycle deadlock resulting from two processes, each holding a resource needed by the other
At this point, a deadlock occurs. The first connection never gets its requested lock on the PurchaseOrderDetail table because the second connection does not give it up until it gets a lock on the Product table. Because the first connection already has the lock on the Product table, we have a deadlock. One of the processes receives the following error message. (Of course, the actual process ID reported will probably be different.)
Msg 1205, Level 13, State 51, Line 1 Transaction (Process ID 57) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Figure 10-6 shows an example of a conversion deadlock. Process A and process B each hold a shared lock on the same page within a transaction. Each process wants to promote its shared lock to an exclusive lock but cannot do so because of the other process’s lock. Again, intervention is required.
Figure 10-6. A conversion deadlock resulting from two processes wanting to promote their locks on the same resource within a transaction
SQL Server automatically detects deadlocks and intervenes through the lock manager, which provides deadlock detection for regular locks. In SQL Server 2008, deadlocks can also involve resources other than locks. For example, if process A is holding a lock on Table1 and is waiting for memory to become available and process B has some memory that it can’t release until it acquires a lock on Table1, the processes deadlock. When SQL Server detects a deadlock, it terminates one process’s batch, rolling back the active transaction and releasing all that process’s locks to resolve the deadlock. In addition to deadlocks on lock resources and memory resources, deadlocks can also occur with resources involving worker threads, parallel query execution–related resources, and MARS resources. Latches are not involved in deadlock detection because SQL Server uses deadlock-proof algorithms when it acquires latches.
In SQL Server, a separate thread called LOCK_MONITOR checks the system for deadlocks every five seconds. As deadlocks occur, the deadlock detection interval is reduced and can go as low as 100 milliseconds. In fact, the first few lock requests that cannot be satisfied after a deadlock has been detected will immediately trigger a deadlock search rather than wait for the next deadlock detection interval. If the deadlock frequency declines, the interval can go back to every five seconds.
This LOCK_MONITOR thread checks for deadlocks by inspecting the list of waiting locks for any cycles, which indicate a circular relationship between processes holding locks and processes waiting for locks. SQL Server attempts to choose as the victim the process that would be least expensive to roll back, considering the amount of work the process has already done. That process is killed and error message 1205 is sent to the corresponding client connection. The transaction is rolled back, meaning all its locks are released, so other processes involved in the deadlock can proceed. However, certain operations are marked as golden, or unkillable, and cannot be chosen as the deadlock victim. For example, a process involved in rolling back a transaction cannot be chosen as a deadlock victim because the changes being rolled back could be left in an indeterminate state, causing data corruption.
Using the SET DEADLOCK_PRIORITY statement, a process can determine its priority for being chosen as the victim if it is involved in a deadlock. There are 21 different priority levels, from –10 to 10. You can also specify the value LOW, which is equivalent to –5, NORMAL, which is equivalent to 0, and HIGH, which is equivalent to 5. Which session is chosen as the deadlock victim depends on each session’s deadlock priority. If the sessions have different deadlock priorities, the session with the lowest deadlock priority is chosen as the deadlock victim. If both sessions have set the same deadlock priority, SQL Server selects as the victim the session that is less expensive to roll back.
In the example in Figure 10-5, the cycle deadlock could have been avoided if the processes had decided on a protocol beforehand—for example, if they had decided always to access the Product table first and the PurchaseOrderDetail table second. Then one of the processes gets the initial exclusive lock on the table being accessed first, and the other process waits for the lock to be released. One process waiting for a lock is normal and natural. Remember, waiting is not a deadlock.
You should always try to have a standard protocol for the order in which processes access tables. If you know that the processes might need to update the row after reading it, they should initially request an update lock, not a shared lock. If both processes request an update lock rather than a shared lock, the process that is granted an update lock is assured that the lock can later be promoted to an exclusive lock. The other process requesting an update lock has to wait. The use of an update lock serializes the requests for an exclusive lock. Other processes needing only to read the data can still get their shared locks and read. Because the holder of the update lock is guaranteed an exclusive lock, the deadlock is avoided.
In many systems, deadlocks cannot be completely avoided, but if the application handles the deadlock appropriately, the impact on any users involved, and on the rest of the system, should be minimal. (Appropriate handling implies that when error 1205 occurs, the application resubmits the batch, which most likely succeeds on the second try. Once one process is killed, its transaction is aborted, and its locks are released, the other process involved in the deadlock can finish its work and release its locks, so the environment is not conducive to another deadlock.) Although you might not be able to avoid deadlocks completely, you can minimize their occurrence. For example, you should write your applications so that your processes hold locks for a minimal amount of time; in that way, other processes won’t have to wait too long for locks to be released. Although you don’t usually invoke locking directly, you can influence locking by keeping transactions as short as possible. For example, don’t ask for user input in the middle of a transaction. Instead, get the input first and then quickly perform the transaction.