Tuesday, October 12, 2010

Consistency Problems

A transaction must exhibit ACID properties but apart from ACID properties they can exhibit other properties.These properties are called consistency problems.They are DIRTY READS, NONREPETABLE READS,PHANTOMS and LAST UPDATES.
An application should be avoid LAST UPDATES the other problems can ignore depending on the application.
You can determine these behaviours to allow or disallow by using Isolation levels.
LAST UPDATES :
Last Updates occur when two process reads the same data and both manipulate and update the data at the same time.The second process might overwrite the first update completely.
DIRTY READS :
It occurs when one process reads the uncommitted data . If one process has changed the data but not yet committed the second process reads the data in an inconsistent state.
By default DIRTY READS are not allowed.
By using the Isolation level READ UNCOMMITTED you can see DIRTY READS.
NON REPEATABLE READS :
It occurs when a process might get different values reading the same data in two seperate reads with in the same transaction.These are also called Inconsistent analysis.
Phantoms :
 It can happen only when a query with a action such as Where <> 10 is involved.
If two select operation using the same action in the same transaction return different number of rows.

The behaviour of  your transaction depends on which Isolation level.
You can set the isolation level by using Set Transaction Isolation Level (isolation level name)
Your concurrency models determines how the isolation level is implemented.

Concurrency and Types

What is Concurrency ?
It can be defined as the multiple process to access or change the data at the same time.
Concurrency can be avoided by using Isolation levels.

For details about Isolation levels see my post
http://ravisql2k5.blogspot.com/2010/10/isolation-levels.html

There are two types of Concurrency models in SQL Server 2005 and 2008
Pessimistic and Optimistic
Pessimistic concurrency model is available before SQL Server 2005 and Optimistic is introduced in SQL Server 2005.
Pessimistic : It is the default behaviour of SQL Server to acquire locks and to block access to data that other process is using.In other words we can say readers block writers and writers block readers in pessimistic concurrency environment.
Pessimistic concurrency model supports 4 types of Isolation levels.
Read committed, Read Uncommitted, Repeatable Read, and Serializable
Optimistic Concurrency Model :
This is newly Introduced from SQL Server 2005 and it supports to 2008 also.The default behaviour of this model is to use row versioning.
we can say writers do not block readers and readers do not block writers but writers can block writers.
Optimistic supports two types of Isolation levels.
Read committed snapshot and Snapshot Isolation levels.
Both uses row versioning.
To see the details of Isolation levels check my below post
http://ravisql2k5.blogspot.com/2010/10/isolation-levels.html

Monday, October 11, 2010

Isolation Levels

Isolation Levels :
What is Isolation levels?
It controls the behaviour of your read operations.Such as
  • Whether locks are taken when data is read, and what type of locks are requested.
  • How long the read locks are held.
Types of Isolation Levels ?
SQL Server supports five Isolation levels.
READ UNCOMMITTED
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
SQL Server 2005 Introduces two types of transaction Isolation levels.
READ COMMITTED ISOLATION and SNAPSHOT ISOLATION.Both Isolation levels uses row versioning.

The Lowest Isolation level is READ UNCOMMITTED.
READ UNCOMMITTED :
In this Isolation Level your read operations to not take any locks, and because SQL Server isn’t trying to acquire locks, it won’t be blocked by conflicting locks acquired by other processes. Your process is able to read data that another process has modified but not yet committed.
Read Uncommitted Isolation level wont gives the transactional consistency because locks are just ignored.
In this Level DIRTY READS,PHANTOM and NON REPEATABLE READ occurs.
Here Iam showing you one example to understand READ UNCOMMITTED ISOLATION level.
STEP 1:
1.Here Iam updating a table but not yet committed.
open a query window by pressing CTRL+N in Management studio and run the below syntax
begin tran
use pepsi
select * from clustered_dupes
update table clustered_dupes set col4 = 'ravi' where col1 = 1

 The table is updated and to see now
STEP 2 :
Open one more query window by pressing CTRL+N in management studio and run the below syntax
Begin Tran
Use Pepsi
Select * from dbo.clustered_dupes
If you run the above syntax the process will block and it is blocked by STEP 1 Process Because SQL Server try to puts the lock on table but already an exclusive lock is kept by STEP 1 process. The data is modified but still it is not committed and Until Commit Tran wont run on STEP 1 SQL server wont release the lock.














 Now open another window and run SP_WHO2 and you will see there is a process which is Blocked by step 1 process. 
Until you commit the transaction or rollback the transaction in step 1 process, the step 2 process wont get completed you need to kill the process STEP 2.
But this is default behaviour of SQL Server when Isolation level is READ COMMITTED
Now we are changing the Isolation level by using 
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Issue the Same syntax of STEP 2 Process.
Begin tran
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
use pepsi
select * from clustered_dupes
Now you can read the uncommitted data in STEP 1 Process without blocked by STEP 1 Process.

I think this example clearly illustrate the use of Isolation level READ UNCOMMITTED

READ COMMITTED :   
This is the default isolation level in SQL Server 2005. 
It ensures that an operation never reads data that another application has changed but not yet committed.In this level Dirty reads cannot occur but Phantom and nonrepeatable read occurs.
REPEATABLE READ : 
When this option is set, as data is read, locks are placed and held on the data for the duration of the transaction. These locks prevent other transactions from modifying the data you have read.This isolation level is more restrictive than READ COMMITTED and READ UNCOMMITTED, and can block other transactions.It prevents non repeatable reads but does not prevent the phantom rows.
Here Iam showing an example of REPEATABLE READ.
STEP 1:
Begin Tran
Use pepsi
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Select * from clustered_dupes
After running the above script locks are placed on table Clustered_dupes and it wont allow you to update the transactions but you can see the data.
STEP 2:
Open one more query window by pressing ctrl+N on management studio and run the below script
Begin tran
use  pepsi
update clustered_dupes set col4 = 'ravi' where col1 = 1


It wont allow you to update the table until STEP 1 process should commit and step 2 process is blocked by step 1
STEP 3 :
Open one more query window by pressing ctrl+N on management studio and run the below script 
Begin tran
Use pepsi
Select * from clustered_dupes
now you can see the data.


SERIALIZABLE :
This option is the most restrictive isolation level because it places a Key range lock on the data. This prevents any modifications to the data being read from until the end of the transaction. It also avoids phantom reads by preventing rows from being added or removed from the data range set.
 In addition, enforcing the Serializable isolation level requires that you not only lock data that has been read, but also lock data that does not exist!
READ COMMITTED SNAPSHOT : 
When this option is set, SQL Server uses row versioning instead of locking.
It can be enabled by using database option 
alter database 'database name' set read_committed_snapshot on
after enabling this option you can examine on column is_read_committed_snapshot_on  in sys.databases catalog view.
In this level dirty reads cannot occur but phantom and nonrepetable reads occur.
This isolation level cannot set on master,msdb and tempdb system databases.
When you are enabling read_committed_snapshot option for a database please ensure that there are no active transactions for that database. 
When this option is enabled for a database, the Database Engine maintains versions of each row that is modified. Whenever a transaction modifies a row, image of the row before modification is copied into a page in the version store. The version store is a collection of data pages in tempdb. If multiple transactions modify a row, multiple versions of the row are linked in a version chain. Read operations using row versioning retrieve the last version of each row that had been committed when the transaction or statement started.
 For transactions using an isolation level based on row versioning, read operations do not request shared locks on the data. This means readers using row versioning do not block other readers or writers accessing the same data. Similarly, the writers do not block readers. However, writers will block each other even while running under row versioning-based isolation levels. Two write operations cannot modify the same data at the same time.

SNAPSHOT ISOLATION :
Snapshot isolation specifies that data read by any statement will only see data modifications that were committed before the start of the transaction. The effect is as if the statements in a transaction see a snapshot of the committed data as it existed at the start of the transaction. The ALLOW_SNAPSHOT_ISOLATION database option must be set to ON for a transaction to specify the SNAPSHOT isolation level.

It makes a snapshot copy of the data being updated. During the update, readers can continue to read the original data. When the update is committed, it’s written back over the original data.
 

 


Monday, October 4, 2010

Index Internals Part 2

SQL uses four basic operations on Index.
It can Scan ,Seek ,Lookup and Update the Index.
Scans :
Scans are divided in to Index Scan, Clustered Index scan and Table Scan
Index Scans : Reading all the leaf pages of a non-clustered index using the next and previous page pointers to navigate. Because non-clustered indexes generally have fewer pages in the leaf than a clustered index, this operation is usually cheaper than a clustered index scan.
Table Scans : This operation occurs for a heap table(A table without clustered index).The first page in the heap is located based on info in the system tables, and then the pages are read one by one, using the next and previous pointers in the page headers. This is generally an expensive operation and should be avoided where ever possible.
Clustered Index Scans : Basically the same operation as for a table scan, just on a table that has clustered index. This operation reads the leaf pages of the clustered index, using the next and previous page pointers in the page headers.Like with the table scan, this can be an expensive operation and should, wherever possible be avoided.

Note : For any type of Scans, it wont use the index b-tree Structure to locate data.It is directly reading all the leaf pages.

Seeks :
Seeks are two types Clustered Index seek and Non-Clustered Index seek
Clustered Index Seek : This operation uses the clustered index’s b-tree structure. The seek starts at the root of the tree and navigates down the levels of the index until it reached the leaf page(s) with the desired data.
Non-Clustered Index Seek : Much the same as the clustered index seek, just using a non-clustered index.

LOOKUPS :
Lookups are two types Key Lookup and RID Lookup
Key Lookup : It indicates that the nonclustered index that was used to locate the rows affected by the query did not have all the columns required by the query. The missing columns must be looked up from the clustered index.
A Key Lookup can be a bottleneck because for each row returned by the index seek, the key lookup does a clustered index seek returning a single row. For a small number of rows, that’s not  a problem, for hundreds of rows it can be a major bottleneck in the query.
RID Lookup : This Look up occurs when you  performed against a heap.In this case there is no clustered index to fetch the rows it has to use Row identifier.

Updates : 
Anytime that a row is changed, those changes must be made not only in the base table (clustered index or heap) but also in any index that contains the columns that were affected by the change.  This applies to insert, update and delete operations.


Tools for Analyzing Indexes :
To Understand Index structures fully, there are some tools that we are going to use.
Sys.dm_db_index_physical_stats: This DMV is used to find the fragmentation level and number of clustered indexes and non-clustered indexed exists in a table.
The syntax of this DMV is :
select * from sys.dm_db_index_physical_stats (database_id,object_id,Index_id,Partition_number,mode)
for example I want to see the information of a table production.product in adventureworks database.
select * from sys.dm_db_index_physical_stats (6,1429580131,null,null,'limited')
here limited is the mode and we have 3 modes in the DMV.
Limited,Detailed,Sampled
Limited Mode : The default operating mode of the DMV is called LIMITED.The LIMITED mode can only return the logical fragmentation of the leaf level plus the page count. The idea behind this option is to allow you to find the fragmentation of an index by reading the minimum number of pages, i.e. in the smallest amount of time.This option faster than using the DETAILED mode scan.It scans all pages for a heap, but only the parent-level pages for an index, which are the pages above the leaf-level.
Detailed Mode :  The DETAILED mode does two things:
•    Calculate fragmentation by doing a LIMITED mode scan
•    Calculate all other statistics by reading all pages at every level of the index
And so it's obviously the slowest.  
Sampled Mode : It does two things
  • LIMITED mode scan
  • If the number of leaf level pages is < 10000, read all the pages, otherwise read every 100th pages (i.e. a 1% sample)


However, if the index has less than 10,000 pages total, SQL Server converts SAMPLED to DETAILED.  

The Next DMV is SYS.DM_DB_INDEX_USAGE_STATS :

It will show you number of seeks,scans,lookups,updates on index  and last user seek,scan,lookup,updates on index from user queries or system queries.
select * from sys.dm_db_index_usage_stats
there are no parameters required.
The index usage stats DMV is cleared when SQL Server is restarted as well as when you detach/attach the database or when you backup/restore the database.


One more DMV Sys.dm_db_index_operational stats :
This DMV is used for lock escalations,page splits,lock contention and I/O statistics.
I will explain you one by one.
 The following query demonstrates identifying the top 3 objects associated with waits on page locks:
SELECT      TOP 3

            OBJECT_NAME(o.object_id, o.database_id) object_nm,

            o.index_id,

            partition_number,

            page_lock_wait_count,

            page_lock_wait_in_ms,
            case when mid.database_id is null then 'N' else 'Y' end as missing_index_identified

FROM sys.dm_db_index_operational_stats (db_id(), NULL, NULL, NULL) 
LEFT OUTER JOIN (SELECT DISTINCT database_id, object_id
        FROM sys.dm_db_missing_index_details) as mid

     ON mid.database_id = o.database_id and mid.object_id = o.object_id
ORDER BY page_lock_wait_count DESC

 Lock Escalations: 
You can use sys.dm_db_index_operational_stats to track how many attempts were made to escalate to table locks (index_lock_promotion_attempt_count), as well as how many times escalations actually succeeded (index_lock_promotion_count).  The following query shows the top three objects with the highest number of escalations:
SELECT      TOP 3

            OBJECT_NAME(object_id, database_id) object_nm,

            index_id,

            partition_number,

            index_lock_promotion_attempt_count,

            index_lock_promotion_count

FROM sys.dm_db_index_operational_stats

      (db_id(), NULL, NULL, NULL)

ORDER BY index_lock_promotion_count DESC





Page Split Tracking:Excessive page splitting can have a significant effect on performance.  The following query identifies the top 10 objects involved with page splits (ordering by leaf_allocation_count and referencing both the leaf_allocation_count and nonleaf_allocation_count columns).  The leaf_allocation_count column represents page splits at the leaf and the nonleaf_allocation_count represents splits at the non-leaf levels of an index:
SELECT      TOP 10

            OBJECT_NAME(object_id, database_id) object_nm,

            index_id,

            partition_number,

            leaf_allocation_count,

            nonleaf_allocation_count

FROM sys.dm_db_index_operational_stats

      (db_id(), NULL, NULL, NULL)

ORDER BY leaf_allocation_count DESC
 

Note : To run DMVs you need permission of CONTROL,VIEWSERVERSTATE,VIEWDATABASESTATE.


 Sys.indexes:
Another way to view existing indexes defined for a given table is to use the system view called "sys.indexes"
select * from sys.indexes it will give the indexes list in a database.
or you can also query indexes on a single table in a database.Here I used the adventureworks database.


SELECT * FROM sys.indexes WHERE object_id = (
   SELECT object_id FROM sys.tables WHERE name = 'product')



SP_helpindex :
Return information about indexes on a table or view.
Example : USE AdventureWorks;
GO
EXEC sp_helpindex N'Sales.Customer';
GO



Saturday, October 2, 2010

Indexes Internals

What is Index ?
Indexes are used to retrieve the data more quickly and efficiently.It reduces disk I/O and logical reads.
It allows to find the data in a table  without scanning the entire table.

A Database Index is similar to Index in a book.

Where Indexes can be created ?
Indexes can be created on single column or multiple columns.

What is an Index Key?
An Index is defined on one or more columns called Index Key or Key columns. The Index Key can be compared to the terms listed in a book index.They are the values that the index will be used to search for.You can find these Index keys at the back of the test book.Indexes Keys are stored in the leaves.



Simple Index : An Index created with only one key column is called Simple Index
Composite Index : An Index created with more than one key column is called composite Index.

Indexes are organized in a B-tree structure called Balanced tree.

What is Balanced Tree ?
It is the method of placing and locating files in a database.

About B-Tree : Unlike a normal tree B-Tree always Inverted  with their root node at top and leaf nodes at the bottom. There are Intermediate levels between root node and leaf node but it depends on multiple factors.

Root Node : The first page of the Index is called Root Node.(Starting node)
Branch Node : The pages between starting node and leaf node are called branch nodes.
Leaf Node : The lowest level of the Index is called Leaf node.












Note : The Size and width of the tree based on the definition of Index and the number of rows and size of rows in the table.

To Understand these structures I will explain some basic terms and definitions.
Indexes have two primary components LEAF LEVEL and NON-LEAF LEVEL.
LEAF LEVEL contains Every row of the table in Indexed order.
 NON LEAF LEVEL used for navigating to the LEAF LEVEL.
Each non-leaf level stores something for every page of the level below—and levels are added until the index builds up to a root of one page.

Types of Indexes :
There are two types of Indexes Clustered and Non-Clustered.
Clustered Index :
It Sorts and stores at the leaf node. The leaf level of the clustered index has the actual data pages of the table. Because of this there can only be one clustered index per table.
Non Clustered Index :
A nonclustered index does not affect the order of the data in the table pages, because the leaf pages of a nonclustered index and the data pages of the table are separate. A pointer is required to navigate from an index row to the data row.The structure of row locator depends on the data pages are stored in a heap or clustered Index.For a heap the row locator is pointer to the RID(ROW IDENTIFIER) for the data row,for a table with a clustered index, the row locator is the clustered index key.

Heap table :  A table without the clustered index is called Heap table.
ROW IDENTIFIER : The RID is an 8-byte structure comprised of File ID, Page Number and Slot Index and will uniquely identify a row in the basic heap.

 Index Limits :
There are a number of built-in limitations on indexes
Key size
The size of an index key is limited to a maximum of 900 bytes and a maximum of 16 columns. This is definitely a limit, not a goal, as the larger the index key gets, the more pages in the index and the deeper the index tree. As the number of pages and the depth of the tree increases so the index becomes less efficient to use. Larger indexes also use more storage space and result in less efficient use of SQL’s data cache.
Number of indexes
In SQL 2005 and earlier there was a limitation of 250 indexes per table, one clustered and 249 non-clustered. In SQL 2008, with the addition of filtered indexes, that limitation was increased to 1000, one clustered and 999 non-clustered indexes.

How SQL uses indexes:
If a table does not have index, the only way to find all occurrences of a value within a table is to read the entire table. If a table has an index, it speeds up the locating of values within that index in two ways.
1. The index is sorted in the order of the key columns. This means that once all the matching values have been found, the remaining portion of the table can be ignored. This is the same as a telephone directory.
2. The tree structure of the index allows a divide-and-conquer approach to locating rows, where large portions of the table can be quickly excluded from the search.
There are four basic operations that SQL can do on an index. It can scan the index, it can seek on the index, it can do lookups to the index and it can update the index

Friday, October 1, 2010

How to watch when a checkpoint operation is doing ?

Hi , In my company training one of my colleague asked me is it possible to see what a checkpoint operation is doing ?

There are some trace flags that allow you to watch a checkpoint operation is doing.

  • 3502: writes to the error log when a checkpoint starts and finishes
  • 3504: writes to the error log information about what is written to disk
You can enable these trace flags by using DBCC TRACE ON

DBCC TRACE ON (3502,3504,-1)

After Enabling trace flags run any DML Transaction with a checkpoint command and you can see in ERROR LOG. Below Iam posting some information after running a transaction by using checkpoint operation.

2010-10-02 01:32:00.03 spid11s     Ckpt dbid 1 started (0)
2010-10-02 01:32:00.06 spid11s     About to log Checkpoint begin.
2010-10-02 01:32:00.09 spid11s     Ckpt dbid 1 phase 1 ended (0)
2010-10-02 01:32:00.10 spid11s     FlushCache: cleaned up 6 bufs with 6 writes in 1 ms (avoided 0 new dirty bufs)
2010-10-02 01:32:00.10 spid11s                 average throughput:  46.88 MB/sec, I/O saturation: 5
2010-10-02 01:32:00.10 spid11s                 last target outstanding: 2
2010-10-02 01:32:00.10 spid11s     About to log Checkpoint end.
2010-10-02 01:32:00.20 spid11s     Ckpt dbid 1 complete
2010-10-02 01:32:01.62 spid11s     Ckpt dbid 1 started (0)
2010-10-02 01:32:01.62 spid11s     About to log Checkpoint begin.
2010-10-02 01:32:01.63 spid11s     Ckpt dbid 1 phase 1 ended (0)
2010-10-02 01:32:01.63 spid11s     FlushCache: cleaned up 4 bufs with 4 writes in 1 ms (avoided 0 new dirty bufs)
2010-10-02 01:32:01.63 spid11s                 average throughput:  31.25 MB/sec, I/O saturation: 2
2010-10-02 01:32:01.63 spid11s                 last target outstanding: 2
2010-10-02 01:32:01.63 spid11s     About to log Checkpoint end.
2010-10-02 01:32:01.66 spid11s     Ckpt dbid 1 complete
2010-10-02 01:32:01.94 spid11s     Ckpt dbid 8 started (8)
2010-10-02 01:32:01.94 spid11s     About to log Checkpoint begin.
2010-10-02 01:32:01.95 spid11s     Ckpt dbid 8 phase 1 ended (8)
2010-10-02 01:32:02.23 spid11s     FlushCache: cleaned up 1224 bufs with 64 writes in 281 ms (avoided 57 new dirty bufs)
2010-10-02 01:32:02.23 spid11s                 average throughput:  34.03 MB/sec, I/O saturation: 70
2010-10-02 01:32:02.23 spid11s                 last target outstanding: 2
2010-10-02 01:32:02.23 spid11s     About to log Checkpoint end.
2010-10-02 01:32:02.23 spid11s     Ckpt dbid 8 complete
2010-10-02 01:32:02.23 spid55      Ckpt dbid 8 started (8)
2010-10-02 01:32:02.23 spid55      About to log Checkpoint begin.
2010-10-02 01:32:02.23 spid55      Ckpt dbid 8 phase 1 ended (8)
2010-10-02 01:32:02.27 spid55      FlushCache: cleaned up 87 bufs with 8 writes in 31 ms (avoided 0 new dirty bufs)
2010-10-02 01:32:02.27 spid55                  average throughput:  21.93 MB/sec, I/O saturation: 9
2010-10-02 01:32:02.27 spid55                  last target outstanding: 2
2010-10-02 01:32:02.27 spid55      About to log Checkpoint end.
2010-10-02 01:32:02.27 spid55      Ckpt dbid 8 complete
2010-10-02 01:36:32.94 spid11s     Ckpt dbid 8 started (0)
2010-10-02 01:36:32.94 spid11s     About to log Checkpoint begin.
2010-10-02 01:36:32.94 spid11s     Ckpt dbid 8 phase 1 ended (0)
2010-10-02 01:36:33.20 spid11s     FlushCache: cleaned up 1288 bufs with 54 writes in 250 ms (avoided 33 new dirty bufs)
2010-10-02 01:36:33.20 spid11s                 average throughput:  40.25 MB/sec, I/O saturation: 61
2010-10-02 01:36:33.20 spid11s                 last target outstanding: 2
2010-10-02 01:36:33.20 spid11s     About to log Checkpoint end.
2010-10-02 01:36:33.20 spid11s     Ckpt dbid 8 complete
2010-10-02 01:48:00.17 spid11s     Ckpt dbid 4 started (0)
2010-10-02 01:48:00.17 spid11s     About to log Checkpoint begin.
2010-10-02 01:48:00.18 spid11s     Ckpt dbid 4 phase 1 ended (0)
2010-10-02 01:48:00.18 spid11s     FlushCache: cleaned up 20 bufs with 18 writes in 1 ms (avoided 0 new dirty bufs)
2010-10-02 01:48:00.18 spid11s                 average throughput: 156.25 MB/sec, I/O saturation: 15
2010-10-02 01:48:00.18 spid11s                 last target outstanding: 2
2010-10-02 01:48:00.18 spid11s     About to log Checkpoint end.
2010-10-02 01:48:00.19 spid11s     Ckpt dbid 4 complete
2010-10-02 02:00:23.77 spid54      DBCC TRACEON 3406, server process ID (SPID) 54. This is an informational message only; no user action is required.
2010-10-02 02:10:00.75 spid11s     Ckpt dbid 4 started (0)
2010-10-02 02:10:00.75 spid11s     About to log Checkpoint begin.
2010-10-02 02:10:00.75 spid11s     Ckpt dbid 4 phase 1 ended (0)
2010-10-02 02:10:00.76 spid11s     FlushCache: cleaned up 20 bufs with 19 writes in 15 ms (avoided 0 new dirty bufs)
2010-10-02 02:10:00.76 spid11s                 average throughput:  10.42 MB/sec, I/O saturation: 18
2010-10-02 02:10:00.76 spid11s                 last target outstanding: 2
2010-10-02 02:10:00.76 spid11s     About to log Checkpoint end.
2010-10-02 02:10:00.76 spid11s     Ckpt dbid 4 complete

Internal Process When a checkpoint occurs

Hi today Iam explaining about how do checkpoint works and what gets logged.

When a checkpoint occurs no matter how its triggered :
  • All dirty data pages for the database are written to disk.
  • Log records describing the checkpoint are generated.
  • The LSN of the checkpoint  is recorded in the database boot page
  • Before a page is written to disk, all log records up to and including the most recent log record describing a change to that page are written to disk. This guarantees recovery can work and is called write-ahead logging. Log records are written to the log sequentially, and log records from multiple transactions will be mix together in the log. The log cannot be selectively written to disk, so writing a dirty page to disk that only has a single log record affecting it may mean writing many more previous log records to disk as well.
  • If in the SIMPLE recovery model, the VLFs in the log are checked to see whether they can be marked inactive

Lazy Writer Process

There are two built in automatic mechanisms SQL Server uses to scan the buffer cache periodically and writes any dirty pages to disk.
The Lazy Writer and Checkpoint process
Lazy Writer Process :
It scans the buffer cache periodically and writes dirty pages to disk.It sleeps for specific interval of time when it wakes up, it examines the size of free buffer list.If the free buffer list is below a certain point, the lazy writer thread scans the buffer pool to repopulate the free buffer list.As buffers are added to free list, they are also written to disk if they are dirty.
The lazywriter expands or shrinks the data cache to keep the operating system’s free physical memory at 5 MB (plus or minus 200 KB) to prevent paging. If the operating system has less than 5 MB free, the lazywriter releases memory to the operating system instead of adding it to the free list. If more than 5 MB of physical memory is free, the lazywriter recommits memory to the buffer pool by adding it to the free list.

Checkpoint Process

Checkpoint Process:
It scans the buffer cache periodically and writes any dirty data pages for a particular database to disk.

Difference between checkpoint and lazywriter :
checkpoint process is never puts buffers on the free list but lazywriter  puts buffers on the free list.

The only reason of a checkpoint process is to guarantee that pages written before a certain time are written to disk.So that it keeps minimum dirty pages in memory and length of time SQL Server requires recovery of a
database after a failure is kept to minimum.

When Checkpoint occurs ?
It occurs automatically at regular intervals but can be requested manually.
The log is getting full and the database is in autotruncate mode. A checkpoint is triggered to truncate the transactional log and free up space but if no space can be freed up because of long running transactions, no
checkpoint occurs.
When recovery time is estimated to be longer than recovery interval a checkpoint is triggered.
When an orderly shutdown SQL server, a checkpoint is issued on every database on the instance.
When a backup is taken.
Database files have been added or removed by using ALTER DATABASE

Permissions required for checkpoint : Members of the SYSADMIN, DB_OWNER and DB_BACKUPOPERATOR

Note :The frequency of checkpoints in each database depends on the amount of data modifications made, not on any time-based measure. A database used primarily for read-only operations will not have many checkpoints
The recovery interval option controls when SQL Server issues a checkpoint in each database. Checkpoints are done on a per-database basis.
When a checkpoint occurs SQL Server writes a checkpoint record to the transactional log, which lists all the active transactions.
A recovery Interval setting of 1 means that checkpoint occurs about every minute.
When you install SQL Server by default the recovery interval value is set 0. It indicates SQL Server chooses an appropriate value dynamically.
The SHUTDOWN WITH NOWAIT statement shuts down SQL Server without executing a checkpoint in each database