SQL Explained In Simple Way: Indexes Internals

What is Index ?

Indexes are used to retrieve the data more quickly and efficiently.It reduces disk I/O and logical reads.
It allows to find the data in a table without scanning the entire table.

A Database Index is similar to Index in a book.

Where Indexes can be created ?

Indexes can be created on single column or multiple columns.

What is an Index Key?
An Index is defined on one or more columns called Index Key or Key columns. The Index Key can be compared to the terms listed in a book index.They are the values that the index will be used to search for.You can find these Index keys at the back of the test book.Indexes Keys are stored in the leaves.

Simple Index : An Index created with only one key column is called Simple Index
Composite Index : An Index created with more than one key column is called composite Index.

Indexes are organized in a B-tree structure called Balanced tree.

What is Balanced Tree ?
It is the method of placing and locating files in a database.

About B-Tree : Unlike a normal tree B-Tree always Inverted with their root node at top and leaf nodes at the bottom. There are Intermediate levels between root node and leaf node but it depends on multiple factors.

Root Node : The first page of the Index is called Root Node.(Starting node)
Branch Node : The pages between starting node and leaf node are called branch nodes.
Leaf Node : The lowest level of the Index is called Leaf node.

Note : The Size and width of the tree based on the definition of Index and the number of rows and size of rows in the table.

To Understand these structures I will explain some basic terms and definitions.
Indexes have two primary components LEAF LEVEL and NON-LEAF LEVEL.
LEAF LEVEL contains Every row of the table in Indexed order.
NON LEAF LEVEL used for navigating to the LEAF LEVEL.
Each non-leaf level stores something for every page of the level below—and levels are added until the index builds up to a root of one page.

Types of Indexes :

There are two types of Indexes Clustered and Non-Clustered.

Clustered Index :

It Sorts and stores at the leaf node. The leaf level of the clustered index has the actual data pages of the table. Because of this there can only be one clustered index per table.

Non Clustered Index :

A nonclustered index does not affect the order of the data in the table pages, because the leaf pages of a nonclustered index and the data pages of the table are separate. A pointer is required to navigate from an index row to the data row.The structure of row locator depends on the data pages are stored in a heap or clustered Index.For a heap the row locator is pointer to the RID(ROW IDENTIFIER) for the data row,for a table with a clustered index, the row locator is the clustered index key.

Heap table : A table without the clustered index is called Heap table.
ROW IDENTIFIER : The RID is an 8-byte structure comprised of File ID, Page Number and Slot Index and will uniquely identify a row in the basic heap.

Index Limits :
There are a number of built-in limitations on indexes
Key size
The size of an index key is limited to a maximum of 900 bytes and a maximum of 16 columns. This is definitely a limit, not a goal, as the larger the index key gets, the more pages in the index and the deeper the index tree. As the number of pages and the depth of the tree increases so the index becomes less efficient to use. Larger indexes also use more storage space and result in less efficient use of SQL’s data cache.
Number of indexes
In SQL 2005 and earlier there was a limitation of 250 indexes per table, one clustered and 249 non-clustered. In SQL 2008, with the addition of filtered indexes, that limitation was increased to 1000, one clustered and 999 non-clustered indexes.

How SQL uses indexes:
If a table does not have index, the only way to find all occurrences of a value within a table is to read the entire table. If a table has an index, it speeds up the locating of values within that index in two ways.
1. The index is sorted in the order of the key columns. This means that once all the matching values have been found, the remaining portion of the table can be ignored. This is the same as a telephone directory.
2. The tree structure of the index allows a divide-and-conquer approach to locating rows, where large portions of the table can be quickly excluded from the search.
There are four basic operations that SQL can do on an index. It can scan the index, it can seek on the index, it can do lookups to the index and it can update the index

SQL Explained In Simple Way

Saturday, October 2, 2010

Indexes Internals

No comments:

Post a Comment

Welcome to My Blog

Search This Blog

Total Pageviews