DBMS Unit 5
DBMS Unit 5
A database consist of a huge amount of data. The data is grouped within a table, and each table
have related records. A user can see that the data is stored in form of tables, but in actual this huge
amount of data is stored as files in the memory.
Data is stored on external storage devices such as disks and tapes, and fetched into main memory
as needed for processing.
The cost of page I/O (input from disk to main memory and output from memory to disk) dominates
the cost of typical database operations.
Disks are the most important external storage devices. They allow us to retrieve any page at a
(more or less) fixed cost per page. However, if we read several pages in the order that they are
stored physically, the cost can be much less than the cost of reading the same pages in a random
order.
• Tapes are sequential access devices and force us to read data one page after the other. They are
mostly used to archive data that is not needed on a regular basis.
• Each record in a file has a unique identifier called a record id, or rid. A rid has the property that
we can identify the disk address of the page containing the record by using it.
Data is read into memory for processing, and written to disk for persistent storage, by a layer of
software called the buffer manager. When the files and access methods layer needs to process a
page, it asks the buffer manager to fetch the page, specifying the page's rid. The buffer manager
fetches the page from disk if it is not already in memory. Space on disk is managed by the disk
space manager.
File – A file is collection of related information that is recorded on secondary storage such as
magnetic disks, magnetic tables and optical disks.
A file can be created, destroyed, and have records inserted into and deleted from it.
File organization: Method of arranging a file of records on external storage is known as file
organization.
Page 1
Methods of organizing a file in database
1. Heap file/Unsorted file- The simplest file structure is a heap file. Records in a heap file
are stored in random order across the pages of the file. A heap file organization supports
retrieval of all records, or retrieval of a particular record specified by its rid.
2. Sorted File Method –In this method, whenever a new record has to be inserted, it is always
inserted in a sorted (ascending or descending) manner. Sorting of records may be based on
any primary key or any other key. It is best if records must be retrieved in some order, or
only a `range’ of records is needed.
3. Hash File Organization- In a hash file, records are not stored sequentially in a file instead
a hash function is used. The output of the hash function determines the location of disk
block where the records are to be placed. A hash function is a simple mathematical function
to any complex mathematical function. And most of the time, the hash function uses the
primary key to generate the address of the data block.
Page 2
4. B+ tree File Organization- B+ tree uses a tree like structure to store records in File. It
uses the concept of Key indexing where the primary key is used to sort the records. In this
method, all the records are stored only at the leaf node. Intermediate nodes act as a pointer
to the leaf nodes. They do not contain any records.
INDEXING
Indexing is a secondary or alternative method to access the file in a time efficient manner. Indexing
can be classified either on sorted or unsorted file
An index is a data structure that organizes data records on disk for retrieval operations. An index
allows us to efficiently retrieve all records that satisfy search conditions on the search key fields
of the index.
There are three main alternatives for what to store as a data entry in an index:
1. A data entry k* is an actual data record (with search key value k).
2. A data entry is a (k, rid) pair, where rid is the record id of a data record with
search key value k.
3. A data entry is a (k. rid-list) pair, where rid-list is a list of record ids of data
records with search key value k.
Alternative (1), each entry k* is a data record with search key value k. Alternatives (2) and (3),
contains data entries that point to data records.
Clustered Indexes
When a file is organized so that the ordering of data records is the same as to the ordering
of data entries in some index, we say that the index is clustered. That is the order of records
in the file matches the order of data entries in the index. Each table has onlyone clustered
index because data rows can be only sorted in one order. They are faster because both
records and indexes are equally ordered.
Page 3
Non Clustered Indexes
If a file contains records in sequential order and the index whose data entries specifies an
order different from the sequential order of the file it is known as non clustered index. That
is the order of records in the file does not matches the order of data entries in the index. A
single table can have many non-clustered indexes. They are slow because both records and
indexes are not equally ordered.
Indexing methods
Primary Indexing
Secondary Indexing
Primary Index
i An index on set of fields that includes primary key of the table is known as
primary index.
ii Primary Index is an ordered file which has fixed length size with two fields.
iii The first field is the same-a primary key and second, filed is pointed to that
specific data block.
iv This is a type of Clustered Indexing and is guaranteed not to contain duplicates.
Dense Index
Sparse Index
Dense Index
i The dense index contains an index record for every search key value in the data file.
ii The number of records in the index table is same as the number of records in the main
table.
iii This helps us to search faster but needs more space to store index records.
Page 4
Sparse Index
i In order to address the issues of dense indexing, sparse indexing is introduced. The sparse
index contains an index record for few search key value in the data file.
ii A range of index columns store the same data block address. And when data is to be
retrieved, the block address will be fetched linearly till we get the requested data.
iii To locate a record, we find the index record with the largest search key value less than or
equal to the search key value we are looking for.
iv We start at that record pointed to by the index record, and proceed along the pointers in
the file (that is, sequentially) until we find the desired record.
v It needs less space, less maintenance overhead for insertion, and deletions but it is slower
compared to the dense index for locating records.
Page 5
In above diagram, we have not stored the indexes for all the records, instead only for 3 records
indexes are stored. Now if we have to search a student with ID 102, then the address for the ID
less than or equal to 102 is searched – which returns the address of ID 100. This address location
is then fetched linearly till we get the records for 102. Hence it makes the searching faster and also
reduces the storage space for indexes.
Secondary Index
i An index that is not primary is secondary index i.e An index on set of fields that
does not include primary key of the table is known as secondary index.
ii Secondary index can be described on an alternate key or non-key attributes.
iii This is a type of non clustered index and will be dense.
iv In this method, initially huge range for the columns are selected so that first level
of mapping size is small. Then each range is further divided into smaller ranges.
v First level of mapping is stored in the primary memory so that address fetch is
faster. Secondary level of mapping and the actual data are stored in the secondary
memory – hard disk.
Page 6
INDEX DATA STRUCTURES
In a huge database structure, it is very inefficient to search all the index values and reach the
desired data.
Hashing technique is used to calculate the direct location of a data record on the disk without using
index structure. In this technique, data is stored at the data blocks whose address is generated by
using the hashing function. The memory location where these records are stored is known as data
bucket or data blocks.
Most of the time, the hash function uses the primary key to generate the address of the data block.
A hash function is a simple mathematical function to any complex mathematical function. We can
even consider the primary key itself as the address of the data block.
The above diagram shows data block addresses same as primary key value. This hash function can
also be a simple mathematical function like exponential, mod, cos, sin, etc. Suppose we have mod
(5) hash function to determine the address of the data block. In this case, it applies mod (5) hash
function on the primary keys and generates 3, 3, 1, 4 and 2 respectively, and records are stored in
those data block addresses.
Page 7
TYPES OF HASING
STATIC HASHING
In the static hashing, the number of data buckets in memory always remains constant.
Therefore, if you generate an address for say Student_ID = 10 using hashing function mod(3),
the resultant bucket address will always be 1. So, you will not see any change in the bucket address.
Therefore, in this static hashing method, the resultant data bucket address will always remain the
same.
Page 8
Static Hash Functions
Searching: To search for a data entry, we apply a hash function h to identify the bucket to which
it belongs and then search this bucket.
Deleting: To delete a data entry, we use the hash function to identify the correct bucket, locate the
data entry by searching the bucket, and then remove it. If this data entry is the last in an overflow
page, the overflow page is removed from the overflow chain of the bucket.
Inserting: To insert a data entry, we use the hash function to identify the correct bucket and then
put the data entry there.
Suppose we have to insert some records into the file. But the data bucket address generated by the
hash function is full or the data already exists in that address. This situation in the static hashing is
called bucket overflow and is caused due to collision.
A problem called collision, occurs when the hash field value of a record to be inserted hashes to
an address, which is already being occupied by another record. It should be resolved by finding
some other location to place the new record. This process of finding another location is called
collision resolution. Some methods for collision resolution are as follows:
Closed hashing
In this method we introduce a new data bucket with same address and link it after the full data
bucket. These methods of overcoming the bucket overflow are called closed hashing or overflow
chaining.
Page 9
Open Hashing.
When a hash function generates an address at which data is already stored, the next free bucket is
allocated to it. This mechanism is called Open Hashing or Linear Probing
In the below example, 6 is a new record which needs to be inserted. But the hash function generates
address as 1. But it is already full. So the system searches next available data bucket, and assigns
6 to it.
DYNAMIC HASHING
The dynamic hashing method is used to overcome the problems of static hashing like bucket
overflow.
In the dynamic hashing, the number of data buckets in memory grow or shrink as the records
increases or decreases.
This method makes hashing dynamic, and allows insertion or deletion without resulting in poor
performance.
Page 10
Dynamic hashing is of two types
Extendible hashing
Linear hashing
EXTENDIBLE HASHING
It is a type of dynamic hashing which uses a directory of pointers to buckets, and doubles the size
of the number of buckets by doubling just the directory and splitting only the bucket that
overflowed.
Page 11
Consider the following Extendible hashed File
The directory consists of an array of size 4, with each element being a pointer to a bucket.
Search: To locate a data entry, we apply a hash function to the search field and take the last 2 bits
of its binary representation to get a number. The pointer in this array position gives us the desired
bucket.
Eg: To locate a data entry with hash value 5 (binary 101), we look at directory element 01 and
follow the pointer to the data page (bucket B in the figure).
Insert: To insert a data entry, we search to find the appropriate bucket. The data entry is placed in
the bucket to which it belongs and the bucket is split if necessary to make space.
Page 12
For example, to insert a data entry with hash value 13 (denoted as 13*), we examine directory
element 01 and go to the page containing data entries 1*, 5*, and 21 *. Since the page has space
for an additional data entry, we will insert it there.
Next, let us consider insertion of data entry 20* (binary 10100). Looking at directory clement 00,
we are led to bucket A, which is already full. We must first split the bucket by allocating a new
bucket and redistributing the contents (including the new entry to be inserted) across the old bucket
and its split image.
To redistribute entries across the old bucket and its split image, we consider the last three bits of
h(T); the last two bits are 00, indicating a data entry that belongs to one of these two buckets, and
the third bit discriminates between these buckets.
Page 13
We need three bits to discriminate between two of our data pages (A and A2), but the directory
has only enough slots to store all two-bit patterns. The solution is to double the directory. The
bucket is split so, new directory element 000 points to one of the split versions and new element
100 points to the other.
Consider that we now insert 9*, it belongs in bucket B; this bucket is already full. We can deal
with this situation by splitting the bucket and using directory elements 001 and 101 to point to the
bucket and its split image. However, if either bucket A or A2 grows full and an insert then forces
a bucket split, we are forced to double the directory again.
To differentiate between these cases and determine whether a directory doubling is needed, we
maintain a local depth for each bucket. If a bucket whose local depth is equal to the global depth
is split, the directory must be doubled.
Going back to the example, when we inserted 9* into the index it belonged to bucket B with local
depth 2, whereas the global depth was 3.Even though the bucket was split, the directory did not
have to be doubled. Buckets A and A2, on the other hand, have local depth equal to the global
depth, and, if they grow full and are split, the directory must then be doubled.
Delete: To delete a data entry, we search to find the appropriate bucket. The data entry is located
and removed. If the delete leaves the bucket empty, it can be merged with its split image.
Page 14
LINEAR HASHING
Linear Hashing is a dynamic hashing technique, like Extendible Hashing, adjusting gracefully to
inserts and deletes. In contrast to Extendible Hashing, it does not require a directory, deals
naturally with collisions, and offers a lot of flexibility with respect to the timing of bucket splits.
Working:
Suppose we start with a number of buckets N to put records in the buckets 0 to N-1.
Let this be round i which will be 0 initially.
We start with an initial mod hash function hi(K) = K mod 2iN.
When there is a collision, the first bucket, i.e., bucket 0, is split into two buckets: bucket
0 and a new bucket N at the end of the file.
The records in bucket 0 are redistributed between the two buckets using another hash
function hi+1 = K mod 2(i+1)N.
We use a split pointer s that points to next bucket that will be split. This is initially set to 0 and
incremented every time a split occurs.
When s becomes N after incrementing, this signals that all buckets have been split and hash
function hi+1 applies to all buckets.
At this point the split pointer s is reset to 0. After this, next collision hash function to be used
would be hi+2 = K mod 2(i+2)N.
Searching:
Searching for a bucket with hash key K can be done as following. Apply current hash function h i.
If bucket b = hi(K) < s, then apply hash function hi+1 because bucket b is already split.
Page 15
Example:
1) Insert record with keys 32, 44, 36, 9, 25, 5, 14, 18, 10, 30, 31, 35, 7 and 11
Current round i = 0.
Split pointer s = 0.
Hash function to be used h0(K) = K mod 4.
h0(32) = 32 mod 4 = 0. Insert record in bucket 0.
h0(44) = 44 mod 4 = 0. Insert record in bucket 0.
Similarly you can find out hash values for other keys and insert in appropriate buckets as
shown in the following diagram:
Current round i = 0.
Hash function to be used h0(K) = K mod 4
h0(43) = 43 mod 4 = 3. Insert record in bucket 3. But the bucket is full.
Add an overflow page and chain it to 3.
Page 16
Because there is an overflow, split bucket 0 pointed to by s and add a new bucket 4.
Redistribute bucket 0 contents between buckets 0 and 4 using next hash function h1(K) =
K mod 2 * 4.
Increment split pointer.
Current round i = 0.
Hash function to be used h0(K) = K mod 4
h0(37) = 37 mod 4 = 1. Insert record in bucket 1.
Page 17
4) Insert record with key 29.
Current round i = 0.
Hash function to be used h0(K) = K mod 4
h0(29) = 29 mod 4 = 1. Insert record in bucket 1. But the bucket is full.
Split bucket 1 pointed to by s.
Add new bucket 5.
Redistribute bucket 1 contents and 29 between buckets 1 and 5 with new hash function
h1(K) = K mod 2 * 4.
Increment split pointer
No overflow page is needed because we are splitting the same bucket to which the key
was mapped with the original hash function.
Current round i = 0.
Hash function to be used h0 (K) = K mod 4.
h0 (22) = 22 mod 4 = 2. Insert record in bucket 2. But the bucket is full.
Split bucket 2 pointed to by s.
Add new bucket 6.
Redistribute bucket 2 contents and 22 between buckets 2 and 6 with new hash function
h1(K) = K mod 2 * 4.
Increment split pointer
No overflow page is needed because we are splitting the same bucket to which the key
was mapped with the original hash function.
Page 18
6) Insert record with key 66.
Current round i = 0.
Hash function to be used h0(K) = K mod 4
h0(66) = 66 mod 4 = 2. Insert record in bucket 2.
Since 2 < s, we have to use h1 hash function.
h1(66) = 66 mod 8 = 2. The same bucket number and hence insert record in bucket 2.
Current round i = 0.
Hash function to be used h0(K) = K mod 4
h0(34) = 34 mod 4 = 2. Insert record in bucket 2.
2 < s, hence need to use h1 hash function.
h1(34) = 34 mod 8 = 2. Again the same bucket. Insert record in bucket 2.
Page 19
8) Insert record with key 50.
Current round i = 0.
Hash function to be used h0(K) = K mod 4
h0(50) = 50 mod 4 = 2. Insert record in bucket 2.
As 2 < s, use h1 for hashing.
h1(50) = 50 mod 8 = 2. The same bucket. Insert record in bucket 2.
But the bucket is full.
Split bucket 3.
Add new bucket 7.
Redistribute bucket 3 contents and 50 between buckets 2 and 7 with new hash function
h1(K) = K mod 2 * 4.
Add an overflow bucket, insert 50 in it and chain it to bucket 2
Increment split pointer. It will be 4.
Since it is equivalent to initial number of buckets N = 4, reset it to 0.
Use has function to be used as h1(K) = K mod 2 * 4 for further insertion of records.
Set next round i to i + 1. Now the round i will be 1.
Page 20
9) Insert record with key 45.
Current round i = 1.
Hash function to be used h1(K) = K mod 2 * 4.
h1(45) = 45 mod 8 = 5. Insert record in bucket 5.
Current round i = 1.
Hash function to be used h1(K) = K mod 21 * 4
h1(53) = 53 mod 8 = 5. Insert record in bucket 5. But the bucket is full.
Therefore, split bucket 0 as pointed to by split pointer.
Add new bucket 8.
Page 21
Redistribute its contents between buckets 0 and 8 with new hash function h2(K) = K mod
22 * 4 = K mod 16.
h1(32) = 32 mod 16 = 0. Hence 32 remains in the same bucket
Add an overflow bucket, insert 53 in it and chain it to bucket 5
Increment split pointer. It will be 1.
SEARCHING
EXAMPLES
Page 22
EXTENDIBLE HASING VS. LINEAR HASHING
i Linear hashing avoids directory, whereas extendible hashing uses a directory by having a
predefined order of buckets to split.
ii Linear hashing disadvantage is that space utilization could be lower because the bucket
splits are not concentrated where the data density is highest, whereas in extendible hashing
as space is more in advance, the bucket splits can be concentrated when the data density is
highest.
iii A directory based implementation of linear hashing can improve space occupancy but it
is still likely to be inferior to extendible hashing in extreme cases.
iv The choice of hashing function is actually very similar, since linear hashing moving hi to
hi+1 corresponds to same as doubling the directory in extendible hashing.
v In extendible hashing the directory is doubled in a single step whereas in linear hashing it
occurs slowly in each round by doubling the number of buckets.
vi Linear hashing increases the number of splits whereas extendible hashing decreases the
number of splits and gives higher bucket occupancy.
Here we do not use bucket address table. Bucket address table is used.
In static hashing, the resultant data bucket Data bucket changes depending on the record.
address is always the same. In other words, the
bucket address does not change.
Open hashing and Closed hashing are forms of Extendable hashing and Linear hashing are
it. forms of it.
No complex implementation. Implementation is complex.
Here system directly accesses the Bucket. Here the Bucket address table is accessed
before accessing the Bucket.
Page 23
Chaining used is overflow chaining. Overflow chaining is not used
An alternative to hash-based indexing is to organize records using a treelike data structure. The
data entries are arranged in sorted order by search key value, and a hierarchical search data
structure is maintained that directs searches to the correct page of data entries.
This structure allows us to efficiently locate all data entries with search key values in a desired
range. All searches begin at the topmost node, called the root, and the contents of pages in non-
leaf levels direct searches to the correct leaf page. Non-leaf pages contain node pointers separated
by search key values. The node pointer to the left of a key value k points to a subtree that contains
only data entries less than k. The node pointer to the right of a key value k points to a subtree that
contains only data entries greater than or equal to k.
The lowest level of the tree, contains the data entries. Additional records with age less than 22
would appear in leaf pages to the left page L1, and records with age greater than 50 would appear
in leaf pages to the right of page L3.
Suppose we want to find all data entries with 24 < age < 50. Each edge from the root node to a
child node has a label that explains what the corresponding subtree contains. We look for data
entries with search key value > 24, and get directed to the middle child, node A. Again, examining
the contents of this node, we are directed to node B. Examining the contents of node B, we are
directed to leaf node Ll, which contains data entries we are looking for.
Observe that leaf nodes L2 and L3 also contain data entries that satisfy our search criterion. To
facilitate retrieval of such qualifying entries during search, all leaf pages are maintained in a
Page 24
doubly-linked list. Thus, we can fetch page L2 using the 'next' pointer on page Ll, and then fetch
page L3 using the 'next' pointer on L2.
Thus, the number of disk I/Os incurred during a search is equal to the length of a path from the
root to a leaf, plus the number of leaf pages with qualifying data entries.
NOTE: Tree-structured indexing techniques support both range searches and equality searches
efficiently. Equality searches is efficient as the contents of pages in non-leaf levels direct searches
to the correct leaf page via pointers. And since all leaf pages are maintained in a doubly-linked
list, retrieval of data entries with search key values in a desired range is easy
Scan: Fetch all records in the file. The pages in the file must be fetched from disk into
the buffer pool.
Search with Equality Selection: Fetch all records that satisfy an equality selection; for
example, "Find the employee record for the employee with age 23 and sal 50." Pages that
contain qualifying records must be fetched from disk, and qualifying records must be
located within retrieved pages.
Search with Range Selection: Fetch all records that satisfy a range selection; for example,
"Find all employee records with age greater than 35."
Insert a Record: Insert a given record into the file. We must identify the page in the file
into which the new record must be inserted, fetch that page from disk, modify it to include
the new record, and then write back the modified page. Depending on the file organization,
we may have to fetch, modify, and write back other pages as well.
Delete a Record: Delete a record that is specified using its rid. We must identify the page
that contains the record, fetch it from disk, modify it, and write it back. Depending on the
file organization, we may have to fetch, modify, and write back other pages as well.
Page 25
Comparison of I/O Costs
Heap Files
Scan: The cost is BD because we must retrieve each of B pages taking time D per page.
Search with Equality Selection: Suppose that we know in advance that exactly one record
matches the desired equality selection, that is, the selection is specified on a candidate key.
On average, we must scan half the file, assuming that the record exists and the distribution
of values in the search field is uniform. The cost is O.5BD. If no record satisfies the
selection, however, we must scan the entire file to verify this.
Search with Range Selection: The entire file must be scanned because qualifyingrecords
could appear anywhere in the file, and we do not know how many qualifying records exist.
The cost is BD
Insert: We assume that records are always inserted at the end of the file. We must fetch
the last page in the file, add the record, and write the page back. The cost is 2D
Delete: We must find the record, remove the record from the page, and write the modified
page back. The cost is the cost of searching plus D.
Page 26
Sorted Files
Search with Equality Selection: Condition is given and it is sorted. Therefore we can
locate the first page containing the desired record or records, should any qualifying records
exist, with a binary search in log2B steps. Each step requires a disk I/O and two
comparisons. Therefore the cost is Dlog2B
Search with Range Selection: The cost is the cost of search plus the cost of retrieving the
set of records that satisfy the search. Therefore the cost is Dlog2B plus number of
matches.(since data pages are sequentially retrieved until condition becomes false.
Insert: To insert a record while preserving the sort order, we must first find the correct
position in the file, add the record, and then fetch and rewrite all subsequent pages.
Therefore, we must read the latter half of the file and then write it back after adding the
new record. Therefore the cost is that of searching to find the position of the new record
plus BD
Delete: Delete: We must search for the record, remove the record from the page, and write
the modified page back. We must also read and write all subsequent pages because all
records that follow the deleted record must be moved up to compact the free space.
Therefore the cost is the same as for an insert, that is, search cost plus BD
Clustered Files
In a clustered file, extensive empirical study has shown that the number of physical data pages is
about 1.5B
Scan: The cost of a scan is 1.5BD because all data pages must be examined.
Search with Equality Selection: Fetching all pages from the root to the appropriate leaf.
Each step requires a disk I/O and two comparisons. Once the page is known, the first
qualifying record can again be located by a binary search. Therefore the cost is DlogF1.5B.
Search with Range Selection: The first record that satisfies the selection is located as it is
for search with equality. Subsequently, data pages are sequentially retrieved until a record
is found that does not satisfy the range selection; this is similar to an equality search with
many qualifying records. Therefore the cost is DlogF1.5B plus qualifying records.
Page 27
Insert: To insert a record, we must first find the correct leaf page in the index, reading
every page from root to leaf. Then, we must add the new record. Therefore the cost is that
of searching to find the position of the new record plus D.
Delete: We must search for the record, remove the record from the page, and write the
modified page back. Therefore the cost is the same as for an insert, that is, search cost plus
D.
Extensive empirical study has shown that the number of the number of leaf pages in the index is
0.1(L5B) = 0.15B.
Scan: We can read all data entries at a cost of BD(R+0.15) since the leaf level of the index
depends on the size of data entry and the number of leaf pages is 0.15B
Search with Equality Selection: Fetching all pages from the root to the appropriate leaf.
Each step requires a disk I/O and two comparisons. Once the page is known, the first
qualifying record can again be located by a binary search. Therefore the cost is DlogF
0.15B plus D because record should again be fetched from file.
Search with Range Selection: The first record that satisfies the selection is located as it is
for search with equality. Subsequently, data pages are sequentially retrieved until a record
is found that does not satisfy the range selection; this is similar to an equality search with
many qualifying records. Therefore the cost is DlogF 0.15B plus qualifying records
Insert: We must first insert the record in heap file. In addition, we must insert the
corresponding data entry in the index. Finding the right leaf page, and writing it out (D)
after adding the new data entry costs another D. Therefore the cost is D(3 + logFO.15B)
Delete: We need to locate the data record in file and the data entry in the index. Now, we
need to write out the modified pages in the index and the data file, at a cost of 2D. Therefore
the total cost is Search+ 2D.
The number of pages required to store data entries is t 1.25 times the number of pages when the
entries are densely packed, that is, 1.25(0.10B) = 0.125B.
Scan: As for an unclustered tree index, all data entries can be retrieved inexpensively, at
a cost of BD(R+0.125)
Page 28
Search with Equality Selection: Record is found with the help of hash function. Reading
it from the index costs D and reading it from file costs another D. Therefore the total cost
is 2D.
Search with Range Selection: The hash structure offers no help, and the entire heap file
of employee records must be scanned at a cost of BD.
Insert: We must first insert the record in the heap file at a cost of 2D. In addition, the
appropriate page in the index must be located, modify to insert a new data entry, and then
written back (2D) . Therefore the total cost is 4D.
Delete: We need to locate the data record in the employee file and the data entry in the
index. Then, we need to write out the modified pages in the index and the data file, at a
cost of 2D. Therefore the total cost is search+ 2D.
In general an index supports efficient retrieval of data entries that satisfy a given selection
condition.
Hash based indexing techniques are optimized only for equality selections and fare poorly on
range selections where they are typically worse than scanning the entire file of records.
In contrast to simply maintaining the data entries in a sorted file, tree-structured indexes
highlights two important advantages over sorted files:
2. Finding the correct leaf page when searching for a record by search key value is much faster
than binary search of the pages in a sorted file.
There can be at most one clustered index on a given collection of records. On the other hand, we
can build several uncIustered indexes on a data file.
Page 29
Suppose that employee records are sorted by age, or stored in a clustered file with search key age.
If in addition we have an index on the sal field, the latter must be an unclustered index. We can
also build an uncIustered index on, department, if there is such a field.
Clustered indexes are nevertheless expensive to maintain. When a new record has to be inserted
into a full leaf page, a new leaf page must be allocated and some existing records have to be moved
to the new page and also be updated to point to the new location.
Therefore Clustering must be used sparingly and also is no good reason to build a clustered file
using hashing, since range queries cannot be answered using hash -indexes.
In dealing with the limitation that at most one index can be clustered, it is often useful to consider
whether the information in an index's search key is sufficient to answer the query.
For example, if we have an index on age, and we want to compute the average age of employees,
the DBMS can do this by simply examining the data entries in the index. This is an example of
an index-only evaluation. In an index-only evaluation of a query we need not access the data
records in the files that contain the relations in the query; we can evaluate the query completely
through indexes on the files
An important benefit of index-only evaluation is that it works equally efficiently with only
unclustered indexes, as only the data entries of the index are used in the queries.
SELECT E.dno
FROM Employees E
If we have a b+ tree index on age, we can use it to retrieve only tuples that satisfy the selection
E. age> 40.
If virtually everyone is older than 40, we gain little by using an index on age; a sequential scan of
the relation would do almost as well. However, suppose that only 10 percent of the employees are
older than 40. Now, is an index useful? The answer depends on whether the index is clustered.
If the index is unclustered, we could have one page I/O per qualifying employee, and this could
be more expensive than a sequential scan, even if only 10 percent of the employees qualify.
Page 30
On the other hand, a clustered B+ tree index on age requires only 10 percent of the l/Os for a
sequential scan.
FROM Employees E
WHERE E.age> 10
GROUP BY E.dno
If a B+ tree index is available on age, we could retrieve tuples using it, sort the retrieved tuples on
dna, and so answer the query. However, this may not be a good plan if virtually all employees are
more than 10 years old.
Let us consider whether an index on dna might suit our purposes better. We could use the index to
retrieve all tuples, grouped by dna, and for each dna count the number of tuples with age> 10.
Again, the efficiency depends crucially on whether the index is clustered. If the index is not
clustered, we could perform one page I/O per tuple in Employees, and this plan would be terrible.
Clustering is also important for an index on a search key that does not include a candidate key.
SELECT E.dno
FROM Employees E
WHERE E.hobby='Stamps'
If many people collect stamps, retrieving tuples through an unclustered index on hobby can be very
inefficient. Therefore, if such a query is important, we should consider making the index on hobby
a clustered index.
Page 31
On the other hand, if we assume that eid is a key for Employees, and replace the condition
E.hobby= 'Stamps' by E. eid=552, we know that at most one Employees tuple will satisfy this
selection condition.
The next query shows how aggregate operations can influence the choice of indexes (index on
edno).
FROM Employees E
GROUP BY E.dno
A straightforward plan for this query is to sort Employees on dno to compute the count of
employees for each dno.
However, if an index-hash or B+ tree—on dno is available, we can answer this query by scanning
only the index. For each dno value, we simply count the number of data entries in the index with
this value for the search key. Note that it does not matter whether the index is clustered because
we never retrieve tuples of Employees.
The search key for an index can contain several fields; such keys are called composite search keys
or concatenated keys.
As an example, consider a collection of employee records, with fields name, age, and sal, stored
in sorted order by name.
Page 32
The above figure illustrates the difference between a composite index with key (age, sal), a
composite index with key (sal, age), an index with key age, and an index with key sal.
If the search key is composite, an equality query is one in which each field in the search key is
bound to a constant. For example, we can ask to retrieve all data entries with age = 20 and sal =
10 where as in a range query not all fields in the search key are bound to constants. For example,
we can ask to retrieve all data entries with age =20; this query implies that any value is acceptable
for the sal field. As another example of a range query, we can ask to retrieve all data entries with
age < 30 and sal> 40.
Note that the index cannot help on the query sal > 40, because, intuitively, the index organizes
records by age first and then sal. If age is left unspecified, qualifying records could be spread
across the entire index
A composite key index can support a broader range of queries because it matches more selection
conditions. Further, since data entries in a composite index contain more information about the
data record, the opportunities for index-only evaluation strategies are increased.
On the negative side, a composite index must be updated in response to any operation (insert,
delete, or update) that modifies any field in the search key. A composite index is also likely to be
larger than a single-attribute search key index because the size of entries is larger.
Page 33
Design Examples of Composite Keys
Consider the following query, which returns all employees with 20 < age < 30 and 3000 < sal <
5000:
SELECT E.eid
FROM Employees E
A composite index on (age, sal) could help if the conditions in the WHERE clause are fairly
selective. For this query, in which the conditions on age and sal are equally selective, a composite,
clustered B+ tree index on (age, sal) is as effective as a composite, clustered B+ tree index on (sal,
age).
SELECT E.eid
FROM Employees E
WHERE E.age = 25
In this query a composite, clustered B+ tree index on (age, sal) will give good performance because
records are sorted by age first and then (if two records have the same age value) by sal. Thus, all
records with age = 25 are clustered together. On the other hand, a composite, clustered B+ tree
index on (sal, age) will not perform as well. In this case, records are sorted by sal first, and
therefore two records with the same age value (in particular, with age = 25) may be quite far apart.
Page 34
Composite indexes are also useful in dealing with many aggregate queries.
Consider:
FROM Employees E
WHERE E.age = 25
A composite B+ tree index on (age, sal) allows us to answer the query with an index-only scan.
Page 35
The disadvantages of composite search keys indexes are:
Composite search keys indexes tend to have large entries. This means fewer index entries per
index page and more index pages to read.
An update to any attribute of a composite index causes the index to be modified. The columns
you choose should not be those that are updated often.
This specifies that a B+ tree index is to be created on the Students table using the concatenation
of the age and gpa columns as the key. Thus, key values are pairs of the form (age, gpa) .
Once created, the index is automatically maintained by the DBMS adding or removing data
entries in response to inserts or deletes of records on the Students relation.
Page 36
INTUITION FOR TREE INDEXES
Consider a file of Students records sorted by gpa. To answer a range selection such as "Find all
students with a gpa higher than 3.0," we must identify the first such student by doing a binary
search of the file and then scan the file from that point on. If the file is large, the initial binary
search can be quite expensive.
One idea is to create a second file with one record per page in the original (data) file, of the form
(first key on page, pointer to page), again sorted by the key attribute (which is gpa in our example).
We refer to pairs of the form (key, pointer) as index entries or just entries. Note that each index
page contains one pointer more than the number of keys---each key serves as a separator for the
contents of the pages pointed to by the pointers to its left and right.
We can do a binary search of the index file to identify the page containing the first key (gpa)
value that satisfies the range selection (in our example, the first student with gpa over 3.0) and
follow the pointer to the page containing the first data record with that key value. We can then
scan the data file sequentially from that point on to retrieve other qualifying records. This
example uses the index to find the first data page containing a Students record with gpa greater
than 3.0, and the data file is scanned from that point on to retrieve other such Students records.
Because the size of an entry in the index file (key value and page id) is likely to be much smaller
than the size of a page, and only one such entry exists per page of the data file, the index file is
likely to be much smaller than the data file; therefore, a binary search of the index file is much
faster than a binary search of the data file.
Page 37
The potential large size of the index file motivates the tree indexing idea. Why not apply the
previous step of building an auxiliary structure of all the collection of index records and so on
recursively until the smallest auxiliary structure fits on one page? This repeated construction of a
one-level index leads to a tree structure with several levels of non-leaf pages.
The ISAM structure is completely static (except for the overflow pages). The data entries of the
ISAM index are in the leaf pages of the tree and additional overflow pages are chained to some
leaf page if there are more entries inserted into it than will fit onto a single page.
File creation: When the file is created, all leaf pages are allocated sequentially and sorted on the
search key value. The non-leaf level pages (index pages) are then allocated and then the overflow
pages.
Search: For an equality selection search, we start at the root node and determine which
subtree to search by comparing the value in the search field of the given record with the
key values in the node. For a range query, the starting point in the data (or leaf) level is
determined similarly, and data pages are then retrieved sequentially.
Page 38
Example
Consider the tree shown above. All searches begin at the root. For example, to locate a
record with the key value 27, we start at the root and follow the left pointer, since 27 <
40. We then follow the middle pointer, since 20 <= 27 < 33. For a range search, we find
the first qualifying data entry as for an equality selection and then retrieve primary leaf
pages sequentially (also retrieving overflow pages as needed by following pointers from
the primary pages).
Insert: The appropriate page is determined as for a search, and the record is inserted if
space is available, else allocate an overflow page, put it there, and link it.
Example
We assume that each leaf page can contain two entries. If we now insert a record with
key value 23, the entry 23* belongs in the second data page, which already contains 20*
and 27* and has no more space. We deal with this situation by adding an overflow page
and putting 23* in the overflow page. Inserting 48*, 41 *, and 42* leads to an overflow
chain of two pages.
Page 39
Delete: The appropriate page is determined and the record is deleted. If this entry is on an
overflow page and the overflow page becomes empty, the page can be removed. If the entry
is on a primary page and deletion makes the primary page empty, the simplest approach is
to simply leave the empty primary page; it serves as a placeholder for future insertions.
Example:
Deleting 42*, 51*, 97*
Deleting 42* makes the overflow page empty. Hence we should remove it. Deleting 51*,
97* is simple. The resultant tree after deleting 42* 51*, 97* is shown below.
Page 40
Pros
• Relatively simple
• Great for true sequential access
Cons
The B+ tree is a dynamic index structure which is most widely used. It is a multilevel index format
technique. It is a balanced tree in which the internal nodes direct the search and the leaf nodes
contain the data entries. To retrieve all leaf pages efficiently, we link them using page pointers. By
organizing them into a doubly linked list, we can easily traverse the sequence of leaf pages in either
direction.
Page 41
The node pointer to the left of a key value k points to a subtree that contains only
data entries less than k. The node pointer to the right of a key value k points to a
subtree that contains only data entries greater than or equal to k.
A minimum occupancy of 50 percent is guaranteed for each node except the root
if the deletion algorithm is implemented.
Searching for a record requires just a traversal from the root to the appropriate
leaf. We refer to the length of a path from the root to a leaf any leaf, (because the
tree is balanced) as the height of the tree.
Every node contains m entries, where d <=m <= 2d. The value d is a parameter of the B+ tree,
called the order of the tree, and is a measure of the capacity of a tree node. The root node is the
only exception to this requirement on the number of entries; for the root, it is simply required that
1<= m <=2d.
Format of a Node
The format of a node is the same as for ISAM. Non-leaf nodes with m 'index entries
contain m+1 pointers to children.
Pointer Pi points to a subtree in which all key values K are such that Ki <= K <= Ki+1.
Po points to a tree in which all key values are less than K1 and Pm points to a tree in
which all key values are greater than or equal to Km.
For leaf nodes, entries arc denoted as k*, as usual.
Page 42
Advantages of B+ Trees
Since all records are stored only in the leaf node and are sorted sequential linked list,
searching is becomes very easy.
Using B+, we can retrieve range retrieval or partial retrieval. Traversing through the tree
structure makes this easier and quicker.
As the number of record increases/decreases, B+ tree structure grows/shrinks. There is no
restriction on B+ tree size, like we have in ISAM.
Since it is a balance tree structure, any insert/ delete/ update does not affect the
performance.
Since we have all the data stored in the leaf nodes and more branching of internal nodes
makes height of the tree shorter. This reduces disk I/O. Hence it works well in secondary
storage devices.
Disadvantages of B+ Trees
Page 43
SEARCH
ALGORITHM
Page 44
This B+ tree is of order d=2. That is, each node contains between 2 and 4 entries. Each non--leaf
entry is a (key value, node pointer) pair; at the leaf level, the entries are data records that we denote
by k*. To search for entry 5*, we follow the left-most child pointer, since 5 < 13. To search for the
entries 14* or 15*, we follow the second pointer, since 13 <= 14 < 17, and 13 <= 15 < 17. (We do
not find 15* on the appropriate leaf and can conclude that it is not present in the tree.) To find 24
*, we follow the fourth child pointer, since 24 <= 24 < 30.
INSERT
The algorithm for insertion takes an entry, finds the leaf node where it belongs, and inserts it there.
Usually, this procedure results in going down to the leaf node where the entry belongs, placing the
entry there, and returning all the way back to the root node. Occasionally if a node is full, it must
be split. When the node is split, an entry pointing to the node created by the split must be inserted
into its parent; this entry is pointed to by the pointer variable newchildentry. If the (old) root is
split, a new root node is created and the height of the tree increases by 1.
ALGORITHM
Page 45
If we insert entry 8*, it belongs in the left-most leaf, which is already full. This insertion causes a
split of the leaf page. The tree must now be adjusted to take the new leaf page into account, so
we insert an entry consisting of the pair (5, pointer to new page) into the parent node. Note how
the key 5, which discriminates between the split leaf page and its newly created sibling, is 'copied
up.' We cannot just 'push up' 5, because every data entry must appear in a leaf page. Since the
parent node is also full, another split occurs. In general we have to split a non-leaf nodewhen it is
full, containing 2d keys and 2d+1 pointers. The middle key will be 'pushed up' the tree, in contrast
to the case for a split of a leaf page.
Step 1
Step 2
Page 46
Step 3
Page 47
Variation of the insert algorithm (method 2)
This method is usually not used as it increases I/O, especially if we check both siblings.
One variation of the insert algorithm tries to redistribute entries of a node N with a sibling before
splitting the node; this improves average occupancy. The sibling of a node N, in this context, is a
node that is immediately to the left or right of N and has the same parent as N.
Reconsider insertion of entry 8* into the tree shown above. The entry belongs in the left-most leaf,
which is full. However, the (only) sibling of this leaf node contains only two entries and can thus
accommodate more entries. We can therefore handle the insertion of 8* with a redistribution. Note
how the entry in the parent node that points to the second leaf has a new key value; we 'copy up'
the new low key value on the second leaf.
Page 48
DELETE
The algorithm for deletion begins by finding the leaf node where the entry belongs, and then
deleting it. The basic idea behind the algorithm is that we recursively delete the entry by calling
the delete algorithm on the appropriate child node. We usually go down to the leaf node where the
entry belongs, remove the entry from there, and return all the way back to the root node.
Occasionally a node is at minimum occupancy before the deletion, and the deletion causes it to
go below the occupancy threshold. When this happens, we must either redistribute entries froman
adjacent sibling or merge the node with a sibling to maintain minimum occupancy. If entries are
redistributed between two nodes, their parent node must be updated to reflect this, the key value
in the index entry pointing to the second node must be changed to be the lowest search key in the
second node. If two nodes are merged, their parent must be updated to reflect this by deleting the
index entry for the second node, this index entry is pointed to by the pointer variable oldchildentry
when the delete call returns to the parent node. If the last entry in the root node is deleted in this
manner because one of its children was deleted, the height of the tree decreases by1.
ALGORITHM
To delete entry 19*, we simply remove it from the leaf page on which it appears, and we are
done because the leaf still contains two entries.
Page 49
If we subsequently delete 20*, however, the leaf contains only one entry after the deletion. The
(only) sibling of the leaf node that contained 20* has three entries, and we can therefore deal with
the situation by redistribution; we move entry 24* to the leaf page that contained 20* and copy up
the new splitting key (27, which is the new low key value of the leaf from which we borrowed
24*) into the parent.
Suppose that we now delete entry 24*. The affected leaf contains only one entry (22*) after the
deletion, and the (only) sibling contains just two entries (27* and 29*). Therefore, we cannot
redistribute entries. However, these two leaf nodes together contain only three entries and can be
merged. While merging, we can 'toss' the entry ((27, pointer' to second leaf page)) in the parent,
which pointed to the second leaf page, because the second leaf page is empty after the merge and
can be discarded.
Deleting the entry 27 (from non leaf page ) has created a non-leaf-level page with just one entry,
which is below the minimum of d = 2. To fix this problem, we must either redistribute or merge.
In either case, we must fetch a sibling. The only sibling of this node contains just two entries (with
key values 5 and 13), and so redistribution is not possible. We must therefore merge. Together, the
nonleaf node and the sibling to be merged contain only three entries, and they have a total of five
pointers to leaf nodes. To merge the two nodes, we also need to pull down the
Page 50
index entry in their parent that currently discriminates between these nodes. This index entry has
key value 17, and so we create a new entry (17, left-most child pointer in sibling). Now we have
a total of four entries and five child pointers, which can fit on one page in a tree of order d = 2.
Note that pulling down the splitting key 17 means that it will no longer appear in the parent node
following the merge. After we merge the affected non-leaf node and its sibling by putting all the
entries on one page and discarding the empty sibling page, the new node is the only child of the
old root, which can therefore be discarded.
The above examples illustrated redistribution of entries across leaves and merging of both leaf-
level and non-leaf-level pages. The remaining case is that of redistribution of entries between non-
leaf-level pages. To understand this case, consider the following tree
When we delete 24* from the tree (the leaf nodes will merge as 22, 27, 29 hence resulting in
deletion of 27 from non leaf node) the non-leaf level node now contains key value 30 and has a
sibling that can spare entries (the entries with key values 17 and 20). We move these entries over
from the sibling.
Page 51
A B+ Tree during a Deletion
Note that, in doing so, we essentially push them through the splitting entry in their parent node
(the root), which takes care of the fact that 17 becomes the new low key value on the right and
therefore must replace the old splitting key in the root (the key value 22).
Note: Intuitively, entries are redistributed by pushing through the splitting entry in the
parent node. And it suffices to re-distribute index entry with key 20; but we have re-
distributed 17 as well for just for illustration.
Page 52
Example problem:
INSERTION
Page 53
Page 54
DELETION
Page 55
Page 56
ISAM AND B+ TREE COMPARISION
ISAM B+ TREE
ISAM is a static structure. That is the number B+ tree is a dynamic structure. That is the
of leaf nodes are fixed. number of leaf nodes changes.
Could use overflow pages. Overflow pages are not used in B+ trees.
Large overflow pages leads to poor Most widely used index in database
performance. management systems because of its better
performance.
Not suitable for files that grow and shrink. Suitable for files that grow and shrink.
The leaf pages are assumed to be allocated All the leaf nodes are linked to each other via
sequentially, thus no pointer is needed in leaf pointers.
nodes to point to the next leaf node.
INDEXING HASHING
Indexing allows us to efficiently retrieve all Hashing technique is used to calculate the direct
records that satisfy search conditions on the location of a data record on the disk without
search key fields of the index. using index structure
Uses data references that holds the address of Uses hash function to calculate direct location
the disk block. of record on the disk.
Does not work well for large databases Works well for large databases
To perform the indexing we need a primary The hashing function can select any column
key on the table with a unique value. value to generate the address.
It supports both equality and range searches. It support only equality search.
The disadvantage of sequential indexed file is Hashing allows us to avoid accessing an index
that we must access an index to locate data. as the address of record is directly computedby
hash function.
Performance degrades as the file grows. A bad hash function may result in lookup taking
time proportional to the number of search keys
in the file.
Page 57
HASH BASED INDEX VS. TREE BASED INDEX
The data entries are arranged in random order. The data entries are arranged in sorted order by
search key value.
A hash function is used to calculate the direct A hierarchical search data structure is
location of a data record. maintained that directs searches to the correct
page of data entries.
Hash based indexing techniques are optimized Tree-based indexing techniques support both
only for equality selections and fare poorly on kinds of selection conditions efficiently,
range selections where they are typically worse explaining their widespread use.
than scanning the entire file of records.
Does not require intermediate page fetches for Requires intermediate page fetches for internal
internal nodes. nodes.
The hashing function can select any column Uses only primary key on the table with a
value to generate the address but most of the unique value.
time uses the primary key.
A bad hash function may result in lookup Performance degrades as the file grows.
taking time proportional to the number of
search keys in the file
Static and dynamic hashing techniques are ISAM and B+ trees exist with trade-offs
forms of it. similar to static and dynamic hashing
respectively.
Page 58