DBMS-Unit 4
DBMS-Unit 4
RAID – File Organization – Organization of Records in Files – Data dictionary Storage – Column
Oriented Storage– Indexing and Hashing –Ordered Indices – B+ tree Index Files – B tree Index Files –
Static Hashing – Dynamic Hashing – Query Processing Overview – Algorithms for Selection, Sorting
and join operations – Query optimization using Heuristics - Cost Estimation.
1. RAID
(Redundant array of independent disks) originally redundant array of inexpensive disks) is a
way of storing the same data in different places on multiple hard disks to protect data in thecase of a
drive failure.
RAID: Redundant Arrays of Independent Disks
Disk organization techniques that manage a large numbers of disks, providing a view of a
single disk of high capacity and high speed by using multiple disks in parallel, and high reliability by
storing data redundantly, so that data can be recovered even if a disk fails
• These levels contain the following characteristics:
• It contains a set of physical disk drives.
• In this technology, the operating system views these separate disks as a single logical disk.
• In this technology, data is distributed across the physical drives of the array.
• Redundancy disk capacity is used to store parity information.
• In case of disk failure, the parity information can be helped to recover the data.
Advantages
• I/O performance is greatly improved by spreading the I/O load across many channels &
drives.
• Best performance is achieved when data is striped across multiple controllers with only one
driver per controller
Disadvantages
• It is not fault-tolerant, failure of one drive will result in all data in an array being lost
RAID LEVEL 2:
This configuration uses striping across disks, with some disks storing error checking and correcting
(ECC) information. It has no advantage over RAID 3 and is no longer
• Each bit of data word is written to a data disk drive (4 in this example: 0 to 3).
• Each data word has its Hamming Code ECC word recorded on the ECC disks.
• On Read, the ECC code verifies correct data or corrects single disk errors.
Advantages-
• On the fly' data error correction
• Extremely high data transfer rates possible
• The higher the data transfer rate required, the better the ratio of data disks to ECC disks.
Disadvantages-
• Very high ratio of ECC disks to data disks with smaller word sizes
• Entry level cost very high
• Requires very high transfer rate requirement to justify. No
commercial implementations exist
Disadvantages
• Transaction rate equal to that of a single disk drive at best (if spindles are synchronized).
• Controller design is fairly complex.
Disadvantages
• Quite complex controller design
• Worst Write transaction rate and Write aggregate transfer rate
• Difficult and inefficient data rebuild in the event of disk failure
• Block Read transfer rate equal to that of a single disk
RAID LEVEL 5:
• RAID 5 uses striping as well as parity for redundancy. It is well suited for heavy read and
low write operations.
• Block-Interleaved Distributed Parity; partitions data and parity among all N + 1 disks,
rather than storing data in N disks and parity in 1 disk
.
RAID LEVEL 6:
• This technique is similar to RAID 5, but includes a second parity scheme that is distributed
across the drives in the array. The use of additional parity allows the array to continue to
function even if two disks fail simultaneously. However, this extra protection comes at a cost.
• P+Q Redundancy scheme; similar to Level 5, but stores extra redundant information to guard
against multiple disk failures.
• Better reliability than Level 5 at a higher cost; not used as widely.
FILE ORGANIZATION
The database is stored as a collection of files.
o Each file is a sequence of records.
o A record is a sequence of fields.
Classifications of records
o Fixed length record
o Variable length record
(i) Fixed length record approach:
Assume record size is fixed each file has records of one particular type only different files
are used for different relations
o Simple approach
o Record access is simple Example pseudo code type
account = record
account_number char(10);
branch_name char(22);
balance numeric(8);
end
Total bytes 40 for a record
Two problems
- Difficult to delete record from this structure.
- Some record will cross block boundaries, that is part of the record will be stored in one
block and part in another. It would require two block accesses to read or write Reuse the free space
alternatives:
– Move records i + 1, . . ., n to n i, . . . , n – 1
– do not move records, but link all free records on a free list
– Move the final record to deleted record place.
Free Lists
Store the address of the first deleted record in the file header.
Use this first record to store the address of the second deleted record, and so on
Variable-Length Records
Suppose there is a preexisting sorted sequence of four records R1, R3 and so on upto R6 and R7.
Suppose a new record R2 has to be inserted in the sequence, then it will be inserted at the end of
the file, and then it will sort the sequence.
If we want to search, update or delete the data in heap file organization, then we need to
traverse the data from staring of the file till we get the requested record.
If the database is very large then searching, updating or deleting of record will be time-
consuming because there is no sorting or ordering of records. In the heap file organization, we
need to check all the data until we get the requested record.
search-key pointer
Index files are typically much smaller than the original file.
• Access time: The time it takes to find a particular data item, or set of items, using the technique in
question.
• Insertion time: The time it takes to insert a new data item. This value includes the time it takes to
find the correct place to insert the new data item, as well as the time it takes to update the index structure.
• Deletion time: The time it takes to delete a data item. This value includes the time it takes to find the
item to be deleted, as well as the time it takes to update the index structure.
• Space overhead: The additional space occupied by an index structure. Provided that the amount of
additional space is moderate, it is usually worthwhile to sacrifice the space to achieve improved
performance.
Ordered Indices:
Each index structure is associated with a particular search key. An ordered index stores the
values of the search keys in sorted order, and associates with each search key the records that contain
it. A file may have several indices, on different search keys. If the file containing the records is
sequentially ordered, a clustering index is an index, whose search key also defines the sequential order
of the file. Clustering indices are also called primary indices. The search key of a clustering index is
often the primary key. Indices whose search key specifies an order different from the sequential order
of the file are called nonclustering indices, or secondary indices.
Index-sequential file: ordered sequential file with a primary index.
Sparse index: In a sparse index, an index entry appears for only some of the search-key values. Sparse
indices can be used only if the relation is stored in sorted order of the search key, that is, if the index is
a clustering index. To locate a record, we find the index entry with the largest search-key value that is
less than or equal to the search-key value for which we are looking.
Figure :Sparse index.
• Less space and less maintenance overhead for insertions and deletions.
• Generally slower than dense index for locating records.
• Good tradeoff: sparse index with an index entry for every block in file, corresponding to
least search-key value in the block.
Multilevel Indices
Indices with two or more levels are called multilevel indices. Searching for records with a
multilevel index requires significantly fewer I/O operations than does searching for records by binary
search.
• If primary index does not fit in memory, access becomes expensive.
• To reduce number of disk accesses to index records, treat primary index kept on disk as a
sequential file and construct a sparse index on it.
o outer index – a sparse index of primary index
o inner index – the primary index file
• If even outer index is too large to fit in main memory, yet another level of index can be
created, and so on.
• Indices at all levels must be updated on insertion or deletion from the file.
Insertion.
Dense indices:
1. If the search-key value does not appear in the index, the system inserts an index entry with the
search-key value in the index at the appropriate position.
2. Otherwise the following actions are taken:
a. If the index entry stores pointers to all records with the same search key value, the system
adds a pointer to the new record in the index entry.
b. Otherwise, the index entry stores a pointer to only the first record with the search- key
value. The system then places the record being inserted after the other records with the same
search-key values.
Sparse indices: We assume that the index stores an entry for each block.If the system creates a new
block, it inserts the first search-key value appearing in the new block into the index. On the other hand,
if the new record has the least search-key value in its block, the system updates the index entry pointing
to the block; if not, the system makes no change to the index.
Deletion.
Dense indices:
1. If the deleted record was the only record with its particular search-key value, then the system
deletes the corresponding index entry from the index.
2. Otherwise the following actions are taken:
a. If the index entry stores pointers to all records with the same search key value, the system deletes
the pointer to the deleted record from the index entry.
b. Otherwise, the index entry stores a pointer to only the first record with the search-key value. In this
case, if the deleted record was the first record with the search-key value, the system updates the index
entry to point to the next record.
Sparse indices:
1. If the index does not contain an index entry with the search-key value of the deleted record, nothing
needs to be done to the index.
2. Otherwise the system takes the following actions:
a. If the deleted record was the only record with its search key, the system replaces the corresponding
index record with an index record for the next search-key value (in search-key order). If the next search-
key value already has an index entry, the entry is deleted
instead of being replaced.
b. Otherwise, if the index entry for the search-key value points to the record being deleted, the system
updates the index entry to point to the next record with the same search-key value.
Secondary Indices:
Secondary indices must be dense, with an index entry for every search-key value, and a pointer
to every record in the file. A secondary index on a candidate key looks just like a dense clustering index,
except that the records pointed to by successive values in the index are not stored sequentially. In
general, however, secondary indices may have a different structure from clustering indices. If the search
key of a clustering index is not a candidate key, it suffices if the index points to the first record with a
particular value for the search key, since the other records can be fetched by a sequential scan of the
file.
In contrast, if the search key of a secondary index is not a candidate key, it is not enough to
point to just the first record with each search-key value. The remaining records with the same search-
key value could be anywhere in the file, since the records are ordered by the search key of the clustering
index, rather than by the search key of the secondary index. Therefore, a secondary index must contain
pointers to all the records.
Advantages:
Secondary indices improve the performance of queries that use keys other than the search key of the
clustering index.
B+-Tree Index Files
Defn: A B+-tree index takes the form of a balanced tree in which every path from the root of the tree
to a leaf of the tree is of the same length. Each non leaf node in the tree has between n/2 and n children,
where n is fixed for a particular tree.
Structure of a B+-Tree:
Figure shows a typical node of a B+-tree. It contains up to n − 1 search-key values K1, K2, . . .
, Kn−1, and n pointers P1, P2, . . . , Pn. The search-key values within a node are kept in sorted order;
thus,if i < j, then Ki < K j .
Non-Leaf Nodes:
Non leaf nodes form a multi-level sparse index on the leaf nodes.
For a non-leaf node with m pointers:
▪ All the search-keys in the sub tree to which P1 points are less than K1
▪ For 2 i n – 1, all the search-keys in the subtree to which Pi points have
values greater than or equal to Ki–1 and less than Km–1
• The number of pointers in a nodeis called the fanout of the node. Nonleaf nodes are also
referred to as internal nodes.
Example of a B+-tree
Deletion:
1. Find the record to be deleted, and remove it from the main file and from the bucket (if present)
2. Remove (search-key value, pointer) from the leaf node if there is no bucket or if the bucket has
become empty
3. If the node has too few entries due to the removal, and the entries in the node and a sibling fit
into a single node, then
o Insert all the search-key values in the two nodes into a single node (the one on the left),
and delete the other node.
o Delete the pair (Ki–1, Pi), where Pi is the pointer to the deleted node, from its parent,
recursively using the above procedure
• Otherwise, if the node has too few entries due to the removal, and the entries in the node and a
sibling fit into a single node, then
o Redistribute the pointers between the node and a sibling such that both have more than
the minimum number of entries.
o Update the corresponding search-key value in the parent of the node.
• The node deletions may cascade upwards till a node which has n/2 or more pointers is found.
If the root node has only one pointer after deletion, it is deleted and the sole child becomes the
root.
Fig: Before and after deleting “Downtown
B-Tree Index Files:
B-tree indices are similar to B+-tree indices. The primary distinction between the two
approaches is that a B-tree eliminates the redundant storage of search-key values. Search keys in nonleaf
nodes appear nowhere else in the B-tree; an additional pointer field for each search key in a nonleaf
node must be included.
Figure Typical nodes of a B-tree. (a) Leaf node. (b) Nonleaf node.
HASHING
Static Hashing:
One disadvantage of sequential file organization is that we must access an index structure to
locate data, or must use binary search, and that results in more I/O operations. File organizations based
on the technique of hashing allow us to avoid accessing an index structure. Hashing also provides a
way of constructing indices.
1. A bucket is a unit of storage containing one or more records (a bucket is typically a disk block).
2. In a hash file organization we obtain the bucket of a record directly from its search-key value
using a hash function.
3. Hash function h is a function from the set of all search-key values K to the set of all bucket
addresses B.
4. Hash function is used to locate records for access, insertion as well as deletion.
5. Records with different search-key values may be mapped to the same bucket; thus entire bucket
has to be searched sequentially to locate a record.
Hash file organization of account file, using branch-name as key.
a) There are 10 buckets,
b) The binary representation of the ith character is assumed to be the integer i.
c) The hash function returns the sum of the binary representations of the characters modulo 10
➢ E.g. h(Perryridge) = 5 h(Round Hill) = 3 h(Brighton) = 3
Hash function:
The worst possible hash function maps all search-key values to the same bucket. Such a function is
undesirable because all the records have to be kept in the same bucket. An ideal hash function distributes
the stored keys uniformly across all the buckets, so that every bucket has the same number of records.
The distribution is uniform: That is, the hash function assigns each bucket the same number of search-
key values from the set of all possible search-key values.
The distribution is random: That is, in the average case, each bucket will have nearly the same number
of values assigned to it, regardless of the actual distribution of search-key values.
Typical hash functions perform computation on the internal binary representation of the search-
key. For example, for a string search-key, the binary representations of all the characters in the string
could be added and the sum modulo the number of buckets could be returned.
Handling of Bucket Overflows:
If the bucket does not have enough space, a bucket overflow is said to occur. Bucket overflow
can occur for several reasons:
Bucket overflow can occur because of
➢ Insufficient buckets
➢ Skew in distribution of records. This can occur due to two reasons:
multiple records have same search-key value
chosen hash function produces non-uniform distribution of key values
Although the probability of bucket overflow can be reduced, it cannot be eliminated; it is handled by
using overflow buckets.
There are two types of hashing
1. Closed hashing: other name is Overflow chaining – the overflow buckets of a given bucket
are chained together in a linked list.
2. Open hashing: The set of buckets is fixed, and there are no overflow chains. Instead, if a bucket
is full, the system inserts records in some other bucket in the initial set of buckets B. It is not
suitable for database applications.
Hash Indices:
Hashing can be used not only for file organization, but also for index-structure creation.
Defn: A hash index organizes the search keys, with their associated pointers, into a hash file structure.
Hash indices are always secondary indices
o if the file itself is organized using hashing, a separate primary hash index on it using the
same search-key is unnecessary.
o However, we use the term hash index to refer to both secondary index structures and
hash organized files.
Deficiencies of Static Hashing
In static hashing, function h maps search-key values to a fixed set of B of bucket addresses.
➢ Databases grow with time. If initial number of buckets is too small, performance will
degrade due to too much overflows.
➢ If file size at some point in the future is anticipated and number of buckets allocated
accordingly, significant amount of space will be wasted initially.
➢ If database shrinks, again space will be wasted.
➢ One option is periodic re-organization of the file with a new hash function, but it is very
expensive.
These problems can be avoided by using techniques that allow the number of buckets to be modified
dynamically.
Dynamic Hashing:
Defn: Dynamic hashing techniques allow the hash function to be modified dynamically to
accommodate the growth or shrinkage of the database.
Extendable hashing – one form of dynamic hashing
➢ Hash function generates values over a large range — typically b-bit integers, with b =
32.
➢ At any time use only a prefix of the hash function to index into a table of bucket
addresses.
➢ Let the length of the prefix be i bits, 0 i 32.
➢ Bucket address table size = 2i. Initially i = 0
➢ Value of i grows and shrinks as the size of the database grows and shrinks.
➢ Multiple entries in the bucket address table may point to a bucket.
➢ Thus, actual number of buckets is < 2i
The number of buckets also changes dynamically due to coalescing and
splitting of buckets.
- Each bucket j stores a value ij; all the entries that point to the same bucket have the same
values on the first ij bits.
- To locate the bucket containing search-key Kj:
1. Compute h(Kj) = X
2. Use the first i high order bits of X as a displacement into bucket address table, and follow the
pointer to appropriate bucket
To insert a record with search-key value Kj
➢ follow same procedure as look-up and locate the bucket, say j.
➢ If there is room in the bucket j insert record in the bucket.
➢ Else the bucket must be split and insertion re-attempted
To split a bucket j when inserting record with search-key value Kj
1. If i > ij (more than one pointer to bucket j)
➢ allocate a new bucket z, and set ij and iz to the old ij -+ 1.
➢ make the second half of the bucket address table entries pointing to j to point to z
➢ remove and reinsert each record in bucket j.
➢ recompute new bucket for Kj and insert record in the bucket (further splitting is
required if the bucket is till full)
2. If i = ij (only one pointer to bucket j)
➢ increment i and double the size of the bucket address table.
➢ replace each entry in the table by two entries that point to the same bucket.
➢ recompute new bucket address table entry for Kj
Now i > ij so use the first case above.
When inserting a value, if the bucket is full after several splits (that is, i reaches some limit b) create
an overflow bucket instead of splitting bucket entry table further.
To delete a key value,
➢ locate it in its bucket and remove it.
➢ The bucket itself can be removed if it becomes empty (with appropriate updates to the
bucket address table).
➢ Coalescing of buckets can be done (can coalesce only with a “buddy” bucket having
same value of ij and same ij –1 prefix, if it is present)
➢ Decreasing bucket address table size is also possible
n Note: decreasing bucket address table size is an expensive operation and should
be done only if number of buckets becomes much smaller than the size of the
table
Use of Extendable Hash Structure: Example
QUERY PROCESSING
A query expressed in a high-level query language such as SQL must first be scanned, parsed, and
validated. The scanner identifies the language tokens-such as SQL keywords, attribute names, and
relation names, whereas the parser checks the query syntax . The query must also be validated, by
checking that all attribute and relation names are valid .
An internal representation of the query is then created, usually as a tree data structure called a query
tree. It is also possible to represent the query using a graph data structure called a query graph.
Then DBMS select the execution strategy to execute the query.
Defn: Query Optimization
A query typically has many possible execution strategies, and the process of choosing a suit-able one
for processing a query is known as query optimization.
The query optimizer module has the task of producing an execution plan, and the code
generator generates the code to execute that plan. The runtime database processor has the task of
running the query code,whether in compiled or interpreted mode, to produce the query result.
QUERY OPTIMIZATION
There are two main techniques for implementing query optimization.
1. Heuristic rules for ordering the operations in a query execution strategy
A heuristic is a rule that works well in most cases but is not guaranteed to work well in every
possible case. The rules typically reorder the operations in a query tree.
2. Systematically estimating the cost of different execution strategies
To select a lowest cost estimate.
This two techniques are usually combined in a query optimizer.
TRANSLATING SQL QUERIES INTO RELATIONAL ALGEBRA
An SQL query is first translated into an equivalent extended relational algebra expression represented
as a query tree data structure.
SQL queries are decomposed into query blocks. Nested queries within a query are identified as separate
query blocks.
Consider the following SQL query on the EMPLOYEE relation in the above Figure
SELECT LNAME, FNAME FROM EMPLOYEE WHERE SALARY > (SELECT
MAX (SALARY) FROM EMPLOYEE WHERE DNO=5);
This query includes a nested sub query and hence would be decomposed into two blocks.
The inner block is
(SELECT MAX (SALARY) FROM EMPLOYEE WHERE DNO=5)
and the outer block is
SELECT LNAME, FNAME FROM EMPLOYEE WHERE SALARY > C
where c represents the result returned from the inner block. The inner block could be
translated into the extended relational algebra expression
ℱMAX SALARY (σDNO=5 (EMPLOYEE))
outer block into the expression
πLNAME, FNAME (σSALARY>C(EMPLOYEE))
(OP2): DNUMBER>5(DEPARTMENT)
(OP3): DNO=5(EMPLOYEE)