L4 Indexing
L4 Indexing
Arm assembly
Accessing a Disk Page
Index files are typically much smaller than the original file
Two basic kinds of indices:
Ordered indices: search keys are stored in sorted order
Hash indices: search keys are distributed uniformly
across “buckets” using a “hash function”.
06/05/2025
Index Evaluation Metrics
10
06/05/2025
Ordered Indices
11
In an ordered index, index entries are stored sorted on
the search key value. E.g., author catalog in library.
Primary index: in a sequentially ordered file, the index
whose search key specifies the sequential order of the file.
Also called clustering index
The search key of a primary index is usually but not
06/05/2025
Dense Index Files
12
06/05/2025
Sparse Index Files
13
search-key
To locate a record with search-key value K we:
Find index record with largest search-key value < K
Search file sequentially starting at the record to which
06/05/2025
Sparse Index Files (Cont.)
14
records.
Good tradeoff: sparse index with an index entry for
every block in file, corresponding to least search-key
value in the block.
06/05/2025
Multilevel Index
15
06/05/2025
Multilevel Index (Cont.)
16
06/05/2025
Index Classification
17
Summery
Primary vs. secondary: If search key contains same
order or not.
Clustered vs. unclustered: If order of data records
is the same as order of data entries or not.
Dense vs. sparse: If there is an entry in the index
for each key value or not .
Single level vs. multi level:
06/05/2025
Hash-Based Indexes
18
Good for equality selections.
Index is a collection of buckets. Bucket = primary
page plus zero or more overflow pages.
Hashing function h: h(r) = bucket in which
record r belongs. h looks at the search key fields
of r.
Buckets may contain the data records or just
the rids.
Hash-based indexes are best for equality
selections. Cannot support range searches
So what is difference between hashing and
indexing?
06/05/2025
Index Update: Deletion
19
If deleted record was the only record in the file with its
particular search-key value, the search-key is deleted from the
index also.
Single-level index deletion:
Dense indices – deletion of search-key: similar to file record
deletion.
Sparse indices –
06/05/2025
Index Update: Insertion
20
06/05/2025
Secondary Indices Example
disk
Block fetch requires about 5 to 10 micro
06/05/2025
B+-Tree Index Files
25
06/05/2025
B+-Tree Index Files (Cont.)
26
06/05/2025
B+ Tree Example
27
To Records
06/05/2025
B+-Tree Node Structure
28
Typical node
Ki are the search-key values
Pi are pointers to children (for non-leaf nodes) or pointers
to records or buckets of records (for leaf nodes).
The search-keys in a node are ordered
K1 < K2 < K3 < . . . < Kn–1
06/05/2025
Leaf Nodes in B+-Trees
29
Properties of a leaf node:
For i = 1, 2, . . ., n–1, pointer Pi either points to a file record with
search-key value Ki, or to a bucket of pointers to file records, each
record having search-key value Ki.
If Li, Lj are leaf nodes and i < j, Li’s search-key values are less than
Lj’s search-key values
Pn points to next leaf node in search-key order
06/05/2025
Non-Leaf Nodes in B+-Trees
30
06/05/2025
Sample non-leaf
31
120
150
180
to keys to keys to keys
< 120 120 k<150 150k<180 180
06/05/2025
Sample leaf node
32
to next leaf
in sequence
120
130
with key 120
To record
06/05/2025
3
5
11
30
30
35
100
101
110
B+ Tree Example
33
100
To Records
120
130
150
156 120
179 150
180
180
200
06/05/2025
B+ Tree
34
HT
06/05/2025
Insert into B+ tree
35
06/05/2025
(a) Insert key = 32
36
n=3
100
30
11
30
31
32
3
5
06/05/2025
(b) Insert key = 7
37
n=3
100
30
7
57
11
30
31
3
5
06/05/2025
100
160
150
(c) Insert key = 160
156 120
179 150
180
38
160
179
180
180
n=3
200
06/05/2025
(d) New root, insert 45 n=3
39
30
new root => balance maintained
10
20
30
40
10
12
20
25
30
32
40
40
45
1
2
3
06/05/2025
Deletion from B+ tree
40
06/05/2025
(b) Delete 50
=> min # of keys
41
in a leaf = 5/2 = 2
n=4
40 35
100
10
35
10
20
30
35
40
50
06/05/2025
(c) Leaf Underflow Delete 50
n=4
42
100
20
40
40
20
30
40
50
06/05/2025
(d) Non-leaf underflow Delete 37
=> min # of keys in a
non-leaf =
(n+1)/2 - 1=3-1= 2
n=4
25
new root
40
25
10
20
30
40
30
30
37
10
14
20
22
25
26
40
45
1
3
43 06/05/2025
Home task
• Construct a B+ tree having n= 4 or 5 up to
level 3 to insert random keys considering
the cases.
• How can you perform range key query in a
B+ tree ?
44 06/05/2025
Queries on B+-Trees (Cont.)
45
06/05/2025
B-Tree Index Files (Cont.)
48
Create an index
create index <index-name> on <relation-name>
(<attribute-list>)
E.g.: create index b-index on branch(branch_name)
Use create unique index to indirectly specify and
enforce the condition that the search key is a
candidate key.
Not really required if SQL unique integrity constraint is
supported
To drop an index
drop index <index-name>
06/05/2025
Index Selection Guidelines
Attributes in WHERE clause are candidates for
index keys.
Exact match condition suggests cluster/sparse/hash
index.
Range query suggests tree index.
Clustering is especially useful for range queries;
can also help on equality queries if there are
many duplicates.
Multi-attribute search keys should be considered
when a WHERE clause contains several conditions.
Try to choose indexes that benefit as many queries
as possible.
If only one index can be clustered per relation,
choose it based on important queries that would
benefit the most from clustering.
Index Selection Guidelines(Cont..)
SELECT E.dno
FROM Emp E
WHERE E.age>40
B+ tree index on E.age can be used to get
qualifying tuples.
Things to consider
How selective is the condition?
If 99% are over 40, index is less useful
If 10%, an index is useful
Index Selection Guidelines(Cont..)
SELECT E.dno, COUNT (*)
FROM Emp E
WHERE E.age>20
GROUP BY E.dno
Thank You
06/05/2025