0% found this document useful (0 votes)
7 views16 pages

Weekly Exercises 01

Uploaded by

naga manasa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views16 pages

Weekly Exercises 01

Uploaded by

naga manasa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

IN 3020/4020

Week 1 Exercises
02 February 2021

DBMS architecture & Indexing

1
DBMS Architecture
Question

• (From exam 2020) Sketch the typical DBMS


architecture we saw a few times during the
beginning of the course and explain the role and
job of each path and component of the
architecture.

2
Indexing
Question 1

1. Suppose blocks hold either three records, or ten key-pointer pairs.


As a function of n, the number of records, how many blocks do we
need to hold a data file and
a) a dense index
b) a sparse index

3
Indexing Figure 17.4
A dense secondary index (with block pointers)
on a nonordering key field of a file.
Brief review before solving

Dense index: In a dense index, an


index entry appears for every
search key value in the file.
• The Search-key or Index field column
in the Index file contains unique values
(generally) from one field or column
(generally) in the Data file. It’s a copy
of a data file column of your choice.
•Generally, because sometimes an index
file may contain duplicate key values
and/or multiple index field columns.
• Choice for selecting a column to use
for indexing is often obvious, i.e., you
choose the one whose attribute is the
basis or starting point of most queries
on the data file, e.g., ID no., Name, Zip.
• You create the index by SQL (say)
command when you create the DB, or
later by altering the DB.
• Some DBs automatically create an
index (e.g. by generating a serial no. for
each record). But often you may need
additional; perhaps more than 1 index. Fundamentals of Database Systems, 7th Edition. 4
Elmasri and Navathe
Figure 14.2 Dense index
Indexing Index file Data file
Brief review before solving

• The other column in the index file is the


Pointer column. It contains a unique reference
number, identifiable by the software which
created the DB, and points to the location of
the data file row, or block that contains the
row, corresponding to the search key value.
Say, we have chosen a Field/Column in the
data file to use as an index field column. Let’s
study a few cases of dense indexing:
1. The field has unique values. In this case, the
dense index will have 1 pointer to each record.
Ex. Fig 14.2. We may, but need not, sort the
Figure 14.4
data file based on the field values. Dense [clustering] index with search key dept name
2. The field has duplicate values. We sort the Index file Data file
data file based on the field values and create
the index. A dense index will have 1 block
pointer for each block of data that sequentially
clusters the records of the same field values.
The pointer will have additional offset
information for each record in that block
relative to the first record. Ex. Fig 14.4. We
call it Dense Clustering index.
3. The field has duplicate values, but we cannot
or don’t want to sort the data file based on
field values. In such case, the dense index will
also contain duplicate values and have 1 pointer
toward each record/row in the data file. We call Database System Concepts, 7th Edition.
it Dense Non-clustering index. 5
Silberschatz, Korth, and Sudarshan
Indexing
Brief review before solving

Sparse index: In a sparse index, by


contrast, an index entry appears for only
some of the field values. It implies that not
every unique value in the chosen data file
column enters the Index file as a search-key
value; only some selected values do.
Normally, each search key value in the index
file has 1 block pointer to each block of
data file column. The pointer essentially
points to the first record of the block and
contains offset information for other records
in that block relative to the first record.
How the search works?
Suppose, we want to find the record for
Angel, Joe in the data file. We start in the
index file and skip the entries Aaron, Ed; …;
Anderson, Zach, because these values are
smaller than our target, until we reach
Arnold, Mach, which is the 1st index entry
to exceed the value of Angel, Joe. Then we
go back to Anderson, Zach, which is the
largest value equal or less than Angel, Joe
and arrive at its corresponding block in data
file column via the pointer. Finally, by using
offset in the pointer, we locate the 2nd entry,
i.e. our target, in the block.
Fundamentals of Database Systems, 7th Edition.
Do you see that Sparse index can be used
only if the data file is sorted by search key?
Elmasri and Navathe 6
Indexing: Solution to Question 1
1. Suppose blocks hold either three records, or ten key-pointer
pairs. As a function of n, the number of records, how many
blocks do we need to hold a data file and
a) a dense index
b) a sparse index
a) Assumption: one key-pointer pair for each data record.
No. of blocks to hold the data file = n/3
No. of blocks to hold the index file = n/10
No. of total blocks = (n/3 + n/10) = 13n/30

b) Assumption: one key-pointer pair for each data block.


No. of blocks to hold the data file = n/3
No. of blocks to hold the index file = (n/3)/10 = n/30
No. of total blocks = (n/3 + n/30) = 11n/30
7
Indexing
Question 2

2. Suppose that blocks can hold either ten records or 99 keys and 100 pointers. Also,
assume that the average B-tree node is 70% full; i.e., it will have 69 keys and 70
pointers. We can use B-trees as part of several different structures. For each structure
described below, determine
1. the total number of blocks needed for a 1,000,000-record file, and
2. the average number of disk I/O’s to retrieve a record given its search key.
You may assume nothing is in memory initially, and the search key is the primary key for
the records.
(a) The data file is a sequential file, sorted on the search key, with 10 records per block.
The B-tree is a dense index.
(b) The same as (a), but the data file consists of records in no particular order, packed 10
to a block.
(c) The same as (a), but the B-tree is a sparse index.
(d) Instead of the B-tree leaves having pointers to data records, the B-tree leaves hold
the records themselves. A block can hold ten records, but on average, a leaf block is
70% full; i.e., there are seven records per leaf block.
(e) The data file is a sequential file, and the B-tree is a sparse index, but each primary
block of the data file has one overflow block. On average, the primary block is full,
and the overflow block is half full. However, records are in no particular order within
a primary block and its overflow block. 8
B-Tree
Brief review before solving

Fundamentals of Database Systems, 7th Edition.


Elmasri and Navathe 9
Indexing: Solution to Question 2a
2. Suppose that blocks can hold either ten records or 99 keys and 100 pointers. Also,
assume that the average B-tree node is 70% full; i.e., it will have 69 keys and 70
pointers. We can use B-trees as part of several different structures. For each structure
described below, determine
1. the total number of blocks needed for a 1,000,000-record file, and
2. the average number of disk I/O’s to retrieve a record given its search key.
You may assume nothing is in memory initially, and the search key is the primary key for
the records.
(a) The data file is a sequential file, sorted on the search key, with 10 records per block.
The B-tree is a dense index.
Lowest level 1,000,000/10 = 100,000 blocks, for data
Leaf level 1,000,000/70 ≈ 14,286 blocks 1 Mn records require 1 Mn pointers from the Leaf
level to the data records.
3rd level 14,286/70 ≈ 205 blocks Also, in effect, each tree block has 70, not 100
pointers. Remember, data records are still grouped
4th level 205/70 ≈ 3 blocks by every 10 records.
The root 3/70 ≈ 1 block
Total no. of blocks = 100,000 + 14,286 + 205 + 3 + 1 = 114,495
Average no. of disk-operations = 5 10
Indexing: Solution to Question 2b

2. Suppose that blocks can hold either ten records or 99 keys and 100 pointers. Also,
assume that the average B-tree node is 70% full; i.e., it will have 69 keys and 70
pointers. We can use B-trees as part of several different structures. For each structure
described below, determine
1. the total number of blocks needed for a 1,000,000-record file, and
2. the average number of disk I/O’s to retrieve a record given its search key.
You may assume nothing is in memory initially, and the search key is the primary key for
the records.
(b) The same as (a), but the data file consists of records in no particular order, packed
10 to a block.
The solution is same as (a) beacuse the ordering does not affect the no. of blocks or
pointers needed.

11
Indexing: Solution to Question 2c
2. Suppose that blocks can hold either ten records or 99 keys and 100 pointers. Also,
assume that the average B-tree node is 70% full; i.e., it will have 69 keys and 70
pointers. We can use B-trees as part of several different structures. For each structure
described below, determine
1. the total number of blocks needed for a 1,000,000-record file, and
2. the average number of disk I/O’s to retrieve a record given its search key.
You may assume nothing is in memory initially, and the search key is the primary key for
the records.
(c) The data file is a sequential file, sorted on the search key, with 10 records per block.
The B-tree is a sparse index.
Lowest level 1,000,000/10 = 100,000 blocks, for data
Leaf level 100,000/70 ≈ 1429 blocks In Sparse index, 100 000 data block require 100 000
pointers, i.e., one pointer or each block of data.
2nd level 1429/70 ≈ 21 blocks
The root 21/70 ≈ 1 block
Total no. of blocks = 100,000 + 1429 + 21 + 1 = 101,451
Average no. of disk-operations = 4
12
Indexing: Solution to Question 2d
2. Suppose that blocks can hold either ten records or 99 keys and 100 pointers. Also,
assume that the average B-tree node is 70% full; i.e., it will have 69 keys and 70
pointers. We can use B-trees as part of several different structures. For each structure
described below, determine
1. the total number of blocks needed for a 1,000,000-record file, and
2. the average number of disk I/O’s to retrieve a record given its search key.
You may assume nothing is in memory initially, and the search key is the primary key for
the records.
(d) Instead of the B-tree leaves having pointers to data records, the B-tree leaves hold
the records themselves. A block can hold ten records, but on average, a leaf block is
70% full; i.e., there are seven records per leaf block.
Lowest/Leaf level 1,000,000/7 = 142,858 blocks Leaves hold the data records, and data records are
grouped into 70% of 10 or 7, while the pointers,
2nd level 142,858/70 ≈ 2041 blocks in other tree levels, are grouped into 70% of 100
3rd level 2041/70 ≈ 30 blocks or 70.

The root 30/70 ≈ 1 block


Total no. of blocks = 142,858 + 2041 + 30 + 1 = 144,930
Average no. of disk-operations = 4
13
Indexing: Solution to Question 2e
2. Suppose that blocks can hold either ten records or 99 keys and 100 pointers. Also,
assume that the average B-tree node is 70% full; i.e., it will have 69 keys and 70
pointers. We can use B-trees as part of several different structures. For each structure
described below, determine
1. the total number of blocks needed for a 1,000,000-record file, and
2. the average number of disk I/O’s to retrieve a record given its search key.
(e) The data file is a sequential file, and the B-tree is a sparse index, but each primary
block of the data file has one overflow block. On average, the primary block is full, and
the overflow block is half full. However, records are in no particular order within a
primary block and its overflow block.
Capacity available for each primary block = 10 + 50% of 10 = 15 records
No. of primary blocks = 1,000,000/15 ≈ 66,667 blocks (though each block still
holding 10 records)
No. of overflow blocks = No. of primary blocks ≈ 66,667 (each overflow block holding
50% of 10, or 5 records)
No. of 1st level B-tree blocks = 66,667/70 ≈ 953 (remember, overflow block is an
annex to the primary, and so does not have a separate block pointer)
No. of 2nd level B-tree blocks = 953/70 ≈ 14
The root = 14/70 ≈ 1.
Total no. of blocks = data block + index block = 66667*2 + 953 + 14 + 1 = 134,302

Average no. of disk-operations = 3 (for B-tree levels) + 1 for primary block + 1/3 of
the time for overflow block = 13/3 14
Quiz on PostgreSQL Indexing

• Exercise/test
Take the 3-minute-test on
https://use-the-index-luke.com/3-minute-test/postgresql
It has 5 exercises. The solutions are explained after you try them! And you can find
many examples in the book and on the Internet. 

15
Questions? Please email to
smrashid@math.uio.no

Thank you 

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy