0% found this document useful (0 votes)
14 views149 pages

DBMS Theory Book

Uploaded by

Hrushikesh Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views149 pages

DBMS Theory Book

Uploaded by

Hrushikesh Patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 149

DATABASE

MANAGEMENT
SYSTEM
CONTENTS
DATABASE MANAGEMENT SYSTEM
CHAPTER - 1
Functions Dependency and Normalization

1. Functional Dependency and Normalization.....................................................01-36

• Introduction …………………………………………………………………………. 1
• Relation Database Management System …………………………………………...... 2
• Functional Dependency …………………………………………………………….....4
• Properties of FD’s…………………………………………………………………….. 5
• Attribute Set Closure (X+)……………………………………………………………. 5
• Super Key (OR) Candidate Key……………………………………………………… 5
• Minimal Super-key is Candidate Key ………………………………………….......... 5
• Membership Test …………………………………………………………………….. 6
• Schema Refinement (Normalization)………………………………………………… 6
• Testing Condition to check whether lossy (or) lossless …………………………….. 9
▪ When to merge 2 table …………………………………………………………......... 5
• Dependency Preserving ……………………………………………………………… 6
• Normal Form ………………………………………………………………………. 11
• Possible way of non-trivial FD which suffer Redundancy………………………….. 12
• How to convert relation into BCNF ………………………………………………… 15
• Equality of FD’s Set …………………………………………………………………17
• Canonical Cover (OR) Minimal Cover …………………………………………….. 18
• Foreign Key ………………………………………………………………………… 20

CHAPTER - 2
E-R Model
2. E-R Model.................................................................................................. 37-48
• E-R model………………………………………………………………………….. 37
CHAPTER - 3
Query Language

3. Query Language .................................................................................... 49-85


• SQL……………………….………………………………………………………… 49
• Relational Algebra …………………………………………………………………..59
• SET Operation ……………………………………………………………………… 60
• Join Operation ……………………………………………………………………….62
• Variation of Join ……………………………………………………………………. 63
• Outer Join ………………………………………………………………………….. 65
• FULL-OUTER JOIN ………………………………………………………………. 66
• Tuple Relation Calculus(TRC) ………………………………………………………66
• A General Expression of TRC is …………………………………………………….67

CHAPTER - 4
Transaction and Concurrency Control

4. Transaction and Concurrency Control................................................86-126

• Transaction …………………………………………………………………………. 86
• Classification of Schedule……………………………………………………………72
• Classification of schedule based on serializability…………………………………. 97
• Two types of schedule based on serializability ……………………………………. 97
• Conflict Serializable Schedule ………………………………………………………97
• Conflict Serializable Schedule……………………………………………………….99
• View Serializable Schedule ………………………………………………………..102
• Concurrency Control Protocol………………………………………………...……104
• Locking Protocol……………………………………………………………………104
• Basic Time-stamp Ordering ………………………………………………………..109
• Thomas’s Write Time Stamp Protocol……………………………………………...110
• Deadlock Prevention Protocol………………………………………………………111
CHAPTER - 5
INDEXING

5. Indexing ……………………………………................................................127-139

• Categories of Indexing……………………………………………..………………. 129


• B+-Tree ………………..……………………………………………………….……133
CHAPTER-1

CONTENT
Functions Dependency and Normalization

1. Functional Dependency and Normalization.....................................................01-36

• Introduction …………………………………………………………………………. 1
• Relation Database Management System …………………………………………...... 2
• Functional Dependency …………………………………………………………….....4
• Properties of FD’s…………………………………………………………………….. 5
• Attribute Set Closure (X+)……………………………………………………………. 5
• Super Key (OR) Candidate Key……………………………………………………… 5
• Minimal Super-key is Candidate Key ………………………………………….......... 5
• Membership Test …………………………………………………………………….. 6
• Schema Refinement (Normalization)………………………………………………… 6
• Testing Condition to check whether lossy (or) lossless …………………………….. 9
▪ When to merge 2 table …………………………………………………………......... 5
• Dependency Preserving ……………………………………………………………… 6
• Normal Form ………………………………………………………………………. 11
• Possible way of non-trivial FD which suffer Redundancy………………………….. 12
• How to convert relation into BCNF ………………………………………………… 15
• Equality of FD’s Set …………………………………………………………………17
• Canonical Cover (OR) Minimal Cover …………………………………………….. 18
• Foreign Key ………………………………………………………………………… 20
1 Functional Dependency and Normalization
Introduction
Database: - Collection of data (inter-related data OR logically related data)
DBMS: - collection of interrelated data and a set of programs to access those data.
Primary Goal
The primary goal of a DBMS is to provide a way to store and retrieve database information that is
both convenient and efficient.
Aim of DBMS
To provide data Independency: - Hide the physical details to the external user.
According to ‘CODD’, basic aim is to provide data independency, there should be at-least 2-level
of abstraction i.e., we need at least two level of interface between User and Database.
To provide this, 3-level of abstraction is used User.



 External Schema-1 External Schema-2



 Conceptual
DBMS 
 Schema


 Physical

 Schema

DB

Logical level: What data exist in DB.


Low level: How data physically stored in DB. Also called META-DATA (data about data)
Physical Schema
(Internal level): Complete detail of data storage and access path for the DB. Physical level describes
complex low-level data structures in detail. A block of consecutive locations for storage can be
considered as a record on a physical level. The database system hides database programmers from
many of the most basic storage aspects. On the other hand, database managers might be aware of
specifics of the data's physical organization.
Example File name, Location, Type, Accessing way (indexing).

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


2| DBMS : CS

CREATE TABLE STUDENT


(Sid integer 10;
Sname Char 30)
Conceptual Schema: Hiding Physical Detail i.e., Describing only the structure of whole DB.
What Data exist in DB. The data that are kept in the database and their relationships are
described at the next level of abstraction called logical schema. Thus, the logical level can be
used to define the entire database in terms of a handful of relatively simple structures.
Example: Student (Sid, Sname) i.e., only Abstract Detail.
External Schema (View level): - Used to provide different level of security.
* Each external schema describes the part of the DB that a particular user is interested and
hide the rest of the DB from user. A huge database containing a variety of data.Many
database system users just require access to a portion of the database; they do not require
all of the information. To make their engagement with the system simpler, there is a
view level of abstraction. For the same database, the system might offer a variety of
perspectives.
View
It is a virtual (OR) derived relation, a relation that does not necessarily exist in its own right,
but may be dynamically derived from one or more base relation.
* A view is a relation that appears to the user to exit, can be manipulated as if it were a
base relation, but does not necessarily exit in the sense that the base relation do.
* Views are dynamic means that changes made to the base relation that affect the view
are immediately reflected in view.
Relation Database Management System
The entire DB stored in a table (collection of row and column)
Sid Sname Branch  Attribute (field OR column)
S1 A CS  Tuple (Row OR Record)
S2 A CS
S3 B IT
S4 B CS
Figure 1
Cardinality: Number of records in table.
Arity OR Domain: Number of attribute in table.
Relational Schema: Abstract detail of table.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization |3

Student(Sid, Sname, Branch)


Relational Instance: - A table along with same tuple.
CODD Rule: - No. 2-Record with same data i.e., every tuple should be different from other
tuple.
This is possible by Maintaining the KEY.
KEY: Minimal set of attributes used to differentiate every tuples of relation uniquely.

Sid is the Key in Figure-1.


1. By Default, Key means candidate key.
2. candidate key is Not Unique.
3. More than one candidate key is also possible for a table.
4. More than one attribute is also a candidate key.
Simple Key: Candidate Key with one attribute.
Compound Key: Candidate Key with more than one attribute.
Prime Attribute Set OR Key Attribute Set: -Attribute belongs to any candidate key.
Non-Prime Attribute: - Attribute not belong to any C.K.
Primary Key:-It is one of the candidate key.
Alternate Key (OR):-Remaining all candidate key are Alternate Key.
Super Key:- Adding zero or more attribute to a candidate key is called super key.
Note: - Every candidate key is super key but every super key. need not be candidate key.
Minimal super key is always a candidate key.
Primary Key & Alternate Key
Condition for P.K = Not allowed NULL value
NULL Value: - Unknown or Unexisted value.
(1) Basically ‘Null’ is not zero or empty string.
(2) No two Null values are same.
Atmost one P.K is allowed for a relation ‘PRIMARY KEY’ clause is used to define PK in
table.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


4| DBMS : CS

Let R be the relation schema with n-attribute R(A1, A2,A3). Then how many super key are
possible.
(i) With only candidate key {A1}
Sol. {A1, A1A2 , A1A3 , A1A2A3} i.e. 4 super key possible.

Let R be the relation schema with n-attribute R(A1, A2….. An). Then how many super key are
possible with only candidate key {A1}
Sol. Total number of super key possible is 2n-1.
(i) with only candidate key {A1, A2}
Sol. When A1 is candidate key{ A1, A1A2 , A1A3 , A1A2A3}. When A2 is candidate key{ A2,
A1A2 , A2A3 , A1A2A3}.But A1A2,A1A2A3 are common in both set so count these type of
super key only once.So total super key possible is 6.

Let R be the relation schema with n-attribute R(A1, A2….. An). Then how many super key are
possible with only candidate key {A1,A2}
Sol. Total number of super key possible is 2n-1+2n-1 - 2n-2.
Functional Dependency
Functional Dependency is a constraint between two sets of attributes from the database. It tells
relationship between attribute OR constraints between set of attributes of DB.
Let R be the relational schema with same set of attribute X, Y. Let t1, t2 any tuple of R such that
X  Y is Exist in R only if t1.X = t2.X then t1.Y = t2.Y.
X Y X Y X Y
X1 Y1 X1 Y1 X1 Y1
X1 Y1 X1 Y2 X1 Y1
X2 Y1 X2 Y2 X2 Y2
XY XY XY

If X  Y, then X  Y is trivial, otherwise non-trivial F.D.

Sid  Sid is Trivial FD,


Sid Sname  Sname (Trival FD)

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization |5
Sid Sname  Sid Sname (Trival FD)

Sid  Sname (Non-Trivial) Both side have different attribute.


Cid  Cname (Non-Trivial) Both side have different attribute
Properties of FD’s:
(1) Reflexivity: If X  Y then X  Y
(2) Transitivity: If X  Y & Y  Z, then X  Z
(3) Augmentation: If X  Y then XZ  YZ
(4) Union If X  Y && X  Z, then X  YZ
(5) Decomposition If X  YZ, then X  Y, X  Z
(6) Pseudo-transitivity If X  Y & WY  Z then WX  Z
(7) Composition If X  Y & Z  W then XZ  YW
But, XY  Z then X  Z, Y  Z
Attribute Set Closure (X+)
Set of all attribute that are functionally determined by X.

R(ABCD) FD’s = {AB, BC, CD}


A+ = {ABCD} B+={BCD} C+={CD} D+={D}
Super Key (OR) Candidate Key
Let R be the relation schema X be the set of attributes over R.
If X+(Closure set of X) is determine all attribute of R, then X is super-key or candidate key.
Minimal Super-key is Candidate Key
Candidate Key: If X is Super-Key & no proper subset of X determine all the attribute of R then X
is candidate key.

R(ABCDE) FD ={AB  C, C  D, B  E}
(AB)+ = {ABCDE}
So, AB is Super Key.
Test for C.K  A+ = {A} B+ = {BE}

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


6| DBMS : CS
i.e. no proper subset of super key determine all attribute of R.So,AB super key is minimal.
So, AB is candidate key.
{A, B} = Prime attribute

R(ABCDE) FD={AB  C, C  D, B  AE}


(AB)+ = {ABCDE}
A+ = {A}
B+ = {ABCDE}
So, B is candidate key ,AB is S.K but not candidate key.
Note:- If super key consist with only one attribute is , then surely it is candidate key.
If 2 or more attribute is super key, then it may or may not be candidate key.
Membership Test:
Let F be the functional dependency set and let X  Y be a FD then X  Y logically implied in FD
set F only if closure of X contains Y.

Check whether following membership test is valid or not.


(a) Let R(wxyz) with FD(F) ={w  y, x  z}.Check FD {wx  y}is member of F or not?
Sol. Find (wx)+ = wxyz
So, wx  y is member of FD
(b) Let R(wxyz) with FD(F) ={x  y, y  z}. Check FD {y  x}is member of F or not?
Solution: Find (y)+ = yz
So, y  x is not a member of FD set F.
Note: { it means y  z}
Schema Refinement (Normalization):
Goal of Schema Refinement:- To eliminate/Reduce redundancy.
Redundancy: Duplicate copy of same-data.
Redundancy possible in DB if two (OR) more independent relation kept in same table.
Sid Sname Cid Cname Fee
S1 A C1 DB 8K
S1 A C2 Java 7K
S2 A C2 Java 7K
S3 B C2 Java 7K

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization |7
S3 B C1 DB 8K
PK: {SidCid}
In above example, two independent table student and course merged in a single table which results
redundancy.
Problem because of Redundancy:
(1) Wastage of space
(2) Create Problem/Anomalies:
(a) Updation Anomalies: It says if you update same data then updata duplicate data also. But it will
take more time due to duplication of data.So, it is a costly operation.
(b) Insertion Anomalies: Let insert C3 DS 10K as a new course , but there is no student for this
course.
If we keep NULL NULL C3 DS 10K then it is not possible because Sid is primary key which is
NULL is not allowed.
So, insert * * C3 DS 10K i.e., * is some dummy data. But it may result inconsistencywhen we count
total number of sid present in relation. So, Dummy Data result inconsistency.
(c) Deletion Anomalies: Deletion of same data result forcefully deletion of other useful data

Let delete course C2, then we also delete tuple (t2, t3, t4).
So, we loss other data.
To Eliminate these anomalies, we delete redundancy.
Redundancy can be eliminated by Decomposition of Table.
Decomposition: Splitting relation into 2 or more sub-relation.
Sid Sname Sid Cid Cid Cname Fee
S1 A S1 C1 C1 DB 8K
S2 A S1 C2 C 2 Java 7K
S3 B S2 C2 PK: Cid
PK: Sid S3 C2
S3 C1
PK: SidCid
Properties of Decomposition:
(1) Lossless Decomposition
(2) Dependency preserving
(1) Lossless Join Decomposition:
Because of decomposition, after join again, it should not create any extra tuple.
i.e., R1 natural join R 2  R ,it is lossless decomposition

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


8| DBMS : CS

A B C
1 1 2
2 1 3
3 2 3

a) Let decomposition is R1(AB) R2(BC)


Then R1 R2
A B B C
1 1 1 2
1 1 1 3
1 1 2 3
2 1 1 2
2 1 1 3
2 1 2 3
3 2 1 2
3 2 1 3
3 2 2 3
A B B C A B C
1 1 1 2 1 1 2
1 1 1 3 1 1 3
2 1 1 2  2 1 2
2 1 1 3 2 1 3
3 2 2 3 3 2 3

So, R1natural join R2  R so it is Lossy decomposition.


b)Let Decomposition is R1(AB) R2(AC)
Then R1 R2

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization |9
A B A C
1 1 1 2
1 1 2 3
1 1 3 3
2 1 1 2
2 1 2 3
2 1 3 3
3 2 1 2
3 2 2 3
3 2 3 3

A B A C
1 1 1 2
2 1 2 3
3 2 3 3

A B C
1 1 2
2 1 3
3 2 3

R1natural join R2 = R So,it is Lossless decomposition.


Testing Condition to check whether lossy (or) lossless
Let R be relational schema, decomposed into R1, R2 sub-relation then given decomposition is
Lossless if
(1) Attr.(R1)  Attr.(R2) = R
(2) R1 R2  

(3) R1 R2 should be candidate key of either R1 (or) R2 (OR) Both


When to merge 2 table
Ri and RJ table merge only if
(1) Ri Rj
(2) Ri Rj Ri (OR) Rj (i.e., common Attr. shoud be candidate key. of Ri (OR) Rj.

R(ABCDEG) {ABC, ACB, ADE, BD, BCA, EG}


(a) Decomposition = {AB, BC, ABDE, EG}
Sol. Merge R1(AB) R3(ABDE) as it satisfies both the condition.
So, R13(ABDE) R2(BC) R4(EG)
Now, Merge R13 and R4

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


10 | DBMS : CS
So, R134(ABDEG) R2(BC)
Now, Merging of R2& R134 is NOT Possible.
So, Lossy Decomposition.
(b) Decomposition is {ABC, ACDE, ADG}
Sol. Merge Table (1) & (2)
R12(ABCDE) R3(ADG)
R123(ABCDEG)
So, Lossless Decomposition.
Dependency Preserving
It says that each functional dependency XY specified in F set of relation R either
appeared directly in one of the sub-relation in the decomposition or could
be inferred from the dependencies of sub-relation.
R  FD set  F 

R1 R2
FD set F1 FD set F2

Then all the FD (F) are preserved in either F1 or by F2 or by both i.e


F1 F2 = F Dependency Preserve
F1 F2 F Not Dependency Preserve

R(ABCD)
FD = {AB, BC, CD, DA}
Decomposition = {AB. BC. CD}
R1  AB  R 2  BC  R 3  CD 
FD’s are A  B BC CD
BA CB DC

FD AB, BC, CD, are directly preserved but,


Here, D  A is not directly preserved, but if we find D+ w.r.t F1, F2, F3
F.D’s set, then D+ = DCBA.
So, D  A is preserved

R(ABCD)

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 11
FD = {ABC, CD, DB}
Dec. = {ABD, CD}
Sol. R1(ABD) R2(CD)
FD’s are ABD CD
ADB
DB
So, Now find /check for ABC
(AB)+ = ABD
Not Dependency Preserved
Normal Form
To eliminate (OR) reduce the redundancy and minimize the insertion, deletion, updation anomalies.
Types of Normal Form:-
1NF 
2NF 
 Single valued FD's 4NF,5NF} Multi-valued FD’s
3NF 
BCNF

First Normal Form: Relational Schema is in 1NF iff


“No Multi-valued Attribute exist in R
(OR)
Relation only in Atomic (OR) Single valued attribute.

Sid Sname Cname


S1 A C/C++
Sid: CK S2 B C/Java
S3 B DB

Cname is multivalued attribute(MVD).


So, Relation not in 1NF.
To make R into 1NF, Decompose R.
Sid Sname Cname
S1 A C
S1 A C++
Sid Cname: CK S2 B C
S2 B Java
S3 B DB

* BCNF may suffer from redundancy because of MVD

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


12 | DBMS : CS
* BCNF is highest normal form for single valued FD(SVFD).
 RDBMS not accept Multi-valued Attribute.
 Default RDBMS always in 1NF.
Condition for checking which FD is redundant.
Definition: Non-Trivial FD (XY) suffers from Redundancy if X is Not super key

X Y
 X1 Y2
X Y2
Redundant Copy   1 X  Y where X is not a key
X2 Y3

 X1 Y2

If non-trivial FD (XY) with X is a key, then X, Y attribute not suffers from redundancy.
X Y
X1 Y1
X2 Y1
X3 Y1 X is a key so no redundancy possible.

Possible way of non-trivial FD which suffer Redundancy.


(1) Proper subset of C.K  Non-Prime Attribute

Sid Sname Cname


 S1 A C

 S1 A C++
Redundancy 
 S2 B C
 S2 B Java

S3 B DB

C.K. = Sid Cname


Here, Sid  Sname is a FD but it suffers from Redundancy.
(2) Non-Prime Attribute  Another non-prime attribute
Example: R(ABCD) {ABC, CD}
Here, CD FD suffers from Redundancy.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 13
(3) Proper subset of CK  Proper subset of other C.K.
R(ABCD) {ABC, CD, CB}
C.K = {AB, AC}
Here, CB FD suffers from Redundancy.
Also, CD FD suffer from Redundancy which is Case-1
A B C D
A1 B1 C1 D1
A1 B1 C1 D2
A2 B2 C2 D3
A2 B2 C2 D4
A3 B3 C3 D5
A2 B2 C2 D6

{AC, CB, BA}


C.K = {AD, CD, BD}
Here, AC is Redundant FD.
Second Normal Form (2NF): Relation R is in 2NF if
(1) R should be in 1NF ,AND
(2) R should not contain any partial dependency

Where,X is any primary,Y Proper Subset of primary key,A is Non-Prime Attribute.

Sid Sname Cname


S1 A C
S1 A C++
S2 B C
S2 B Java
S3 B DB

SidCname: CK
But, Sid  Sname is a partial dependency.
Partial Dependency: Let R be a relational schema with X, Y, A are attribute set where X is
any primary key, Yis proper subset of primary key, A is Non-Prime attribute, then if FD Y
 A exist, then it is partial dependency.
Third Normal Form (3NF): -
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
14 | DBMS : CS
Relation R is in 3NF iff every non-trivial FD(XY) satisfy the following condition:-
R should be in 2NF and
(1) X: Super-key OR
(2) Y: Prime Attribute (OR)

Relation R is in 3NF iff


(1) R should be in 2NF &
(2) R should not contain any TRANSTIVE DEPENDENCY.
Transitive Dependency: X  Y is transitive dependency only if
(i) X should not be CK, AND
(ii) Y should not be prime attribute

R(ABCD) FD={AB,BC,CD}
A B C D
A1 B1 C1 D1
A2 B2 C1 D1
A3 B3 C2 D2
A4 B1 C1 D1
A5 B2 C1 D1

BCNF A Relation R is in BCNF only if


(1) R should be in 3NF,and
(2) Every non-trivial FD (XY) with X is Super-key
Q)Find Highest Normal Form (HNF)of the following

R(ABCDE) FD={ABC, CD, BE}


Sol. Primary key is {AB} but ,BE is partial dependency So,R in 1NF.
So, HNF = 1NF

R(ABCDE) FD={ABC, CD, DE, EA, DB}


Sol. Primary key is {AB,C,D,BE} but , EA suffer from redundancy. So,R in 3NF

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 15
How to convert relation into BCNF
Question: R(ABCDE) FD={ABC, CD, BE}
CK: AB
Not in 2NF as BE is partial dependency.
So, Decompose R.
R1  ABCD  R 2  BE  

AB  C BE 
 lossless, Dependency Preserve,also in 2NF
CD 
CK: AB CK: B  

Note: Above relation is in 2NF but not in 3NF because of FD C  D in R1 relation


So, Decompose R1,
R1(ABC) R2(CD) R3(BE)
ABC CD BE
AB: CK C: CK B:CK
Now, R is in 3NF,lossless and dependency preserve also.

R(ABCDEF) with FD={A  BCDEF, BC  ADEF, BF, DE}


Sol. primary key= {A, BC}
Not in 2NF due to partial dependency BF.
So, Decompose R,
R1  ABCDE  R 2  BF  

A  BCDE BF 

BC  ADE  lossless, Dependency Preserving, 2NF
DE B: CK 

A. BC: CK 

But, not in 3NF due to FD DE


So, decompose R1,
R1  ABCD  R 2 (DE) R 3  BF  

A  BCD DE BF 
 lossless, Dependency Preserving, 3NF,BCNF
BC  AD D: CK B: CK 
A, BC: CK 

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


16 | DBMS : CS

Counter Example for BCNF.


R(ABCD) {ABCD, DA}
C.K. = AB, DB
R1  BCD  R 2  DA  

BD  C D  A D: CK  BCNF, lossless
BD: CK  But NOT Dep. Preserving

Not in BCNF because of DA. So, Decompose


Design Goal 1NF 2NF 3NF BCNF
0% red no no no yes (SVFD )only
Lossless yes yes yes yes
Dep.Pres. yes yes yes no

Note: 3NF is more accurate normal form as BCNF not always dependency preserve.
(1) BINARY RELATION: Relation with 2-attribute
Example: R(AB), the possible FD with their key and highest normal form are:
(1) A  B A:CK BCNF (HNF)
(2) B  A B:CK BCNF (HNF)
(3) A  B, B  A A, B: CK BCNF (HNF)
(4) Non non-trivial AB: CK BCNF (HNF)
Note: If a relation with 2-Attribute, always in BCNF.
(2) Relation with only simple CK: If every primary key are simple, then partial dependency
not possible. So, a relation with simple primary key always in 2NF but may or may not in
3NF (OR) BCNF.

R(ABCD) AB, BC, CD, DA


Key are {A,B,C,D} so R is in 2NF.

R  ABCDE  A  B B  AC C  DE
2NF yes yes yes
3NF yes yes no

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 17
(3) Relation with only Prime Attribute (No non-prime attribute)
If a relation (R) with only prime attribute, then R is always in 3NF.

R(ABCDE)
{ABC, CD, DE, EA}
CK = AB, EB, DB, CB
3NF yes yes yes yes
BCNF no no no no

Equality of FD’s Set


Let F and G are two FD set of relational schema R then F and G are equivalent if and only if F covers
G and G covers F. F and G sets are said to be equal if
(1) F covers G: All FD’s of G are logically implied in FD’s set F then F  G AND
(2) G covers F: All FD’s of F are logically in FD’s set G then G  F. i.e
(a) F  G (F cover G), AND
(b) G  F (G cover F)
A set of functional dependencies F is said to cover another set of
functional dependencies G if every FD in G is also in F+; that is, if every dependency
in G can be inferred from F; alternatively, we can say that G is covered by F.

F = {AB, ABC. DAC, DE}


G = {ABC, DAE}
Which is TRUE?
(a) F  G (b) G  F (c) F  G (d) none
Sol.
Check F covers G
F = {AB, ABC, DAC, DE}
G = ABC
D  AE
So, F covers G
Check G cover F
G = {ABC, DAE}
F = AB
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
18 | DBMS : CS
ABC
DAC
DE
G cover F
So, F  G
Canonical Cover (OR) Minimal Cover
A functional dependency set F is said to be minimal cover of functional dependency set E if every
FD of set E is member in functional dependency set F and after removing of any FD from F will not
cover the every FD of set E.
It is used to eliminate of redundant FD’s.
Extraneous Attribute: Attribute which are not useful in FD set is called extraneous attribute.

Minimal cover (OR) canonical cover means elimination of extraneous attribute (OR)
redundant FD’s.
Note: Minimal cover may not be unique, but all minimal cover are logically equivalent.
Possible Cases:
1 AB  C, A  B  A  C, A  B
 BisExtraneous Attribute
 2  AB  C, A  C  A  C
 BisExtraneous Attribute
 3 A  BC, B  C  A  B, B  C
 CisExtraneous Attribute
 4  AB  CD, BC  D  AB  C, BC  D
 DisExtraneous Attribute

How to find “Attribute” is Extraneous (Redundant)


(1) For each functional dependency X  A in F for each attribute B that is an element of X.
If {{F = (XA)}  {X-(B)A}} is equivalent to F,
Then replace XA with (X-(B))A in F

Possible Case-1
{ABC, AB} {AB, AB}
Here, B is Extraneous Attribute
Proof: Let ‘A’ is Redundant attribute

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 19
Then, {ABC, AB}  {AB, BC}
Algorithm:
F = {AB  BC}  G= {ABC, AB}
Check F cover G, Check G cover F

AB  C  AB =ABC AB A


+ +
=ABC
AB  A  =AB BC  B
+ +
=B

So, A is not Extraneous Attribute


Possible Case-1:
Let B is Extraneous Attribute
Then,
{ABC, AB}  {AC. AB}
Algo. F: {AB; AC}G: {ABC, AB}
F cover G G cover F

AB  C  AB =ABC AB  A  =AB


+ +

AB  A  =ABC AC  A  =ABC


+ +

So, B is Extraneous Attribute


Algo for finding minimal cover (F) for a Set of Functional Dependencies (E)
(1) Set F  E
(2) Replace each FD X  {A1, A2,…, An} in F by the n-functional dependencies
X  A1, X  A2, ………., XAn
(3) For each functional dependency X  A in F for each attribute B that is an element of X.
If {{F – {XA}}  {(X-{B})A}} is equivalent to F, then replace X  A with (X – {B})
 A in F.
(4) For each remaining functional dependency X  A in F
If {F – {X  A}} is equivalent of F, then remove XA from F.
Function Dependencies Closure (F+)
* Set of all trivial & non-trivial FD’s determined by given FD set (F) is called F+
We can also find F+ by attribute set closure

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


20 | DBMS : CS

R(ABC) F = {AB, BC}


Sol. + = 
8 A+ = ABC A, AA, AB, AC
AAB, ABC, AAC, AABC
4 B+ = BC B, BB, BC, BBC
2 C+ = C C, CC

8 (AB)+=ABC AB, ABA, ABB, ABC


ABAB, ABBC, ABAC, ABABC
4 (BC)+=BC BC, BCB, BCC, BCBC
8 (AC)+ = ABC AC, ACA, ACC, ACB
ACAB, ACAC, ACBC, ACABC
8 (ABC)+ = ABC ABC, ABCA, ABCB, ABCC
ABCAB, ABCBC, ABCAC, ABCABC
Total = 43

R(AB) FD’s = {AB, BA}


Sol. A+ = AB A, AA, AB, AAB
B+= AB B, BA, BB, BAB
(AB)+= AB AB, ABA, ABB, ABAB
+ =  Total = 13

Foreign Key
Foreign Key: are set of attribute references to primary key (OR) alternative key of the same
(OR) same other table.
* Used to relate data with one table to another table
* Used to maintain the consistency among tuples in the two relations
Referential integrity:- Referential integrity is a constraint between the two relational
schemas R1 and R2 that is specified in the criteria for a foreign key, which are provided
below. If the conditions are met, a set of attributes FK in relation schema R1 is a foreign
key of R1 that refers relation R2.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 21
1. The properties in FK are said to reference or refer to the relation R2 if they belong to
the same domain(s) as the primary key attributes PK of R2.
2. Either a value of PK for some tuples t2 in the current state r2(R2) happens, or the value
of FK in a tuple t1 in the current state r1(R1) is NULL. The tuple t1 is said to reference
or refer to the tuple t2 in the first scenario, where t1[FK] = t2[PK].
The referenced relation is R2, and the referencing relation is R1.
A referential integrity constraint from R1 to R2 is said to hold if these two requirements are
satisfied.
In the below example,enroll table(Referencing table) sid attribute is foreign key to student
table(Referenced table)

Note:- Foreign Key is not Unique.


Integrity constraints of Foreign Key(Referenced Relation and Referencing Relation)
(1) Referenced Relation (Student) integrity constraint
(a) Insertion in Referenced Relation: Whenever insert new record in referenced relation,
no violation occur in referencing table due to insertion operation, so allow to execute the
operation.
(b) Deletion in Referenced Relation: Whenever delete some record in referenced relation
,there may be violation occur in referencing table due to delete operation so ,it may cause
violation.
So, if any violation occurs, then it can be removed by different integrity constraint given
below:-
(i) On Delete No Action: Delete operation restricted if integrity constraints violation occurs.
(ii) On Delete Cascade: After deleting, if violation occur then delete corresponding tuples
from both relations.
(iii) On Delete Set Null:After deleting, if violation occur then set the Null value in
referencing table attribute.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


22 | DBMS : CS
Note: If a referencing attribute that cause a violation is part of PK then it cannot be set as
NULL.
Note:-If nothing mention in definition , then by default constraints is on Delete No Action.
(c) Updation: Whenever update some record in referenced relation ,there may be violation
occur in referencing table due to update operation.
So, if any violation occurs, then it can be removed by different integrity constraint given
below:-
(i) On update No Action: update operation restricted if integrity constraints violation occurs.
(ii) On update Cascade: After deleting, if violation occur then update corresponding tuples
from both relations.
(iii) On update Set Null:After update, if violation occur then set the Null value in referencing
table attribute.
Note: If a referencing attribute that cause a violation is part of PK then it cannot be set as
NULL.
Note:-If nothing mention in definition , then by default constraints is on Update No Action.
(2) Referencing Relation (Enroll) integrity constraint
In Referencing relation, if integrity violation occurs, then corresponding operation is
restricted.
Note:-* Referenced Table behave like Parent Relation
* Referencing Table behave like Child Relation

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 23

PRACTICE QUESTIONS

1. Let R is a relational schema with 5-attribute i.e. R(a1, a2,......a5.), and if a1 and a2 are candidate
key, then how many super key possible?
Min : 24.0
Max : 24.0
Answer 24
Sol. 24
251  251  252  24  24  23  32  8  24
2. Let R(ABCD) is a relation-chema with following dependencies, the total number of candidate
key possible is_________
FD's  {A  B, B  C, C  D, D  A}

Min : 4.0
Max : 4.0
Answer 4
Sol. 4
A+ = ABCD, B+ = BCDA, C+ = CDAB, D + =DABC
3. Let a relational schema R( A, B, C. D) where AB and BC are candidate key, then total number
of super key possible is__________
Min : 6.0
Max : 6.0
Answer 6
Sol. 3
When AB is candidate key. then total number of super key is 24 2  22  4
When BC is candidate key then total number of super key is
24-2 = 22 = 4 but there exist some super key which are common.
So, Total number of super key is
242  242  243  22  22  21  4  2  6
4. Given a relation R(A, B, C) and set of functional dependencies FD = {A  BC, B  C, C
 B} then what are the candidate key of R?
(A) A only (B) AB only
(C) AB, AC only (D) A, B, C
Sol. (A)
AT = ABC
B+ = BC

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


24 | DBMS : CS
+
(AB) = ABC but it is super key
5. Consider a table for relation R

Name Rank Room number Shift


Ram Lecturer 104 Morning
Shvam Student counselor 105 Afternoon
Hari Clerical 115 Morning
Ram Student counselor 105 Evening

Which of the following is not a candidate for above relation?


(A) {name, Rank} (B) {Room number, Shift}
(C) {Rank, Room number} (D) {name, shift}
Sol. (C)
{Rank, Room number} is not a candidate key because student counselor, 105 comes twice.
6. Let R be a relational schema with n-attribute then minimum number of super key possible is
(A) 0 (B) 1 (C) n (D) n-1
Sol. (B)
b i.e. when all the attribute combine form a key.
7. Consider the following relational schema with following FD's R(ABCDE)
FD's = {AB  C, BC  D, D  E, E  B}
then total number of candidate key possible is___________
Min : 3.0
Max : 3.0
Sol. 3
(AB)+ = ABCDE
(AE)+ = AHBCD
(AD)+ = ADEBC
8. Consider the following statement
1) Every super key is a candidate key
2) Every candidate key is super key statement which are true___________
(A) 1 only (B) II only (C) both (D) none
Sol. (B)
Every candidate key is super key but every super key need not be candidate key.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 25
9. Consider a relational schema with 5 attribute then maximum number of super key possible
is________

Min : 31.0
Max : 31.0
Answer 31
Sol. 31
2* – 1 = 31
10. Consider the following relational schema R(ABCDE) with following FD's
FD's = {AB  CD, C  D, D  B, B  A}
then total number of candidate key possible is_
(A) 0 (B) l (C) 2 (D)3
Sol. (D)
Following more the candidate key of R.
EB+ = BACDE
11. The following functional dependencies hold for relation R(A,B,C), S(B,D,E)
B  A, A  C
The relation R contains 500 tuples and relation S contain 100 tuples. What is the maximum
number of tuple possible in natural join of R S i.e._______
Min : 100.0
Max : 100.0
Answer 100
Sol. 100
As B is the candidate key of relation R(A, B, C). B is foreign key of S-relation S contain 100
tuples. So, after join operation, maximum number of tuple is 100.
12. Given a relation R(A,B,C) and FD set is {A  BC, B  C, C  B} the highest normal form
of R is
(A) INF (B) 2NF (C) 3NF (D) BCNF
Sol. (B)
Candidate key of R is {A}
So, there exist no partial dependencies.
13. Let the relation R(A, B, C, D, E, F) and candidate key are (AB, AE), then number of superky
are_________
Min: 24.0
Max : 24.0

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


26 | DBMS : CS
Answer 24
Sol. 24
Number of superkey = Superkey(AB) + Superkey(AE) - Superkey(ABE)
= 26-2 + 26-2 – 26-3
= 24 + 24 – 23
= 16 + 16 – 8 = 24
13. Consider the relation R(ABCDEF) and functional dependencies (AB  C, C  D, D EB,
E F, F  A).Then total number of candidate key possible are_______
Min: 5.0
Max : 5.0
Answer 5
Sol. 5
i.e. {C, D, AB, BE, BF}
14.. Consider the relation R (ABCD) and functional dependencies
F.D = {AB  CD, D A} is decomposed into relation R1 (AD) and R2 (BCD). When of the
following statement is correct?
(A) The decomposition is lossy decomposition
(B) The decomposition is lossless decomposition
(C) In some cases, decomposition is lossless but sometimes lossy
(D) none
Sol. (B)

D  A 
R1(AD) has FD   . So, {D, A} is candidate key of R1
A  D 
R2(BCD) has FD {BCD  BCD}. So, BCD is candidate key.
So, decomposition is lossless but not dependency preserve.
15. Consider the following relations R(ABCDEF) with FD’s {A  FC, C  D, B E}. Then to
make relation into 3NF, minimum number of table required is ________
Min : 4.0
Max : 4.0
Answer 4
Sol. 4
Candidate key of R is {AB}
So, B  E is a partial dependencies.
So, Total tables required is

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 27
CD, ACF, BE, AB
16. Consider the following FD's set F and D
F = {A  B, AB  C, D  AC, D E}
G = {A  BC, D  AB}
then which of the following is correct?
(A) F covers G (B) G covers F
(C) F & G are equivalent (D) None of these
Sol. (A)
D  E FD is not cover by FD set G
17. If table R has only one candidate key, then which of the following is always true?
(A) R is in 2NF but not in 3NF
(B) R is in 3NF, also in BCNF
(C) R is in 2NF, but may not in 3NF
(D) None
Sol. (D)
Consider relation {AB  C, B D}
Candidate key = AB
R is in 3NF but not in BCNF
Consider relation {A  BC, B  E}
Candidate key - A
So, R is in 2NF but not in 3NF
Consider relation {A  C, B D}
Key AB. So, R not in 2NF
18. Consider schema R(ABCDE) and functional dependencies
{A  BC, CD E, B D, E A}. Then decomposition of R into R1(ABC) and R2(ADE)
is
(A) dependency preserving and lossless join
(B) lossless join but not dpendency preserving
(C) dependency preserve, but not lossless join
(D) none
Sol. (B)
Common attribute of R1 and R1 is A
and A is candidate key of R1
So. decomposition is lossiess

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


28 | DBMS : CS
But, not dependency preserving i.e. CD  E and B  D
19. Consider the following table where A is primary key and B is foreign key which references
To A with ON DELETE CASCADE. What are the additional Tuples delated when tuple (2,
3) is deleted?

A L
1 2
2 3
4 3
5 1
6 4

(A) (1, 2), (4, 1) (B) (l, 2), (5, 1)


(C) (4, 3), (5, 1) (D) (1, 2), (4, 3) (5, 1)
Sol. (B)
When (2, 3) is deleted, then coloum B which have value 2 is also deleted i.e. (1, 2). Now,
coloum B which have value 1 is also deleted.
20. Given relation R(ABCD) with FD set {AB  C, C A, C  D, D  A}. Total number of
candidate key are_________
Min : 2.0
Max : 2.0
Answer 2
Sol. 2
(BC)+ = CADB
(AB)+ = ABCD
21. Relation R(ABCDE) with FD set is {AB  D, A  D, C  D, ABC DE}.
Total number of FD which are not is 2NF is_____________
Min : 3.0
Max : 3.0
Answer 3
Sol. 21
Candidate key is ABC
So, partial dependencies are
AB  D, A  D, C  D
22. Which of the following is correct statement?
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Functional Dependency and Normalization | 29
(A) Every relation in BCNF is also in 3NF (B) Every 2NE relation is in BCNF
(C) Every TNF relation is in 3NF (D) Every relation with transitive dependency is
in 3 NT

Sol. (A)
Every BCNF relation is also 3NF but not vice-versa
23. Consider a relation R(MNRQT) with FD set
F1 = {M  TN, P  QM}
F2 = {M  N, P  Q, P  MT, MK  T}
Which of the following is correct?
(a) F1 and F1 are equivalent (B) Fl and F2 are not equivalent
(C) F1  F2 (D) F2  F1
Sol. (A)
F1 covers F2 and F2 covers F1. So, both FD set are equivalent.
24. Consider the relation R(A, B, C, D, E, F) with FD’s {AB  C, C  D, D  E, E  F, F 
B}
The maximum number of candidate key possible is__________
Min : 5.0
Max : 5.0
Answer 5
Sol. 5
(AB)+ = ABCDEF
(AC) + = ABCDFE
(AD)+ = ABCDEF
(AF) + = ABCDEF
(AE) + = ABCDEF
25. Consider the following relation
R(MNOPQR) with FD set
{M  N, O  R, Q  R}. The minimum number of relation required to decompose relation
R into 2NF which satisfy lossless join and dependency preserving decomposition
is_________?
Min: 4.0
Max : 4.0
Answer 4
Sol. 4
Candidate key of R is {MOQ}
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
30 | DBMS : CS
So, partial dependencies are M N, O  R, Q  R
So, decompose relation R into
R1 (MN) R2 (OP) R3 (QR) R4 (MOQ)

26. Consider the following relation R(ABCDE) with functional dependencies set F = {AB  C,
AB  D, D  A, BC  D, BC E} what will be the highest satisfied
form satisfied by R?
(A) INF (B) 2NF (C) 3NF (D) BCNF
Sol. (C)
Candidate key of relation R is AB, BD, BC.
D  A not follows the condition of BCNF
So, R not m BCNF
27. Consider the following two relation R1(ABC) and R2(ABC) with following FD set
respectively
FD1 = {A  B, B  C, A  C}
FD2 = {A  B, B  C, AB  C}
then what is the relationship between R1 and R1 ?
(A) R 1 R 2 (B) R 2 R 1 (C) R 2  R 1 (d) no relation

Sol. (C)
Both relation R1 and R2 are equivalent because f1 and f2 cover f1
28. Consider a following relation-schema R(ABCD) with following FD’s
FD = {A  B, B  C, C  D, AB D}
then,
(a) A  D is member of FD set
(b) BC  D is member of FD set
Total number of statement which are true is_________
Min : 2.0
Max : 2.0
Answer 2
Sol. 2
find A+
A+ = ABCD i.e. A  D
find (BC)+ = BCD i.e. BC  D
So, both statement arc true
29. Consider the following set of functional dependencies on relational schema R(ABC)
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Functional Dependency and Normalization | 31
{A  BC, B  C, A  B, AB  C}
The canonical cover for F is:-

(A) {A  B, B  C, A  C (B) {A  B, B  C A, B  C}
(C) {A  B, B  C} (D) none
Sol. (C)
Canonical cover means no attribute is extraneous and no FD's is extraneous.
i.e.,
= A  B, A  C, B  C, A  B, AB C
= A  B, B  C
30. Consider the following R(A, B, C, D) with FD set {AB  C, C  A, C  D}. Total number
of candidate are___________
Min : 2.0
Max : 2.0
Answer 2
Sol. 2
(AB)+ = ABCD
(BC)+ = BCAD
31. Consider the following relation R (A B C D E) with FD set
F1 = [A  EB4 C  DA}
F2 = {A  B, C  D, C  AE, AB  E}
which of the following is True?
(A) F1  F2 (B) F1  F2

(C) F1  F2 (D) Can’t say

Sol. (C)
Check f1 comes f2
A  B, A+ = AEB
C  D, C+ = CDA
C  AE, C+ = CDAE
AB  E, AB+ = ABE
Check f2 comes f1
A  EB A+ = ABCDE
C  DA C+ = CAED

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


32 | DBMS : CS
So, f1  f2

32. Consider the following relational instance

A B C D
1 b1 c1 d1
1 b1 c2 D2
2 b1 c3 d1
3 b2 c2 d3

which of the following functional dependencies will hold on the above relational instances?
(A) A  B (B) B  D
(C) C  B (D) D  A
Sol. (A)
If there exist x  y FD, then for every unique x, there should be a unique y.

A B
1 b1
1 b2
2 b1
3 b2

In above question, which satisfy the above condition.


33. Which of the following property of functional dependencies not valid?
(A) If x  y then xz  yz
(B) If x  y & wy  z then wx  z
(C) If xy  z then x  z & y  z
(D) If x  y & z  w then xz  yw
Min : 3.0
Max : 3.0
Answer 3
Sol.7: ();
Property 1 is Augmentation property.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 33
Property 2 is Pseudotranstivity property.
Property 4 is Conposition property
but, If xy  z then it does not mean that x  z & y  z
34. Consider the relation-schema R(ABCDE) with FD set {AD  B, BC  E, ED  A} the
which of the following is candidate key of R?
(A) ABCDE (B) ABCD (C) ACD (D) ACE
Sol. (C)
(ACD)+ = ACDBE
35. Consider the following FD set of given relational schema R(ABCD)
{AB  CD, BC  D}, then which of the following attribute is extranous in FD set
(A) A in AB  CD (B) B in BC  D
(C) C in AB  CD (D) D in AB  CD
Sol. (D)
After removing D attribute from FD AB  CD, equivalent
FD set is {AB  C, BC  D}
So, the new FD's set and the original FD"s set are covers each other. So, both FD's set are
equivalent.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


34 | DBMS : CS

EXERCISE QUESTIONS

1. Consider the relational schema R(ABC) with FD set {A  B, B  A, B  C}


which of the following statement is True?
1) FD's set {A  B, B  A, A  C} is minimal cover
2) FD's set (A  B, B  A, B  C} is minimal cover
(A) 1 only (B) 2 only (C) both (D) none
Ans. C
2. Consider a relation R with attribute set {A, B} then relation R has highest normal is
(A) INF (B) 2NF (C) 3NF (D) BCNF
Ans. (D)
3. Which of the normal from is more accurate in relation database management system with
single valued FD's?
(A) INF (B) 2NF (C) 3NF (D) BCNF
Ans. (C)
4. Consider the relation R with following FD’s R (ABCDE) FD's
= AB  C  D, D  E, E  A}
The higest normal form of the above relation is
(A) INF (B) 2NF (C) 3NF (D) BCNF
Ans. (C)
5. Consider the following relation R (ABCDEF) with Fd's
{A  BC, A  D, BC  ADEF, B  F, A  E A  F, D  E} then highest normal
form of given relation is
(A) INF (B) 2NF (C) 3NF (D) BCNF
Ans. (A)
6. Consider the following relation R (ABCDEFG) with FD set
{A  G, E  F, C  D, B  E, AB  C}
The highest normal form of relation R is
(A) INF (B) 2NF (C) 3NF (D) BCNF
Ans. . (A)
7. Consider a relation R with same FD's set let candidate key of R are simple (i.e. each candidate
key consist of single attribute) then R always in
(A) INF (B) 2NF (C) 3NF (D) BCNF

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Functional Dependency and Normalization | 35
8. Consider the following relation R (ABCDE) with FD's {A  B, B  AC, C  DE}, then
which of the following statement is correct
(A) R is in INF but not in 2NF (B) R is in 2NF but not in 3NF
(C) R is in 3NF but not in BCNF (D) R is in BCNF
Ans. (B)
9. If a relation R has all the attribute are prime, then R always is in
(A) 1NF (B) 2NF (C) 3NF (D) BCNF
Ans. (C)
10. Total number of serial schedule possible with 5 transaction is_________
(A) 115 (B) 118 (C) 125 (D) 120
Ans. D
11. Complete the sentence: Logical Data Independence is the ability to modify…
A. physical-level schema without affecting the logical-level schema
B. the logical-level schema with no effect on view-level schema.
C. view-level schema without affecting logical -level schema.
D. logical-level schema without affecting physical-level schema.
Ans. B
12. Complete the sentence: Physical Data Independence is the ability to modify…
A. physical-level schema without affecting the logical-level schema
B. the logical-level schema with no effect on view-level schema.
C. view-level schema without affecting logical-level schema.
D. logical-level schema without affecting physical-level schema.
Ans. A
13. Suppose A is a foreign key in R that refers to tuples of using values of the key attribute B of
S. Let X be the set of all non-null values of column A and let Y be the set of all values of
column B. Identity the correct relationship between X and Y that holds in
A. X is a subset of Y
B. X is a proper subset of Y
C. Y is a subset of X
D. X need not be a subset of Y and Y need not be a subset of X
Ans. A
14. Consider the two statements given below
S1: Each record of referencing relation can relate to at-most one record of referenced relation.
S2: Each record of referenced relation can relate to many (0 or more) records of referencing
relation.
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
36 | DBMS : CS
Choose correct option:
A. S1 is true, S2 is true B. S1 is false, S2 is true
C. S1 is false, S2 is false D. S1 is true, S2 is false
Ans. A
15. Complete the following statement by picking the appropriate for the blanks: An alternative to
establish that a set of attributes X, form a key of a relation R, is by checking if R is equal to
the ….. of the set X.
A. attributes B. closure
C. functional dependencies D. prime attributes
Ans. B
16. Given F = {A  B, B  C, C  D}, which of the following represents F+?
A. {A  B, B  C, C  D, A  BCD}
B. {A  B,B  C, C  D, A  C, B  D, A  D, B  CD, A  BCD }
C. {A  C, B  D, A  D}
D. {A C, B  D, A  D, A  BCD}
Ans. B
17. If F = {AB  CD}, then F entails which of the following?
A. ABE  CDE B. AB  CDE
C. AB  C D. A  CD
Ans. A, C
18. Given that A, B, C are attributes (Note: they are not sets of attributes) of a relation, consider
the following functional dependencies:
S1: A  AB,
S2: ABC  AB.
Which of the following options is correct?
A. S1 is trivial FD and S2 is a non-trivial FD
B. S1 is a non-trivial FD and S2 is a trivial FD
C. Both S1 and S2 are trivial FDs
D. Both S1 and S2 are non-trivial FDs
Ans. B
19. If F = {X  Y, WY  Z}, then Fentails which of the following?
A. W  Z B. WX  Z
C. X  Z D. Y  Z
Ans. B

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


CHAPTER-2
CONTENT
E-R Model
2. E-R Model.................................................................................................. 37-48
• E-R model………………………………………………………………………….. 37
2 E-R MODEL
E-R Model
The entity-relationship data model (E-R) provides the identification of the entities to be represented
in the database and their relationships.So,E-R model is the diagrammatic representation of DB
Design
Entity: It is a “thing” or “object” in the real world that is distinguishable from all other objects. For
example, each employee in a organization is an entity. Each entity contains some set of properties,
and the values for some set of properties may uniquely identify an entity.
Entity set: An entity set is a set of same type entities of the same type that share the same
properties,or attributes .It is represented by Rectangle

 
Attribute: The descriptive features that each member of an entity collection possesses are known as
attributes. It is represented by ovel.

Key-Attribute: Key attribute is represented by underline the attribute.

Multi-valued Attribute: Multi-valued attributes can take up and store more than one value at a time
for an entity instance from a set of possible values.It is represented by Double ovel.

Derived Attribute: Value of attribute is derived from another stored attribute.It is represented by
dotted ovel.

Age – is Derived attribute which is derived from current date and date of birth.
Date of Birth – is stored attribute

Composition Attribute: Attributes which are divided into sub-parts.Example:-Name is a


composite attribute which is further divided into first name,middle name and last name.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


38 | DBMS : CS

Relationship set: A group of relationships of the same type is known as a relationship set.It
is denoted by diamond .

Self-Referential Relationship Set:- An entity which is relate to itself is called Self-


Referential relationship Set:

Domain:-The value set (or domain of values) that corresponds to each simple attribute of an
entity type describes the range of values that may be provided to that attribute for each
particular entity.
Degree of Relationship Type: The degree of a relationship type is the number of
participating entity types.

Student Enroll
course

Here,2 entity are related with relationship set ‘enroll’ .So degree is 2.
Descriptive Attributes:- A relationship may also have attributes called descriptive attributes
Constraints on Binary Relationship Types:- The combinations of entities that may be
included in the corresponding relationship set are typically constrained by particular
relationship kinds.2 main types of constraints are:-
(1) Participation
(2) Cardinality
(1) PARTICIPATION: - The participation constraint states whether an entity's existence is
dependent on its relationship (via the relationship type) with another entity. The minimal
number of relationship instances in which each entity may participate is specified by
constraint.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


E-R Model | 39
2 Type of participation:-
(a) Total (b) Partial

Let constraints is one Employee manages multiple departments. But each department should
be managed by single employee.
Total Participation: Every entity of entity set relating with the another entity set via
relationship set. Total participation is also called existence dependency

Department
Denoted by Double lines  
Partial Participation: Every entity of entity set is not relating with the another entity set via
relationship set.

Employee
Denoted by single line  
(2) CARDINALITY: The maximum number of relationship instances in which an entity can
take part is indicated by the cardinality ratio for a binary relationship i.e each entity of first
set is related wit how many entity of other entity set.
4-type of Cardinality:
(1) 1:1 (2) 1:M (3) M:N
(1) 1:1:- An entity in A has a maximum of one entity in B as an association, and vice versa
for entity in B.
Course
Cid Cname
C1 DS

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


40 | DBMS : CS
C2 C
C3 C++

Enroll
Sid Cid
S1 C2
S2 C3
Key for enroll table is {sid,cid}
Student
Sid Sname
S1 A
S2 B
S3 A
(2) 1:M:- An entity in A is associated with any number (zero or more) of entities in B. An entity
in B, however, can be associated with at most one entity in A.
Student
Sid Sname
S1 A
S2 B
S3 A

Course
Cid Cname
C1 DS
C2 C
C3 C++

Enroll
Sid Cid
S1 C1
S1 C2
S2 C3
Key for enroll table is {cid}

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


E-R Model | 41
(4) M:N:- An entity in A is connected to any number of entities in B (zero or more), and an
entity in B is connected to any number of entities in A.
Student
Sid Sname
S1 A
S2 B
S3 A

Course
Cid Cname
C1 DS
C2 C
C3 C++
Enroll
Sid Cid
S1 C2
S1 C3
S2 C1
S3 C2
Key for enroll table is {sidcid}
Minimization of E-R Diagram:-
Case 1:- When cardinality is 1:M
EMP  Sid, Ename  Manages  Sid, did, Since  Dept  did, dname 
E1 E1 D1 D1
E2 E1 D2 D2
E3 E2 D3 D3
D4
P.K=Did
F.K.=Sid,did P.K=did

Here, one Table primary key referencing the primary key of others table. If such scenario
exist, then merge both table.( Manages & Dept. Table)

Dept_Manages  Did, dname, Sid, since  D1 E1


D2 E1

D3 E2
F.K
D4 NULL

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


42 | DBMS : CS

 If it is partial relation, NULL comes


 If it is total relation, no NULL comes
Case-2 When cardinality is 1:1
a) E1, E2 are two entity set R is a relationship set; and let partial participation in both sides.

A A1 A B A B1
1 a1 1 12 11 b1
2 a1 2 13 12 b2
3 a2 13 b2

In this case, Either we can merge E1& R (OR) E2& R


i.e., E1 R E2

(OR) E1 R E2

Let Merge E1 R E2

A A1 B B B1
1 a1 12 11 b1
2 a1 13 12 b 2
3 a2 NULL 13 b 2
A: PK B: PK
B: Alternate Key

So, Minimum 2 Table required


Let Merge both table, then

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


E-R Model | 43
A A1 B B1
1 a1 12 b2
2 a1 13 b2
3 a2 NULL NULL
NULL NULL 11 b1

Key is not possible. So, Merging of these table is not allowed.


b) E1, E2 are entity set.
R is relationship set and let atleast one total participation exist.

A A1 A B B b1
1 a1 1 12 11 b1
2 a1 2 13 12 b2
13 b3

If this is the case, then we make only one table.


A A1 B b1
NULL NULL 11 b1
1 a1 12 b2
2 a1 13 b3

C.K. = A, B
P.K = B
Alterante key = A
Minimum number of table required is one.
If both side have total participation, then we cannot get any NULL value in merging table.
Case 3:- When cardinality is M:N
Note:-Merging is not possible when cardinality is M:N.So,3 table required to represent the
information.
Note:
(1) 1:M = R and E2 can Merge
So, Minimum 2 Table required.
(2) += when atleast one side total participation then 1 table required otherwise 2 table
required.
(3) M:N= We cannot Merge
So, Minimum 3 Table required.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


44 | DBMS : CS
Because after merging, CK should be either E1 CK (OR) E2 CK
Weak Entity Set: -
It is a theortical concept.No such concept exist in RDBMS
Weak Entity Set is a Entity set which has no primary key i.e., tuples of weak entity set not
possible to differentiate using only weak entity set attributes

Denoted by

Cname age gender


A 20 F
A 22 F
B 22 M
Here, we can not differentiate child based on Cname, age, gender.

Employee Belongsto Child


Eid Eid Cname Cname age gender
E1 E1 A A 20 F
E2 E2 A A 22 F
E3 E1 B B 22 M

Weak entity type also called Chid entity (OR) Subordinate entity.
This relationship set is NOT strong because E1& E2 Cname is A but we can not differentiate
the A(i.e., which A is that).So, Relationship is WEAK.
So, We always merge weak relationship set and weak entity set.
SSn Cname age gender
E1 A 20 F

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


E-R Model | 45
E2 A 22 F
Now, it becomes strong relationship set.
So, Maximum and Minimum 2-table required.
Note: - (1) Participation between relationship set and weak entity set is always total
participation.
But, Participation between relationship set and Employee (Strong entity set) is may be total
(OR) partial.
(2) Most of the cases, cardinality between strong entity and weak entity s M:1.
But sometimes M:N is also possible.
Self.Referential Entity Set:
Relationship set/Entity set can relate itself.

Employee can superwise more than one sub-ordinate


Each subordinate reports to single employee
According to above construnts
(1) Subssn is primary key
(2) Subssn, supssn are F.K
So, merge both table because it satisfy all merging condition.
Employee(SSn, Ename, rating, supssn)
This type of Entity is called Recursive Entity set.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


46 | DBMS : CS
PRACTICE QUESTIONS

1. The minimum number of tables required to represent the following E-R diagram into relational midel
is__________

Mm : 5.0
Max : 5.0
Answer 5
Sol. 5
One for E]t one for E,, one for R, Two tables for multivalued attribute of Ej and E,
2. Consider a business rule i,e. "Each department is manged by atmost one employee".
What is the cordiality from employee to the department?
(A) 1:1 (B) 1: M
(C) M : 1 (D) M : N
Sol. (B)
E – R diagram for above constraints is

3. Consider The following E-R diagram

Let the cardinality of each relation is 1:1, then minimum number of table possible in relational model
to represent above E-R diagram (Let the participation of A.B is total)_______
Min : 3.0
Max : 3.0
Sol. 3
One table for AR1 B
One table for R2
One table for R3

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


E-R Model | 47
EXERCISE QUESTIONS
1. Suppose A is a set with 4 elements. The number of elements in the power set of A is
A. 15 B. 16 C. 24 D. 4
Ans. A
2. Consider the statements:
S1: The key of an entity type always consists of a single attribute.
S2: The key of an entity type may have more than one attribute.
S3: An entity type has exactly one key.
S4: An entity type may have more than one key.
Which of the following is correct?
A. S1 and S3 are TRUE B. S1 and S4 are TRUE
C. S2 and S3 are TRUE D. S2 and S4 are TRUE
Ans. D
3. Suppose X is a composite attribute of an entity type and has three components – C1, C2 and
C3, where only C2 is multi-valued. If domain sets of C1, C2 and C3 have 5, 3 and 4 elements
respectively, what is the size of the domain of x?
A. 60 B. 12 C. 120 D. 160
Ans. D
4. Suppose isPartOf is a relationship type with two participating entity types District and State.
What is the appropriate cardinality ratio for District: State?
A. 1:N B. N:1 C. M:N D. 1:1
Ans. B
5. Suppose Author is a relationship type with two participating entity types Person and Book.
What is the appropriate cardinality ratio for Person:Book?
A. 1:N B. N:1 C. M:N D. 1:1
Ans. C
6. Consider the binary relationship type BiologicalMother between entity types Person and
Woman. Suppose the cardinality ratio (Person: Woman) constraint of the relationship is
expressed using (mm, max) notation as (u,v) on the line connecting Person to BiologicalMother
and (x,y) on the line connecting Woman to BiologicalMother, which one of the following is
correct:
A. (u,v) = (1,1); (x,y); = (1, N) B. (u,v) = (1,N); (x,y); = (1, N)
C. (u,v) = (1,1); (x,y); = (0, N) D. (u,v) = (1,N); (x,y); = (0, N)
Ans. C
7. Suppose entity set A = {a,b,c,d,e} and entity set B = {w,x,y,z} and they participate in a
relationship R and the instances in R are {(a,w), (b,w), (c,x), (d,x), (e,y)}. Which one of the
following is correct?
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
48 | DBMS : CS
A. Cardinality ratio A:B is many-to-one; A participates partially; B participates completely .
B. Cardinality ratio A:B is one-to-many; A participates partially; B participates partially .
C. Cardinality ratio A:B is many-to-one: A participates completely; B participates partially
D. Cardinality ratio A:B is many-to-many; A participates completely; B participates partially.
Ans. C
8. Consider the following sets :
C = {p: weak entity type; q: multi-valued attribute; r: derived attribute: s: relationship type}
D = {w: dashed-line ellipse; x: diamond box; y: double-line rectangle; z: double-line ellipse}
The correct match between elements of C and D is
A. p--z; q--y; r--w; s--x B. p--y; q--z; r--x; s--w
C. p--y; q--z; r--w; s--x D. p--z; q--w; r--z; s—y
Ans. C
9. A foreign key in a relation R can NOT be used to refer to tuples in R itself.
A. True
B. False
Ans. B
10. There are two weak entities E1 and E2, where E1 is the owner entity for E2. Consider relational
representation of E1 and E2 and choose correct options:
A. Mapping of E1 should be done before E2
B. Mapping of E2 should be done before E1
C. E1 and E2 can be mapped in any order
D. Partial key of E1 includes the partial key of E2
Ans. A

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


CHAPTER-3
CONTENT
Query Language

3. Query Language .................................................................................... 49-85


• SQL……………………….………………………………………………………… 49
• Relational Algebra …………………………………………………………………..59
• SET Operation ……………………………………………………………………… 60
• Join Operation ……………………………………………………………………….62
• Variation of Join ……………………………………………………………………. 63
• Outer Join ………………………………………………………………………….. 65
• FULL-OUTER JOIN ………………………………………………………………. 66
• Tuple Relation Calculus(TRC) ………………………………………………………66
• A General Expression of TRC is …………………………………………………….67
3 QUERY LANGUAGE
SQL
Formal (Procedural) Q.L.: - What & how to retrieve from DB.

R.A.
Informal (non-procedural) Q.L: - What to retrieve from DB
Example: TRC, SQL
SQL: - By default duplicate is not eliminated. So, SQL table is not a ‘set of tuples’ but it is a
multiset (bag) of tuples.
Keyword ‘DISTINCT’ is used to get only distinct values.
The basics clause in SQL is:
SELECT <attribute list>
FROM <table list>
WHERE <condition>

A B C D
1 X C1 D1
2 X C2 D2
3 Y C3 D3
4 Z C3 D4
5 K C4 D5

Retrieve ‘A’ & D whose B value is X.


Sol.
A D
1 D
2 D2

Equivalent relational algebra query is:


A, D  σB='X' ,  R  

Note: A simple SQL query with a single relation name in the FROM clause is similar to a
SELECT-PROJECT pair of relational algebra operation.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


50 | DBMS: CS

Student
Sid Sname Dob Courseid Course
1 A 10/11 1 Cid Cname Fee
2 B 10/11 4 1 C 10K
3 A 10/12 1 2 C++ 15K
4 C 10/13 1 3 DB 20K
5 D 10/14 2 4 Java 30K

Retrieve Sname and Dob of student who enrolled in course ‘C’


Sol. SELECT Sname, Dob
FROM Student, Course
WHERE Coursename = ‘C’
Note: A SQL query with multiple relation same in FROM clause is similar to a SELECT
PROJECT JOIN sequence of relational algebra operation.
Note: If we not indicate ‘WHERE’ clause in the SQL query, then all the tuples if the relation
specified in the FROM clause are selected for the result.
* If more than one relation is specified in the ‘FROM’ clause, then the CROSS-PRODUCT
of these relation is selected.
Use of Asterisk (*)
* To retrieve all the attribute of the selected tuple
Example:
SELECT *
FROM Student
WHERE Cid = 2

SELECT *
FROM employee, department
Sol. Results cross-product of Employee and department relation.
Note: ‘SELECT DISTINCT’ eliminates duplicate ‘SELECT ALL’ does not eliminate
duplicate. But if we are not mentioning neither DISTINCT nor ALL then it is equivalent to
SELECT ALL.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 51
Set Operation: Union (), Intersection (), Set-difference (Except)
Note: By default, result of these operation eliminates the duplicate because these are the set
operation and set does not contain duplicate values.
* If we don’t want to remove duplicate or SQL behaves as Multiset, then use keyword Union
All, Except All, Intersect All.
Note: To apply sett operation we have to make sure that relation on which we apply the
operation have the same attribute and that the attribute appears in the same order in both
relation i.e., Union Compatible.
Union/Union All
Intersect/Intersect All
Minus/Minus All
(1) Union all = number of duplicate tuples in the result is equal to the total no. of duplicates
that appear in both S1 and S2 query
(2) Intersect All = number of duplicate tuples that appear in the result is equal to minimum
number of duplicates in both S1 and S2.
(3) Except All = number of duplicates copies of a tuple in the result is equal to the number
of duplicate copies in S1 minus the # of duplicate copies in S2, provided difference is
positive.
R A S B
1 1
1 1
1 1
2 2
2 4
4 5
6 5
6

1
2
4
6
5

R Union S: It gives distinct tuples from R and S.


R Union All S:It gives all tuple from R and S.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


52 | DBMS: CS
1
1
1
2
2
4
6
6
1
1
2
4
5
5

R intersect S: Distinct common tuples from R and S.

1
2
4

R intersect All S: common tuples from R and S.

1
1
2
4

R-S: Distinct Tuples in R but not in S 6

1
2
6
6
R-All S:  Tuples in R but not in S
IS / IS NOT Clause:-
 To check attribute value is ‘NULL’, SQL uses IS (OR) IS NOT
 We can not use = (OR) <> ‘Not equal to’ operator because SQL considers each Null
value as being distinct from every other NULL values.
 When join condition is specified, then tuples with NULL values for the join attribute
are not included in the result.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 53

Eid Ename dno


1 A
2 NULL
Eid
3 B
4 NULL 2
O/P =
5 A 4

Retrieve Eid who do not have dno.


Use of = and IN/Not IN
In nested Query, if inner query results a single attribute and a single tuple, then query result
will be a single (scaler) value. In such case, we can write = comparison operator between
inner & outer query.
But, if we compare a single value with a set/multiset of a value (result from inner query), then
we can not use = operator.
For this, we use IN/NOT IN, which compares a value with a set (Multiset) of values and
evaluate TRUE if given value is element of set (Multiset)
ANY/SOME/ALL
* Op Any (Op Some) are also comparison operator which can be used to compare a single
value to a sett or multiset values (O/P of nested query)
Note: = ANY (=SOME) is equivalent to IN
 op ALL returns TRUE if value is TRUE for (acc. to op.) all values (result of inner query)
Note: (1) x op Any {S: Empty Set}  always false
(2) x op ALL {S: Empty set}  always true.
Note:-
=SOME is equivalent to IN.
< >ALL is equivalent to NOT IN.
< >SOME not equivalent to NOT IN.
=ALL not equivalent to IN.
Test for empty relation:-
Co-related Nested Queries:-Whenever a condition in the WHERE clause of a nested query
references some attribute of a relation declared in the outer query, the two queries are said to
be Correlated Nested Query.
Note: The nested query is evaluated once for each tuple in the outer query.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


54 | DBMS: CS
* Whenever a table name from outer ‘FROM’ clause can be used in a nested query in the
‘WHERE’ clause it is called correlated subquery
EXISTS/NOT EXISTS
EXISTS function is SQL is used to check whether the result of a correlated nested query is
empty (contain no tuples) or not
* EXISTS returns TRUE if there is atleast one tuple in the result of the nested query, otherwise
it returns FALSE.
* NOT EXISTS returns TRUE if there are no tuples in the result of nested query, otherwise
it returns FALSE.

Branch
Bno. City Code
1 Jaipur XX
2 Chandigarh YY
3 Delhi ZZ
4 Jalandhar AA
5 Jairpur XXY
Staff: -
Sid Sname Branch no. Position
11 A 1 Manager
12 B 3 Assistant
13 C 3 Supervisor
14 D 2 Assistant
15 E 3 Manager
16 F 1 Assistant

Find all staff who work in a ‘Jaipur’ branch office.


Sol. Select Sid, Sname, Position
From Staff S
Where EXISTS (Select *
From branch B
Where S.branch no = b.branch no.
AND
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Query Language | 55
City = ‘Jaipur’)
O/P  Sid Sname Position
11 A Manager
16 F Assistant
Note: Above Query is Equivalent to
Select Sid, Sname, position
From staff S, Branch b
Where S. branch no = b. branch no. AND
City = Jaipur

Select course id from section as S where semester = fall and year = 2009 and
exists (select *
from section as T
where semester = spring and
year = 2010 and S. course id = T. course id)

Find all course tough in both the fall 2009 and spring 2010 semester.
NULL VALUES:-
Null values are considered as Unknown/Unexisted
 If where clause evaluates to either FALSE or UNKNOWN for a tuple, the tuple is not
added to the result.
When a query uses the ‘select distinct’ clauses, duplicate tuples must be eliminated.
Aggregate Function:
Count: returns the no. of values in a specified column
Sum: returns the sum of the values in a specified column
AVG: returns the average the values in a specified column
MIN: returns the smallest values in a specified column
MAX: returns the largest value in a specified column
Note: Apart from ‘COUNT’ each function eliminates nulls first and operates only on the
remaining non-null values.
‘Count (*)’ is a special use of ‘COUNT’ which counts all the rows of a table, regardless of
whether nulls (OR) duplicates values occur.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


56 | DBMS: CS
Note: Aggregate function can be used only in the SELECT list and in the HAVING clause.
Note: If we use aggregate function as well as sameother attribute in ‘SELECT’ clause then
other attribute must be specified with GROUP BY clause.

Select Sid, COUNT(Salary)


from staff
is illegal in SQL.
So, if a SELECT clause uses an aggregate operation, then it must use only aggregate operation
unless the query contains a ‘GROUP BY’ clause.
GROUP BY:
Group By clause groups the data
Group By clause specifies the grouping attribute, which should also appear in the SELECT
clause, so the value resulting from applying each aggregate function to a group of tuples
appears along with the value of grouping attribute.
Note: If NULL exist in the grouping attribute, then a separate group is created for all tuples
with a NULL value in the grouping attribute.
HAVING: -
Having clause provides a condition on the group of tuples associated with each value of the
grouping attribute. Only the group that satisfy the condition are retrieved in the result of the
query.
 WHERE clause selects individual row for final result.
HAVING clause selects group for the final result.
 HAVING clause imposes a condition on the GROUP BY clause
‘With’:
- The ‘With’ clause defines the temporary relation which is used in the immediately following
query.

Find all department where the total salary is greater than the average of the total salary at all
department.
Sol. With dept. total (depatname, value) as
(Select depatname, sum(salary)
from instructor
grouping departname)
dept total ave (value) as

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 57
(Select avg(value)
From dept.total)
Select dept.name
from dept total, dept total avg.
where dept total value > = dept total avg value.
LIKE
Allow comparison of one string value with another string value.
 Partial strings are specified using two reserved characters (wildcard characters): -
(a) %
(b) _ (underscore)
% replace an arbitrary number of zero or more character.
_ replace a single character.
LIKE ‘H%’ means the first character must be H, but the rest of the string can be anything.
LIKE ‘H_ _ _’ means there must be exactly four characters in the string, the first of which
must be an H.
LIKE ‘%e’ means any sequence of character of length atleast 1, with the last character an e.
LIKE ‘%GATE%’ means a sequence of characters of any length containing GATE.
NOT LIKE ‘H%’ means first character cannot be an H.
Note: If the search string can include the pattern-matching character itself, we can use
ESCAPE character to represent the pattern matching character.
Testing for Absence of Duplicate tuple.
‘UNIQUE’ = Boolean function for testing whether a subquery has duplicate tuples in its
result.
True: - If the argument subquery contains no duplicate tuples.
Note: -If Subquery returns Empty Result, then also, return TRUE FALSE: - If subquery
returns contain duplicate tuples.
Note: -Return TRUE if there are multiple copies of a tuple, as long as atleast one of the
attribute of the tuple is NULL.

Select T. course id
from course as T
where unique(Select R.courseid
from section as R
where T.courseid=R.courseid
and R.year=2009
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
58 | DBMS: CS

Select T.courseid
from course as T
where 1 < = (Select count(R.courseid)
from section as R
where T.courseid=R.courseid
and
R.year=2009
NULL Value
(1) The result of an arithmetic expression (+, -, *, /) is NULL if any of the input value is
NULL.
Example: r.A + 5 = null (if r.A value is null)
(2) SQL gives UNKNOWN the result of any comparison involving a null value.
(3) When comparing values of corresponding attribute from two tuples, the values are treated
as identical if either both are non-null and equal in value OR both are null.

(‘A’, null), (‘A’ null) are treated as identical, even if some of the attribute have a null value.
(4) Treatment of null above pointer is different from the way null are treated in predicate
where a comparison ‘null = null’ would return unknown.
Join Condition:
‘ON’ The ON predicate is written like WHERE clause predicate except for the use the
keyword ON rather than WHERE.
ON condition appears at the end of the join expression

Select *
from student join takes on st.id = takes.id
 Join and Natural join difference is that result in ‘Join’, 1D attribute appears twice, Natural
join, 1D attribute once.
Equivalent query w/o ON
select *
from student, takes
where stu.id = takes.id

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 59

Relational Algebra
Relational Algebra is a formal languages(procedural language)
Note:- By default duplicate is eliminated.
List of operation in relational algebra are
SELECT,PROJECT, RENAME, SET operation(UNION, INTERSECTION, MINUS),
CARTESIAN PRODUCT, INNER JOIN(THETA, EQUI, NATURAL JOIN), OUTER JOIN(LEFT,
RIGHT, FULL), DIVISION
(1) SELECT: - It is a unary operation, which perform horizontal partition.
 It isused to select the tuples from a reltion that satisfies a selection conditon.
Denoted by: <Selection Condition>(R)
 Here, selection condition is a Boolean expression.
Selection condition format:
<attribute name>< Comparision op ><Constant value>
(OR)
<attribute name>< Comparision op ><attribute name>
 The relation resulting from SELECT operation has the same atribute as R.
 It applied to each tuple individuallly
 The ‘Degree’ of the relation resulting from a SELECT operation is number of attributes i.e., the
same as the degree of R.
 Numbers of tuples in the resulting relation is always less than (OR) equal to the number of tuples
in R i.e.,
|c(R)|  |R|
 The fraction of tuples selected by a selection condition is referred to as the selectivity of the
condition.
 SELECT operation is commutative i.e.,

  
σ cond1 σ cond2  R  =σ cond21 σ cond1  R  
So, we can always combine a cascade of SELECT operation into a single SELECT operation with a
conjunctive (AND) condition i.e.,

 
σ cond1 σ cond21 = σ cond31 ....σ condn1  R ..... 
σ cond1 AND Cond2 AND.......... Condn  R 

PROJECT operation
 It is a unary operation which perform vertical partition
 selects column (attribute)

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


60 | DBMS: CS
Denoted by: <attribute list>(R)
 The result contains only the attribute specified in attribute list, in the same order as they appear in
the list.
 degree is equal to number of attributes in <attribute list>
 The number of tuples in a relation resulting from a PROJECT operation is always less than or
equal to the number of tuples in R.
 If the projection list is a superkey or R, then the resulting relation has the same number of tuples
as R.
 PROJECT operator is NOT commutative
<list1> (<list2> (R)) = <list1>(R)
If <list2> contain the attribute in <list1>
RENAME operation:
 used to give name to the relation that hold the intermediate result.
 also used to rename the attributes in the intermediate and result relation.
It is a unary operation.
 denoted by (rho)
Notation-1: ρS B ,B ,B ,....B
1 2 3 n   R  rename both the relation (R) by S and its attribute by B1, B2,…..,Bn
Notation-2: S(R)  rename the relation only
Notation-3: ρ B ,B ,B ,....B
1 2 3 n   R  rename the attributes only

CSEBRANCH branch=CSE(Student)
RESULT Sname(CSEBRANCH)

R(name, middle, package) Fname, Lname, Satary(EMPLOYEE)


SET Operation
(1) UNION (2) INTERSECTION (3) MINUS
UNION: - (RS) results a relation that includes all tuples that are either in R or in S or in both R
and S but duplicate tuples are eliminate.
INTERSECTION: - (RS) results a relation that includes all tuples that are in both R and S.
SET DIFFERENCE:(R-S (OR) S-R)
R-S results a relation that includes all tuples that are in R but not in S.
S-R results a relation that includes all tuples that are in S but not in R.
Note: (1) All set operations are binary operation .

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 61
(2) Set operations applied only if both relations are Union compatible.
Two relation R(A1, A2,…..,An) and S(B1, B2,…..Bn) are said to be Union compalible if they have
the same degree n and if dom(Ai) = dom(Bi) for 1 i n. This mean that the two relation have the
same number of attributes and each corresponding pair of attributes has the same domain.
Note:- UNION and INTERSECTION are commutative operation.
R  S = S  R, RS=SR
Note:- UNION and INTERSECTION are associative operation
R(ST) = (RS) T
R(ST) = (RS) T
Note:- MINUS is NOT Commutative
R – S S – R
Note:- RS = RS – (R-S) – (S-R)
RS = R – (R-S)
Minimum and maximum number of tuples in set operation result:-
Let relation R has m-tuples and relation S has n-tuples then,
operation Min.number of tuple. Max. number of tuple
R S Max  m, n  m+n
R S 0 min. m,n 
R S 0 m

CARTESIAN PRODUCT (CROSS PRODUCT) operation;-


Denote by (X)
It is a binary operation
Both relation need not necessary be Union Compatible.
Combination of every member from one relation with every member from other relation.
Let degree of R = n
 Let degree of S = m
Then, degree of R  S = n + m
Let Number of tuples in R = n
Let number of tuples in S = m
Then, number of tuples in R  S = n * m

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


62 | DBMS: CS

Let relation R(A,B) and S(C,D)


A B
A1 B1
A2 B2

C D
C1 D1
C2 D2
Then R X S=
A B C D
A1 B1 C1 D1
A1 B1 C2 D2
A2 B2 C1 D1
A2 B2 C2 D2

Join Operation
 denote by ( )
 used to combine related tuples from two relation into single tuples.
 It is a binary operation.
Note: JOIN operation can be stated in terms of CARTESIAN PRODUCT followed by a SELECT
operation.
General form of a JOIN operation on two relation R(A1, A2,…..,An) and S(B1, B2,……,Bm) is

The result of JOIN is a relation (Q) with n+m attribute Q(A1, A2,…..,An, B1, B2,…….,Bm) in that
order.
 General JOIN condition is of the form: -
<condition> AND <condition> AND………..AND<condition>
Where, each condition in the form Ai  BJ

Ai is an attribute of R
Bj is an attribute of S
Ai and Bj have same domain

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 63
Φ is one of the comparision operator (=, , , }
A JOIN operation with such a general join condition is called a THETA JOIN
Note: Tuples whose join attribute are NULL or for which the join condition is FALSE do not appear
in the result.
Variation of Join
(1) EQUI JOIN (2) NATURAL JOIN
EQUIJOIN: - JOIN where only = comparision(equal to ) operator is used is called EQUIJOIN.
So, in the result of EQUIJOIN, we always have one or more pairs of attribute that have identical
values in every tuple.
NATURAL JOIN: (denoted by *) is used to get rid of second (superfluous identical) attribute in a
EQUIJOIN Condition.
Note: (1) NATURAL JOIN is basically EQUIJOIN followed by removal of the superfluous
attributes.
(2) The standard definition of NATURAL JOIN requires that the two join attributes (or each pair of
join attributes) have the same name in both relation. If this is not the case, a renaming operation is
applied first.
(3) If No combination of tuples satisfies the JOIN condition, result of JOIN is an empty relation with
zero tuples.
(4) If relation R has n-tuples
If relation S has m – tuples
Then result of JOIN (R  join condition  S have between 0 and mn tuples.

(5) If there is no JOIN condition, all combination of tuples qualify and JOIN degenerates into a
CARTESIAN PRODUCT, also called CROSS PRODUCT (OR) CROSS JOIN.
(6) These JOIN operation is also called INNER JOIN
(7) If No common attribute, then
Natural Join = Cross Product
(R S) = (R S)

Complete Set of R.A. operation:


 {, , , , } is a complete set i.e., any other RA operation can be expressed as a sequence of
operations from this set.
JOIN operation can be specified as a CARTESIAN PRODUCT followed by a SELECT operation.
R <condition> S  condition  (RXS)

* NATURAL JOIN can be specified as a CARTESIAN PRODUCT preceded by RENAME and


followed by SELECT and PROJECT operation.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


64 | DBMS: CS
Division Operation: R(Z)  S(X), where X  Z. Let Y = Z-X.
The tuples in the denominator relation restrict the numerator relation by selecting those tuple
in the result that match all values present in the denominator.
The DIVISION operation can be expressed as a sequence of ,  and  operation as follows:
T1y(R)
T2y((ST1) – R)
T  T1 – T2
  operator dealing with queries that involve universal quantification.
Enroll  E  Course  C 
Sid Cid Cid
S1 C1 C1
S1 C2 C2
S2 C1 C3
S2 C2
S1 C3
S3 C3

Retrieve the Sid of student who enrolled every course.


Sol. A(X, Y)/B(Y) result X values for that there should tuples <x, y> for every y values of relation
B.
E(Sid, Cid)/C(Cid)
ΠSid  E  -ΠSid  ΠSid  E ×C-E 

1 ΠSid  E  ×C S1 C1
S2  C2
S3 C3

S1 C1
S1 C2
S1 C3
S2 C1
S2 C2
S2 C3
S3 C1
S3 C2
S3 C3
Note: Not enrolled every course  Enrolled some course because it cover who enrolled in all
course also.
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Query Language | 65
(2) (ΠSid  E ×C) - E

1 C1 S1 C1
S1 C2 S1 C2
S1 C3 S1 C3
S2 C1 S2 C1
S2 C2  S2 C2
S2 C3 S3 C3
S3 C1
S3 C2
S3 C3
S2
(3) Result of Step-2 is
S3

i.e. return all student id who are not enrolled in all the courses.

(4) ΠSid  E  -ΠSid  ΠSid  E ×C-E 

i.e from all student id,subtract student id who are not enrolled in every course.
S1

Note: If R has cardinality= n -tuples


If S has cardinality = m tuples
  n 
Then R|S= 0 to   
  m   tuples

Outer Join
When a user wants to retain all of the tuples in R, S, or both relations in the result of the JOIN
regardless of whether there are matching tuples in the other relation, a set of operations known as
outer joins.
3 types of outer join
1. Left outer join
2. Right outer join
3. Full outer join

Let R and S table given below.


R A B C B C D
1 2 1 2 1 4
2 2 2 4 2 1

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


66 | DBMS: CS
LEFT OUTER JOIN
R S

Step-1: Output of natural join is


A B C D
1 2 1 4
Step-2:- Add those tuples of R are failed in join condition.
A B C D
1 2 1 4
2 2 2 NULL

RIGHT-OUTER JOIN: - R S

Step-1: Output of natural join is:-


A B C D
1 2 1 4
Step-2:- Add those tuples of S are failed in join condition.
A B C D
1 2 1 4
NULL 4 2 1

FULL-OUTER JOIN
Step-1:- Output of natural join is
A B C D
1 2 1 4
Step-2:- Add those tuples of R and S which are failed in join condition.
A B C D
1 2 1 4
2 2 2 NULL
NULL 4 2 1

Tuple Relation Calculus(TRC)


TRC is non-procedural query language because there is no explanation of how or in what sequence
to evaluate a query because we just need to provide one declarative statement to indicate a retrieval
request that “What is to be is specified by a calculus phrase”rather than on how it should be retrieved.
Note:-Expressive power of TRC and relational algebra is equal i.e. any retrieval that can be stated in
relational calculus can also be specified in basic relational algebra.
Format: {T|P(T)}
T = Tuple Variable
P(T) = formula over tuple variable T.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 67
T returns set of tuples (T) such that P(T) is satisfied.
(OR) result of query is the set of all tuples t that satisfy COND (t).

Retrieve name of students who scored > 90


Sol. {t.name|STUDENT(t) AND t.marks>90}

Retrieve supplier whose rating > 10


Sol. {s | s  supplier  s.rating >10{
Atomic Formula Atomic Formula
P S:: Consist one OR more atomic formula

A General Expression of TRC is


{t1.Aj, t2.AK,……., tn.Am| COND(t1, t2,…..tn,…..tn+m}
Where t1, t2,…..tn+m are tuple variables
AJ.AK……. are attribute of the relation on which ti ranges
And COND is a conditional formula.
In formula we can also use quantities
(1) Existential Quantifier ()
(2) Universal quantifier ()
Free variable: If tuple variable not bounded by Quantifier, then that variable is called free variable.

s  supplier
Bounded Variable: If tuple variable preceded by Quantifier Example: & supplier.
Note:(1) TRC, atmost are free variable can be used.
(2) Result of TRC should be free variable

Retrieve sid of the supplier who supplied some red parts.


 sid  col.id=p.id  Catalog X  col=red  Parts   

  
T c  catalog  p  parts  p.col=Red  P.id=c.id   T=c.sid  

  Result 

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


68 | DBMS: CS

Retrieve name of the all employee who work for the CSE department.
Solution: {t.name|Emplyee(t) AND (d) (Department (d) AND d.name=CSE AND
d.number=t.Dno}
Safe and unsafe Query
A safe expression in relational calculus is one that is guaranteed to yield a finite number of
tuples as its result
Unsafe Query: Quary which results infinite set of tuples.

{s| s supplier}
Tuples not belong to supplier.
i.e., all tuples in DB except supplier.
i.e., whenever we do complement of free variable, we get unsafe query.
Note: Expressive power of RA = Expressive power of 7RC
An similar expression can be obtained by transforming a universal quantifier into an
existential quantifier and vice versa i.e.
(x) (P(x)) NOT (x) (NOT (P(x)))
(x) (P(x)) NOT (x) (NOT (P(x)))
(x) (P(x) AND Q(x)) NOT (x) (NOT (P(x)) OR NOT (Q(x)))
(x) (P(x) OR Q(x)) NOT (x) (NOT (P(x)) AND NOT (Q(x)))
(x) (P(x)) OR Q(x)) NOT (x) (NOT (P(x)) AND NOT (Q(x)))
(x) (P(x) AND Q(x)) NOT (x) (NOT (P(x)) OR NOT (Q(x)))

Query 1) Get names of all employees in department 5 who work more than 10 hours/week
on the ProductX project.
Sol. { t.fname, t.minit, t.lname | employee(t) and (Exists w)(Exists p)(works_on(w) and
project(p) and t.ssn = w.essn and w.pno = p.pnumber and w.hours >= 10 and p.pname =
'ProductX') }
(Query 2) Get names of all employees who have a dependent with the same first name as
themselves.
Sol. { t.fname, t.minit, t.lname | employee(t) and (Exists d)(dependent(d) and t.ssn = d.essn
and t.fname = d.dependent_name) }
(Query 3) Get the names of all employees who are directly supervised by Franklin Wong.
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Query Language | 69
Sol. t.fname, t.minit, t.lname | employee(t) and (Exists e)(employee(e) and t.superssn =
e.ssn and e.fname = 'Franklin' and e.lname = 'Wong') }
(Query 4) Get the names of all employees who work on every project.
Sol. { t.fname, t.minit, t.lname | employee(t) and (Forall p)(project(p) -> (Exists
w)(works_on(w) and w.essn = t.ssn and e.pno = p.pnumber)) }
(Query 5) Get the names of employees who do not work on any project.
Sol. { t.fname, t.minit, t.lname | employee(t) and not (Exists w)(works_on(w) and w.essn
= t.ssn) }

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


70 | DBMS: CS
PRACTICE QUESTIONS
1. Consider the following query
SELECT Studentid, Studentname
FROM student
WHERE birthyear > = ALU SELECT birthyeary FROM student)
(A) Return studentid and studentname of youngest student
(B) Return studentid and studentname of oldest student
(C) Return studentid and studentname of all student
(D) None
Sol. (A)
Inner query results birthyear of all student. So, outer query returns studentid and studentname
of youngest student.
2. Which of the following operation is not a derived operation in relational algebra?
(A) Minus (B) Intersection (C) Join (D) Division
Sol. (A)
R  S  [(R  S)  (R  S)]  (S  R)
OR
R  (R  S)
RS  condition  (R  S)

Division is derive using projection, cross product and set difference.


3. Consider the following relation R, S and T

A B C A B A C
R 1 a S
10 2 a 1 NULL
2 b 20 3 b 2 10
3 c 30 6 c 3 20
4 d 40 6 d 4 NULL
5 e 50 7 E 5 NULL
NULL 6 30
NULL 6 40
NULL 7 50

If T = R ? S, then find the missing operation at ??


(A) Product (B) Natural join
(C) Left outer join (D) full outer join

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 71
Sol. (D)
Full outer join of R and S will give T relation, Here, Null entries are taken for R and S to
include all missing entries of common attribute while joining R and S
4. Consider a schema with two relation R(A, B) and S (B, C) where all values are integers. Make
no assumption about the key. Consider the following two relational algebra expression:
(1) A,C ( BC S)

(2) A (B1R)  C (B1S)

which of the following is true


(A) Both queries are equivalent (B) Output of query 1 c output of query 2
(C) output of query 2 tz output of query 1 (D) No relation
Min : 1.0
Max : 1.0
Sol. 1

then R B 1 S

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


72 | DBMS: CS
5. Consider the following sequence of operation given below on the relation Employee (Eid,
Ename, address, Bdate, Superssn, Dno, Sex)
1) male-emp  sex  m (Employee)

2) Result 1  Eid (Male-emp)

3) Result 2  superssn (male-emp)

4) Result  Result  Result 2


What will be the above sequence of operations performed on the given relation produce?
(A) Eid of an employee who is either male (OR) supervisor
(B) Eid of all male supervisor
(C) Eid of male employee and eid of their supervisor
(D) None Answer C
Sol. (C)
Result of first operation is the relation which contain all male employee.
Result of second operation is the relation which contain the list of all Eid of all male employee
Result of third operation is the relation which contain the list of all superssn of all male
employee
6. Consider the two relation R(ABCD) where A is primary key and R,(DE) where D is primary
key. Let R1 has 1500 tuples and R2 has 1000 tuples. What is the maximum number of tuples
in R1 R2 is________
Min : 1500.0
Max : 1500.0
Answer 1500
Sol. 1500
Every tuple of R1 can join to almost one tuple in R2 since D is primary key in R2
7. Consider the following relational schemes
Studen(Ename, Sid, Sex, Age)
Courseinfo(Courseid, Coursename, Hours, instructor, name)
Euro 11 (Sid, corns eid, sem)
Which of the following is the efficient relational algebra expression for the following query?
"find the fname of student who have enrolled for course having id 100"
(A) fame (couseid 100 (Enroll Strident))

(B) fame (couseid 100 (Enroll Courseinfo Snident))

(C) fame (couseid 100 (Enroll)) Student})

(D) fame (couseid 100 (Enroll Student))

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 73
Sol. (C)
Option (c) is efficient relational algebra expression since we are per-forming selection on
enroll first and then do the join of resultant table with student relation to get student.
8. Which of the following statement is correct in aggregation with NULL values in SQL?
(A) All aggregate function ignore NULL values in their collection
(B) All aggregate function ignore NULL values except Count() function
(C) All aggregate function ignore NULL values except Count(*) function
(D) All aggregate function ignore NULL values except Count(distinct column-name)
Sol. (C)

A B

1 4
2 5
3 6
4 Null

Then Count (A) = 4


Count (B) = 3
Count(*) = 4
9. Which of the following operation results same output table as given input table.
Input table: R
A B

3 4

6 8

9 77

10 14

I) A,B (ABR) II) AB (A,BR) III) AB (A,BR)

(A) I & II (B) II & III (C) III only (D) I & III
Sol. (C)

In given input, A < B. So,  A  B select all rows and project A,B

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


74 | DBMS: CS
10. Consider the following relation
R1 (P, Q, R) and R2 (R, S, T)
Let R1 has 1000 record and R2 has 2000 record. The non-null attribute ‘P’' in R2 is referencing
attribute 'P’ R1. Let X be the minimum number of records in R1 R2 and Y be the maximum
number of records in R1 R2 then X + 2Y is____________
Min : 6000.0
Max : 6000.0
Answer 6000
Sol. 6000
Since P in R, is not key. Hence, all value of P may or may not be unique. Hence, every entry
under 'P’ in R2 will match with 'P’ in R1
Hence, maximum is 2000. But P is R2 is foreign key referencing P in R1 Therefore, minimum
is also 2000.
So, X + 2y = 2000 + 2*2000 = 6000
11. Consider the following query
Doctor (Did, Dname, Dfee, Specialization)
SELECT Dname
FROM Doctor
WHERE NOT Dfee > - ANY('SELECT Dfee FROM Doctor WHERE specialization =
'Cardio"))
What is The output of above query?
(a) Name of doctor whose fee is less than every doctor who have cardio specialization
(b) Name of doctor whose fees is greater than every doctor who have cardio specialization
(c) Name of doctor whose fees is less than equal to doctor who have cardio specialization
(d) None
Sol. (A)
Inner query finds all fees of doctor having cardio-specialization
Outer query compare every record of doctor if fee greater than equal to inner fee then reject
it.
12. Consider a relation Employee (Eid, Ename, Salary)
SELECT Ename, Sum(Salary)
FROM Employee
group by Ename
Having sum(Salary < 10000)
What is the output of above query?

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 75
(A) Name of all Employee with their respective salary is less than 10000
(B) Name of all employee having same name with their individual salary is less than 10,000
(C) Name of Employee having same name with their total salary less than 10,000
(D) None
Sol. (C)
The given query group the record by same name and find total salary less than 10,000.
13. Given two relation R1 and R2 with n and 0 record respectively. What are the maximum
possible record in the result of R1/R2?
(Assume the set of attribute of R2 subset of set of attribute of R1)
(A) n (B) 0 (C) mote than n (D) undefined
Sol. (A)
Since no condition will be applied on R1 by R2 because R2 contain zero record. So, all the
records of relation R will be represent in the output of R1 / R2 which is n.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


76 | DBMS: CS
EXERCISE QUESTIONS
1. Consider the following relations S1(P,Q,R) and S2(T,U,V) and the given instances. What is the
result of the algebra expression
P,Q((R=3)V(R=5) (S1))  T,U ((v2)V(V3)(S2))

S1 S2

P Q R T U V

1 2 3 1 2 3

5 10 4 2 4 2

2 4 3 1 2 4

3 6 5 3 6 2

1 2 5

A. Empty relation
B. {(2,4), (3,6)}
C. A relation with schema (P,Q) and tuples {(1,2)}
D. A relation with schema (P,Q) and tuples {(1,2), (2,4), (3,6)}
Ans. A
2. Using the relation instances given in the Question 24, find out how many tuples will be there
in the result of the following relational algebra expression
S1 S1.Q S2.V S2

A. 20 B. 9 C. 8 D. 16
Ans. D
3. Consider the following relations S1(P,Q,R) and S2(P,R) and the given instances. What is the
result of the relational algebra expression?
S1  S2
S1 S2
P Q R P R
1 2 2 1 2
1 4 2 2 4
2 4 4
1 6 2
2 2 4

A. Empty relation

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 77
B. A relation with scheme (Q) and tuples {(2), (4)}
C. A relation with scheme (Q) and tuples {(2), (4), (6)}
D. {2, 4}
Ans. B
4. Consider the following relation S1 and its given instance. What is the size of the result of the
following relational algebra expression on the relation S1?
P.Q(S1)*Q.R (S1)

P Q R

1 2 3

3 4 5

2 4 6

3 3 5

4 2 3

A. 5 B. 6 C. 7 D. 8

Ans. C

5. On the same relation given in Question 28, what is the size of the result of the following
relational algebra expression?

P,Q1 ( P,Q (S1)) (Q1 Q2) Q2,R (Q,R (S1))

A. 9 B. 10 C. 11 D. 12

Ans. C

6. Consider two relations R1(A,B) and R2(B,C) as given below

R1 R2

A B B C

4 8 6 3

3 6 3 6

2 4 2 4

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


78 | DBMS: CS
Assume that R(R,B,C) is full natural outer join of R1 and R2. Consider following tuples
t1=(4,8,null)
t2=(null,3,6)
t3=(3,4,null)
t4=(3,6,3)
t5=(null,6,null)
t6=(null,2,4)
Choose correct option:
A. R contains only t1, t3 and t4 B. R contains only t2, t4 and t6
C. R contains all tuples D. R contains t1, t2, t3, t4, t6 but not t5
Ans. D
7. Consider the statements given below
P: Any retrieval request that is specified in the basic relational algebra can also be specified in
relational calculus
Q: Any retrieval request that is specified in the relational calculus can also be specified in basic
relational algebra
Choose correct option:
A. P is true Q is false B. P is false Q is true
C. Both P and Q are true D. Both P and Q are false
Ans. C
8. Consider the following relational scheme where rollNo and courseld are foreign keys in
Enrollment referring to rollNo in Student and courseld in Courses, respectively:
Student(rollNo, name, degree, year, sex, deptNo, advisor)
Courses(courseld, cname, credits, deptNo)
Enrollment(rollNo, coursed, sem, year, grade)
Consider the TRC query given below
{s.rollNo, s. name| Student(s)  e1 (Enrollment(e1)  (e1. rollNo = s.rollNo))}
Which one of the following is the correct interpretation of the above query?
A. Retrieve rollNo and name of students who have enrolled for exactly one course
B. Retrieve rollNo and name of students who have enrolled for more than one course
C. Retrieve rollNo and name of students who have enrolled for some course
D. Retrieve rollNo and name of students who have all enrolled for same course
Ans. C

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 79
9. Consider the relational schema given in Question 34 and the TRC query given below
{s.rollNo|Student(s)  e1 (Enrollment(e1)  (e1. rollNo =)) ((e2)(e3) (Enrollment)(e2)
Enrollment(e3)  e2. rollNo  e2. coursed  e3. course Id  s. rollNo = e2. rollNo))}
Which one of the following is the correct interpretation of the above query?
A. Retrieve rollNo of students who have enrolled for some course
B. Retrieve rollNo of students who have enrolled for at-least one course
C. Retrieve rollNo of students who have enrolled for at-least two course
D. Retrieve rollNo of students who have enrolled for exactly one course
Ans. D
10. Consider the following relational scheme where cid in Enrollment is a foreign key referring to
cid of Courses.
Enrollment(sid. cid, grade)
Courses(cid. cname. instructor)
Consider the following division operation of relation algebra
sid,cid(Emrollment)/cid (Courses) and the equivalent TRC query given below:
{e.sid\Enrollment(e)  c (Coures(s) op1 e1 (Enrollment (e1)  c.cid =e1.cid opp2 e1.sid =
e.sid))}
Choose the correct operators (op 1. op2):
A. ,  B. ,  C. ,  D. , 
Ans. D
11. In SQL, the operator NOT IN is equivalent to which of the following?
A. =ANY B. =ALL C. < >ANY D. < > ALL
Use the following schema of the academic institution relational database for questions: 38,
and 41.
student(rollNo, name, degree, year. sex. deptNo, advisor)
department(deptld, name, hod, phone)
professor(empld. name. sex. startYear. deptNo. phone)
course(courseld. cname. credits. deptNo)
enrollment(rollNo. courseld. sem. year, grade)
teaching(empld, courseld. sem. year. classRoom)
preRequisite(preCourseld. courseld)
deptNo is a foreign key in the student, professor and course relations referring to deptld of
department relation; advisor is a foreign key in the student relation referring to empID of
professor relation; hod is a foreign key in the department relation referring to empID of
professor relation; rollNo is a foreign key in the enrollment relation referring to rollNo of studen

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


80 | DBMS: CS
relation; courseld is a foreign key in the enrollment, teaching relations referring to courseld of
course relation; empld is a foreign key of the teaching relation referring to empID of professor
relation; preCourseld and courseld are foreign keys in the prerequisite relation referring to
courseld of the course relation.
Ans. D
12. Which of the following queries would retrieve the roll number and names of students who
have enrolled for all the prerequisite courses of the number 324?
A. select s.rollNo, s.name
from students s, enrollment e
where e.rollNo = s.rollNo and
e.courseld = ANY (select preCourseld
from preRequisite p
where p.courseld = “324”)
B. select s.rollNo. s.name
from student s, enrollment e
where e.rollNo=s.rollNo and
e.courseld = All (select preCourseld
from prerequisite p
where p.courseld = “324”)
C. select s.rollNo, s.name
from student s
where NOT EXISTS (select * from prerequisite p
where p.courseld=“324” and
NOT EXISTS (select *
from enrollment e
where e.courseld=p.preCourseld and
e.rollNo=s.rollNo)
D. select s.rollno, s.name
from students s
where EXISTS (select *
from prerequisite p
where p.courseld= “324” and
EXISTS (select *
from enrollment e
where e. coursed=p.pre Courseld and
e.rollNo=s.rollNo))
Ans. C

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 81
13. Which of the following commands is used for schema modification in SQL?
A. Alter B. Update C. Insert D. Delete
Ans. B
14. In order to perform a UNION operation as in an SQL query, the operands of UNION operator
(i) need to be union compatible
(ii) need to have the same attribute names in the same order
(iii) need to have the same attribute names, but they can be in any order
Choose the correct option:
A. Only (i) is true B. Only (ii) is true
C. (i) and (ii) are true D. (i) and (iii) are true
Ans. A
15. Which of the following queries would retrieve the roll number and name of students from
Dept number 3 who have a lady professor from another department as their advisor?
A. Select s.rollNo, s.name
From student s
Where s.deptNo=3 and
p.deptNo<> 3 and
s.advisor IN (Select p. empld from professor p where p.sex = ‘F’);
B. Select s.rollNo, s.name
From student s
Where s.deptNo=and
s.advisor IN (select p.empld from professor p where p.sex = ‘F’ and p.deptNo <>
s.deptNo);
C. Select s.rollNo, s.name
From student s, professor p
Where s.deptNo=3and
p.deptNo <> 3 and
s.advisor IN (select p.empld from professor p where p.sex = ‘F’) ;
D. Select s.rollNo, s.name
From student s, professor p1
Where s.deptNo=3 and
P1.deptNo<> 3 amd
s.advisor IN (select p.empld from professor p where p.sex = ‘F’);

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


82 | DBMS: CS
Use the following schema of GATE exam details of a particular year for questions: 42 and 46.
gateMarks(regNo, name, sex, branch, city, state, marks)
Here. regNo uniquely identifies a student who has written the GATE examination.
The values of the attribute branch indicate the branch of engineering, such as CS, EC, etc
in which the examination was taken. Other attributes are obvious.
Ans. C
16. Which of the following queries would retrieve the regNo, name and marks of students who
obtained the minimum marks in the branch of CS? [MCQ]
A. select name, min(marks)
from gateMarks
where branch="CS"
B. select regNo, name, marks
from gateMarks
where branch="CS“ and marks = ANY (select min(marks)
from gateMarks
where branch="CS")
C. select regNo, name, marks
from gateMarks
where branch="CS“ and marks <=ALL (select marks
from gateMarks
where branch="CS")
D. select regNo, name, marks
from gateMarks
where branch="CS“ and marks <=ANY (select marks
from gateMarks
where branch= “CS")
Use the following schema of the academic institution relational database for questions: 43, 44
and 45.
student(rollNo. name, degree, year, sex, deptNo, advisor)
department(deptld, name, hod, phone)
professor(empld, name, sex, startYear. deptNo. phone)
course(courseld. cname. credits, deptNo)
enrollment(rollNo. courseld. sem, year, grade)
teaching(empld, courseld. sem, year, classRoom)
preRequisite(preCourseld. courseld)
Ans. B,C

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Query Language | 83
17. Which of the following queries would compute, for each department, the total credits of all
the courses offered by the department?
A. Select deptld, name, sum(credits) as totalCredits
From department, course
Where deptld = deptNo
Group by deptld, name;
B. Select deptld, name, count(credits) as totalCredits
From department, course
Where deptld = deptNo
Group by deptld, name;
C. Select deptld, name, sum(credits) as totalCredits
From department, course
Where deptld = deptNo;
D. Select deptld,name, sum(credits) as totalCredits
From department, course
Group by deptld, name
Having deptld = deptNo;
Ans. A
18. Which of the following queries would retrieve the deptld and name of all the departments that
are such that the total of the credits of all the offered courses by the department is strictly
greater than 40?
A. Select deptld,name, sum(credits) as totalCredits
From department,course
Where totalCredits > 40
Group by deptld,name
Having deptld = deptNo;
B. Select deptld, name, sum(credits) as totalCredits
From department, course
Where deptld = deptNo and totalCredits > 40
Group by deptld, name;
C. Select deptld,name, sum(credits) as totalCredits
From department,course
Group by deptld,name
Having deptld = deptNo and totalCredits > 40;

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


84 | DBMS: CS
D. Select deptld, name, sum(credits) as totalCredits
From department, course
Where deptld = deptNo
Group by deptld, name
Having totalCredits > 40;
Ans. D
19. Which of the following queries would retrieve the deptNo and total credits of department(s)
which has(have) the maximum total of credits for courses offered by the department across all
departments? [MSQ].
A. select deptNo, sum(credits) as totalCredits1
from course
group by deptNo
having totalCredits1 = ANY (select max(x.totalCredits)
from (select sum(credits) as totalCredits
from course
group by deptNo)as x
);
B. select deptNo, max(sum(credits)) as totalCredits1
from course
group by deptNo;
C. select deptNo, sum(credits) as totalCredits1
from course
group by deptNo
having totalCredits1 = max(totalCreditsl);
D. select deptNo, sum(credits) as totalCredits1
from course
group by deptNo
having totalCredits1 >= ALL(select sum(credits)
from course
group by deptNo);
Ans. A, D
20. The following relation is used to store the detail of the students in an engineering college:
Student(rollNo, name sex).
Consider the following predicates.
P1: rollNO= ‘CS17B038’
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Query Language | 85
P2: name= ‘Suresh’ (Assume that there are at least two people in the college with name
‘Suresh’)
P3: sex = ‘Male’
Let c1, c2 and c3 denote the selectivity of P1, P2 and P3, respectively. Under normal conditions,
the relation among c1, c2 and c3 is
A. c1 < c2 < c3 B. c1< c2 > c3
C. c1 < c2 = c3 D. c3 < c2 < c1
Ans. A

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


CHAPTER-4
Transaction and Concurrency Control

4. Transaction and Concurrency Control................................................86-126

• Transaction …………………………………………………………………………. 86
• Classification of Schedule……………………………………………………………72
• Classification of schedule based on serializability…………………………………. 97
• Two types of schedule based on serializability ……………………………………. 97
• Conflict Serializable Schedule ………………………………………………………97
• Conflict Serializable Schedule……………………………………………………….99
• View Serializable Schedule ………………………………………………………..102
• Concurrency Control Protocol………………………………………………...……104
• Locking Protocol……………………………………………………………………104
• Basic Time-stamp Ordering ………………………………………………………..109
• Thomas’s Write Time Stamp Protocol……………………………………………...110
• Deadlock Prevention Protocol………………………………………………………111
4 Transaction and Concurrency Control
Transaction
A transaction is a group of operations that together make up a single logical unit of work. One or
more database access operations, such as insertion, deletion, update, or retrieval activities, are
included in a transaction.

Transfer of money from one account to another is a transaction.


A transaction is referred to as a read-only transaction if the database operations in it only
obtain data rather than updating the database; otherwise, it is referred to as a read-write
transaction.
The actions that can be executed by a transaction include Reads and Writes of database
objects. Each transaction must state whether it will commit (i.e., successfully complete) or
abort (i.e., stop and reverse all previous actions) as its final action.
Read(A): Accessing data item from DB to MM (Programmed variable)
Write (A): Update Data-item A into DB
Commit: Transaction executed successfully

Transfer Rs.500 from A to B


Initially values of data item A = 1000
Initially values of data item B = 2000
Begin Transaction
R(A)
A = A – 500
W(A)
R(B)
B = B + 500
W(B)
Commit
End Transaction
To preserve the consistency, transaction should satisfy the following ‘ACID’ properties: -
Atomicity- A transaction is an atomic unit of processing; it should either be
performed in its entirety or not performed at all.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 87
Consistency preservation:- A transaction should be consistency preserving, meaning that if
it is completely executed from beginning to end without interference from other transactions,
it should take the database from one consistent state to another.
Isolation:_ Even though numerous transactions are running simultaneously, each transaction
should appear to be running independently of the others. In other words, no other transactions
running simultaneously should obstruct a transaction's execution.
Durability:- A committed transaction must leave the database with the modifications it made.
There must be no failure that results in these changes being lost.
(1) A: Atomicity: Execute all operation of transaction (OR) none of them.

Let transaction T transfer $500 from account A to account B.


R A
A=A-500
W A
R  B
*B=B+500
W  B

Where, * denote failure of transaction.


Any consequences of the transaction on the database must be undone(rollback) by the
recovery approach if a transaction fails to finish for any reason, such as a system crash in the
middle of transaction execution.

Reason for Transaction Failure: -


(1) Power failure (2) Software crash
(3) Operating system (4) Hardware Crash (Disk)
(5) Operating system may kill the transaction (dead lock)
It atomicity is not satisfied, then it create inconsistency.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


88 | DBMS : CS
Recover Management Component: Whenever transaction is failed before commit, then
rollback of transaction is done by recovery management component.
Rollback means Undo All Modification.
For Rollback every transaction follow transaction log.
Transaction log: The system keeps a log6 to record all transaction activities that change the
values of database objects and other transaction information that could be required to enable
recovery from failures in order to be able to recover from failures that affect transactions. The
log is stored on disk as a sequential, append-only file, therefore all failures except catastrophic
or disc failure have no effect on it.
Log records are the entries that are written to the log file and the corresponding action for
each log record. In these entries, id refers to a unique transaction-id that is generated
automatically by the system for each transaction and that is used to identify each transaction.
1.{start_transaction, id }:- mean transaction T id has started execution.
2. {write_item, id]., X, old_value, new_value }. mean transaction Tid has changed the value
of database item X from old_value to new_value.
3. {read_item, id]., X }. Indicates that transaction Tid has read the value of database item X.
4. {commit, id } Indicates that transaction Tid has completed successfully, and affirms that
its effect can be committed (recorded permanently) to the database.
5. {abort, id } Indicates that transaction Tid has been aborted. It is possible to reverse the
effects of these WRITE operations of a transaction T by tracing back through the log and
setting all items changed by a WRITE operation of T to their old_values because the log keeps
track of every WRITE operation that modifies the value of some database item.
If a problem occurs before the system is sure that all of the new_values in a transaction have
been written from the main memory buffers to the actual database on disk, then a redo of the
operation might also be required.
* Once the transaction is completed successfully, then log file is deleted.
* Log file maintained by system until transaction committed (OR) rollback completed.

T.log
Add=1000
Anew=500
Bold=2000
Bnew=2500
(2) Durability: Rollback (Recovery) should possible under any case of failure . durability
property is the responsibility of the recovery subsystem of the DBMS
(3) Consistency: DB should be consistence before and after execution of transaction.
Taken care by use
* If RMC & CCMC both are success, then always consistency satisfied.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 89
(4) Isolation: The database system needs to take extra precautions to guarantee that
transactions run properly without interference by concurrently running database statements.
So,isolation mean concurrent execution of two (or) more transaction should be equal to any
serial schedule (serializable schedule).

Let transaction T1 and T2 .


T1: - Transfer $500 from account A to account B
R(A)
W(A)
R(B)
W(B)
T2: Display total balance of account A and account B
R(A)
R(B)
Schedule: - Time order sequence of 2 or more transaction.
Categories of Schedule:
(1) Serial Schedule: Transaction in the schedule executing one after another, i.e., after
commit of one transaction then only another transaction begins.

Let serial schedule S1 is:-

T1 T2
100 R(A)
0
W(A
500 )
200 R(B)
0
W(B R(A 500
250 ) )
0 250
R(B 0
)

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


90 | DBMS : CS

Let serial schedule S2 is:-

T1 T2
R(A) 1000
R(B) 2000
1000 R(A)
500 W(A)
2000 R(B)
2500 W(B)

Serial schedule are always consistent in nature.


Note: n! serial schedule possible with n-transaction
Adv: (1) Any serial schedule result consistency output
Dis.: If transaction need I/O operation, then CPU becomes ideal (R(A), W(A)). So,
throughput is poor,poor resource utilization & response time is high.
Throughput: Number of processes executed in a unit time.
(1) Concurrent schedule: Inter leaved (OR) simultaneous execution of 2 (OR) more
transaction.

Let initial value of A=1000


Let initial value of B=2000
Let T1 task is to transfer $500 from A to B
Let T2 task is to display the balance of A and B
Schedule S1 :
T1 T2
R(A) 1000
W(A) 500
R(A) 500
R(B) 2000
W(B) 2500
R(B) 2500
Schedule S1 is consistent and final result is equivalent to serial schedule T1->T2

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 91
Schedule S2 :
T1 T2
R(A) 1000
R(A) 1000
W(A) 500
R(B) 2000
R(B) 2000
W(B) 2500
Schedule S2 is consistent and final result is equivalent to serial schedule T2->T1
Schedule S3:-
T1 T2
R(A) 1000
W(A) 500
R(A) 500
R(B) 2000
R(B) 2000
W(B) 2500
Schedule S13 is not consistent.
Advantage of concurrent schedule: - More throughput, better resource utilization, less
response time
Disadvantage of concurrent schedule:- May result inconsistency.
Concurrency Control Management Component allow to execute only consistence schedule.
Isolation: Concurrent execution of two (OR) more transaction should be equal to any serial
schedule (serializable schedule)
Isolation is taken care by ‘CCMC’.
Note:- T1 has p operation, T2 has q operation then number of concurrent schedule possible =
(p+q)
Cp (OR) (p+q)Cq
* Every serial schedule are concurrent schedule (but not vice-versa)
* Number of concurrency schedule >>> Number of serial schedule.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


92 | DBMS : CS
Problem because of concurrent schedule
(1) RW Problem (Write After Read) – Unrepeatable Read
T1 T2
R(A)
W(A)
Transaction(T2) update data-item (A) which is read by transaction (T1) which is uncommitted.
 i.e., Transaction(T2) change the value of data-item (A) that has been read by transaction.
T1 while T1 still in progress.
Problem: If T1 tries to read the value of A again, it will get a different value. So, it is called
Unrepeatable read problem.

Issue DBMS book from library.


Let A= Number of copies of DBMS textbook.
For issue DBMS textbook, operation are
R(A)
if(a>0)
{
A=A–1
W(A)
Commit
}
Else
{
“No textbook available”
}

Let initial value of A = 10


T1 T2
R(A)

if(A>0)

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 93
A=A–1
R(A)

if(A>0)

A=A–1
W(A)
W(A)
Final value of A after successful completion must be 8 after 2 book are issued. But according
to above code ,final value of A is 9 which results inconsistency.
(2) W.R. Problem (Read After Write) – Reading Uncommitted data or Temporary
Update problem
T1 T2
W A
R(A)

Note:- Above R(A) operation is Uncommitted Read or Dirty Read


Transaction T2 could read a DB item A that has been modified by another transaction T1
which is not yet committed is called Dirty Read.
This issue arises when a transaction attempts to change a database item but fails for some
reason. The updated item is accessed (read) by another in between before being changed back
to its initial value.

T1: Transfer $ 500 from account A to account B.


T2: Display account A and account B balance.
T1
R  A  T2
W A

R A
R  B

 fail
R  B
W  B

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


94 | DBMS : CS
(3) W-W (Write After Write) – Overwriting Uncommitted Data
T1 T2
W A
W A

Let T1: Set A, B value as 1000


T2: Set A, B values as 2000
Let transaction executed in serial manner (T1T2), then final value of A and B is 1000.
Let transaction executed in serial manner (T2T1), then final value of A and B is 2000.
But, let transaction executed in concurrent manner then,
T1 T2
W A A= 1000 2000
W A B= 2000 1000
W  B
W  B inconsistence as final value of A and B not equal to any serial schedule.

LOST UPDATE PROBLEM


T1 T2
R A
W A
W  A   This udation lost
due to rollback
*  Rollback
Blind Write: If a transaction update (write) the data-item without performing any read
operation is called blind write.
Incorrect summary problem: If one transaction is calculating an aggregate summary
function on a number of records while other transactions are updating some of these calculate
some values before they are updated and others after they are updated.
Classification of Schedule:
(1) Based on Recoverability.
(2) Based on Serializability.
(1) Based on Recoverability:
(a) Irrecoverable (non-recoverable) schedule: some situation which are not possible to recover.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 95

Rollbacking of committed transaction. A transaction T should never need to be rolled back


once it has been committed. This makes sure that transactions' durability property is not
compromised
T1
W A T2

. R A
Commit C
(OR)
Rollback

Transaction T2 read the data item which is updated by T1 and if T2 commit before C/R of T1.
(b) Recoverable Schedule:
If transaction T2 reads data-item which is modified by T1, then commit (or) rollback of T1
should before commit of T2.
T1 T2
W A
R A
C/R
C

Note: If a schedule is recoverable, it does not mean that is is always free from inconsistency.
* It may be inconsistence schedule.

T1 T2
R A
W A This scheduleis
R A Recoverable, but
R  B Inconsistence schedule
R  B
W  B
C1
C2

So, A recoverable schedule may be non-serializable schedule.


So, Recoverable schedule may not free from non-serializable i.e., W-R, R-W, W-W, lost
update problems may exist.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


96 | DBMS : CS
Cascading Rollback: Failure of one transaction resulting rollback of set of dependent
transaction.
Disadvantage: Wastage of CPU time effect system throughput.

T1 T2 T3
R(A)
W(A)
R(A)
W(A) Rollback
R(A)
*fail Rollback W(A) Rollback
Cascade less Rollback Schedule: If transaction (T1) updates the data item(A), then other
transaction (T2) is not allowed to read data-item(A) until commit (OR) rollback of T1
T1 T2
W A
C
R A

Disadvantage: May not free from inconsistency.


T1 T2
R A
W A
W A  get lost update problem.
Rollback So, it is inconsistence schedule

Strict Recoverable Schedule: If transaction (T1) performs write (A) operation, then other
transaction not allowed to read/write data-item (A) until commit (OR) rollback of T1.
T1 T2
W A
C
R A W A

So, W-R, W-W, lost update problem are not possible, but R-W may exist.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 97

Consider the following schedule:


R1(x) R2(z) R1(z) R3(x) R3(y) w1(x) w3(y) R2(y) w2(y) C1 C2 C3
Is given schedule is irrecoverable, recoverable, cascadless or strict recoverable schedule?
Solution: Irrecoverable because of operation w3(y) R2(y) and C2 C3
Question: R1(x) R2(z) R3(x) R1(z) R2(y) R3(y) W1(x) C1 W2(z) W3(y) W2(y) C3 C2.
Solution: Recoverable and cascade-less but Not Strict because of operation W3(y) W2(y).
Classification of schedule based on serializability
Schedules that are always considered to be correct when concurrent transactions are executing. Such
schedules are known as serializable schedules i.e. result of concurrent schedule is equivalent to any
serial schedule.
Two types of schedule based on serializability
1) Conflict Serializable Schedule
2) View Serializable Schedule
Conflict Serializable Schedule
Conflict Pair: A pair of operation is called conflict pair if
(1) Atleast one Write operation
(2) Same data-item
(3) Different transaction

 a  T1 : R  A  T2 : R(A)
T1 T2 T1 T2

So, Not Conflict Pair.


R A R A
R A R A
S1  S2

i.e., execution sequence S1 and S2 give same result. So, it is not conflict pair.
 b  T1 : R  A  T2 : W  A 
T1 T2 T1 T2
R A W A
W A R A

S1 S2
S1  S2
Conflict Pair

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


98 | DBMS : CS
Because T1 read initial value of A in S1, whereas in S2, it read updated value.So,it is a conflict
pair.
 c  T1 : W  A  T2 : R
T1 T2 T1 T2
W A R A
R A W A

S1 S2
S1  S2
Conflict Pair
Because T2 read the updated value of A in S1, whereas in S2, it read initial value. So, it is a
conflict pair.
 d  T1 : R  A  T2 : R(A)
T1 T2 T1 T2

W A W A
W A W A
S1  S2

Because final updated value of A in S1 and S2 is not same. So, it is a conflict pair.
Conflict Equivalent Schedule: When two operations in a schedule access the same database
item, are part of different transactions, and are either both write_item operations or one
write_item and one read_item operation, respectively, they are said to conflict. The effects on
the database or the transactions in the schedule may change if two conflicting operations are
applied in two schedules in different orders, therefore the schedules are not conflict
equivalent.
So, if S2 results after swapping order of consecutive non-conflict pair of S1, then S1 and S2
are conflict equivalent schedule.
T1 T2 T1 T2
R A R A
W A W A
R A R A
R  B R  B
R  B R  B
W  B non-conflict W  B
pair
S1 S2

In above example,R1(B) and R2(B) are non conflict pair, so S1& S2 are Conflict Equivalent
Schedule

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 99

Conflict Serializable Schedule


Conflict serializable schedule possible only if result of conflict equivalent schedule should be equal
to any serial schedule

T1 T2
R A
W A
R A is Not CSS because output of schedule is not equivalent to any serial schedule.
R  B
R  B
W  B

T1 T2 T1 T2
R A R A
W A W A
R A R  B
R  B R A
W  B W  B
R  B R  B
S1 S2
S1  S2  Conflict Equivalent Schedule  because final output of schedule
is equivalent to serial schedule T1->T2

Note: If two schedule are conflict equivalent schedule, then it doesn’t mean that they are
conflict serializable schedule.

Find conflict Equivalent Schedule.


S1: R2(A) W2(A) R3(C) W2(B) W3(A) W3(C) R1(A) R1(B) W1(A) W1(B)
S2: R3(C) R2(A) W2(A) W2(B) W3(A) R1(A) R1(B) W1(A) W1(B) W3(C)
S3: R2(A) R3(C) W3(A) W2(A) W2(B) W3(C) R1(A) R1(B) W1(A) W1(B)

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


100 | DBMS : CS
T1 T2 T3 T1 T2 T3 T1 T2 T3
R A R C R A
W A R A R C
R C W A W A
W  B W  B W A
W A W A W  B
W C R A W C
R A R  B R A
R  B W A R  B
W A W  B W A
W  B W  B

S1 S2 S3

S1& S2 are conflict Equivalent Schedule S3 is Not Conflict Equivalent Schedule because in
S3 ,non conflict pair (W3(A),W2(A) ) operation sequence is changed.
Precedence Graph
Precedence Graph is used to check whether schedule is conflict serializable schedule or not.
Let graph G: (V, E) V = Transaction of Schedule
E = Conflict pair precedence.
TiTj only if there exist conflict pair such that Ti precedence Tj
Ti : R A Tj: W  A  or
Ti : W A Tj: R  A  or
Ti : W A Tj: W  A  or

R2(A) W2(A) R3(C) W2(A) W3(A) W3(C) R1(A) R1(B) W1(A) W1(B)
Edge T2 to T1 as W2(B), R1(B) operation.
Edge T2 to T3 as W2(A) W3(A) operation.
Edge T3 to T1 as W3(A) R1(A) operation.
T1 
 T2
No Cycle. So, schedule is Conflict Serializable Schedule.
T3

R2(A) R3(C) W3(A) W2(B) W3(C) R1(A) R1(B) W1(A) W1(B)

Cycle in precedence graph, So, schedule is not Conflict Serializable Schedule.


----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Transaction and Concurrency control | 101
Testing Condition for conflict serializable schedule
(1) If precedence G is Acyclic, then schedule is conflict serializable schedule.
Equivalent serial schedule is topological sequence of Acyclic precedece G.
(2) If precedence G is Cyclic then schedule is Not Conflict Serializable schedule.
Topological Sequence: is a technique used to traverse graph only if G is Acyclic.
Step-1: Visit Vertex (V) with indegree ‘0’ and delete Vertex (V) from G.
Step-2: Repeat first step until G becomes Empty

T1 
 T2
T3

T2 T3 T1 Serial Schedule

T1T2T3T4T5
(OR)
T1T2T4T3T5

Check schedule is Conflict serializable schedule.


T1 T2 T3
R A
W A
W A
W A

Schedule is Not Conflict serializable schedule due to cyclic in precedence graph.


But, it is equivalent to serial Schedule T1 T2 T3
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
102 | DBMS : CS
i.e.,
T1 T2 T3 T1 T2 T3
R A R A
W A  W A
W A W A
W A W A

Note: If Schedule is not conflict serializable schedule(CSS) then it does not mean that
schedule is always non-serializable.
Testing Condition
If (Acyclic Precedence G) then CSS  Serializable – It is only sufficient condition for
serializability.
else
Not CSS  may (OR) may not serialzable– It is not neccesary condition for Non-serlizability
So, CSS is sufficient but not necessary for serializability.
Note:- If schedule is Not CSS, then there should be blind write.
View Serializable Schedule
View Equivalent: Two schedules S1 and S2 are view equivalent if the following 3-condition holds.
(1) For each data-item x, if transaction Ti read the initial value of x in schedule S1, then
transaction Ti must also read the initial value of x in schedule S2.
(2) If Ti reads a value of x written by Tj in S1, then it must also read the value of x written by Tj
in S2.
(3) For each data-item x, the transaction (if any) that performs the final write on A in S1 must
also perform the final write on A in S2.
View serializable schedule(VSS): A schedule is view serializable if it is view equivalent to some
serial schedule.
Note: (1) Every CSS is VSS but not vice-versa.
(2) Any view serializable schedule that is not CSS contain a Blind Write.
Note: Initial Read means before read, no write operation performed on that data-item.

T1 T2 T3
R A
W A is NOT CSS but VSS (T1 T2 T3) i.e.,
W A
W A

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 103
T1 T2 T3
R A
W A
W A
W A
C1
C2 C4

Below schedule is Correct schedule but not serializable: Because addition and subtraction
operation are commutative, they can applied in any order i.e., it is possible to produce correct
schedules that are not serializable.

T1 T2
R(X) R(Y)
X = X-10 Y=Y-20
W(X) W(Y)
R(Y) R(X)
Y=Y+10 X=X+20
W(Y) W(X)
Let
R1  X 
W1  X 
R2 Y
W2  Y 
R1  Y 
W1  Y 
R2 X
W2  X 

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


104 | DBMS : CS
is non-serializable schedule, but schedule is considered as correct because the operation
between each Ri(I) and Wi(I) are commutative and the order of executing the sequence
consisting of (read, update, write) is not important as long as each (Read, update, write)
sequence by a particular transaction Ti on a particular (I) is not interrupted by conflicting
operations.
Concurrency Control Protocol
(1) Locking protocol
(2) Timestamp Protocol
Locking Protocol:
Lock: is a variable used to identify status of data-item
Locking Protocol :-A locking protocol is a set of rules to be followed by each transaction such that
if transaction are interleaved, then net effect is identical to execution all transaction in some serial
order.
DBMS responsibility is allow only serializable, recoverable schedule and DBMS responsibility uses
locking protocol to achieve this
(1) Shared-Exclusive lock
(a) Shared Lock (S): - If a transaction T want to perform read operation then shared-mode lock
(denoted by S) is used.
Example: T1 Example: T1
S(A) S(A)
R(A) R(A)
W(A)  Not allowed

(b) Exclusive lock (X): - If a transaction T want to perform read / write operation then exclusive-
mode lock (denoted by X) is used.

Example: T1
X(A)
R(A)
W(A)

Lock Compatable Table: - tells that whether Tj request is grant or denied if Ti already hold
a lock on that data item.
Request () /Hold() Shared Exclusive
Shared Allowed Not Allowed
Exclusive Not Allowed Not Allowed

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 105

T1 T2
X A
R A
W A
U A
S A
R A
U A
S  B
R  B
U  B
X  B
R  B
W  B
U  B

Above schedule is allowed to executed by S/X-lock


Note: Given schedule is non-serializable schedule but, still allowed to executed by shared-
exclusive lock.
So, only S/X lock is not sufficient for serilizability.
Two-phase locking protocolo (2PL): -
Transaction (T) can allowed to request lock as data-item A only if none of the data-item is
unlock by Transaction (T).It has 2 phase a) Locking phase/Growing phase b) Unlocking
phase/Shrinking phase

T
Growing X  A 

(OR)  S  B 

Locking  S  C 
Phase  X  D 
 lock-point  Position of after last lock (OR) position of before first unlock point 
Shrinking  U  C 

(OR)  U  A 

Unlock  U  B 
Phase  U  D 

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


106 | DBMS : CS

T1 T2
X A
R A
W A
X  B
U A
SA
R A
S  B   requestNot allowed
R  B

R  B
W  B

In above example ,Due to 2PL protocol,T2 request of S(B) is denied as T1 already hold a
X(B) on data item B.
Note:-2PL not allowed to execute the non–serializable schedule.(above schedule is non–
serializable)

T1 T2
X A
R A
W A
X  B
T1 lock point  U  A 
S A 
R A
R  B
W  B
U  B
S  B
R  B   T2 lock point
U A
U  B

Note: 1)If schedule is in 2PL, then schedule always conflict serilizable schedule (So, 2PL
always serializable)

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 107
2)No non-serializable schedule allowed to execute by 2PL.
3)Every CSS may (OR) may not in 2PL.
4) 2PL free from non-serializability always.
5)If schedule is in 2PL then equivalent serial schedule is based on lock-point.

T1 T2 T3
• T2  T3  T1
• serial schedule

Limitation of 2PL
(1) Not free from irrecoverable schedule.
T1 T2
X A
W A
X  B
U A
S A 
R A
W  B
U  B
S B
R  B
Commit
Commit

Above Schedule is in 2PL  Ensure Serializability(T1 T2)


But, not recoverable schedule.
(2) Not commit free from Deadlock:
T1 T2
X A
W A
S  B
R  B
Denied  X  B 
W  B
S  A   Denied
R A

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


108 | DBMS : CS
(3) Not Free from Starvation
T1 T2 T3 T4
S A 
denied So,wait  X  A 
S A 
U A
denied So,wait  X  A 
S A 
U A
denied So,wait  X  A 

Strict 2PL: - 2PL + transaction T does not release any of its exclusive (write) locks until after
it commits or aborts. Hence, no other transaction can read or write an item that is written by
T unless T has committed, leading to a strict schedule for recoverability
Advantage: Always ensure serializability and strict recoverability.
Disadvantage: Not free from Deadlock & Starvation.

T1 T2
X A
R A
W A
C/R
U A
S A  X  A 

Regorous 2PL: Basic 2PL and transaction T does not release any of its locks (exclusive or
shared) until after it commits or aborts.
Advantage: Results strict recoverable schedule & serializable schedule
 Equivalent serial schedule is based on order of ‘Commit’.
T1 T2 T3
C
T3  T1  T2
C
C

Disadvantage: Not free from Deadlock & starvation but, easy to implement than strict 2PL.
Conservative 2PL (or) Static 2PL:-
 2PL + all lock should be hold before the transaction begins, by predeclaring its read-set
and write-set.
Disadvantage: Less concurrency may be difficult to identify the required lock.
May not free from starvation.
Advantage: Deadlock free.
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Transaction and Concurrency control | 109
2)Time-Stamp Ordering Protocol
Note:-Time-Stamp Ordering Protocol always free from Deadlock.
Time-Stamp Value: - Unique value assigned by DBMS to every transaction in ascending
order.
10 20 30 40
T1 T2 T3 T4

T1: older transaction


T4: Younger transaction
Read-Timestamp (A): Highest transaction timestamp value that has perform read (A)
operation successfully.
10 20 30 40
T1 T2 T3 T4
R A RTSA = 0 10 30
R A
R A

Write Timestamp (A): Highest transaction timestamp value that has perform Write(A)
operation successfully.

10 20 30 40
T1 T2 T3 T4
R A
W A WTS_A = 0 10 30 40
W A
W A
W A

Note: In timestamp ordering protocol, concurrent execution of schedule(S) is equal to serial


schedule based on time-stamp order.
Basic Time-stamp Ordering
Possible Rollback:
(1) 10 20 (2) 10 20 (3) 10 20
T1 T2 T1 T2 T1 T2

W(A) R(A) W(A)


R(A) W(A) W(A)
  
If Younger transaction If Younger transaction If younger transaction
update the data read the data-item update the data-
item(A), then (A) then older item(A) then older

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


110 | DBMS : CS
older transaction transaction not transaction not
not allowed to allowed to write. allowed to write.
read.So rollback So rollback older So rollback older
older transaction transaction transaction
Case-1: T1 issue R(A):
If WTS(A) > TS(T1), then rollback T1 otherwise, allow to excute R(A) by T1 and Set RTS(A)
= Max(TS(T1), RTS(A))
Case-2: T1 issue W(A):
If RTS(A) > TS(T1), then rollback T1 .
If WTS(A) > TS(T1), then rollback T1 .
otherwise, allow to execute W(A) by T1 and Set WTS(A) = TS(T1)
Thomas’s Write Time Stamp Protocol:
(1) 10 20 (2) 10 20 (3) 10 20
T1 T2 T1 T2 T1 T2

W(A) R(A) W(A)


R(A) W(A) W(A)
  
If Younger transaction If Younger transaction If younger transaction
update the data read the data-item update the data-
item(A), then (A) then older item(A) then older
older transaction transaction not transaction not
not allowed to allowed to write. allowed to write.
read.So rollback So rollback older So,ignore the write
older transaction transaction operation and
continue the
transaction.
1. It always ensure serializability
2. Not free from irrecoverability.
3. Not free from starvation.

10 20
T1 T2

W(A)
R A
C
C
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Transaction and Concurrency control | 111
* Basic Time Stamp ordering protocol.
* Thomas’s Write TS ordering protocol but not recoverable schedule.
To avoid recoverability problem, we use strict time stamp ordering protocol.
Strict Time Stamp ordering Protocol: To achieve the recoverability,foloow the rules of
Strict Recoverability i.e.
Basic TS Protocol (OR) Thomas’s Write TS Protocol and Strict Recoverability
 It always ensure serializability, recoverability deadlock free-but, not free from starvation.
Deadlock Prevention Protocol
Timestamp values is used to prevent deadlock in locking protocol.
Dependency Graph.
T1 T2
T1 
 T2
X A
S  A   request
i.e.,T2 is depend on T1

(1) WAIT-DIE Protocol:-


(a) Let Ti and Tj are two transactions in schedule with TS(Ti) < TS(Tj) and
(b) (i) If Ti required the resource i.e., holded by Tj then Ti allowed to Wait.
Ti 
 Tj
T1 allow to Wait
old younger

(ii) If TJ required the resource i.e., holded by Ti, then Rollback TJ.
Ti  Tj
Rollback TJ and restart with same TS value.
old younger

(2) WOUND-WAIT Protocol:-


(a) Let Ti and Tj are two transactions in schedule with TS(Ti) < TS(TJ)
And
(b) (i) If transaction Ti depend on TJ then Rollback TJ
Ti  Tj
Rollback TJ (restart with same time stamp value)
old younger

(ii) If transaction TJ depend on Ti then TJ allowed to Wait.


Ti  Tj
TJ allow to wait
old younger

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


112 | DBMS : CS
PRACTICE QUESTIONS
1. For the below transaction, which of the following is correct?

T1 T2
R(A)
R(A)
W(A
)
W(B W(A)
)
W(A
)

(A) Schedule is view serializable


(B) Schedule is not view but conflict serializable
(C) Schedule is view scheduled but not conflict serializable
(D) none
Sol. (C)
Precedence graph for schedule is

Graph contain cycle. So, schedule is not conflict serializable schedule. But initial read, final
update are same as T2  T1  T3 serial schedule So, schedule is View serializable
2. Which of the following scenario may lead to an irrecoverable error in database system?
(A) A transaction writes a data-item after it is read by an uncommitted transaction
(B) A transaction reads a data item after it is written by an uncommitted transaction
(C) A transaction reads a data item after it is read by an uncommitted transaction
(D) A transaction reads a data item after it is written by a committed transaction
Sol. (B)
Irrecoverable schedule is like

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 113

T1 T2

W(A) R(A)
C
Rolback*
3. Consider the following schedule
S: R1 (A) R2(A) W2 (A) W1 (A) C2 C1
(A) Schedule is irrecoverable
(B) Schedule is recoverable but not cascadeless
(C) Schedule is cascadeless but not strict
(D) Schedule is strict
Sol. (C)
Given schedule is

T1 T2
R(A)
R(A)
W(A)
C
W(A
)
C

Given schedule is cascade less but not street recoverable schedule


4. Consider the following schedule:
S: R1(A) R2 (B) R3(B) W1(A) W3(A) R2(C) W1(A)
Assume that time-stamp for three transaction T1, T2, T3 are {30, 10,20) respectively. Which
of the following is TRUE?
(A) The schedule is allowed under basic time stamp protocol
(B) The schedule is allowed under thomas write time stamp protocol
(C) The schedule is not under both basic time and thomas write protocol
(D) none

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


114 | DBMS : CS
Sol. (B)
In thomas-write times tamp protocol, if younger transaction write the data item before older
transaction update, then ignore the operation of older transaction.

T1 T2 T3
R(A)
R(B)
R(B)
W(A
) W(A)
R(C)

W(A
)
5. Consider the following schedule
S: R1(X) R2(X) R1(X) C1 W3(X) C2 R3 (X) C3
(A) Schedule is not view serializable but recoverable schedule
(B) Schedule is not conflict serializable but strict recoverable schedule
(C) Schedule is conflict serializable but strict recoverable schedule
(D) Schedule is conflict serializable but not strict recoverable schedule
Sol. (C)
T1 T2 T3
R(X)
R(X)

R(X)
C W(X
)
(C)

W(X
)
C

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 115
So, precedence graph is

So, schedule is conflict serializable but not strict recoverable schedule because no transaction
perform read or write operation on uncommitted transaction.
6. Consider the following schedule

T1 T2
R(A)
W(A)
R(A)
W(A
) back
Roll
C

Schedule is
(A) Recoverable (B) Cascodeless
(C) Irrecoverable (D) Strict recoverable
Sol. (C)
As T1 reads a data item which is update by T2 and T1 commit first. So, recoverability is not
possible
5. Consider the following schedule
S: R1(x) R2(z) R1(z) R3(y) R4(x), W1(y) W3(y) R2 (y) W2(z) W2(y) then
(A) S is conflict serializable with T1  T2  T3
(B) S is conflict serializable with T2  T3  T1
(C) S is conflict serializable with T2  T1  T3
(D) Not conflict serializable
Sol. (D)
6. Match List-1 with List-II
List – I List – II
(a) W-R conflict 1) Unrepeatable rend
(b) R-W conflict 2) Phantom problem
(c) W-W conflict 3) Lost update

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


116 | DBMS : CS

A B C
(A) 4 1 3

(B) 4 2 3
(C) 2 4 3
(D) 2 1 4
Sol. (A)
W – R conflict - Reading uncommitted data
R – W conflict - Unrepeatable read
W – W conflict - Over writing uncommitted data
7. Which of the following is correct?
(A) A schedule will be irrecoverable if a transaction reads a data item and commit after it
is written by an uncommitted transaction
(B) A schedule will be irrecoverable if a transaction writes a data item and commit after
it is read by an uncommitted transaction
(C) Both of these
(D) none of these
Sol. (A)
Since, the data item is written by an uncommitted transaction there are a chance that, it might
abort. Hence, if another transaction reads the data item and commit, the sechedule will be
irrecoverable.
8. Total number of serial schedule possible with 5 transaction is_________
Min : 120.0
Max : 120.0
Answer 120
Sol.1: ();
N! Serial schedule are possible with n transaction. So, S! = 120
9. Consider the following statement
(1) All serial schedule are always consistance schedule
(2) All concurrent schedule are always consistance schedule
Which of the following is FALSE?
(A) 1 only (B) 2 only (C) Both (D) none
Sol. (B)
Concurrent schedule may be inconsistence schedule

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 117

T1 T2
R(A)
W(A) R(A)
R(B)
R(B)
W(B)

is inconsistence schedule
10. Match the following
Table – I
(1) Atonieity
(2) Isolation
(3) Consistency
Table – II
(a) Concurrent execution of two or more transaction should be equal to more transaction
should be equal to any serial schedule
(b) Execute all operation or none
(c) Before and after execution of transaction, DB should be equal
(A) l – a, 2 – b, 3 – c (B) 1 – c, 2 – a, 3 – b
(C) l – b,2 – a,3 – c (D) l – b, 2 – c,3 – a
Sol. (C)
Atomicity means execute all operation or more isolation means concurrent execution of two
or more transaction should be equal to any serial schedule. Consistency means before and
after execution of transaction. DB should be equal
11. Consider the following schedule
R3(A) R1(A) W3(A) R2(A) W1(B) R2(B) W2(A) C3 C1C2
Which of the following is True?
(A) Schedule is not recoverable
(B) Schedule is recoverable but not cascaddess
(C) Schedule is recoverable and cascadless
(D) Schedule is strict recoverable

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


118 | DBMS : CS
Sol. (B)

T1 T2 T3
R(A)

R(A)
W(A)
R(A)
W(B)
R(B)
W(A)
C3
C1
C2

Schedule is recoverable but not cascadless because W3(A) R2(A) operation


12. Consider the following schedule S1 and S2
S1: R1(A) W1(A) R2(A) R1 (B) R1(B) W1(B)
S2: R1(A)W1(A) R2(A) R1(B) R2(B) W1(B)
I) Both S1 and S2 are conflict equivalent schedule
II) S1 and S2 are not conflict serializable schedule.
Which of the following statement is True?
(A) 1 only (B) 11 only
(C) Both (D) none
Sol. (C)
S2 results after swapping order of consecutive non-conflict pair of S1. So, S2 and S, are conflict
equivalent schedule. But, non of the schedule is equivalent to any serial schedule
13. Consider the following schedule
S1 : R2(x) W2(x) R3(z) W2(y) W3(x) W3(z) R1(x) R1(y) W1(x) W1(y)
S2: R2(x) R3(z) W3(x) W2(x) W2(y) W3(z) R1(x) R1(y) W1(x) W1(y)
(1) S1 is conflict serializable schedule but not S2
(2) S2 is conflict senalizable schedule but not S1
(3) S1 and S2 both are conflict servalizable schedule
(4) None of the schedule is conflict senalyabie schedule

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 119
Sol. (A)
Precedence graph of S1 is

Precedence graph of S2 is

14. Consider the following statement


(1) Every conflict serilizable schedule is view serilizable schedule
(2) Any view seralizable schedule that is not conflict serializable schedule is always contain
a blind write operation
Which of the following statement are true?
(A) 1 only (B) 2 only
(C) both (D) None
Sol. (C)
Every conflict serilizable schedule is view serializable but every view perilyable may or not
be conflict sevalyable schedule. Any view serilzable schedule which is not conflict
serializable schedule is always contain blind write operation
T1 T2 T3
R(A)
W(A)

W(A)
W(A)

15. Let T1 and T2 are two transaction with their time stamp value TS (T1) and TS(T2). Let TS(T1)
< TS (T2) and if T1 required the resource which is holded by T2 Then, according to wait-die
protocol, which of the following statement is true
(A) T2 allowed to wait (B) T2 is rollback,
(C) T1 allowed to wait (D) T1 is roolback
Sol. (C)
In wait - die protocol, when older transaction required the resource which is hold by younger
transaction then older transaction allowed to wait.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


120 | DBMS : CS
16. Consider the following Precedence graph of 9 schedule

Total number of topological sequence possible is


Min : 2.0
Max : 2.0
Sol. 2
1) T1  T2  T3  T4  T5

2) T1  T2  T4  T3  T5

17. Consider the following schedule is

T1 T2 T3
R(A)
W(A)
W(A)
W(A)

which of the following statement is true regarding schedule?


(A) S is conflict serlyable schedule
(B) S is view serilizable schedule
(C) S is conflict serizable end view senlizable schedule
(D) S is neither conflict senlizable nor view senlizable schedule
Sol. (B)
Given S is not conflict senlyable schedule because of R1(A) – W2(A) conflict and W2 (A) -
W1(A) conflict
But is view senlyable schedule and equivalent to serial schedule.
T1  T2  T3
18. Consider the following schedule
S: R1(x) Co2(x) W1(x) R3(x) C1 C2 C3
Given Schedule is
(A) Recoverable, cascadeless
(B) Recoverable but not cascadless

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 121
(C) Not recoverable
(D) Strict Recoverable schedule
Sol. (B)

T1 T2 T3
R(x)
W(x)
W(x) R(x)

C
C
C

T3 reads the data item x before commit of T1


19. Consider the following schedule S1 and S2
S1: R1(A) R2(B) R3(C) W1(B) C1 W2(C) C2 W3 (D) C3
S2: R1(A) R2(B) R3 (C) W1(B) W2(C) W3(A) C1C2C3
Which of the following schedule is allowed by strict 2PL protocol?
(A) Only S1 (B) Only S2
(C) Both (D) none
Sol. (D)

T1 T2 T3 T1
S1: R(A) S2 : R(A)

R(B) R(B)

R(C) R(C)

W(B)
W(C) W(B)

W(A) C1

C1 W(C)

C2 C2

C3 W(D)
C3

Both schedule S1 and S2 are not allowed in strict 2PL. Because in strict 2PL, all exclusive lock hold
until commit.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


122 | DBMS : CS
20. Consider the following statement
(A) Every strict schedule is recoverable
(B) Every recoverable schedule is cascardless
(C) Every cascadeless schedule is recoverable
(D) Every strict schedule is not case ad less
Which of the following statement is incorrect
(A) a, b (B) b, d
(C) b, c (D) c, d
Sol. (B)
Every recoverable schedule need not be cascaddles but every cascadless is recoverable
schedule. Every strict schedule is cascaddess also
21. Let Transaction T, has 4 operation and translation T, has 2-operation, then total number of
concurrent schedule possible is________
Min : 13.0
Max : 13.0
Sol.4: 13
6
C4 * 2C2 = 15 Total schedule possible out of which 2 are serial schedule.
22. Consider the following schedule S:
T1
R(A)
W(A)
R(A)
R(B)
R(B)
W(B)
C1 C2

Which of the following is TRUE?


(A) S is recoverable but consistence
(B) S is not recoverable but consistence
(C) S is recoverable but inconsistence
(D) S is neither recoverable nor consistence
Sol. (C)
S is recoverable schedule but not consistence schedule

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Transaction and Concurrency control | 123
S is recoverable because commit of T1 is before T2
S is not consistence because T2 display the output before update.
23. Consider the following schedule S

T1 T2
R(A)
W(A)
R(A)
R(B) R(B)
W(B)

Which of the following statements is True?


(a) S is serialyable
(b) S is non-seralizable but allowed to execute by shared exclusive lock protocol
Sol. (B)
S is non-serilizable but allowed to execute by shared exclusive lock, i.e.

T1
X(A)
R(A)
W(A)
CO(A)
U(A)
S(A)
R(A)
U(A)
S(B)
R(B)
U(B)
X(B)
R(B)
W(B)
U(B)

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


124 | DBMS : CS
24. Consider the following Statement
1) If schedule is in 2PL, then it is always conflict servalyable schedule.
2) Non-Serrlzable schedule is not allowed to execute by 2PL. Number of statement are true
is___
Sol. (B)
2PL always guarantees conflict serlizable schedule allowed to execute by 2PL.
25. Consider the following statement
(1) In Basic time-stand ordering protocol, if younger transaction update the data item then
older transaction not allowed to read the same data item then older transaction not allowed to
read the same data item
(2) If younger transaction read the data item then older transaction not allowed to write the
same data item is basic time stamp ordering protocol
Number of statement is incorrect is_______
Min : 0.0
Max : 0.0
Sol. 0
Both statement are correct w.r.t. Basic time stamp ordering protocol

26. Consider the following statement


(a) Time stamp ordering protocol is free from deadlock.
(b) There exist a schedule that are portable in 2PL but not possible under time stamp protocol
and vice-versa.
Number of statement arc correct is__________
Sol. (B)
Both statement are correct
Time stamp ordering protocol is always free from deadlock.
27. Let T, and TT arc two transaction with time stamp value TS(T1) < TS (T2) and if transaction
T, depend on T, which of the following condition is true in wound-wait protocol?
(A) T1 is rolback (B) T2 is rolback
(C) T1 is wait (D) T2 is wait
Sol. (B)
T1  T2 then T2 should rollback is wound-wait protocol.
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Transaction and Concurrency control | 125
EXERCISE QUESTIONS
1. Consider the following schedule S.
S: R1 (X); R2 (Z); R1 (Z); R3 (X); R3 (Y); W1 (X); W3 (Y); R2 (Y); W2 (Z);
In the precedence graph of S, the number of edges (both incoming and outgoing) of nodes T1
and T2, respectively are
A. 1, 1 B. 1, 2 C. 2, 1 D. 2, 2
Ans. D
2. Which of the following serializations of transactions results in a schedule that is conflict
equivalent to the schedule given in question 73?
A. T1, T2, T3 B. T3, T2, T1
C. T1, T3, T2 D. T3, T1, T2
Ans. D
3. Consider the following schedules.
S1: R1 (X) R1(Y) R2 (X) R2 (Y) W2 (Y) W1 (X)
S2: R1 (X) R2(X) R2 (Y) W2 (Y) R1 (Y) W1 (X)
Which of the above schedules are conflict-serializable?
A. Only S1 B. Only S2 C. Both D. None
Ans. B
4. There exists a schedule that is view serializable but not conflict seriallizable
A. True
B. False
State whether the statements given in questions 76-77 are True of False about Two Phase
Locking (2PL) protocol
Ans. A
5. If all transactions follow 2PL protocol, the resulting schedules will always be serial schedules

A. True
B. False
Ans. B
6. If all transactions follow 2PL protocol, deadlocks can be avoided
A. True
B. False
Ans. B

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


126 | DBMS : CS
7. Consider the following schedule S.
S: r1 (x); r2 (z); r1 (z) r3 (x); r3 (y); w1 (x); c1 ; w3 (y); c3 ; r2 (y); w2 (z); w2(y);c2
Here ri (a) denotes transaction i reads item a wi (a) denotes transaction i writes data item a, ci
denotes that transaction i is committed. The transaction S is
A. cascadeless and strict
B. cascadeless and not strict
C. not strict; cascadeless is irrelevant
D. not cascadeless and not strict
Ans. A
8. Consider the following schedule S.
S: r1 (x); r2 (z); r1 (z); r3 (x); r3 (y); w1 (x); w3 (y); r2 (y); w2 (z); w2 (y);
Here ri (a) denotes transaction i reads item a and wi (a) denotes transaction i writes data item a.
The commit operations of the transactions can be added at the end of S in an appropriate order
such that S is recoverable. The number of such orders is
A. 1 B. 2 C. 3 D. 4
Ans. C

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


CHAPTER-5
INDEXING

5. Indexing ……………………………………................................................127-139

• Categories of Indexing……………………………………………..………………. 129


• B+-Tree ………………..……………………………………………………….……133
5 INDEXING

INDEXING
An index is a data structure that organizes data records on disk to optimize certain kinds of retrieval
operations. An index allows us to efficiently retrieve all records that satisfy search conditions on the
search key fields of the index.

DB file divided into Blocks.Blocks contains the record.


I/O Cost: -Number of block transfer from Secondary memory to main memory in order to access
the DB.
Records can be stored in blocks in 2-ways
(1) Spanned (2) Unspanned
(1) Spanned Organization
Single records spanned between the blocks. In the below figure, record R3 spanned between 2 block
B1 and B2.

Let Block size = 100B


Record Size = 40B
Block size 100
Block factor = number of record per block = = =2.5
Record size 40

 No Memory Leaks/wastage
 I/O cost is more because to access the record R3,2 block need to transfer from secondary memory
to main memory.
(2) Unspanned organization:
Records not allowed to spanned between blocks.
 Entire record belongs to single block.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


128 | DBMS: CS

Block size = 100B


Record size = 40B
100 
Block factor=  =2
 40 
* Wastage of memory in block
* I/O Cost is less as record stored in single block.

Let the relational schema Emp(Eid,Ename,Esalary) .Then retrieve the eid from Emp table
whose Eid is ‘y’.

Let Data is physically stored based on Eid.


Select *
From Emp Search Key = Eid
Where Eid = ‘y’ Ordered file
 If file is not ordered the I/O is O(n) using linear search.
 If file is ordered the I/O is  logn  using binary search.

To reduce I/O cost, we go for Indexing.


 In indexing, create one index file. Index file also divide into block .Index file have 2
attribute
a) Search Key b) Pointer.
Some of the important points related to index and DB file are:-
(1) Size of index file block = Size of DB file block
(2) Entry size of index file = Size of search key + Pointer size
(3) Entry size of index file (size of each Record in index)<< Entry Size of DB record as index
entry consists with 2 attribute only.
(4) Block factor of index file (number of record in each block) >> Block factor of DB file
----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------
Indexing | 129
(5) Number of Index file block (M) << No. of DB file Block (N)
logM   1
Worst case I/O cost with index file =
Index block For
Binary search DB
ordered block

So, logM + 1 <<logN (when DB is ordered)


Or
logM + 1 << N (when DB is unordered)
Categories of Indexing:
(1) Dense Index: A dense index has an index entry for every search key value (and hence every
record) in the data file i.e.1-1 mapping between index record and DB record.
* For every DB record, there exist corresponding entry in index file.

So, Number of index entry = Number of DB record


(2) Sparse Index: A sparse(or nondense) index, on the other hand, has index entries for only some
of the search values. A sparse index has fewer entries than the number of records in the file. For
every DB block, there exist an entry in index file.

So, No. of index file entry = No. of DB block

Consider the following data:


Block size = 1000B, Record size = 100B, Search Key size = 12B, Pointer size = 8B, 10000
record in DB. How many index blocks required using dense & spare index?What is the I/O
cost required?

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


130 | DBMS: CS
1000
Sol. DB block factor = =10 Rec./block
100

No. of record 10000


Number of DB block = = =1000block
Block factor 10

Index file entry size = 12 + 8 = 20B

Block size 1000


Index file block factor= = =50 entry/block
Entry size 20

Dense Index: number of index file entry = No. of DB record = 10,000

number of Record 10000


Number of dense index block = = =200 block
Block factor 50

Sparse index: # of index entire = Number of DB block

1000
= =20 block
50

I/O cost without indexing= logN = log 1000 = 10 block (order)

OR

N = 1000 block (unorder)

I/O cost (Dense Index) = logM + 1 = log200 + 1 = 9

I/O cost(Sparse Index) = log20 + 1 = 6

Type of Index:

(1) Primary Index:

Condition: (1) Search key used in index file should be used to physically ordered the DB
record (i.e., ordered file)

(2) Search key should be candidate key

Note: Primary Index is Sparse. Atmost one primary index is possible.

(2) Clustering Index:

Condition: (1) Search key used in index file should be used to physically ordered the DB
records (i.e., ordered file)

(2) Search key used in index file is non-key.

Note: Clustering Index is sparse because it has an entry for every distinct value.

* Atmost one clustering index is possible.

Note: For a DB, we cannot make both Primary index and Clustering index.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Indexing | 131

(3) Secondary Index: Secondary way of accessing the data even if either primary (OR)
clustering index is exist.

* Sedondary index may be on key field (OR) non-key field.

* Search key is used which is not be used to ordered the DB file.

Note: Secondary Index is always Dense. More than one secondary Index is possible.

Let secondary index on a Non-key field of a file. We can implement this type of index as
follows.

(a) Include several index entries with the same key one for each record. So, it is a dense index.

(b) Use variable length records for the index entries with a repeating field for the pointer (one
pointer for each block)

(c) Create an extra level of indirection to handle multiple pointers. So, it is sparse index.

In this pointer in the index entry points to a block of record pointer and each record pointer
in the block points to one of database record.

If record pointer cannot fit in a single disk block, a cluster (OR) linked list of blocks is used.

* It requires one (OR) more additional block access because of extra level.

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


132 | DBMS: CS

Multilevel Index:
Index is index until last level become Binary block.

 Block factor of 1st level = Block factor of 2nd level


 Block size of 1st level = Block size of 2nd level = Block size of DB file
 From 2nd level index onwards, index is always sparse
1st level may be Spare or Dense.
 I/O cost = (No. of level) + 1 block

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Indexing | 133
Problem of static Multi-level Index:
(1) Insertion: May require shifting of DB which results entire index file change.
To remove this problem, we maintain overflow pages
But, problem with overflow pages is we get a chain of overflow pages, then it effects I/O cost.
So, W.C. goes to ‘N’.
(2) Deletion: (1) May require shifting of DB up results entire index can rebuild.
(2) level the space unused i.e., minimum occupancy of index block becomes 0%.
So, Due to these problems, we use Dynamic Multi-level Index
B+-Tree
(1) Structure of Internal Node:

P – Block Pointer
P-1 Key
(2) Structure of External Node (leaf Node)

K1R1 K 2 R 2         K P-1R P-1 BP 



Block pointer
Points to next
leaf node
P-1 Key
P-1 Record pointer (data pointer)
1 block pointer
(3) Every internal node except root should be atleast P/2 block pointer and atmost P-block
pointer.
(4) Root have atleast 2-block pointer & atmost P-block pointer.
(5) Every leaf-node has at same level.

If there are N records in the file, the path is no longer than log n/2  N 

The no. of pointer in a node is called FAN-001

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


134 | DBMS: CS
Note: (1) Every key should present in leaf node.
(2) Every leaf maintains one block pointer which points to next leaf node.
(3) Once we came to leaf node, sequential access is possible.
(4) Internal node consists only key and block pointer (Not record pointer)
(5) Sequential access of all records.
(6) Random access of all records.

B+ - tree with P = 3 and Pleaf = 2


8, 5, 1, 7, 3, 12, 9, 6

 n-1 
* Each node must have at least   key atmost (n-1) key.
 2 

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Indexing | 135

1, 2, ………….10
Note:(1) If given keys are in ascending order,
In B+-tree: Right Biasing gives more no. of node split than left biasing.
(2) If given biasing descending left biasing more
B+-Tree

Let n = 3 Deletion
Initial:

1.
Delete-5: No Problem

Delete-12: Underflow (Redistribute)

Delete-9: Underflow (Merge with left, redistribute)

Question: Delete from B+-tree.


n=4

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


136 | DBMS: CS

Delete-20 Merge with sibling

Delete-18 - No Problem

Delete-22 Problem

Delete-7

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Indexing | 137
PRACTICE QUESTIONS

1 Which of the following is correct statement?


(A) In ordered file organization, searching is difficult but insertion is easy.
(B) In ordered file organization, both inertion and searching is easy.
(C) In unordered file organization, searching is difficult but insertion is easy
(D) In unordered tile organization, insertion is difficult but searching is easy
Sol. (C)
In unordered file organization. Linear search is used but insertion is easy as record are inserted
at the end of the file.
2. In a B+ tree, order of internal node is 30 and order of leaf node is 25. If all the nodes of the
tree are 80% full, then the index record will be present in a 4-level B+ tree with root at level-
0 and leaves at Ievel-3. Number of key pointer at level-1 is_
Min : 552.0 Max :
552.0
Sol. 552
80
Internal node = 80% of 30   30  24
100
80
External node = 80% of 25   25  20
100
Level Number Number of Number of
of nodes Key pointer block pointer
0 1 23 24
1 24 24  23 24  24
Number of key pointer at level - 1 is 24  23 = 552
3 Suppose block size is 90B and record size is 20B then the block factor is _________ B (used
unpanned organization)
Min : 4.0
Max : 4.0
Sol. 4
Block size 90
Block factor   4
Recordsize 20

4. The order of an internal node is ET tree index is the maximum number of children it can have
Suppose that a child pointer takes 3B. the search Field value takes 17B and the block size is
1024B. The order of internal node is

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


138 | DBMS: CS
Min : 52.0
Max : 52.0
Sol. 52
Size of child pointer = 3B
Size of search field = 17B
Block size = 1024
The order of internal node = P
So, (P-l)17 + P*3  1024
P=|_52.05] = 52
5. The minimum number of level of B+ -tree index requited for 5000 keys and order of B+ -tree
node is 10 i.e. (Assume order is maximum pointer possible to store in B+ -tree node]
is________
Min: 4.0
Max : 4.0
Sol. 4
Given, order of tree = 10
So, leaf node order = 9 key pointer + 1 block point
Internal node order = 9key + 10child pointer
 5000 
So, level – 1    556 nodes
 9 

 556 
level  2    56 nodes
 10 

 56 
level  3     6 nodes
 10 

6
level  4     1nodes
10 

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------


Indexing | 139
Exercise QUESTIONS
1. The blocking factors for the data file and the index file, respectively, are
A. 11, 292 B. 10, 100
C. 11, 101 D. 10, 102
Ans. D
2. The number of blocks in the first level index file is:
A. 131072 B. 1311
C. 1310 D. 1286
Ans. D
3. A data file with 1,00,000 records is stored on a disk. Consider a multi-level secondary index
built on a non-ordering key-field of this file. Which of the following is true about this index?
A. The first level index is an ordered file sorted on the primary key.
B. The number of entries in the last level index is independent of the block size.
C. The number of entries in the first level index is 1,00,000
D. The number of entries in the last level index is always 1
Ans. C
4. Consider a B+ tree in which the maximum number of keys in an internal node is 5. What is the
minimum number of keys in any non-root internal node?
A. 1 B. 2 C. 3 D. 4
Ans. B
5.

For the B+ tree given above, the minimum number of nodes of tree (including the root node)
that must be fetched in order to obtain result of the following query: “Get all records with a
search key greater than or equal to 14 and less than or equal to 20” is
A. 4 B. 5 C. 6 D. 7
Ans. A
6. Consider the B+ tree given in Question 69 with order =3 and order leaf = 2. If we insert the
element 15, the number of internal nodes and leaf nodes in the resulting tree, respectively, are
A. 4,6 B. 4, 7 C. 5, 6 D. 5, 7
Ans. B

----------------------------  DATABASE MANAGEMENT SYSTEM  ----------------------------

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy