Dbms Unit 3 Part2
Dbms Unit 3 Part2
3. Insertion anomalies –
If the user tried to insert data in a record that does not exist at
all. An Insert anomaly arises when certain attributes cannot be
inserted into the database without the presence of other
attributes.
Ex: Adding record of a new student not taking any course would
mean inserting a student record with Null entity in the course
field. This Null entity is done in order to avoid the insertion
anamoly else we won’t be able to insert a student record for a
student who has not opted any Course Yet.
Decomposition
• TO AVOID REDUNDANCY and problems due to redundancy, we use
refinement technique called DECOMPOSITION.
• Decomposition in DBMS removes redundancy, anomalies and
inconsistencies from a database by dividing the table into multiple
tables.
• Decomposition:- Process of decomposing a larger relation into
smaller relations.
• Each of smaller relations contain subset of attributes of original
relation.
Decomposing Relations into No.of
Relations
R(ABCD….)
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations R1( A , B )
and R2( B , C )-
The two sub relations are-
A B
1 2
2 5
3 3
R1( A , B )
B C
2 1
5 3
3 3
R2( B , C )
• Now, let us check whether this decomposition is
lossless or not.
• For lossless decomposition, we must have-
• R1 ⋈ R2 = R
• Now, if we perform the natural join ( ⋈ ) of the
sub relations R1 and R2 , we get-
A B C
1 2 1
2 5 3
3 3 3
• This relation is same as the original relation R.
• Thus, we conclude that the above decomposition is
lossless join decomposition.
NOTE-
• Lossless join decomposition is also known as non-
additive join decomposition.
• This is because the resultant relation after joining
the sub relations is same as the decomposed
relation.
• No extraneous tuples appear after joining of the
sub-relations.
2. Lossy Join Decomposition-
• Consider there is a relation R which is decomposed into sub
relations R1 , R2 , …. , Rn.
• This decomposition is called lossy join decomposition when the
join of the sub relations does not result in the same relation R that
was decomposed.
• The natural join of the sub relations is always found to have some
extraneous tuples.
• For lossy join decomposition, we always have-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R
where ⋈ is a natural join operator
Example-
• Consider the following relation R( A , B , C )-
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations as R1( A , C ) and R2( B , C )-
R1( A , B ) R2( B , C )
R1 ⋈ R2 ⊃ R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and
R2 we get-
A B C
This relation is not same as the original relation R
1 2 1
and contains some extraneous tuples.
2 5 3
Clearly, R1 ⋈ R2 ⊃ R.
2 3 3
Thus, we conclude that the above decomposition is
3 5 3
lossy join decomposition.
3 3 3
NOTE-
• Lossy join decomposition is also known as careless
decomposition.
• This is because extraneous tuples get introduced in the natural
join of the sub-relations.
• Extraneous tuples make the identification of the original tuples
difficult.
NOTE: R ⋈ S(Natural Join):-
(Attributes of Relation R) ∩ (Attributes of Relation S) = Ø Then
Natural join act as cross product , Otherwise we will take common
attribute as one attribute
Ex:-
Relation (R) Relation (S)
A B C D
a1 b1
c1 d1
a2 b2
c2 d2
a3 b3
R ⋈ S(Natural Join)= R✕S when
(Attributes of Relation R) ∩ (Attributes of Relation S) = Ø
So here R is M tuples and S is N tuples i.e M*N=3*2 = 6 Tuples in
total.
A B C D
a1 b1 c1 d1
a1 b1 c2 d2
a2 b2 c1 d1
a2 b2 c2 d2
a3 b3 c1 d1
a3 b3 c2 d2
Functional Dependency
• Functional Dependency (FD) determines the relation of one
attribute to another attribute in a database management system
(DBMS) system.
• Functional dependency helps you to maintain the quality of data in
the database.
• Functional Dependency plays a vital role to find the difference
between good and bad database design.
• FD is the generalization of the concept of key.
• A functional dependency is denoted by an arrow →.
• The functional dependency of X on Y is represented by X → Y i.e
(X→Y means X determines Y ) or (X is functionally dependent on Y).
• Here X is Determinant and Y is called Dependent.
• X→Y says that if two tuples agree on the values of attribute X they
must also agree on the value in attribute Y.
• Given a relation R, a set of attributes X in R is said to functionally
determine another set of attributes Y also in R i.e X→Y Iff (if and
only if) each X value is associated with precisely one value of Y. So
R is said to satisfy the FD X → Y.
Find Valid and invalid FD’s
Armstrong Axioms :
Armstrong axioms defines the set of rules for
reasoning about functional dependencies and also
to infer all the functional dependencies on a
relational database
Various axioms rules or inference rules:
1.Primary axioms 2.secondary or derived axioms
Rule 4 : Composition
If X → Y and Z → W then XZ → YW
Attribute closure:
Attribute closure of an attribute set can be defined as set of attributes
which can be functionally determined from it.
NOTE:
To find attribute closure of an attribute set-
1) Add elements of attribute set to the result set.
2) Recursively add elements to the result set which can be
functionally determined from the elements of result set.
Ex1: R(A, B, C) FD’s:{A → B, B → C}
Attribute closure(A) I.e A+ → {A, B, C}
Attribute closure(B) I.e B+ → {B, C}
Ex2: R(A, B, C, D, E, F) FD’s: {AB → C, BC → AD, D →E, CE →B}
Find (AB)+, (BC)+, (D)+, (CE)+
Solution:
(AB)+={A, B, C, D, E} so the following FD’s will be generated
from(AB) +
AB → A, AB → B, AB → C, AB → D, AB → E
(BC)+ ={B, C, A, D, E} = {A, B, C, D, E}
(D)+={D,E}
(CE)+={C, E, B, A, D}={A, B, C, D, E}
Ex3: R(A, B, C, D, E, F, G)
Ex4: R(A, B, C, D, E)
FD: {A → B, B → D, C →DE, CD → AB}
Find (A)+ , (B)+ , (C)+ , (D)+ , (F)+ , (ABD)+
Solution:
(A)+ = {A, B, D}
(B)+ = {B, D}
(C)+ = {C, D, E, A, B}
(D)+ = {D}
(F)+ = {F}
(ABD)+ = {A, B, D}
Types of functional dependencies:
1) Trivial functional dependency:-A FD X→Y is said to be Trivial FD If
and only if (iff) Y⊆X. In other words If RHS of same FD is the
subset of LHS of the FD called Trivial FD.
Ex: AB → A, AB → B, AB → AB
AB → C(Non Trivial FD), AB → AC (Non Trivial FD)
2) Completely Non-trivial functional dependency:-If X→Y and Y is not
subset of X. In other words If X→Y and X∩Y=Ф (null) then it is called
completely non-trivial functional dependency.
Ex: AB → C, AB →CD
3) Semi Non Trivial FD:- If X→Y and X ∩ Y ≠ Ø then it is called as Semi
Non-Trivial FD. In other words if there is at least one attribute in the
RHS i.e not part of the LHS such FD is called Non-Trivial FD.
Ex: AB → BC (AB ∩ BC=B i.e ≠ Ø), AB → AC, AB → A (Trivial FD)
Prime and non-prime attributes
Attributes which are parts of any candidate key of relation are called
as prime attribute, others are non-prime attributes.
Candidate Key:
Candidate Key is minimal set (super key) of attributes of a relation
which can be used to identify a tuple uniquely.
Consider student table: student(sno, sname, sphone, age)
we can take sno as candidate key. we can have more than 1 candidate
key in a table.
Types of candidate keys:
1. simple(having only one attribute)
2. composite(having multiple attributes as candidate key)
Super Key:
Super Key is set of attributes of a relation which can
be used to identify a tuple uniquely.
• Adding zero or more attributes to candidate key
generates super key.
• A candidate key is a super key but vice versa is not
true.
Consider student table: student(sno, sname,
sphone, age)
we can take sno, (sno, sname) as super key
EX 1: R(A, B, C, D) FD’s are {A → BC, B → CD, D → AB}
Find Candidate keys, Super keys, Prime attributes and Non-Prime
attributes ?
Solution: 1. Candidate Keys are minimal set(super key) that
determines all attributes of a relation which can be used to identify a
tuple uniquely.
2. Super Key can determine all the attributes of that relation.
3. Attributes which are parts of any candidate key of relation are
called as prime attribute, others are non-prime attributes.
Note:1. if X+ contains all the attributes of that relation then X is called
super key of relation(R).
2. If X is a minimal set then X is called as candidate key of a
relation(R).
A+ →ABCD C+ →C
B+ →BCDA D+ →DABC
Here A, B and D are Candidate keys and Prime Attributes.
C is a Non-Prime attribute.
EX 2: R(A, B, C) FD’s are {A →B, B →C}
Solution: A+→ABC- It (A)is CK and SK
B+→BC- B is not SK and not CK because it is not generating all
attributes of relation(R).
C+→C- is not SK and not CK because it is not generating all
attributes of relation(R).
(AB)+→ABC -AB is SK but not CK
(AC)+→ABC -AC is SK but not CK
(BC)+→BC -BC is not SK and not CK
(ABC)+→ABC -ABC is SK but not CK
Prime attribute is A
Non Prime attribute are B,C
Ex 3: R(ABCDE) FD’s are {AB→C, C→D, B→E}. Find Candidate keys,
Super keys, Prime attributes and Non-Prime attributes ?
Solution:
• Ex 4: R(ABCDE) FD’s are {AB→C, C→D, B→EA}. Find Candidate
keys, Super keys, Prime attributes and Non-Prime attributes ?
Solution:
• Ex 5: R(ABCD) FD’s are {AB→CD, A→B}. Find Candidate keys,
Super keys, Prime attributes and Non-Prime attributes ?
Solution:
• Ex 6: R(ABCDE) FD’s are {A→B, BC→D, D→AE}. Find Candidate
keys, Super keys, Prime attributes and Non-Prime attributes ?
Solution:
• Ex 7: R(ABCDE) FD’s are {AB→C, CD→E, DE→B}. Find Candidate
keys, Super keys, Prime attributes and Non-Prime attributes ?
Solution:
Normalization of Database
It will help in designing a good data base which involves a set of normal
forms as follows -
First normal form
A relation is said to be in first normal form if it contains all atomic
values or single values.
Example:
Course Content
Programming Java, C++
Web HTML, Php, ASP
Here (student id, project id) are key attributes and (student name,
project name) are non-prime attributes. It is decomposed as
Project ID Project Name
Student ID Student Name Project ID
Ex: R(A, B, C, D) FD’s are {A→B, B→C} Find whether it is in 2nd Normal
form or not ?
Solution: AD+ →{A, D, B, C}
Note: A+ →{A, B, C}, D+ →{D}
So CK=AD
Prime attributes are A and D.
Non-prime attributes are B and C.
A→B
R1={A, B, C} R2={A,D}
A+ →{A, B, C} FDs are {Nill}
Fds are {A → BC, B → C} A+=ABC
A+=ABC D+=D
B+=BC
C+=C
CK=A for R1
So It is in 2nd Normal form.
Third Normal Form (3NF)
A table is said to be in the Third Normal Form when,
1. It is in the Second Normal form.
2. And, it doesn't have Transitive Dependency.
Transitive dependency – If A->B and B->C are two FDs then A->C is
called transitive dependency.
So Total Sub relations are R12(B, C), R2(C, D), R3(A, B) , so it is in 3rd NF.
Boyce and Codd Normal Form (BCNF)
Boyce and Codd Normal Form is a higher version of the Third Normal form. This form
deals with certain type of anomaly that is not handled by 3NF. It is an extension of
third normal form. A 3NF table which does not have multiple overlapping candidate
keys is said to be in BCNF. For a table to be in BCNF, following conditions must be
satisfied:
• R must be in 3rd Normal Form and
• for each functional dependency ( X → Y ), X should be a super Key or Candidate
key.
St Name Course Teacher
Course Teacher
Ex: R(A, B, C, D, E, F, G, H) FD’s are {A→BD, B→C, E→FG, AE→H}. Find
whether it is in BCNF or not ?
Solution: AE+ →{A, E, B, D, C, F, G, H}
So CK=AE
Prime attribute is A and E .
Non-prime attributes are B, C, D, F, G and H.
A→BD B→C
A+→{A, B, D, C} B+→{B,C} R12={A, B, D}
R1=(ABCD) R11={BC} A+→{A, B, D, C}
A+→{A, B, D, C} C+→{C} D+→ {D}
B+→ {B, C} Fds are {B→C} Fds are {A→BD}
C+→ {C} So These are in BCNF
D+→ {D}
FDs are {A→BCD , B→C}
B→C E→FG
B+→ {B, C} E+→ {E, F, G} R4= {H, A , E}
R2={B,C} R3= {E, F, G} H+→ {H}
C+→ {C} F+→ {F} A+→ {A, B, D, C}
Fds are {B→C} G+→ {G} E+→ {E, F, G}
It is in BCNF Fds are {E→FG} AE+→AEBDCFGH
It is in BCNF AH+→AHBDC
HE+→HEFG
HAE+→HAEBDCFG
Fds are {AE→H}
The above all (R2, R3 and R4)are in BCNF
So {R11 ∪ R12 ∪ R2 ∪ R3 ∪ R4}=R
Fourth Normal Form (4NF)
A table is said to be in the Fourth Normal Form when,
1. It is in the Boyce-Codd Normal Form.
2. it doesn't have Multi-Valued Dependency .
Note: In some cases multi value dependencies may exist not more
than one time in a given relation.
S_ID Hobby
1 Cricket
S_ID → → Hobby
1 Hockey
2 Cricket
2 Hockey
Fifth Normal Form / Projected Normal Form (5NF)
R1 R2