normalisation
normalisation
Second Normal Form (2NF): 2NF eliminates redundant data by requiring that
each non-key attribute be dependent on the primary key. This means that there
should not be any partial functional dependency and that each column should be
directly related to the primary key, and not to other columns.
Third Normal Form (3NF): 3NF builds on 2NF by requiring that all non-key
attributes are independent of each other. This means that there should not be any
transitive dependency and that each column should be directly related to the
primary key, and not to any other columns in the same table.
Normal Forms
Boyce-Codd Normal Form (BCNF): BCNF is a stricter form of 3NF that
ensures that each determinant in a table is a candidate key. In other words,
BCNF ensures that each non-key attribute is dependent only on the
candidate key.
Normalization Process:
Find the candidate key of the given relation by generating the closure set of attributes
for all attributes i.e identify all candidate keys.
Check the relation for 2NF. If it does not satisfy 2 NF then decompose the table so that
the resulting relation follow 2NF.
If relation satisfies 2 NF then check for 3NF. If it does not satisfy 3 NF then
decompose the table so that the resulting relation follow 3NF.
First Normal Form
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
NOTE: 2NF tries to reduce the redundant For instance, if there are 100 students taking C1
data getting stored in memory course, we don’t need to store its Fee as 1000 for
all the 100 records, instead, once we can store it
in the second table as the course fee for C1 is
1000.
Types of Functional dependencies in DBMS
42 abc CO A4
43 pqr IT A3
44 xyz CO A4
45 xyz IT A3
46 mno EC B2
47 jkl ME B2
roll_no → dept_name ,
Since, roll_no can determine whole set of {name, dept_name, dept_building}, it can
determine its subset dept_name also.
dept_name → dept_building ,
Dept_name can identify the dept_building accurately, since
departments with different dept_name will also have a different
dept_building
• More valid functional dependencies:
roll_no → name,
{roll_no, name} ⇢ {dept_name, dept_building}, etc.
Here are some invalid functional
dependencies:
name → dept_name
Students with the same name can have different dept_name, hence this
is not a valid functional dependency.
dept_building → dept_name
There can be multiple departments in the same building. Example, in
the above table departments ME and EC are in the same building B2,
hence dept_building → dept_name is an invalid functional dependency.
More invalid functional dependencies:
• name → roll_no,
• {name, dept_name} → roll_no, dept_building → roll_no, etc.
Types of Functional Dependencies in DBMS
42 abc 17
43 pqr 18
44 xyz 18
42 abc 17
43 pqr 18
44 xyz 18
Here, roll_no → name is a non-trivial functional dependency, since the dependent name is not
a subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-trivial
functional dependency, since age is not a subset of {roll_no, name}
Multivalued Functional Dependency
• In Multivalued functional dependency, entities of the dependent set
are not dependent on each other. i.e.
• If a → {b, c} and there exists no functional dependency between b
and c, then it is called a multivalued functional dependency.
• For example,
roll_no name age
42 abc 17
43 pqr 18
44 xyz 18
45 abc 19
42 abc CO 4
43 pqr EC 2
44 xyz IT 1
45 abc EC 2
Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid functional dependency. This is an
indirect functional dependency, hence called Transitive functional dependency.
Armstrong’s axioms/properties of
functional dependencies:
• Reflexivity: If Y is a subset of X, then X→Y holds by reflexivity rule
Example, {roll_no, name} → name is valid.
• Transitivity: If X → Y and Y → Z are both valid dependencies, then X→Z is also valid by the
Transitivity rule.
Example, roll_no → dept_name & dept_name → dept_building, then roll_no → dept_building
is also valid.
Fully Functional Dependency
• X->Y and X->Z which states that those dependencies are fully
functional.
Partial Functional Dependency
• X is a super key.
Note: If A->B and B->C are two FDs then A->C is called transitive
dependency.
• The normalization of 2NF relations to 3NF involves the removal of
transitive dependencies.
Candidate Key: {STUD_NO} For this relation in table 4, STUD_NO -> STUD_STATE and
STUD_STATE -> STUD_COUNTRY are true.
To convert it in third normal form, we will decompose the relation STUDENT (STUD_NO,
STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY_STUD_AGE) as
Electronics &
102 Communication VLSI Technology B_003 401
Engineering
Electronics &
102 Communication Mobile Communication B_003 402
Engineering
Functional Dependency of the above is as mentioned:
• Stu_ID −> Stu_Branch
• Stu_Course −> {Branch_Number, Stu_Course_No}
• Candidate Keys of the above table are: {Stu_ID, Stu_Course}
Why this Table is Not in BCNF?
• The table present above is not in BCNF, because as we can see that
neither Stu_ID nor Stu_Course is a Super Key or Candidate key.
Mobile
B_003 402
Communication
Stu_ID to Stu_Course_No Table
Stu_ID Stu_Course_No
101 201
101 202
102 401
102 402
Person->-> mobile,
This is read as “person multi determines mobile” and “person multi
Person ->-> food_likes
determines food_likes.”
Note that a functional dependency is a special case of multivalued
dependency. In a functional dependency X -> Y, every x determines
exactly one y, never more than one.
Fourth Normal Form (4NF)
• It builds on the first three normal forms (1NF, 2NF, and 3NF) and
the Boyce-Codd Normal Form (BCNF).
S2 B C2 D S1 A C2 D
S2 B C1 C
S2 B C2 D
Multivalued dependencies (MVD) are:
• SID->->CID;
• SID->->CNAME;
• SNAME->->CNAME
Join Dependency
• Join decomposition is a further generalization of Multivalued
dependencies. If the join of R1 and R2 over C is equal to relation R
then we can say that a join dependency (JD) exists,
• where R1 and R2 are the decomposition R1(A, B, C) and R2(C, D) of a
given relations R (A, B, C, D).
• Alternatively, R1 and R2 are a lossless decomposition of R. A JD ⋈ {R1,
R2, …, Rn} is said to hold over a relation R if R1, R2, ….., Rn is a
lossless-join decomposition.
• The *(A, B, C, D), (C, D) will be a JD of R if the join of joins attribute is
equal to the relation R. Here, *(R1, R2, R3) is used to indicate that
relation R1, R2, R3 and so on are a JD of R. Let R is a relation schema
R1, R2, R3……..Rn be the decomposition of R. r( R ) is said to satisfy
join dependency if and only if
Joint Dependency
Example:
Table R3
Table R1 Table R2
Company Product Agent Company Agent Product
C1 Pendrive Aman
C1 mic Aman
C2 speaker speaker
C1 speaker Aman
Agent->->Product
Fifth Normal Form/Projected Normal Form (5NF)
Table ACP
Compan
Agent Product
y
The relation ACP is again decomposed into 3
A1 PQR Nut relations. Now, the natural Join of all three
relations will be shown as:
A1 PQR Bolt
A1 XYZ Nut
A1 XYZ Bolt
A2 PQR Nut
Decomposition
Table R1 Table R2 Table R3
Compan Agent Product
Agent Company Product
y
A1 Nut PQR Nut
A1 PQR
A1 Bolt PQR Bolt
A1 XYZ
A2 Nut XYZ Nut
A2 PQR
XYZ Bolt
The result of the Natural Join of R1 and R3 over ‘Company’ and then the Natural Join of
R13 and R2 over ‘Agent’and ‘Product’ will be Table ACP.
Hence, in this example, all the redundancies are eliminated, and the decomposition of ACP
is a lossless join decomposition. Therefore, the relation is in 5NF as it does not violate the
property of lossless join.
Thank you!!