Week 5 Normalization Complete Aa
Week 5 Normalization Complete Aa
Lecture 07
Schema Refinement and Normalization
1
LECTURE PLAN
Purpose of Normalisation
Redundancy and Data Anomalies
Repeating Groups
Functional Dependency
Transitive Dependency
Stages of Normalisation
2
Database Normalization
Database normalization is the process of removing
redundant data from your tables in to improve storage
efficiency, data integrity, and scalability.
In the relational model, methods exist for quantifying
how efficient a database is. These classifications are
called normal forms (or NF), and there are
algorithms for converting a given database between
them.
Normalization generally involves splitting existing
tables into multiple ones, which must be re-joined or
linked each time a query is issued.
3
History
4
Normal Form
5
Normalization
“To
“Tocreate
createrelations
relationswhere
whereevery
everydependency
dependencyisison
onthe
thekey,
key,the
thewhole
wholekey,
key,and
and
nothing but the key”
nothing but the key”
6
Normalization
7
Normalization
8
Purpose of Normalisation
To avoid redundancy by storing each ‘fact’ within the
database only once.
To put data into a form that conforms to relational
principles (e.g., single valued attributes, each relation
represents one entity) - no repeating groups.
To put the data into a form that is more able to
accurately accommodate change.
To avoid certain updating ‘anomalies’.
To facilitate the enforcement of data constraints.
9
Redundancy and Data Anomalies
Redundant data is where we have stored the same ‘information’
more than once. i.e., the redundant data could be removed without
the loss of information.
Example: We have the following relation that contains staff and department details:
Example: We have the following relation that contains staff and department details
and a list of telephone contact numbers for each member of staff.
staffNo job dept dname city contact number
SL10 Salesman 10 Sales Stratford 018111777, 018111888, 079311122
SA51 Manager 20 Accounts Barking 017111777
DS40 Clerk 20 Accounts Barking
OS45 Clerk 30 Operations Barking 079311555
11
Functional Dependency
Formal Definition: Attribute B is functionally dependant upon
attribute A (or a collection of attributes) if a value of A determines a
single value of attribute B at any one time.
Formal Notation: A B This should be read as ‘A determines B’
or ‘B is functionally dependant on A’. A is called the determinant
and B is called the object of the determinant.
12
Functional Dependency
Compound Determinants: If more than one attribute is necessary to
determine another attribute in an entity, then such a determinant is
termed a composite determinant.
Example:
Full Functional Dependencies
order# line# qty price (Order#, line#) qty
(Order#, line#) price
A001 001 10 200
A002 001 20 400
A002 002 20 800
A004 001 15 300
13
Functional Dependency
Partial Functional Dependency: This is the situation that exists if it
is necessary to only use a subset of the attributes of the composite
determinant to identify its object uniquely.
Repetition of data!
14
Transitive Dependency
Definition: A transitive dependency exists when there is an
intermediate functional dependency.
15
Normalisation - Relational Model
In order to comply with the relational model it is necessary to 1) remove
repeating groups and 2) avoid redundancy and data anomalies by remoting
partial and transitive functional dependencies.
THE KEY, THE WHOLE KEY, AND NOTHING BUT THE KEY!
16
Stages of Normalisation
Unnormalised
(UDF)
Remove repeating groups
First normal form
(1NF)
Remove partial dependencies
Second normal form
(2NF)
Remove transitive dependencies
Third normal form
(3NF)
Remove remaining functional
dependency anomalies
Boyce-Codd normal
form (BCNF)
Remove multivalued dependencies
Fourth normal form
(4NF)
Remove remaining anomalies
Fifth normal form 17
(5NF)
Unnormalised Normal Form (UNF)
Definition: A relation is unnormalised when it has not had any
normalisation rules applied to it, and it suffers from various anomalies.
This only tends to occur where the relation has been designed using a
‘bottom-up approach’. i.e., the capturing of attributes to a ‘Universal
Relation’ from a screen layout, manual report, manual document, etc...
18
Unnormalised Normal Form (UNF)
ORDER
Customer No: 001964 Order Number: 00012345
Name: Mark Campbell Order Date: 14-Feb-2002
Address: 1 The House
Leytonstone
E11 9ZZ
1. Remove the outermost repeating group (and any nested repeated groups it may
contain) and create a new relation to contain it. (rename original to indicate 1NF)
ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total
(prod-no, prod-desc, unit-price, ord-qty, line-total)
2. Add to this relation a copy of the PK of the relation immediately enclosing it.
ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total
(order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
3. Name the new entity (appending the number 1 to indicate 1NF)
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
4. Determine the PK of the new entity
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)21
Normalisation to 1NF
Example 2
To convert to a 1NF relation, split up any non-atomic
values
1NF
Unnormalised Module Dept Lecturer Text
Module Dept Lecturer Texts M1 D1 L1 T1
M1 D1 L1 T2
M1 D1 L1 T1, T2 M2 D1 L1 T1
M2 D1 L1 T1, T3 M2 D1 L1 T3
M3 D1 L2 T4 M3 D1 L2 T4
M4 D2 L3 T1, T5 M4 D2 L3 T1
M5 D2 L4 T6 M4 D2 L3 T5
M5 D2 L4 T6
Problems in 1NF
• INSERT anomalies
– Can't add a module with no
1NF texts
Module Dept Lecturer Text • UPDATE anomalies
M1 D1 L1 T1 – To change lecturer for M1, we
M1 D1 L1 T2 have to change two rows
M2 D1 L1 T1 • DELETE anomalies
M2 D1 L1 T3
M3 D1 L2 T4 – If we remove M3, we remove
M4 D2 L3 T1 L2 as well
M4 D2 L3 T5
M5 D2 L4 T6
Second Normal Form (2NF)
Definition: A relation is in 2NF if, and only if, it is in 1NF and every
non-key
Removeattribute
partialisfunctional
fully dependent on the primary
dependencies intokey.
a new relation
24
Example - 1NF to 2NF (Example 1)
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
1. Remove the offending attributes that are only partially functionally dependent on
the composite key, and place them in a new relation.
ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total)
(prod-desc, unit-price)
2. Add to this relation a copy of the attribute(s) which determines these offending
attributes. These will automatically become the primary key of this new relation..
ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total)
(prod-no, prod-desc, unit-price)
3. Name the new entity (appending the number 2 to indicate 2NF)
PRODUCT-2 (prod-no, prod-desc, unit-price)
4. Rename the original entity (ending with a 2 to indicate 2NF)
ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total) 25
Bringing a Relation to 2NF
Composite
Primary Key
STUDENT
STUDENT
STUDENT
STUDENT COURSE
32
Example - 2NF to 3NF
ORDER-2 (order-no, order-date, cust-no, cust-name, cust-add, order-total
1. Remove the offending attributes that are transitively dependent on non-key
attributes, and place them in a new relation.
ORDER-2 (order-no, order-date, cust-no, order-total
(cust-name, cust-add )
2. Add to this relation a copy of the attribute(s) which determines these offending
attributes. These will automatically become the primary key of this new relation..
ORDER-2 (order-no, order-date, cust-no, order-total
order-no prod-no
ORDER PRODUCT
34
Bringing a Relation to 3NF
(Example 2)
Transitive
Dependency
EMPLOYEE
EMPLOYEE
EMPLOYEE
DEPARTMENT
Dept_ID Dept_Name
1 Acct
2 Mktg
Third Normal Form
Example 3
• 2NFa is not in 3NF
2NFa – We have the FDs
Module Dept Lecturer {Module} {Lecturer}
M1 D1 L1 {Lecturer} {Dept}
M2 D1 L1 – So there is a transitive FD
M3 D1 L2 from the primary key
M4 D2 L3 {Module} to {Dept}
M5 D2 L4
2NF to 3NF ( Example 3)