0% found this document useful (0 votes)
71 views41 pages

Week 5 Normalization Complete Aa

The document discusses database normalization and schema refinement. It covers the purpose of normalization, which is to remove redundancy and avoid data anomalies. Various normal forms are introduced, including 1NF, 2NF, 3NF and BCNF. The stages of normalization are outlined, which involve removing repeating groups and dependencies to comply with the relational model. Functional dependencies and transitive dependencies are defined.

Uploaded by

Rehman Aziz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views41 pages

Week 5 Normalization Complete Aa

The document discusses database normalization and schema refinement. It covers the purpose of normalization, which is to remove redundancy and avoid data anomalies. Various normal forms are introduced, including 1NF, 2NF, 3NF and BCNF. The stages of normalization are outlined, which involve removing repeating groups and dependencies to comply with the relational model. Functional dependencies and transitive dependencies are defined.

Uploaded by

Rehman Aziz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 41

Database Systems

Lecture 07
Schema Refinement and Normalization

Chapter 7, Database System Concepts, Silberschatz, Korth, Sudarshan


Chapter 10, Database Systems; concepts, designs and applications, S.K.Singh

1
LECTURE PLAN
 Purpose of Normalisation
 Redundancy and Data Anomalies
 Repeating Groups
 Functional Dependency
 Transitive Dependency
 Stages of Normalisation

2
Database Normalization
 Database normalization is the process of removing
redundant data from your tables in to improve storage
efficiency, data integrity, and scalability.
 In the relational model, methods exist for quantifying
how efficient a database is. These classifications are
called normal forms (or NF), and there are
algorithms for converting a given database between
them.
 Normalization generally involves splitting existing
tables into multiple ones, which must be re-joined or
linked each time a query is issued.

3
History

 Edgar F. Codd first proposed the process of


normalization and what came to be known as the 1st
normal form in his paper A Relational Model of Data
for Large Shared Data Banks Codd stated:

“There is, in fact, a very simple elimination procedure


which we shall call normalization. Through
decomposition, non-simple domains are replaced by
‘domains whose elements are atomic (nondecomposable)
values.’”

4
Normal Form

 Edgar F. Codd originally established three normal


forms: 1NF, 2NF and 3NF.
 There are now others that are generally accepted, but
3NF is widely considered to be sufficient for most
applications.

5
Normalization

 We discuss four normal forms: first, second, third, and Boyce-Codd


normal forms
 1NF, 2NF, 3NF, and BCNF

 Normalization is a process that “improves” a database design by


generating relations that are of higher normal forms.

 The objective of normalization:

“To
“Tocreate
createrelations
relationswhere
whereevery
everydependency
dependencyisison
onthe
thekey,
key,the
thewhole
wholekey,
key,and
and
nothing but the key”
nothing but the key”

6
Normalization

 There is a sequence to normal forms:


 1NF is considered the weakest,
 2NF is stronger than 1NF,
 3NF is stronger than 2NF, and
 BCNF is considered the strongest

7
Normalization

1NF a relation in BCNF, is also in


3NF

2NF a relation in 3NF is also in


2NF

3NF a relation in 2NF is also in


1NF
BCNF

8
Purpose of Normalisation
 To avoid redundancy by storing each ‘fact’ within the
database only once.
 To put data into a form that conforms to relational
principles (e.g., single valued attributes, each relation
represents one entity) - no repeating groups.
 To put the data into a form that is more able to
accurately accommodate change.
 To avoid certain updating ‘anomalies’.
 To facilitate the enforcement of data constraints.

9
Redundancy and Data Anomalies
Redundant data is where we have stored the same ‘information’
more than once. i.e., the redundant data could be removed without
the loss of information.
Example: We have the following relation that contains staff and department details:

staffNo job dept dname city Such ‘redundancy’


SL10 Salesman 10 Sales Stratford could lead to the
SA51 Manager 20 Accounts Barking
following ‘anomalies’
DS40 Clerk 20 Accounts Barking
OS45 Clerk 30 Operations Barking

Insert Anomaly: We can’t insert a dept without inserting a member of


staff that works in that department
Update Anomaly: We could change the name of the dept that SA51
works in without simultaneously changing the dept that DS40 works in.
Deletion Anomaly: By removing employee SL10 we have removed all
information pertaining to the Sales dept. 10
Repeating Groups
A repeating group is an attribute (or set of attributes) that can have
more than one value for a primary key value.

Example: We have the following relation that contains staff and department details
and a list of telephone contact numbers for each member of staff.
staffNo job dept dname city contact number
SL10 Salesman 10 Sales Stratford 018111777, 018111888, 079311122
SA51 Manager 20 Accounts Barking 017111777
DS40 Clerk 20 Accounts Barking
OS45 Clerk 30 Operations Barking 079311555

Repeating Groups are not allowed in a relational design, since all


attributes have to be ‘atomic’ - i.e., there can only be one value per cell in
a table!

11
Functional Dependency
Formal Definition: Attribute B is functionally dependant upon
attribute A (or a collection of attributes) if a value of A determines a
single value of attribute B at any one time.
Formal Notation: A  B This should be read as ‘A determines B’
or ‘B is functionally dependant on A’. A is called the determinant
and B is called the object of the determinant.

Example: Functional Dependencies


staffNo job dept dname staffNo  job
SL10 Salesman 10 Sales staffNo  dept
SA51 Manager 20 Accounts
DS40 Clerk 20 Accounts
staffNo  dname
OS45 Clerk 30 Operations dept  dname

12
Functional Dependency
Compound Determinants: If more than one attribute is necessary to
determine another attribute in an entity, then such a determinant is
termed a composite determinant.

Full Functional Dependency: Only of relevance with composite


determinants. This is the situation when it is necessary to use all the
attributes of the composite determinant to identify its object uniquely.

Example:
Full Functional Dependencies
order# line# qty price (Order#, line#)  qty
(Order#, line#)  price
A001 001 10 200
A002 001 20 400
A002 002 20 800
A004 001 15 300

13
Functional Dependency
Partial Functional Dependency: This is the situation that exists if it
is necessary to only use a subset of the attributes of the composite
determinant to identify its object uniquely.

Example: Full Functional Dependencies


student# unit# room grade (student#, unit#)  grade
9900100 A01 TH224 2
9900010 A01 TH224 14
9901011 A02 JS075 3 Partial Functional Dependencies
9900001 A01 TH224 16 unit#  room

Repetition of data!

14
Transitive Dependency
Definition: A transitive dependency exists when there is an
intermediate functional dependency.

Formal Notation: If A  B and B  C, then it can be stated that


the following transitive dependency exists: A  B  C

Example: Transitive Dependencies


staffNo job dept dname staffNo  dept
SL10 Salesman 10 Sales dept  dname
SA51 Manager 20 Accounts
DS40 Clerk 20 Accounts staffNo  dept  dname
OS45 Clerk 30 Operations
Repetition of data!

15
Normalisation - Relational Model
In order to comply with the relational model it is necessary to 1) remove
repeating groups and 2) avoid redundancy and data anomalies by remoting
partial and transitive functional dependencies.

Relational Database Design: All attributes in a table must be atomic,


and solely dependant upon the fully primary key of that table.

THE KEY, THE WHOLE KEY, AND NOTHING BUT THE KEY!
16
Stages of Normalisation
Unnormalised
(UDF)
Remove repeating groups
First normal form
(1NF)
Remove partial dependencies
Second normal form
(2NF)
Remove transitive dependencies
Third normal form
(3NF)
Remove remaining functional
dependency anomalies
Boyce-Codd normal
form (BCNF)
Remove multivalued dependencies
Fourth normal form
(4NF)
Remove remaining anomalies
Fifth normal form 17
(5NF)
Unnormalised Normal Form (UNF)
Definition: A relation is unnormalised when it has not had any
normalisation rules applied to it, and it suffers from various anomalies.

This only tends to occur where the relation has been designed using a
‘bottom-up approach’. i.e., the capturing of attributes to a ‘Universal
Relation’ from a screen layout, manual report, manual document, etc...

18
Unnormalised Normal Form (UNF)
ORDER
Customer No: 001964 Order Number: 00012345
Name: Mark Campbell Order Date: 14-Feb-2002
Address: 1 The House
Leytonstone
E11 9ZZ

Product Product Unit Order Line


Number Description Price Quantity Total
T5060 Hook 5.00 5 25.00
PT42 Bolt 2.50 10 20.50
QZE48 Spanner 20.00 1 20.00

Order Total: 65.50

ORDER (order-no, order-date, cust-no, cust-name, cust-add,


(prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total
19
First Normal Form (1NF)
Definition: A relation is in 1NF if, and only if, all its underlying
attributes contain atomic values only.
Remove repeating groups into a new relation

A repeating group is shown by a pair of brackets within the relational schema.

ORDER (order-no, order-date, cust-no, cust-name, cust-add,


(prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total

Steps from UNF to 1NF:


 Remove the outermost repeating group (and any nested repeated
groups it may contain) and create a new relation to contain it.
 Add to this relation a copy of the PK of the relation immediately
enclosing it.
 Name the new entity (appending the number 1 to indicate 1NF)
 Determine the PK of the new entity
 Repeat steps until no more repeating groups. 20
Example - UNF to 1NF
ORDER (order-no, order-date, cust-no, cust-name, cust-add,
(prod-no, prod-desc, unit-price, ord-qty, line-total)*, order-total

1. Remove the outermost repeating group (and any nested repeated groups it may
contain) and create a new relation to contain it. (rename original to indicate 1NF)
ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total
(prod-no, prod-desc, unit-price, ord-qty, line-total)
2. Add to this relation a copy of the PK of the relation immediately enclosing it.
ORDER-1 (order-no, order-date, cust-no, cust-name, cust-add, order-total
(order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
3. Name the new entity (appending the number 1 to indicate 1NF)
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
4. Determine the PK of the new entity
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)21
Normalisation to 1NF
Example 2
To convert to a 1NF relation, split up any non-atomic
values

1NF
Unnormalised Module Dept Lecturer Text
Module Dept Lecturer Texts M1 D1 L1 T1
M1 D1 L1 T2
M1 D1 L1 T1, T2 M2 D1 L1 T1
M2 D1 L1 T1, T3 M2 D1 L1 T3
M3 D1 L2 T4 M3 D1 L2 T4
M4 D2 L3 T1, T5 M4 D2 L3 T1
M5 D2 L4 T6 M4 D2 L3 T5
M5 D2 L4 T6
Problems in 1NF
• INSERT anomalies
– Can't add a module with no
1NF texts
Module Dept Lecturer Text • UPDATE anomalies
M1 D1 L1 T1 – To change lecturer for M1, we
M1 D1 L1 T2 have to change two rows
M2 D1 L1 T1 • DELETE anomalies
M2 D1 L1 T3
M3 D1 L2 T4 – If we remove M3, we remove
M4 D2 L3 T1 L2 as well
M4 D2 L3 T5
M5 D2 L4 T6
Second Normal Form (2NF)
Definition: A relation is in 2NF if, and only if, it is in 1NF and every
non-key
Removeattribute
partialisfunctional
fully dependent on the primary
dependencies intokey.
a new relation

Steps from 1NF to 2NF:


 Remove the offending attributes that are only partially functionally
dependent on the composite key, and place them in a new relation.
 Add to this relation a copy of the attribute(s) which are the
determinants of these offending attributes. These will automatically
become the primary key of this new relation.
 Name the new entity (appending the number 2 to indicate 2NF)
 Rename the original entity (ending with a 2 to indicate 2NF)

24
Example - 1NF to 2NF (Example 1)
ORDER-LINE-1 (order-no, prod-no, prod-desc, unit-price, ord-qty, line-total)
1. Remove the offending attributes that are only partially functionally dependent on
the composite key, and place them in a new relation.
ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total)

(prod-desc, unit-price)
2. Add to this relation a copy of the attribute(s) which determines these offending
attributes. These will automatically become the primary key of this new relation..
ORDER-LINE-1 (order-no, prod-no, ord-qty, line-total)
(prod-no, prod-desc, unit-price)
3. Name the new entity (appending the number 2 to indicate 2NF)
PRODUCT-2 (prod-no, prod-desc, unit-price)
4. Rename the original entity (ending with a 2 to indicate 2NF)
ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total) 25
Bringing a Relation to 2NF

Composite
Primary Key

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 2NF
• Goal: Remove Partial Dependencies
Partial
Composite Dependencies
Primary Key

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 2NF
• Remove attributes that are dependent from the part but not the whole of
the primary key from the original relation. For each partial dependency,
create a new relation, with the corresponding part of the primary key from
the original as the primary key.

STUDENT

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Bringing a Relation to 2NF
CUSTOMER
STUDENT_COURSE
Stud_ID Name Course_ID Units
101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00 Stud_ID Course_ID
125 Johnson MSI 331 3.00
101 MSI 250
101 MSI 415
125 MSI 331

STUDENT COURSE

Stud_ID Name Course_ID Units


101 Lennon MSI 250 3.00
101 Lennon MSI 415 3.00
125 Johnson MSI 331 3.00
Problems Resolved in 2NF
• Problems in 1NF • In 2NF the first two are
– INSERT – Can't add a module resolved, but not the third
with no texts one
– UPDATE – To change lecturer
for M1, we have to change
two rows 2NFa
– DELETE – If we remove M3, Module Dept Lecturer
we remove L2 as well M1 D1 L1
M2 D1 L1
M3 D1 L2
M4 D2 L3
M5 D2 L4
Problems Remaining in 2NF
• INSERT anomalies
2NFa
– Can't add lecturers who teach
no modules Module Dept Lecturer
• UPDATE anomalies M1 D1 L1
– To change the department for M2 D1 L1
M3 D1 L2
L1 we must alter two rows
M4 D2 L3
• DELETE anomalies M5 D2 L4
– If we delete M3 we delete L2
as well
Third Normal Form (3NF)
Definition: A relation is in 3NF if, and only if, it is in 2NF and every
non-key attribute is non-transitively dependent on the primary key.
Remove transitive dependencies into a new relation

Steps from 2NF to 3NF:


 Remove the offending attributes that are transitively dependent on
non-key attribute(s), and place them in a new relation.
 Add to this relation a copy of the attribute(s) which are the
determinants of these offending attributes. These will automatically
become the primary key of this new relation.
 Name the new entity (appending the number 3 to indicate 3NF)
 Rename the original entity (ending with a 3 to indicate 3NF)

32
Example - 2NF to 3NF
ORDER-2 (order-no, order-date, cust-no, cust-name, cust-add, order-total
1. Remove the offending attributes that are transitively dependent on non-key
attributes, and place them in a new relation.
ORDER-2 (order-no, order-date, cust-no, order-total

(cust-name, cust-add )
2. Add to this relation a copy of the attribute(s) which determines these offending
attributes. These will automatically become the primary key of this new relation..
ORDER-2 (order-no, order-date, cust-no, order-total

(cust-no, cust-name, cust-add )


3. Name the new entity (appending the number 3 to indicate 3NF)
CUSTOMER-3 (cust-no, cust-name, cust-add )
4. Rename the original entity (ending with a 3 to indicate 3NF)
33
ORDER-3 (order-no, order-date, cust-no, order-total
Example - Relations in 3NF
ORDER-3 (order-no, order-date, cust-no, order-total

CUSTOMER-3 (cust-no, cust-name, cust-add )

PRODUCT-2 (prod-no, prod-desc, unit-price)

ORDER-LINE-2 (order-no, prod-no, ord-qty, line-total)

order-no prod-no
ORDER PRODUCT

places placed by contains shows


belongs to
cust-no part of order-no, prod-no
CUSTOMER ORDER-LINE

34
Bringing a Relation to 3NF
(Example 2)

• Goal: Get rid of transitive dependencies.

Transitive
Dependency
EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Bringing a Relation to 3NF
• Remove the attributes, which are dependent on a non-
key attribute, from the original relation. For each
transitive dependency, create a new relation with the
non-key attribute which is a determinant in the transitive
dependency as a primary key, and the dependent non-key
attribute as a dependent.

EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Bringing a Relation to 3NF
EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg

EMPLOYEE

Emp_ID F_Name L_Name Dept_ID


111 Mary Jones 1
122 Sarah Smith 2

DEPARTMENT

Dept_ID Dept_Name
1 Acct
2 Mktg
Third Normal Form
Example 3
• 2NFa is not in 3NF
2NFa – We have the FDs
Module Dept Lecturer {Module}  {Lecturer}
M1 D1 L1 {Lecturer}  {Dept}
M2 D1 L1 – So there is a transitive FD
M3 D1 L2 from the primary key
M4 D2 L3 {Module} to {Dept}
M5 D2 L4
2NF to 3NF ( Example 3)

2NFa 3NFa 3NFb


Module Dept Lecturer Lecturer Dept Module Lecturer
M1 D1 L1 L1 D1 M1 L1
M2 D1 L1 L2 D1 M2 L1
M3 D1 L2 L3 D2 M3 L2
M4 D2 L3 L4 D2 M4 L3
M5 D2 L4 M5 L4
Problems Resolved in 3NF
• Problems in 2NF • In 3NF all of these are resolved (for
this relation – but 3NF can still have
– INSERT – Can't add lecturers anomalies!)
who teach no modules
– UPDATE – To change the
department for L1 we must 3NFb
alter two rows 3NFa Module Lecturer
– DELETE – If we delete M3 we M1 L1
Lecturer Dept
delete L2 as well M2 L1
L1 D1 M3 L2
L2 D1 M4 L3
L3 D2 M5 L4
L4 D2
Normalisation and Design
• Normalisation is related to • When you find you have a
DB design non-3NF DB
– A database should normally – Identify the FDs that are
be in 3NF at least causing a problem
– If your design leads to a non- – Think if they will lead to any
3NF DB, then you might want insert, update, or delete
to revise it anomalies
– Try to remove them

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy