IS202 Data Management: Part 2: Normalization
IS202 Data Management: Part 2: Normalization
Chapter 4
Part 2: Normalization
Well-Structured Relations
A relation that contains minimal data redundancy
and allows users to insert, delete, and update
rows without causing data inconsistencies
Goal is to avoid anomalies
Insertion Anomaly adding new rows forces user to
create duplicate data
Deletion Anomaly deleting rows may cause a loss of
data that would be needed for other future rows
Modification Anomaly changing data in a row forces
changes to other rows because of duplication
Data Normalization
A tool to validate and improve a logical design
so that it satisfies certain constraints that
Data Normalization
Data Normalization
Functional Dependency
1st Normal Form (1NF)
2nd Normal Form (2NF)
3rd Normal Form (3NF)
10
11
12
STUDENT
13
Student
15
Representing Functional
Dependencies
Student
Text representation:
ID -> Name, AveGPA, Nationality
Email -> Name, AveGPA, Nationality
16
17
EmpID
Emp_Course
Course
EmpID
CourseID
No partial
functional
dependencies
DateCompleted
CourseID CourseTitle
19
20
21
BUT
Order_ID Order_Date
Order_ID Customer_ID
Order_ID Customer_Name
Order_ID Customer_Address
Customer_ID Customer_Name
Customer_ID Customer_Address
All this is OK
(2nd NF)
22
Order_ID Customer_ID
Customer_ID Customer_Name, Customer_Address
Now, there are no transitive dependencies
Both relations are in 3rd NF
23
Another Example
Fig 4-26 Invoice relation (1NF)
(Pine Valley Furniture Company)
24
25
28
Product_Description
Product_Finish
Standard_Price
Product_Line_Id
1 End Table
Cherry
$175.00
2 Coffer Table
Natural Ash
$200.00
3 Computer Desk
Natural Ash
$175.00
4 Entertainment Center
Walnut
$650.00
5 Writers Desk
Cherry
$325.00
6 8-Drawer Desk
White Ash
$750.00
7 Dining Table
Natural Ash
$800.00
Walnut
$500.00
Computer Desk
30
4th NF
No multivalued dependencies
5th NF
No lossless joins
Domain-key NF
The ultimate NFperfect elimination of all
possible anomalies
31
32
The structure and sample data are provided for following table. Break it into
relations in 3NF (assumption: Dept_Manager must be a Emp who has a unique
Emp_Code, and Emp_Educ is a multi-valued attribute).
33
PROJECT
ProjectID Employee
Name
Employee
Salary
100A
Jones
64K
100A
Smith
51K
100B
Smith
51K
200A
Jones
64K
200B
Jones
64K
200C
Parks
28K
200C
Smith
51K
200D
Parks
28K
34
36
37
Data normalization
1NF: must be a relation
2NF: 1NF + no partial functional dependency
3NF: 2NF + no transitive dependency
38
Recommendation
Introduction to Normalization
39