0% found this document useful (0 votes)
6 views42 pages

normlization

Chapter 7 discusses normalization in database design, focusing on the importance of good relational design, functional dependencies, and decomposition methods to achieve lossless and dependency-preserving forms. It introduces various normal forms, including Boyce-Codd Normal Form (BCNF) and Third Normal Form (3NF), and highlights the trade-offs between achieving these forms and maintaining data integrity. The chapter emphasizes the need for effective database design processes to minimize redundancy and ensure data consistency.

Uploaded by

frogarpit05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views42 pages

normlization

Chapter 7 discusses normalization in database design, focusing on the importance of good relational design, functional dependencies, and decomposition methods to achieve lossless and dependency-preserving forms. It introduces various normal forms, including Boyce-Codd Normal Form (BCNF) and Third Normal Form (3NF), and highlights the trade-offs between achieving these forms and maintaining data integrity. The chapter emphasizes the need for effective database design processes to minimize redundancy and ensure data consistency.

Uploaded by

frogarpit05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Chapter 7: Normalization

Database System Concepts - 7th Edition 7.1 ©


Outline

 Features of Good Relational Design


 Functional Dependencies
 Decomposition Using Functional Dependencies
 Normal Forms
 Functional Dependency Theory
 Algorithms for Decomposition using Functional
Dependencies
 Decomposition Using Multivalued Dependencies
 More Normal Form
 Atomic Domains and First Normal Form
 Database-Design Process
 Modeling Temporal Data

Database System Concepts - 7th Edition 7.2 ©


Features of Good Relational D
 Suppose we combine instructor and department int
which represents the natural join on the relations in
department

 There is repetition of information


 Need to use null values (if we add a new departme
instructors)

Database System Concepts - 7th Edition 7.4 ©


Decomposition
 The only way to avoid the repetition-of-information
the in_dep schema is to decompose it into two sch
instructor and department schemas.
 Not all decompositions are good. Suppose we dec
employee(ID, name, street, city, salary)
into
employee1 (ID, name)
employee2 (name, street, city, salary)

The problem arises when we have two employees


same name
 The next slide shows how we lose information -- w
reconstruct the original employee relation -- and so
lossy decomposition.

Database System Concepts - 7th Edition 7.6 ©


A Lossy Decompositio

Database System Concepts - 7th Edition 7.7 ©


Lossless Decompositio
 Let R be a relation schema and let R1 and R2 form a
decomposition of R . That is R = R1 U R2
 We say that the decomposition is a lossless deco
if there is no loss of information by replacing R with
relation schemas R1 U R2
 Formally,
∏ R1 (r) ∏ R2 (r) = r
 And, conversely a decomposition is lossy if
r ⊂ ∏ R1 (r) ∏ R2 (r) = r

Database System Concepts - 7th Edition 7.8 ©


Example of Lossless Decomp

 Decomposition of R = (A, B, C)
R1 = (A, B) R2 = (B, C)

Database System Concepts - 7th Edition 7.9 ©


Normalization Theory
 Decide whether a particular relation R is in “good”
 In the case that a relation R is not in “good” form,
decompose it into set of relations {R1, R2, ..., Rn} s
that
• Each relation is in good form
• The decomposition is a lossless decomposition
 Our theory is based on:
• functional dependencies
• multivalued dependencies

Database System Concepts - 7th Edition 7.10 ©


Functional Dependencie

 There are usually a variety of constraints (rules) on


in the real world.
 For example, some of the constraints that are expe
hold in a university database are:
• Students and instructors are uniquely identified
ID.
• Each student and instructor has only one name
• Each instructor and student is (primarily) assoc
only one department.
• Each department has only one value for its bud
only one associated building.

Database System Concepts - 7th Edition 7.11 ©


Functional Dependencies (C

 An instance of a relation that satisfies all such real-


constraints is called a legal instance of the relatio
 A legal instance of a database is one where all the
instances are legal instances
 Constraints on the set of legal relations.
 Require that the value for a certain set of attributes
determines uniquely the value for another set of att
 A functional dependency is a generalization of the
key.

Database System Concepts - 7th Edition 7.12 ©


Functional Dependencies De
 Let R be a relation schema
α ⊆ R and β ⊆ R
 The functional dependency
α→β
holds on R if and only if for any legal relations r(R), w
any two tuples t1 and t2 of r agree on the attributes α,
agree on the attributes β. That is,

t1[α] = t2 [α] ⇒ t1[β ] = t2 [β ]

 Example: Consider r(A,B ) with the following instance


1 4
1 5
3 7

 On this instance, B → A hold; A → B does NOT hold

Database System Concepts - 7th Edition 7.13 ©


Closure of a Set of Functional Depe

 Given a set F set of functional dependencies, ther


certain other functional dependencies that are logi
implied by F.
• If A → B and B → C, then we can infer that A
• etc.
 The set of all functional dependencies logically im
F is the closure of F.
 We denote the closure of F by F+.

Database System Concepts - 7th Edition 7.14 ©


Keys and Functional Depend
 K is a superkey for relation schema R if and only if
 K is a candidate key for R if and only if
• K → R, and
• for no α ⊂ K, α → R
 Functional dependencies allow us to express cons
cannot be expressed using superkeys. Consider th
in_dep (ID, name, salary, dept_name, building,
We expect these functional dependencies to hold:
dept_name→ building
ID  building
but would not expect the following to hold:
dept_name → salary

Database System Concepts - 7th Edition 7.15 ©


Use of Functional Depende
 We use functional dependencies to:
• To test relations to see if they are legal under a
functional dependencies.
 If a relation r is legal under a set F of functio
dependencies, we say that r satisfies F.
• To specify constraints on the set of legal relatio
 We say that F holds on R if all legal relation
satisfy the set of functional dependencies F.
 Note: A specific instance of a relation schema may
functional dependency even if the functional depen
not hold on all legal instances.
• For example, a specific instance of instructor m
chance, satisfy
name → ID.

Database System Concepts - 7th Edition 7.16 ©


Trivial Functional Depende

 A functional dependency is trivial if it is satisfied by


instances of a relation
• Example:
 ID, name → ID
 name → name
• In general, α → β is trivial if β ⊆ α

Database System Concepts - 7th Edition 7.17 ©


Lossless Decompositio

 We can use functional dependencies to show when


decomposition are lossless.
 For the case of R = (R1, R2), we require that for all
relations r on schema R
r = ∏R1 (r ) ∏R2 (r )
 A decomposition of R into R1 and R2 is lossless de
if at least one of the following dependencies is in F
• R1 ∩ R2 → R1
• R1 ∩ R2 → R2
 The above functional dependencies are a sufficien
for lossless join decomposition; the dependencies
necessary condition only if all constraints are funct
dependencies

Database System Concepts - 7th Edition 7.18 ©


Example
 R = (A, B, C)
F = {A → B, B → C)
 R1 = (A, B), R2 = (B, C)
• Lossless decomposition:
R1 ∩ R2 = {B} and B → BC
 R1 = (A, B), R2 = (A, C)
• Lossless decomposition:
R1 ∩ R2 = {A} and A → AB
 Note:
• B → BC
is a shorthand notation for
• B → {B, C}

Database System Concepts - 7th Edition 7.19 ©


Dependency Preservatio

 Testing functional dependency constraints each tim


database is updated can be costly
 It is useful to design the database in a way that con
be tested efficiently.
 If testing a functional dependency can be done by c
just one relation, then the cost of testing this constra
 When decomposing a relation it is possible that it is
possible to do the testing without having to perform
Produced.
 A decomposition that makes it computationally hard
functional dependency is said to be NOT dependen
preserving.

Database System Concepts - 7th Edition 7.20 ©


Dependency Preservation Ex

 Consider a schema:
dept_advisor(s_ID, i_ID, department_name)
 With function dependencies:
i_ID → dept_name
s_ID, dept_name → i_ID
 In the above design we are forced to repeat the dep
name once for each time an instructor participates i
dept_advisor relationship.
 To fix this, we need to decompose dept_advisor
 Any decomposition will not include all the attributes
s_ID, dept_name → i_ID
 Thus, the composition NOT be dependency preserv

Database System Concepts - 7th Edition 7.21 ©


Boyce-Codd Normal For

 A relation schema R is in BCNF with respect to a s


functional dependencies if for all functional depend
F+ of the form
α→β
where α ⊆ R and β ⊆ R, at least one of the followin
• α → β is trivial (i.e., β ⊆ α)
• α is a superkey for R

Database System Concepts - 7th Edition 7.23 ©


Boyce-Codd Normal Form (

 Example schema that is not in BCNF:


in_dep (ID, name, salary, dept_name, building,
because :
• dept_name→ building, budget
 holds on in_dep
 but
• dept_name is not a superkey
 When decompose in_dept into instructor and dep
• instructor is in BCNF
• department is in BCNF

Database System Concepts - 7th Edition 7.24 ©


Decomposing a Schema into

 Let R be a schema R that is not in BCNF. Let α →


FD that causes a violation of BCNF.
 We decompose R into:
• (α U β )
• (R-(β-α))
 In our example of in_dep,
• α = dept_name
• β = building, budget
and in_dep is replaced by
• (α U β ) = ( dept_name, building, budget )
• ( R - ( β - α ) ) = ( ID, name, dept_name, salary

Database System Concepts - 7th Edition 7.25 ©


Example
 R = (A, B, C)
F = {A → B, B → C)
 R1 = (A, B), R2 = (B, C)
• Lossless-join decomposition:
R1 ∩ R2 = {B} and B → BC
• Dependency preserving
 R1 = (A, B), R2 = (A, C)
• Lossless-join decomposition:
R1 ∩ R2 = {A} and A → AB
• Not dependency preserving
(cannot check B → C without computing R1 R

Database System Concepts - 7th Edition 7.26 ©


BCNF and Dependency Preserv

 It is not always possible to achieve both BCNF and


dependency preservation
 Consider a schema:
dept_advisor(s_ID, i_ID, department_name)
 With function dependencies:
i_ID → dept_name
s_ID, dept_name → i_ID
 dept_advisor is not in BCNF
• i_ID is not a superkey.
 Any decomposition of dept_advisor will not include
attributes in
s_ID, dept_name → i_ID
 Thus, the composition is NOT be dependency pre

Database System Concepts - 7th Edition 7.27 ©


Third Normal Form
 A relation schema R is in third normal form (3NF)
α → β in F+
at least one of the following holds:
• α → β is trivial (i.e., β ∈ α)
• α is a superkey for R
• Each attribute A in β – α is contained in a candi
R.
(NOTE: each attribute may be in a different cand
 If a relation is in BCNF it is in 3NF (since in BCNF
first two conditions above must hold).
 Third condition is a minimal relaxation of BCNF to e
dependency preservation (will see why later).

Database System Concepts - 7th Edition 7.28 ©


3NF Example
 Consider a schema:
dept_advisor(s_ID, i_ID, dept_name)
 With function dependencies:
i_ID → dept_name
s_ID, dept_name → i_ID
 Two candidate keys = {s_ID, dept_name}, {s_ID, i_
 We have seen before that dept_advisor is not in BC
 R, however, is in 3NF
• s_ID, dept_name is a superkey
• i_ID → dept_name and i_ID is NOT a superke
 { dept_name} – {i_ID } = {dept_name } and
 dept_name is contained in a candidate key

Database System Concepts - 7th Edition 7.29 ©


Redundancy in 3NF

 Consider the schema R below, which is in 3NF


• R = (J, K, L )
• F = {JK → L, L → K }
• And an instance table:
J L K
j1 I1 k1
j2 I1 k1
j3 I1 k1
null I2 k2

 What is wrong with the table?


• Repetition of information
• Need to use null values (e.g., to represent the re
where there is no corresponding value for J)

Database System Concepts - 7th Edition 7.30 ©


Comparison of BCNF and

 Advantages to 3NF over BCNF. It is always possi


obtain a 3NF design without sacrificing losslessnes
dependency preservation.
 Disadvantages to 3NF.
• We may have to use null values to represent so
possible meaningful relationships among data
• There is the problem of repetition of informatio

Database System Concepts - 7th Edition 7.31 ©


Goals of Normalization

 Let R be a relation scheme with a set F of functiona


dependencies.
 Decide whether a relation scheme R is in “good” fo
 In the case that a relation scheme R is not in “good
decompose it into a set of relation scheme {R1, R2
such that
• Each relation scheme is in good form
• The decomposition is a lossless decomposition
• Preferably, the decomposition should be depen
preserving.

Database System Concepts - 7th Edition 7.32 ©


How good is BCNF?
 There are database schemas in BCNF that do not
be sufficiently normalized
 Consider a relation
inst_info (ID, child_name, phone)
• where an instructor may have more than one p
can have multiple children
• Instance of inst_info

Database System Concepts - 7th Edition 7.33 ©


How good is BCNF? (Con

 There are no non-trivial functional dependencies an


therefore the relation is in BCNF
 Insertion anomalies – i.e., if we add a phone 981-9
to 99999, we need to add two tuples
(99999, David, 981-992-3443)
(99999, William, 981-992-3443)

Database System Concepts - 7th Edition 7.34 ©


Higher Normal Forms
 It is better to decompose inst_info into:
• inst_child:

• inst_phone:

 This suggests the need for higher normal forms, su


Fourth Normal Form (4NF), which we shall see late

Database System Concepts - 7th Edition 7.35 ©


Functional-Dependency Theory R

 We now consider the formal theory that tells us wh


functional dependencies are implied logically by a g
functional dependencies.
 We then develop algorithms to generate lossless
decompositions into BCNF and 3NF
 We then develop algorithms to test if a decomposit
dependency-preserving

Database System Concepts - 7th Edition 7.37 ©


Closure of a Set of Functional Depe

 Given a set F set of functional dependencies, ther


certain other functional dependencies that are logi
implied by F.
• If A → B and B → C, then we can infer that A
• etc.
 The set of all functional dependencies logically im
F is the closure of F.
 We denote the closure of F by F+.

Database System Concepts - 7th Edition 7.38 ©


Closure of a Set of Functional Dep

 We can compute F+ , the closure of F, by repeate


applying Armstrong’s Axioms:
• Reflexive rule: if β ⊆ α, then α → β
• Augmentation rule: if α → β, then γ α → γ β
• Transitivity rule: if α → β, and β → γ, then α
 These rules are
• sound -- generate only functional dependenc
actually hold, and
• complete -- generate all functional dependen
hold.

Database System Concepts - 7th Edition 7.39 ©


Example of F+
 R = (A, B, C, G, H, I)
F={A→B
A→C
CG → H
CG → I
B → H}
 Some members of F+
• A→H
 by transitivity from A → B and B → H
• AG → I
 by augmenting A → C with G, to get AG → CG
and then transitivity with CG → I
• CG → HI
 by augmenting CG → I to infer CG → CGI,
and augmenting of CG → H to infer CGI → HI,
and then transitivity

Database System Concepts - 7th Edition 7.40 ©


Closure of Functional Dependenc

 Additional rules:
• Union rule: If α → β holds and α → γ holds, th
γ holds.
• Decomposition rule: If α → β γ holds, then α
holds and α → γ holds.
• Pseudotransitivity rule:If α → β holds and γ
holds, then α γ → δ holds.
 The above rules can be inferred from Armstrong’s

Database System Concepts - 7th Edition 7.41 ©


Procedure for Computing

 To compute the closure of a set of functional dependenc


F+=F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F
for each pair of functional dependencies f1and f2 in
if f1 and f2 can be combined using transitivity
then add the resulting functional dependen
until F + does not change any further

 NOTE: We shall see an alternative procedure for this ta

Database System Concepts - 7th Edition 7.42 ©


Closure of Attribute Set

 Given a set of attributes α, define the closure of α


(denoted by α+) as the set of attributes that are fun
determined by α under F

 Algorithm to compute α+, the closure of α under F

result := α;
while (changes to result) do
for each β → γ in F do
begin
if β ⊆ result then result := result ∪
end

Database System Concepts - 7th Edition 7.43 ©


Example of Attribute Set Cl
 R = (A, B, C, G, H, I)
 F = {A → B
A→C
CG → H
CG → I
B → H}
 (AG)+
1. result = AG
2. result = ABCG (A → C and A → B)
3. result = ABCGH (CG → H and CG ⊆ AGBC)
4. result = ABCGHI (CG → I and CG ⊆ AGBCH)
 Is AG a candidate key?
1. Is AG a super key?
1. Does AG → R? == Is R ⊇ (AG)+
2. Is any subset of AG a superkey?
1. Does A → R? == Is R ⊇ (A)+
2. Does G → R? == Is R ⊇ (G)+
3. In general: check for each subset of size n-1

Database System Concepts - 7th Edition 7.44 ©


Uses of Attribute Closu
There are several uses of the attribute closure algori
 Testing for superkey:
• To test if α is a superkey, we compute α+, and
contains all attributes of R.
 Testing functional dependencies
• To check if a functional dependency α → β ho
other words, is in F+), just check if β ⊆ α+.
• That is, we compute α+ by using attribute clos
then check if it contains β.
• Is a simple and cheap test, and very useful
 Computing closure of F
• For each γ ⊆ R, we find the closure γ+, and for
γ+, we output a functional dependency γ → S.

Database System Concepts - 7th Edition 7.45 ©


Canonical Cover
 Suppose that we have a set of functional dependen
relation schema. Whenever a user performs an upd
relation, the database system must ensure that the
not violate any functional dependencies; that is, all
dependencies in F are satisfied in the new databas
 If an update violates any functional dependencies i
the system must roll back the update.
 We can reduce the effort spent in checking for viola
testing a simplified set of functional dependencies
same closure as the given set.
 This simplified set is termed the canonical cover
 To define canonical cover we must first define extr
attributes.
• An attribute of a functional dependency in F is
we can remove it without changing F +

Database System Concepts - 7th Edition 7.46 ©

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy