The document discusses normal forms in database design, focusing on functional dependencies (FDs) and properties of decompositions. It explains the concepts of lossless join decompositions and dependency preservation, providing examples with the Hourly-Emps relation. The lecture aims to clarify the implications of decomposing relations and the importance of maintaining data integrity through proper normalization.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
6 views52 pages
Normal Forms
The document discusses normal forms in database design, focusing on functional dependencies (FDs) and properties of decompositions. It explains the concepts of lossless join decompositions and dependency preservation, providing examples with the Hourly-Emps relation. The lecture aims to clarify the implications of decomposing relations and the importance of maintaining data integrity through proper normalization.
Lecture Objectives Where are we now? Chapter 7 of Silberschatz, Korth, and Sudarshan Talked about functional dependencies (FDs) What do we plan to cover today? Continue discussion on FDs Talk about properties of decompositions Talk about Normal Forms
Hourly-Emps Relation Example Hourly-Emps(SSN, Name, Lot, Rating, HourlyWages, HoursWorked) We will denote the attributes as S, N , L, R, W , H respectively What are the FDs of this schema? S → S, N, L, R,W, H R→W
Hourly-Emps Example Contd. Hourly-Emps Relation S N L R W H 123223666 Tim 48 8 10 40 231315368 Jill 22 8 10 30 131243650 John 35 5 7 30 431243650 Jack 35 5 7 32 612244134 Tom 35 8 10 40
Problems are caused by the FD R → W
Update anomaly: can we just change W in the first tuple? Insertion anomaly: what if we want to insert an employee and don’t know the hourly wage for this rating? Delete anomaly: if we delete all employees with rating 5, we lose information about the wage for rating 5
Loss-less Join Decompositions Let R be a relation schema, F be a set of FDs on R Suppose ρ is decomposed into relation schemas R1 , R2 , . . ., Rk We say that ρ is a loss-less join decomposition of R w.r.t. F if for every relation r(R) satisfying F r = ΠR1 (r) ✶ ΠR2 (r) ✶ . . . ✶ ΠRk (r) Note that Decomposition 2 does not have this property as the original relation had 5 tuples, but the relation formed by joining the decomposed relation had 10 tuples
Loss-less Join Decomposition Let ρ = {R1 , R2 } be a decomposition of R and F be a set of functional dependencies on R ρ is loss-less w.r.t. F iff one of the following dependencies is in F + 1. R1 ∩ R2 → R1 or 2. R1 ∩ R2 → R2 Does Decomposition 2 have this property? R1 = (SSN, Name, Lot, Rating) R2 = (Rating, HourlyWages, HoursWorked) R1 ∩ R2 = Rating Rating → Rating, HourlyWages
Algorithm Chase Input: A relation R containing n attributes A1 , A2 , . . ., An , a decomposition of R in ρ = {R1 , R2 , . . . , Rk }, k < n and a set of FDs on R Output: Whether the decomposition is loss-less or not Step 1: 1. Construct a tableau containing n columns (one column for each attribute in R) and k rows (one row for each element in ρ) 2. A row i in the tableau (corresponding to relation scheme Ri has entry a j in the jth column iff Ri contains attribute A j , otherwise row i has entry bi j in the jth column
Algorithm Chase Contd. Step 2: Apply the following rule to the tableau until no more changes can be made to the tableau Let X → Y be a functional dependency in F If the tableau has two rows that agree on all the X columns, then we equate the symbols of the Y columns if one of the equated symbols is a j , make the other one a j if they are bi j or bl j make them both either bi j or bl j arbitrarily Repeatedly consider all FDs until no further change is possible Step 3: If after modifying tableau as above we find a row with all as, then the decomposition is lossless, otherwise not so
Algorithm Chase Examples R = {ABCDEF} ρ = {ABC, AD, DEF}, that is, R1 = {ABC}, R2 = {AD}, and R3 = {DEF} F = {A → B, A → C, D → F, D → E} Applying Chase’s Algorithm, we find that the decomposition is lossless Is Decomposition 1 of Hourly-Emps lossy or lossless? Can you demonstrate using Chase’s Algorithm that Decomposition 2 is lossy?
Dependencies in the Decomposed Table Let G be the set of FDs on Emps(SSN, Name, Lot, Rating) G ⊆ FD+ 1 , such that all FDs in G involve attributes in Emps Let H be the set of FDs on Hourly-Emps3(SSN, HourlyWages, HoursWorked) H ⊆ FD+ 1 , such that all FDs in H involve attributes in Hourly-Emps
pendencies in the Decomposed Table Contd. G is the set of dependencies in Emps 1. SSN → Name 2. SSN → Lot 3. Lot → SSN 4. SSN → Rating H is the set of dependencies in Hourly-Emps3 1. SSN → HoursWorked 2. SSN → HourlyWages If G ∪ H ≡ FD1 , dependency is preserved
Why Dependency Preservation? Whenever an update is made to the database, the system should be able to check that the update will not create an illegal relation (that is, one that does not satisfy all the FDs) We must make sure that such checks can be made by looking at one relation only
Dependency Preserving Let R be a relation scheme, F a set of FDs on R Let ρ = {R1 , R2 , . . . , Rn } be a decomposition of R, that is, (R = ∪ni=1 Ri ) Let ΠRi (F) denote the set of all FDs X → Y in F + such that XY ⊆ Ri (we are projecting FDs on Ri ) A decomposition ρ of R preserves FDs in F if {ΠR1 (F) ∪ ΠR2 (F). . . ΠRn (F)}+ = F +
Decomposition We first need to determine if any decomposition is needed at all What normal form is the relation in? Each normal form has certain characteristics that will tell us whether decomposition is needed or not
First Normal Form (1NF) First Normal Form (1NF): A relation is in 1NF iff each attribute value in the relation is atomic, that is, each cell contains one and only one value
Some Definitions Prime attribute: An attribute of a relation schema R is said to be a prime attribute of R if it is a member of some candidate key of R Non-prime attribute: An attribute is called non-prime if it is not a prime attribute, that is, if it is not a member of any candidate key Consider the schema (ssn, pro jNum, hours) prime attributes: ssn, pro jNum non-prime attributes: hours Full Functional Dependency: If A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A but not on any proper subset of A
Second Normal Form (2NF) Second Normal Form (2NF): A relation is in 2NF if it is in 1NF and every non-prime attribute is fully functionally dependent on the candidate key
Normal Forms (3NF) Third Normal Form (3NF): A relation R is in 3NF if it is in 2NF and for every FD X → A where A 6∈ X , either X is a superkey for R or A is a prime attribute Is the following relation in 3NF? Yes, as shown below. Address(City, Street, Zip) City, Street → Zip Zip → City For FD1 , (City, Street)+ = (City, Street, Zip), so (City, Street) is a superkey For FD2 , (Zip)+ = (Zip,City), so Zip is not a superkey. But, (City) is a prime attribute as (City, Street) is a candidate key
Address Relation City Street Zip Fort Collins Howes Street 80523 Fort Collins Stillwater Creek Drive 80528 Fort Collins Stillwater Creek Court 80528 Fort Collins Raintree Drive 80526
Normal Foms (BCNF) Boyce Codd Normal Forms (BCNF): A relation is in BCNF if for every FD X → A, A 6∈ X , X is a superkey for R Our goal is to get BCNF schemas, but if we cannot then we will aim for 3NF
Normalization Given a relation schema R and a set F of FDs on R we want to have a decomposition ρ = {R1 , R2 , . . . , Rn } such that ρ should be loss-less w.r.t. F ρ should be dependency preserving each Ri should be in BCNF/3NF We have an algorithm that guarantees loss-less and dependency preserving properties and the Ri ’s will be in 3NF We first try to have BCNF, loss-less join, and dependency preserving properties If we cannot get the above we accept 3NF, loss-less join and dependency preserving properties
BCNF Decomposition + Lossless Join Input: A universal relation R and a set of FDs on R 1. D := {R} 2. while there is a relation schema Q in D that is not in BCNF do (a) choose a relation schema Q in D that is not in BCNF (b) find a FD X → Y that violates BCNF (c) replace Q in D by two relation schemas (Q −Y ) and (X ∪Y )
BCNF Decomposition Example R = CT HRSG F = {C → T, HR → C, HT → R,CS → G, HS → R} C = Course, T = Teacher, H = Hour, R = Room, S = Student, G = Grade Find a BCNF decomposition of the above relation
BCNF and Dependency Preservation It is not always possible to get a BCNF decomposition that is dependency preserving R = (J, K, L) F = {JK → L, L → K} Candidate keys = JK and JL R is not BCNF Any decomposition of R will fail to preserve JK → L
ependency-Preserving 3NF Decomposition Let R be a relation Let F is a set of FDs that is a minimal cover Let R1 , R2 , . . ., Rn be a lossless-join decomposition of R where each Ri (1 ≤ i ≤ n) is in 3NF Objective is to find a dependency preserving decomposition 1. Identify the set N of dependencies that are not preserved, that is, not in the closure of the union of Fi ’s (1 ≤ i ≤ n) 2. For each FD X → A in N , create a relation schema XA and add it to the decomposition of R
3NF Synthesis Algorithm Guarantees dependency preservation Guarantees lossless join property Algorithm details 1. Find a minimal cover G for F 2. For each LHS of X of a FD that appears in G create a relation schema in D with attributes {X ∪ {A1 } ∪ {A2 } ∪ . . . ∪ {Ak }}, where X → A1 , X → A2 , . . ., X → Ak are the only dependencies in G with X on the LHS 3. If none of the relation schemas in D contains a key of R, then create one more relation schema in D that contains attributes that form a key of R
Example 3NF Decomposition Contd. Step 1: FDs are minimal Step 2: We get the following schemas R1 = (banker − name, branch − name, o f f ice − number) R2 = (customer − name, branch − name, banker − name)
Example 3NF Synthesis F = {C → CSJDPQV, JP → C, SD → P, J → S} Step 1: Minimal cover = {C → J,C → D,C → Q,C → V, JP → C, SD → P, J → S} Step 2: We obtain the schemas CJ , CD, CQ, CV , CJP, SDP, and JS Step 3: Combine all those which have the same primary key CDJQV , CJP, SDP and JS Step 4: Since we have a relation that has the superkey, we are done
Comparison of BCNF and 3NF It is always possible to decompose a relation into relations in 3NF the decomposition is lossless dependencies are preserved It is always possible to decompose a relation into relations in BCNF the decomposition is lossless it may not be possible to preserve dependencies
Lecture Objectives What did we cover today? Completed schema refinement What do we plan to do next? We plan to discuss about relational algebra Chapter 2