0% found this document useful (0 votes)
46 views35 pages

Lecture 10: BCSE302L - DBMS: Functional Dependencies

The following functional dependencies hold for the given Student table: Regno -> Name, DOB, Phone, Gender CID -> CName Ins_ID -> Ins_Name, Ins_Office Regno, CID -> Ins_ID, Ins_Name, Ins_Office

Uploaded by

Priyanshu Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views35 pages

Lecture 10: BCSE302L - DBMS: Functional Dependencies

The following functional dependencies hold for the given Student table: Regno -> Name, DOB, Phone, Gender CID -> CName Ins_ID -> Ins_Name, Ins_Office Regno, CID -> Ins_ID, Ins_Name, Ins_Office

Uploaded by

Priyanshu Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Lecture 10 : BCSE302L– DBMS

FUNCTIONAL DEPENDENCIES
Outline
 Informal Guidelines
 Functional Dependencies
 Normal Forms Based on Primary Keys
 First Normal Form
 Second Normal Form
 Third Normal Form
 BCNF (Boyce-Codd Normal Form)
 Fourth Normal Form
 Fifth Normal Form
Objectives of database normalization
 To correct duplicate data and database anomalies.

 To avoid creating and updating any unwanted data connections and dependencies.

 To prevent unwanted deletions of data.

 To optimize storage space.

 To reduce the delay and complexity of checking databases when new types of data need to be introduced.

 To facilitate the access and interpretation of data to users and applications that make use of the databases.
Normalization of Relations

 Normalization: The process of decomposing unsatisfactory "bad"


relations by breaking up their attributes into smaller relations

 Normal form: Condition using keys and FDs of a relation to certify


whether a relation schema is in a particular normal form
Normalization of Relations
 A superkey of a relation schema R = {A1, A2, ...., An} is a set of attributes S subset-of R with the property that no two

tuples t1 and t2 in any legal relation state r of R will have t1[S] = t2[S]

 A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey
any more.

 If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily
designated to be the primary key, and the others are called secondary keys.

 A Prime attribute must be a member of some candidate key

 A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key
1 Informal Design Guidelines for Relational Databases
 1.1Semantics of the Relation Attributes
 1.2 Redundant Information in Tuples and Update Anomalies
 1.3 Null Values in Tuples
 1.4 Spurious Tuples

2 Functional Dependencies (FDs)


 2.1 Definition of FD
 2.2 Inference Rules for FDs
 2.3 Minimal Sets of FDs
1.1 Semantics of the Relation Attributes

 GUIDELINE 1: Informally, each tuple in a relation should represent one entity or


relationship instance. (Applies to individual relations and their attributes).
 Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be
mixed in the same relation

 Only foreign keys should be used to refer to other entities

 Entity and relationship attributes should be kept apart as much as possible.

 Bottom Line: Design a schema that can be explained easily relation by relation. The
semantics of attributes should be easy to interpret.
A simplified COMPANY relational database schema
1.2 Redundant Information in Tuples and Update Anomalies

 Information is stored redundantly

 Wastes storage

 Causes problems with update anomalies

 Insertion anomalies

 Deletion anomalies

 Modification anomalies
EXAMPLE OF AN UPDATE ANOMALY

 Consider the relation:


 EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

 Update Anomaly:
 Changing the name of project number P1 from “Billing” to “Customer-Accounting”
may cause this update to be made for all 100 employees working on project P1.
Two relation schemas suffering from update anomalies
Example States for EMP_DEPT and EMP_PROJ
EXAMPLE OF AN INSERT ANOMALY

 Consider the relation:


 EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

 Insert Anomaly:
 Cannot insert a project unless an employee is assigned to it.

 Conversely
 Cannot insert an employee unless an he/she is assigned to a project.
EXAMPLE OF AN DELETE ANOMALY

 Consider the relation:


 EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

 Delete Anomaly:
 When a project is deleted, it will result in deleting all the employees who work on that
project.

 Alternately, if an employee is the sole employee on a project, deleting that employee


would result in deleting the corresponding project.
Guideline to Redundant Information in Tuples and
Update Anomalies

 GUIDELINE 2:
 Design a schema that does not suffer from the insertion, deletion and update
anomalies.

 If there are any anomalies present, then note them so that applications can be made to
take them into account.
1.3 Null Values in Tuples
 GUIDELINE 3:

 Relations should be designed such that their tuples will have as few NULL values as possible

 Attributes that are NULL frequently could be placed in separate relations (with the primary key)

  Reasons for nulls:

 Attribute not applicable or invalid

 Attribute value unknown (may exist)

 Value known to exist, but unavailable


1.4 Spurious Tuples
 Bad designs for a relational database may result in erroneous results for certain JOIN operations

 The "lossless join" property is used to guarantee meaningful results for join operations

 GUIDELINE 4:

 The relations should be designed to satisfy the lossless join condition.

 No spurious tuples should be generated by doing a natural-join of any relations.


Spurious Tuples
 There are two important properties of decompositions:
a) Non-additive or losslessness of the corresponding join
b) Preservation of the functional dependencies.

 Note that:
 Property (a) is extremely important and cannot be sacrificed.
 Property (b) is less stringent and may be sacrificed.
Functional Dependencies
2. Functional Dependencies
 Functional dependencies (FDs)
 Are used to specify formal measures of the "goodness" of relational designs

 And keys are used to define normal forms for relations

 Are constraints that are derived from the meaning and interrelationships of the data attributes

 A set of attributes X functionally determines a set of attributes Y if the value of X determines a


unique value for Y
Functional Dependencies
 X -> Y holds if whenever two tuples have the same value for X, they must have the same value for
Y
 For any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X], then t1[Y]=t2[Y]

 X -> Y in R specifies a constraint on all relation instances r(R)

 Written as X -> Y; can be displayed graphically on a relation schema as in Figures. ( denoted by


the arrow: ).

 FDs are derived from the real-world constraints on the attributes


Examples of FD constraints
 Social security number determines employee name
 SSN -> ENAME

 Project number determines project name and location


 PNUMBER -> {PNAME, PLOCATION}

 Employee ssn and project number determines the hours per week that the employee works on
the project
 {SSN, PNUMBER} -> HOURS
How to find functional dependencies for a relation?
 Functional Dependencies in a relation are dependent on the domain of the relation.

 Consider the STUDENT relation given in Table 1.

 We know that STUD_NO is unique for each student.

 So STUD_NO->STUD_NAME, STUD_NO->STUD_PHONE, STUD_NO->STUD_STATE, STUD_NO-


>STUD_COUNTRY and STUD_NO -> STUD_AGE all will be true.

 Similarly, STUD_STATE->STUD_COUNTRY will be true as if two records have same STUD_STATE, they will have
same STUD_COUNTRY as well.
Examples of FD constraints
 An FD is a property of the attributes in the schema R

 The constraint must hold on every relation instance r(R)

 If K is a key of R, then K functionally determines all attributes in R


 (since we never have two distinct tuples with t1[K]=t2[K])
Attribute Closure
Attribute Closure: Attribute closure of an attribute set can be defined as set of attributes which can be functionally
determined from it.

To find attribute closure of an attribute set:


Add elements of attribute set to the result set.
Recursively add elements to the result set which can be functionally determined
from the elements of the result set.

Using FD set of table 1, attribute closure can be determined as:

(STUD_NO)+ = {STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY, STUD_AGE}


(STUD_STATE)+ = {STUD_STATE, STUD_COUNTRY}
Example Problem
Consider the relation scheme R = {E, F, G, H, I, J, K, L, M, M} and the set of functional dependencies {{E, F} ->
{G}, {F} -> {I, J}, {E, H} -> {K, L}, K -> {M}, L -> {N} on R. What is the key for R? (GATE-CS-2014)
A. {E, F}
B. {E, F, H}
C. {E, F, H, K, L}
D. {E}
Answer: Finding attribute closure of all given options, we get:
{E,F}+ = {EFGIJ}
{E,F,H}+ = {EFHGIJKLMN}
{E,F,H,K,L}+ = {{EFHGIJKLMN}
{E}+ = {E}
{EFH}+ and {EFHKL}+ results in set of all attributes, but EFH is minimal. So it will be candidate key. So correct option
is (B).
For the following Student table instance, find all the possible functional dependencies that are held.
[ Schema - Student (Regno, Name, DOB, Phone, Gender, Course_ID, Course_Name, Instructor_ID,
Instructor_Name, Instructor_Office)]

Regno Name DOB Phone Gender CID CName Ins_ID Ins_Name Ins_Office
14M01 Kumar 12-Jan-1996 12345 M C1 DBMS I1 Kesav G123
14M05 Mary 10-Jun-1995 12367 F C1 DBMS I1 Kesav G123
14M07 Ram 10-May-1996 12898 M C1 DBMS I2 Ragav G127
14M01 Kumar 12-Jan-1996 12345 M C3 DS I5 Mani G125
14B01 Revathi 10-Dec-1995 23456 F C3 DS I5 Mani G125
14M09 Steve 23-Oct-1995 34567 M C4 OS I5 Mani G125
14B03 Ramya 20-Jul-1996 23456 F C4 OS I5 Mani G125

Regno → Name
– Names are uniquely identified by a regno. In other words, for a given register number, there is exactly one name.
Regno → DOB -         For any given register number in Student, there is exactly one DOB value.
Regno → Phone
Regno → Gender
You can write the above FDs collectively as follows;
Regno → Name, DOB, Phone, Gender
2.2 Inference Rules for FDs (1)
 Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold

 Armstrong's inference rules:

 IR1. (Reflexive) If Y subset-of X, then X -> Y

 IR2. (Augmentation) If X -> Y, then XZ -> YZ

 (Notation: XZ stands for X U Z)

 IR3. (Transitive) If X -> Y and Y -> Z, then X -> Z

 IR1, IR2, IR3 form a sound and complete set of inference rules

 These are rules hold and all other rules that hold can be deduced from these
Inference Rules for FDs (2)
 Some additional inference rules that are useful:

 Decomposition: If X -> YZ, then X -> Y and X -> Z

 Union: If X -> Y and X -> Z, then X -> YZ

 Psuedotransitivity: If X -> Y and WY -> Z, then WX -> Z

 The last three inference rules, as well as any other inference rules, can be deduced from IR1, IR2, and IR3
(completeness property)
2.4 Minimal Sets of FDs (1)
 A set of FDs is minimal if it satisfies the following conditions:

1. Every dependency in F has a single attribute for its RHS.

2. We cannot remove any dependency from F and have a set of dependencies that is
equivalent to F.

3. We cannot replace any dependency X -> A in F with a dependency Y -> A, where Y


proper-subset-of X ( Y subset-of X) and still have a set of dependencies that is equivalent
to F.
Minimal Sets of FDs
Minimal Sets of FDs
Thank you

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy