Lecture 10: BCSE302L - DBMS: Functional Dependencies
Lecture 10: BCSE302L - DBMS: Functional Dependencies
FUNCTIONAL DEPENDENCIES
Outline
Informal Guidelines
Functional Dependencies
Normal Forms Based on Primary Keys
First Normal Form
Second Normal Form
Third Normal Form
BCNF (Boyce-Codd Normal Form)
Fourth Normal Form
Fifth Normal Form
Objectives of database normalization
To correct duplicate data and database anomalies.
To avoid creating and updating any unwanted data connections and dependencies.
To reduce the delay and complexity of checking databases when new types of data need to be introduced.
To facilitate the access and interpretation of data to users and applications that make use of the databases.
Normalization of Relations
tuples t1 and t2 in any legal relation state r of R will have t1[S] = t2[S]
A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey
any more.
If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily
designated to be the primary key, and the others are called secondary keys.
A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key
1 Informal Design Guidelines for Relational Databases
1.1Semantics of the Relation Attributes
1.2 Redundant Information in Tuples and Update Anomalies
1.3 Null Values in Tuples
1.4 Spurious Tuples
Bottom Line: Design a schema that can be explained easily relation by relation. The
semantics of attributes should be easy to interpret.
A simplified COMPANY relational database schema
1.2 Redundant Information in Tuples and Update Anomalies
Wastes storage
Insertion anomalies
Deletion anomalies
Modification anomalies
EXAMPLE OF AN UPDATE ANOMALY
Update Anomaly:
Changing the name of project number P1 from “Billing” to “Customer-Accounting”
may cause this update to be made for all 100 employees working on project P1.
Two relation schemas suffering from update anomalies
Example States for EMP_DEPT and EMP_PROJ
EXAMPLE OF AN INSERT ANOMALY
Insert Anomaly:
Cannot insert a project unless an employee is assigned to it.
Conversely
Cannot insert an employee unless an he/she is assigned to a project.
EXAMPLE OF AN DELETE ANOMALY
Delete Anomaly:
When a project is deleted, it will result in deleting all the employees who work on that
project.
GUIDELINE 2:
Design a schema that does not suffer from the insertion, deletion and update
anomalies.
If there are any anomalies present, then note them so that applications can be made to
take them into account.
1.3 Null Values in Tuples
GUIDELINE 3:
Relations should be designed such that their tuples will have as few NULL values as possible
Attributes that are NULL frequently could be placed in separate relations (with the primary key)
The "lossless join" property is used to guarantee meaningful results for join operations
GUIDELINE 4:
Note that:
Property (a) is extremely important and cannot be sacrificed.
Property (b) is less stringent and may be sacrificed.
Functional Dependencies
2. Functional Dependencies
Functional dependencies (FDs)
Are used to specify formal measures of the "goodness" of relational designs
Are constraints that are derived from the meaning and interrelationships of the data attributes
Employee ssn and project number determines the hours per week that the employee works on
the project
{SSN, PNUMBER} -> HOURS
How to find functional dependencies for a relation?
Functional Dependencies in a relation are dependent on the domain of the relation.
Similarly, STUD_STATE->STUD_COUNTRY will be true as if two records have same STUD_STATE, they will have
same STUD_COUNTRY as well.
Examples of FD constraints
An FD is a property of the attributes in the schema R
Regno Name DOB Phone Gender CID CName Ins_ID Ins_Name Ins_Office
14M01 Kumar 12-Jan-1996 12345 M C1 DBMS I1 Kesav G123
14M05 Mary 10-Jun-1995 12367 F C1 DBMS I1 Kesav G123
14M07 Ram 10-May-1996 12898 M C1 DBMS I2 Ragav G127
14M01 Kumar 12-Jan-1996 12345 M C3 DS I5 Mani G125
14B01 Revathi 10-Dec-1995 23456 F C3 DS I5 Mani G125
14M09 Steve 23-Oct-1995 34567 M C4 OS I5 Mani G125
14B03 Ramya 20-Jul-1996 23456 F C4 OS I5 Mani G125
Regno → Name
– Names are uniquely identified by a regno. In other words, for a given register number, there is exactly one name.
Regno → DOB - For any given register number in Student, there is exactly one DOB value.
Regno → Phone
Regno → Gender
You can write the above FDs collectively as follows;
Regno → Name, DOB, Phone, Gender
2.2 Inference Rules for FDs (1)
Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold
IR1, IR2, IR3 form a sound and complete set of inference rules
These are rules hold and all other rules that hold can be deduced from these
Inference Rules for FDs (2)
Some additional inference rules that are useful:
The last three inference rules, as well as any other inference rules, can be deduced from IR1, IR2, and IR3
(completeness property)
2.4 Minimal Sets of FDs (1)
A set of FDs is minimal if it satisfies the following conditions:
2. We cannot remove any dependency from F and have a set of dependencies that is
equivalent to F.