DBMS Unit 2
DBMS Unit 2
Class - T.Y.PLD
(Division-)
AY 2023-2024
SEM-I
1
Unit – II
Database Design
2
MIT School of Computing
Department of Computer Science & Engineering
Syllabus
3
Functional Dependency
• Functional dependency in DBMS, as the name suggests is a
relationship between attributes of a table dependent on each
other.
• Functional Dependency (FD) determines the relation of one
attribute to another attribute in a database management
system.
• Functional dependency helps you to maintain the quality of
data in the database.
• A functional dependency is denoted by an arrow →.
• X→Y read as X determines Y
• Where X is the determinant attribute and Y is the dependent
attribute.
• E.g. sid → sname
Types of Functional Dependency
1.Multivalued dependency
2.Trivial functional dependency
3.Non-trivial functional dependency
4.Transitive dependency
1. Multivalued dependency-Multivalued dependency occurs in
the situation where there are multiple independent multivalued
attributes in a single table.
2. Trivial functional dependency -The Trivial dependency is a set
of attributes which are called a trivial if the set of attributes are
included in that attribute.
So, X -> Y is a trivial functional dependency if Y is a subset of X.
3. Non-trivial functional dependency-
Functional dependency which also known as a nontrivial
dependency occurs when A->B holds true where B is not a subset
of A. In a relationship, if attribute B is not a subset of attribute A,
then it is considered as a non-trivial dependency.
4. Transitive dependency-
A transitive is a type of functional dependency which happens
when t is indirectly formed by two functional dependencies.
Properties of functional dependencies
1. Reflexivity: If Y is a subset of X then X 🡪 Y and it is
always valid.
e.g. sid🡪sid
2. Augmentation: if X 🡪 Y then XZ 🡪 YZ
e.g. sidphoneno 🡪 snamephoneno
11
Rule 5: Comprehensive Data Sub-Language Rule
• A database can only be accessed using a language
having linear syntax that supports data definition, data
manipulation, and transaction management operations.
This language can be used directly or by means of
some application. If the database allows access to data
without any help of this language, then it is considered
as a violation.
Rule 6: View Updating Rule
• All the views of a database, which can theoretically be
updated, must also be updatable by the system.
12
Rule 7: High-Level Insert, Update, and Delete Rule
• A database must support high-level insertion, updation,
and deletion. This must not be limited to a single row,
that is, it must also support union, intersection and
minus operations to yield sets of data records.
Rule 8: Physical Data Independence
• The data stored in a database must be independent of
the applications that access the database. Any change in
the physical structure of a database must not have any
impact on how the data is being accessed by external
applications.
13
Rule 9: Logical Data Independence
• The logical data in a database must be independent of its user’s
view (application). Any change in logical data must not affect the
applications using it. For example, if two tables are merged or
one is split into two different tables, there should be no impact or
change on the user application. This is one of the most difficult
rule to apply.
Rule 10: Integrity Independence
• A database must be independent of the application that uses it.
All its integrity constraints can be independently modified
without the need of any change in the application. This rule
makes a database independent of the front-end application and
its interface.
14
Rule 11: Distribution Independence
• The end-user must not be able to see that the data is
distributed over various locations. Users should always
get the impression that the data is located at one site
only. This rule has been regarded as the foundation of
distributed database systems.
Rule 12: Non-Subversion Rule
• If a system has an interface that provides access to low-
level records, then the interface must not be able to
subvert the system and bypass security and integrity
constraints.
15
Normalization
• Normalization is a database design technique that reduces data
redundancy and eliminates undesirable characteristics like
Insertion, Update and Deletion Anomalies.
Types of Normal forms
1. First Normal Form
2. Second Normal Form
3. Third Normal Form
4. BCNF (Boyce Codd Normal Form)
5. Fifth Normal Form
1st Normal Form (1NF)
Let’s understand the First Normal Form with the help of an example.
Below is a students’ record table that has information about
student roll number, student name, student course, and age of
the student.
In the students record table, you can see that the course column has
two values. Thus it does not follow the First Normal Form.
Now, if you use the First Normal Form to the previous table,
you get the below table as a result.
25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
Third Normal Form (3NF)
EMPLOYEE_ZIP table:
In the above table, student_id and subject together form the primary
key because using student_id and subject; you can determine all the
table columns.
Another important point to be noted here is that one professor
teaches only one subject, but one subject may have two professors.
Which exhibit there is a dependency between subject and professor,
i.e. subject depends on the professor's name.
This table follows all the Normal forms except the Boyce Codd
Normal Form.
As you can see stuid, and subject forms the primary key, which
means the subject attribute is a prime attribute.
However, there exists yet another dependency - professor → subject.
BCNF does not follow in the table as a subject is a prime attribute,
the professor is a non-prime attribute.
.
To transform the table into the BCNF, you will divide the table
into two parts.
One table will hold stuid which already exists and the second
table will hold a newly created column profid and in the second
table will have the columns profid, subject, and professor, which
satisfies the BCNF
Multivalued Dependency
•Multivalued dependency occurs when two attributes in a table are
independent of each other but, both depend on a third attribute.
•A multivalued dependency consists of at least two attributes that
are dependent on a third attribute that's why it always requires at
least three attributes.
Example: Suppose there is a bike manufacturer company which
produces two colors(white and black) of each model every year.
BIKE_MODEL MANUF_YEAR COLOR
M2011 2008 White
M2001 2008 Black
M3001 2013 White
M3001 2013 Black
M4006 2017 White
M4006 2017 Black
Here columns COLOR and MANUF_YEAR are dependent on
BIKE_MODEL and independent of each other.
Types of Decomposition
1. Lossless Decomposition
2. Lossy decomposition
Lossless Decomposition
•If the information is not lost from the relation that is decomposed,
then the decomposition will be lossless.
•The lossless decomposition guarantees that the join of relations
will result in the same relation as it was decomposed.
•The relation is said to be lossless decomposition if natural joins of
all the decomposition give the original relation.
Example:
EMPLOYEE_DEPARTMENT table:
EMP_ID EMP_NAME EMP_AG EMP_CITY DEPT_ID DEPT_NAME
E
22 Denim 28 Mumbai 827 Sales
33 Alina 25 Delhi 438 Marketing
46 Stephan 30 Bangalore 869 Finance
52 Katherine 36 Mumbai 575 Production
The 60 Jack is decomposed into two
above relation 40 relations
Noida 678 Testing
EMPLOYEE and DEPARTMENT
EMPLOYEE table:
22 Denim 28 Mumbai
33 Alina 25 Delhi
46 Stephan 30 Bangalore
52 Katherine 36 Mumbai
60 Jack 40 Noida
DEPARTMENT table
827 22 Sales
438 33 Marketing
869 46 Finance
575 52 Production
678 60 Testing
Now, when these two relations are joined on the common column
"EMP_ID", then the resultant relation will look like:
Employee ⋈ Department
EMP_ID EMP_NAME EMP_AGE EMP_CITY DEPT_ID DEPT_NAME
46
Question
• What is lossless decomposition? Suppose that we decompose the schema
R=(A,B,C,D,E) into (A,B,C) and (A,D,E), show that this decomposition is a
lossless decomposition if the following set F of functional dependencies
holds: A→BC CD → E B→D E → A.
• Given relation R (A, B, C, D, E) with dependencies AB→ C, CD→ E, DE→ B,
IS AB a candidate key? or IS ABD is a candidate key?
• List the desirable properties of decomposition. Explain lossless join with
an example.
• Define Boyce Codd Normal form. How is it different from 3NF? Why it is
considered as stronger form of 3NF
47