Dbms Mod 4 Notes
Dbms Mod 4 Notes
Module – 4
Database Programming
Introduction of Database Normalization:
• Normalization is an important process in database design that helps to improve the
database’s efficiency, consistency, and accuracy. It makes it easier to manage and
maintain the data and ensures that the database is adaptable to changing business
needs.
• Database normalization is the process of organizing the attributes of the database
to reduce or eliminate data redundancy (having the same data but at different
places).
Nonloss/Lossless Decomposition:
• Lossless join decomposition is a decomposition of a relation R into relations R1, and
R2 such that if we perform a natural join of relation R1 and R2, it will return the
original relation R. This is effective in removing redundancy from databases while
preserving the original data.
• Example of Lossless Decomposition
normalization. With the help of functional dependencies we are able to identify the
primary key, candidate key in a table which in turns helps in normalization.
2. Query Optimization
With the help of functional dependencies we are able to decide the
connectivity between the tables and the necessary attributes need to be projected
to retrieve the required data from the tables. This helps in query optimization and
improves performance.
3. Consistency of Data
Functional dependencies ensures the consistency of the data by removing
any redundancies or inconsistencies that may exist in the data. Functional
dependency ensures that the changes made in one attribute does not affect
inconsistency in another set of attributes thus it maintains the consistency of the
data in database.
4. Data Quality Improvement
• {Note that, there are many courses having the same course fee} Here, COURSE_FEE
cannot alone decide the value of COURSE_NO or STUD_NO; COURSE_FEE together
with STUD_NO cannot decide the value of COURSE_NO; COURSE_FEE together with
COURSE_NO cannot decide the value of STUD_NO; Hence, COURSE_FEE would be a
non-prime attribute, as it does not belong to the one only candidate key {STUD_NO,
COURSE_NO} ; But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on
COURSE_NO, which is a proper subset of the candidate key. Non-prime attribute
COURSE_FEE is dependent on a proper subset of the candidate key, which is a partial
dependency and so this relation is not in 2NF. To convert the above relation to 2NF,
we need to split the table into two tables such as : Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
•
NOTE: 2NF tries to reduce the redundant data getting stored in memory. For
instance, if there are 100 students taking C1 course, we don’t need to store its Fee as
1000 for all the 100 records, instead, once we can store it in the second table as the
course fee for C1 is 1000.
• X is a super key.
• Y is a prime attribute (each element of Y is part of some candidate key).
• In other words, A relation that is in First and Second Normal Form and in which no
non-primary-key attribute is transitively dependent on the primary key, then it is in
Third Normal Form (3NF).
• Example 1: Let us consider the student database, in which data of the student are
mentioned.
• Stu_Branch Table:
1
V. SHARVANI, ASST. PROF.,MCA,BITM
0
INTRODUCTION TO SQL
three attributes because it consists of at least two attributes that are dependent on
a third.
• For a dependency A -> B, if for a single value of A, multiple values of B exist, then the
table may have a multi-valued dependency. The table should have at least 3
attributes and B and C should be independent for A ->> B multivalued dependency.
Example: Consider the database table of a class that has two relations R1 contains
student ID(SID) and student name (SNAME) and R2 contains course id(CID) and
course name (CNAME).
Table R1 Table R2
1
V. SHARVANI, ASST. PROF.,MCA,BITM
2
INTRODUCTION TO SQL
• Table R1 X R2
Join Dependency:
• Join decomposition is a further generalization of Multivalued dependencies. If the
join of R1 and R2 over C is equal to relation R then we can say that a join dependency
(JD) exists, where R1 and R2 are the decomposition R1(A, B, C) and R2(C, D) of a
given relations R (A, B, C, D). Alternatively, R1 and R2 are a lossless decomposition of
R. A JD ⋈ {R1, R2, …, Rn} is said to hold over a relation R if R1, R2, ….., Rn is a
lossless-join decomposition. The *(A, B, C, D), (C, D) will be a JD of R if the join of
joins attribute is equal to the relation R. Here, *(R1, R2, R3) is used to indicate that
relation R1, R2, R3 and so on are a JD of R. Let R is a relation schema R1, R2,
R3……..Rn be the decomposition of R.
• Example:
• Table R1: Table R2 Table R3
1
V. SHARVANI, ASST. PROF.,MCA,BITM
3
INTRODUCTION TO SQL
• Table R1⋈R2⋈R3
Agent->->Product
• Properties
• A relation R is in 5NF if and only if it satisfies the following conditions:
• 1. R should be already in 4NF.
2. It cannot be further non loss decomposed (join dependency).
1
V. SHARVANI, ASST. PROF.,MCA,BITM
4
INTRODUCTION TO SQL
• Example – Consider the above schema, with a case as “if a company makes a
product and an agent is an agent for that company, then he always sells that product
for the company”. Under these circumstances, the ACP table is shown as:
• Table ACP
The relation ACP is again decomposed into 3 relations. Now, the natural Join of all three
relations will be shown as:
• Table R1 Table R2 Table R3
• The result of the Natural Join of R1 and R3 over ‘Company’ and then the Natural
Join of R13 and R2 over ‘Agent’ and ‘Product’ will be Table ACP.
• Hence, in this example, all the redundancies are eliminated, and the decomposition
of ACP is a lossless join decomposition. Therefore, the relation is in 5NF as it does
not violate the property of lossless join.
1
V. SHARVANI, ASST. PROF.,MCA,BITM
5
INTRODUCTION TO SQL
• First Normal Form (1NF): This is the most basic level of normalization. In 1NF, each
table cell should contain only a single value, and each column should have a unique
name. The first normal form helps to eliminate duplicate data and simplify queries.
• Second Normal Form (2NF): 2NF eliminates redundant data by requiring that each
non-key attribute be dependent on the primary key. This means that each column
should be directly related to the primary key, and not to other columns.
• Third Normal Form (3NF): 3NF builds on 2NF by requiring that all non-key attributes
are independent of each other. This means that each column should be directly
related to the primary key, and not to any other columns in the same table.
• Boyce-Codd Normal Form (BCNF): BCNF is a stricter form of 3NF that ensures that
each determinant in a table is a candidate key. In other words, BCNF ensures that
each non-key attribute is dependent only on the candidate key.
• Fourth Normal Form (4NF): 4NF is a further refinement of BCNF that ensures that a
table does not contain any multi-valued dependencies.
• Fifth Normal Form (5NF): 5NF is the highest level of normalization and involves
decomposing a table into smaller tables to remove data redundancy and improve
data integrity.
1
V. SHARVANI, ASST. PROF.,MCA,BITM
6