0% found this document useful (0 votes)
7 views58 pages

Part 2

Uploaded by

xehad13913
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views58 pages

Part 2

Uploaded by

xehad13913
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Schema Refinement:

Problems caused by redundancy PART-2


Decompositions
problems related to decomposition
reasoning about functional dependencies
FIRST, SECOND, THIRD normal forms, BCNF lossless join
decomposition
multi-valued dependencies
FOURTH normal form
FIFTH normal form.
Schema Refinement
 It is the process of improving a database schema to ensure it meets the desired
criteria of correctness, efficiency, and maintainability.
 The basic goal of schema refinement is to eliminate redundancy, ensure data
integrity, and optimize performance.
 It is a systematic approach of decomposing tables to eliminate data redundancy
and undesirable characteristics like Insertion. Update and Deletion Anomalies.
 The Schema Refinement refers to refine the schema by using some techniques.
 The best technique of schema refinement is decomposition.
 Redundancy refers to repetition of same data or duplicate copies of same data
stored in different locations.
Importance of Schema Refinement
1. Eliminate Redundancy: Reduces duplicated data, which
saves storage and avoids anomalies.
2. Ensure Data Integrity: Maintains accuracy and
consistency of data through well-defined relationships
and constraints.
3. Optimize Performance: Enhances query performance by
organizing data in an efficient manner.
4. Simplify Maintenance: Eases database management
tasks like updates, insertions, and deletions by ensuring
a-clean, logical-structure.
Process of Schema Refinement
Analysis: Assess the current schema for inefficiencies and
anomalies.
Normalization: Apply normalization principles to structure the
schema into appropriate normal forms.
Decomposition: Break down larger tables into smaller, related
tables to eliminate redundancy.
Validation: Ensure that decompositions are lossless and preserve
dependencies.
Optimization: Fine-tune the schema for performance by indexing,
partitioning, and other techniques
Anomalies or Problems Facing without Normalisation
Anomalies refers to the problems occurred after poorly
planned and unnormalised databases where all the data is
stored in one table which is sometimes called a flat file
database. Let us consider such type of schema –

Here all the data is stored in a single


table which causes redundancy of data
or say anomalies as SID and Sname are
repeated once for same CID
Problems caused by Redundancy
Storing the same information redundantly, that is, in more than one
place within a database. can lead to several problems:
Redundant storage: Some information is stored repeatedly.
Update anomalies: If one copy of such repeated data is updated, an
inconsistency is created unless all copies are similarly updated.
Insertion anomalies: It may not be possible to store some
information unless some other information is stored as well.
Deletion anomalies: It may not be possible to delete some
information without losing some other information as well.
Updation/Modification Anomaly
If there is updation in the fee from 5000 to 7000,
then we have to update FEE column in all the
rows, else data will become inconsistent.
Insertion Anomaly and Deletion Anomaly
These anamolies exist only due to redundancy,
otherwise they do not exist.
Insertion Anomaly :
New course is introduced C4, But no student is there
who is having C4 subject.
Cont……
Deletion Anomaly :
- Deletion of S3 student cause the deletion of
course.
- Because of deletion of some data forced to
delete some
other useful data.
DECOMPOSITION
Decomposition is the process of breaking down in parts or
elements.
It replaces a relation with a collection of smaller relations.
It breaks the table into multiple tables in a database.
It should always be lossless, because it confirms that the
information in the original relation can be accurately
reconstructed based on the decomposed relations.
If there is no proper decomposition of the relation, then it may
lead to problems like loss of information.
USES OF DECOMPOSITION
 The essential idea is that many problems arising from redundancy can be addressed by replacing

a relation with a collection of smaller relations.

 Each of the smaller relations contains a subset of the attributes of the original relation.

 We refer to this process as decomposition of the larger relation into the smaller relations

A good decomposition does not :

 lose information

 complicate checking of constraints

 contain anomalies (or at least contains fewer anomalies)


Types of Decomposition

There are two types of decomposition :


1). Lossy Decomposition
2). Lossless Join Decomposition
Lossy Decomposition
Some information might be lost when decomposing and rejoining
the tables.
"The decomposition of relation R into R1 and R2 is lossy when the
join of R1 and R2 does not yield the same relation as in R.”

 One of the disadvantages of decomposition into two or more


relational schemes (or tables) is that some information is lost during
retrieval of original relation or table.
EXAMPLE: Consider that we have table STUDENT with three attribute roll_no , sname and
department.
Lossless Join Decomposition

"The decomposition of relation R into R1 and R2 is lossless when


the join of R1 and R2 yield the same relation as in R.“
A relational table is decomposed (or factored) into two or more
smaller tables, in such a way that the designer can capture the
precise content of the original table by joining the decomposed
parts.
This is called lossless-join (or non-additive join) decomposition.
This is also referred as non-additive decomposition.
The lossless-join decomposition is always defined with respect to a
specific set F of dependencies.
EXAMPLE: Consider that we have table STUDENT with three attribute roll_no , sname and department.
Problems Related to Decomposition
 Lossy Decomposition:

Can lead to loss of information and data integrity issues.


 Increased Complexity:

Managing multiple tables and their relationships can become complex.


 Performance Overhead:

More joins required in queries can impact performance.


 Dependency Preservation:

Ensuring all functional dependencies are preserved in decomposed tables can be challenging
PROPERTIES OF DECOMPOSITION
Following are the properties of Decomposition,

1. Lossless Decomposition

2. Dependency Preservation

3. Lack of Data Redundancy


Lossless Decomposition
 Decomposition must be lossless.
 It means that the information should not get lost from the relation that is
decomposed.
 It gives a guarantee that the join will result in the same relation as it was
decomposed.
Example:
Let's take 'E' is the Relational Schema, With instance 'e';

With instance: e1, e2, e3, . . . . en, If e1 ⋈ e2 ⋈ e3 . . . . ⋈ en, then it is called as


is decomposed into: E1, E2, E3, . . . . En;

'Lossless Join Decomposition'.


Lack of Data Redundancy
Lack of Data Redundancy is also known as a Repetition of
Information.
The proper decomposition should not suffer from any data
redundancy.
The careless decomposition may cause a problem with the data.
The lack of data redundancy property may be achieved by
Normalization process.
FUNCTIONAL DEPENDENCIES (FDs)
 A functional dependency (FD) is a kind of Integrity Constraint that
generalizes the concept of a key.
 Let R be a relation schema and let X and Y be nonempty sets of attributes
in R. We say that an instance r of R satisfies the FD XY if the following
holds for every pair of tuples t1 and t2 in r.
If t1.X=t2.X, then 11.y=t2.y
 An FD X → Y essentially says that if two tuples agree on the values in attributes X, they must also
agree on the values in attributes Y
 Example: Relation: Student(StudentID, StudentName, CourseID, CourseName)
 FD Example: StudentID→ Student Name
 Explanation: If two tuples have the same StudentID, they must have the same Student Name.
FUNCTIONAL DEPENDENCIES (FDs)
 A functional dependency (FD) is a relationship between two attributes,
typically between the PK and other non-key attributes within a table.
 For any relation R, attribute Y is functionally dependent on attribute X
(usually the PK), if for every valid instance of X, that value of X uniquely
determines the value of Y.
 This relationship is indicated by the representation below :
X Y
 The left side of the above FD diagram is called the determinant, and the
right side is the dependent.
Importance of Functional Dependencies
Normalization: It helps in the normalization
process to reduce redundancy and improve data
integrity.
Database Design: Assists in designing a database
schema that ensures data consistency.
Query Optimization: Aids in optimizing queries
by understanding the relationships between
data.
About FDs
There are two ways:
1. CLOSURE OF A SET OF FDs
2. ATTRIBUTE CLOSURE

1. CLOSURE OF A SET OF FDs (FDC):


The closure of a set of functional dependencies, denoted as F+, is the set of all functional dependencies that
can be inferred from the original set F using the Armstrong's axioms (reflexivity, augmentation, and
transitivity).
Purpose:
 To understand all possible dependencies that can be derived from the given set of FDs.
 Usage: F+ is used to determine if a particular FD is implied by a set of FDs. It's also used in normalization
processes to ensure that certain properties hold
Example: IF(A-B.B-C), then
F+ includes A-B B-C and A-C (by transitivity).
1. CLOSURE OF A SET OF FDs
Given some FDs, we can usually infer or compute
additional FDs: ssn did, did lot implies ssn lot
An FD f is implied by a set of FDs F if f holds whenever all
FDs in F hold.
F + = closure of F is the set of all FDs that are implied by F.
 Rules of inference of FDs (or) Armstrong’s Axioms (OR):
 Reflexivity: If X C Y, then
Y X
 Augmentation: If X Y, then XZ YZ for any Z
 Transitivity: If X Y and Y Z, then X Z
These are sound and complete inference rules for FDs!
Cont…..
 Couple of additional rules (that follow from AA):
 Union: If X Y and X Z, then X YZ
 Decomposition: If X YZ, then X Y and X Z

Example:
 Contracts(cid,sid,jid,did,pid,qty,value), and:
 C is the key: C CSJDPQV
 Project purchases each part using single contract: JP C
 Dept purchases at most one part from a supplier: SD P
 JP C, C CSJDPQV imply JP CSJDPQV
 SD P implies SDJ JP
 SDJ JP, JP CSJDPQV imply SDJ CSJDPQV
Example for producing a FD based on AA
Given a relation R with attributes W, U, V, X, Y, Z and
functional dependencies:
W UV, U Y, VX YZ. Prove that it holds: WX
Z.
Solution:
1. (with decomposition) From W U V we take W V.
2. (with augmentation) WX VX.
3. (with transitivity) WX YZ.
4. (with decomposition) WX Z.
2. ATTRIBUTE CLOSURE (AC)
 The closure of a set of attributes X, denoted as X+, is the set of all
attributes that can be functionally determined by X.
 It is used to find all attributes that are dependent on a given set of
attributes, which helps in identifying candidate keys and ensuring
the integrity of database design.
Example:
Given FDs: (AB.B-C,A-D)
Compute A+:
Start: A+={A}
Apply A->B: A+ = {A, B}
Apply B->C: A+= {A, B, C}
Apply A->D: A+= {A, B, C, D}
Result: A+={A, B, C, D}
Algorithm to find the Closure of a Set of attributes X
Given a relation R και its functional dependencies
F+, find the closure of attribute A.

1. Let X=A.
2. Among the functional Dependencies of F+, we
search for dependencies C D, where C ⊆ X. If
we found such a dependency, then we add D in X.
3. We repeat Step 2 till we cannot add additional
attributes in X.
Example 1
Let R= (V, Y, Z, W) and F+ = {V Z, VZ W, W Y, VY
W}
Find the closure of attribute V.

Solution:

Step 1: X=V.
Step 2: X=VZ because of V Z.
Step 3: X=VZW because of VZ W.
Step 3: X=VZWY because of W Y.
Step 3: No more repeats can be made.
Example 2
Let R = ( V, Y, Z, W) and F+ = {V Υ, W Y, V W}
Find the closure of attribute V.

Solution:

Step 1: X=V.
Step 2: X=VY because of V Y.
Step 3: X=VYW because of V W.
Step 3: no more repeats can be made.
Other Kinds of Dependencies (or) Types of Functional
Dependencies:
Fully -Functional dependency
Partial dependency
Transitive dependency
Trivial functional dependency
Non-trivial functional dependency
Multivalued dependency
Full Functional Dependency
 A functional dependency X→Y is said to be a full functional dependency, if
removal of any attribute A from X, the dependency does not hold any more. i.e. Y
is fully functional dependent on X, if it is Functionally Dependent on X and not on
any of the proper subset of X.
 For example, (Emp_num.Proj_num) → Hour
 Is a full functional dependency. Here, Hour is the working time by an employee in
a project.
Partial Dependency
 A functional dependency X →Y is said to be a partial functional dependency, if
after removal of any attribute A from X, the dependency still holds. i.e. Y is
dependent on a proper subset of X. So X is partially dependent on X.
 For example,If (Emp mum, Proj_num) Emp_name but also Emp num→
Emp_name then Emp_name is partially functionally dependent on
(Empl_mmm,Proj_num).
 a non-key attribute is determined by a part, but not the whole, of a COMPOSITE
primary key.
Transitive Dependency
A functional dependency is X → Z is said to be a transitive functional dependency if there
exists the functional dependencies X → Y and Y Z ie. it is an indirect relationship.
For example, EMP NUM JOB_CLASS is a transitive dependency which comes from
EMP_NUM JOB_CLASS and JOB_CLASS CHG_HOUR Trivial functional dependency
A functional dependency X → Y is said to be a trivial functional dependency if Y is a subset
of X.
For example, (Emp_num Emp_name) → Emp_num is a trivial functional dependency since
Transitive
Emp_num is a subset of (Emp_nunEmp_name).
Dependency

EMPLOYEE

Emp_ID F_Name L_Name Dept_ID Dept_Name


111 Mary Jones 1 Acct
122 Sarah Smith 2 Mktg
Multivalued Dependency
 Multivalued dependency occurs in the situation where there are multiple independent
multivalued attributes in a single table. A multivalued dependency is a complete constraint
between two sets of attributes in a relation. It requires that certain tuples be present in a relation.
 Example:
Consider the following table

The functional dependencies


car model -> manufr_year
car model -> colour
There are multivalued dependency since manufr_year and color both are multivalued attribute
Trivial Functional dependency
 The dependency of an attribute on a set of attributes is known as trivial
functional dependency if the set of attributes includes that attribute.
 Symbolically: A ->B is trivial functional dependency if B is a subset of A
 The following dependencies are also trivial: A→ A & B→B
Example:
Consider a table with two columns: Student_id and Student_Name
 Student_Id, Student Name) → Student_Id is a trivial functional
dependency as Student_Id is a subset of (Student_Id. Student_Name).
 That makes sense because if we know the values of Student Id and
StudentName then the value of Student_id can be uniquely determined.
 Also, Student_Id->Student_Id & Student_Name -> Student_Name are
trivial dependencies.
Non - Trivial Functional dependency
 If a functional dependency X → Y holds true where Y is not a subset of X then this
dependency is called non trivial Functional dependency.
 Example:
An employee table with three attributes: emp_id, emp_name, emp_address.
 The following functional dependencies are non-trivial:
emp_id -> cmp_name (emp_name is not a subset of emp_id)
emp_id emp_address (emp_address is not a subset of emp_id)
 On the other hand, the following dependencies are trivial:
(emp_id, emp_name) emp_name [emp_name is a subset of (emp_id, emp_name}]
 Completely non trivial FD
If a FDX Y holds true where X intersection-Y-is-null then this dependency is said to be
completely on trivial function dependency.
Normalization
 Normalization is a systematic process of organizing data in a database to minimize
redundancy and dependency.
 It involves dividing large tables into smaller tables and defining relationships between them
to increase the clarity of the data structure.
 Objectives of Normalization:
 Eliminate Redundant Data: Reduces the duplication of data to save space and ensure consistency.
 Ensure Data Dependencies Make Sense: Organizes data so that dependencies are logical, ensuring
data integrity.
 Simplify Data Management: Makes databases easier to maintain by structuring data efficiently
Normalization
Advantages of Normalization:
Greater overall database organization will be gained.
The amount of unnecessary redundant data reduced,
Data integrity is easily maintained within the database.
The database & application design processes are much for flexible.
Security is easier to maintain or manage.
Disadvantages of Normalization:
The disadvantage of normalization is that it produces a lot of tables with
a relatively small number of columns These columns then to be joined
using their primary/foreign key relation ship.
INF table:
Breaks the values
In a table
Advantages of removing transitive dependencies:
Amount of data duplication is reduced
Achievement of data integrity
Boyce Codd Normal Form (BCNF)
• Boyce Codd. Normal Form is the superior version of 3NF and wus
developed by Raymond F. Boyce and Edgar F. Codd to tackle
certain types of anomalies which were not resolved with 3NF
• The table is said to be in Royce Codd Normal Form is
• That the table should be in the third normal form
• Every Right-Hand Side (RHS) attribute of the functional
dependencies should depend on the super key of that particular
table.
• A relation will be BCNF, if it is in 3NF and every functional
dependency X-Y. X should be the super key of the table. (Le every
FD, LHS is super key.)
4NF
5NF

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy