Chapter 4
Chapter 4
Mapping Entity
An entity is a real-world object with some attributes.
Mapping Process (Algorithm)
Create table for each entity.
Entity's attributes should become fields of tables with their respective data types.
Declare primary key.
What is a relation
A relation is a table that holds the data we are interested in. It is two-dimensional and has
rows and columns.
Mapping Relationship
A relationship is an association among entities.
Mapping Process
EX: if A and B are attributes of relation R, and B is functionally dependent on A,if each
value of A is associated with exactly one value of B. ( A and B may each consist of one
or more attributes.)
Example on Student (sid, name, supervisor_id, specialization):
{supervisor_id} {specialization} means
If two student records have the same supervisor, then their specialization (e.g.,
Databases) must be the same
On the other hand, if the supervisors of 2 students are different, we do not care about
their specializations (they may be the same or different)
Product table
One FD : - ( { S#} {City})
Because every tuple of that relation with a given S# value also has the same city value.
TRIVIAL DEPENDENCIES
One-way to reduce the size of the set of FD we need to deal with is to eliminate the
trivial dependencies.
Trival functional dependency means that the right-hand side is a subset ( not
necessarily a proper subset) of the left-hand side.
A functional dependency X Y is trivial if Y is a subset of X
e.g. <S#, P#> <S#>. (Trivial)
{name, supervisor_id} {name}
Armstrong’s Axioms
Be X, Y, Z be subset of the relation scheme of a relation R
Reflexivity:
If YX, then XY (trivial FDs)
{name, supervisor_id}{name}
Augmentation:
If XY , then XZYZ
if {supervisor_id} {spesialization} ,
then {supervisor_id, name}{spesialization, name}
Transitivity:
If XY and YZ, then XZ
◦ if {supervisor_id} {spesialization} and {spesialization} {lab}, then
{supervisor_id}{lab}
Self-determination:
AA
Decomposition:
If A B,C then A B and A C
Union:
If A B and A C, then A B,C
Composition:
If A B and C D, then A,C B,
Anomalies
An anomaly is an inconsistent, incomplete, or contradictory state of the database
Insertion anomaly – user is unable to insert a new record of data when it should be
possible to do so because not all other information is available.
Deletion anomaly – when a record is deleted, other information that is tied to it is also
deleted
Update anomaly –a record is updated, but other appearances of the same items are not
updated
Redundancy leads to the following anomalies:
Update anomaly: A change in Address must be made in several places. Updating one
fact may require updating multiple tuples.
Deletion anomaly: Deleting one fact may delete other information. Suppose a person
gives up all hobbies. Do we:
Set Hobby attribute to null? No, since Hobby is part of key
Delete the entire row? No, since we lose other information in the row
Insertion anomaly: To record one fact may require more information than is available.
Hobby value must be supplied for any inserted row since Hobby is part of key
Normalization
Database Normalization is a technique of organizing the data in the database.
Normalization is a systematic approach of decomposing tables to eliminate data redundancy
and undesirable characteristics like Insertion,Update and Deletion Anomalies.
It is a multi-step process that puts data into tabular form by removing duplicated data
from the relation tables.
Normalization is used for mainly two purpose,
Eliminating redundant(useless) data.
Ensuring data dependencies make sense i.e data is logically stored.
What is Normalization
Normalization is a database design technique which organizes tables in a manner that
reduces redundancy and dependency of data.
OR
Normalization is the process of removing redundant data from your tables.
Improve storage efficiency, data integrity, and scalability.
Normalization generally involves splitting existing tables into multiple ones, which
must be re-joined or linked each time a query is issued.
It divides larger tables to smaller tables and link them using relationships.
Decomposition
Database normalization: The process of removing redundant data from your tables.
To improve storage efficiency, Data integrity, and scalability.
In the relational model, methods exist for quantifying how efficient a database is.
These classifications are called normal forms (or NF).
Normalization generally involves splitting existing tables into multiple ones, which
must be re-joined or linked each time a query is issued.
Normal Form
E.F. Codd originally established three normal forms: 1NF, 2NF and 3NF.
There are now others that are generally accepted, but 3NF is widely considered to be
sufficient for most applications.
Most tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form).
Unnormalized Form (UNF)
A table that contains one or more repeating groups.
To create an unnormalized table: Transform data from information source (e.g. form)
into table format with columns and rows.
First Normal Form (1NF)
• A relation in which the intersection of each row and column contains one and only one
value.
• Remove horizontal redundancies
• No two columns hold the same information
• No single column holds more than a single item
• (No composite value e.g. Full name (fname ,mname,lname))
• Each table cell should contain single value(atomic value).
• Each column must have a unique name
102 Ramesh CO
102 Ramesh DS
103 Sayali CO
103 Sayali DS
103 Sayali CN
◦ If partial dependencies exist on the primary key remove them by placing them in a new
relation along with copy of their determinant.
BCNF( Boyce-Codd Normal Form) : A relation is in BCNF, if and only if, every determinant
is a candidate key.
The difference between 3NF and BCNF:
Functional dependency A B,
• 3NF : Allows this dependency in a relation if B is a primary-key attribute and A is not a
candidate key, whereas BCNF insists that for this dependency to remain in a relation, A
must be a candidate key. Every relation in BCNF is also in 3NF. However, relation in 3NF
may not be in BCNF.
Non additive lossless join –if we don’t getting extra record or row after joining without
losing any information or data.
Summery
1NF :
A relation in which the intersection of each row and column contains one and only
one value.
2NF :
Based on concept of full functional dependency
Every non-primary-key attribute is fully functionally dependent on the primary key.
3NF :
Based on concept of Transitive dependency
No non-primary-key attribute is transitively dependent on the primary
key.
BCNF :
Every determinant is a candidate key
4th Normal Form
Relationships for Example
Normalization Example 2