0% found this document useful (0 votes)
99 views31 pages

Theory of Database Systems: Lecture 10. The Process of Normalization I

The document discusses the process of normalization in database design. Normalization is done in steps to reach higher normal forms like 1NF, 2NF and 3NF. This removes anomalies like insertion, deletion and modification anomalies. The document explains reaching 1NF by removing repeating groups and 2NF by removing partial dependencies on primary keys through decomposition into multiple tables.

Uploaded by

qayswedx123
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPS, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views31 pages

Theory of Database Systems: Lecture 10. The Process of Normalization I

The document discusses the process of normalization in database design. Normalization is done in steps to reach higher normal forms like 1NF, 2NF and 3NF. This removes anomalies like insertion, deletion and modification anomalies. The document explains reaching 1NF by removing repeating groups and 2NF by removing partial dependencies on primary keys through decomposition into multiple tables.

Uploaded by

qayswedx123
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPS, PDF, TXT or read online on Scribd
You are on page 1/ 31

Theory of Database

Systems
Lecture 10. The process of normalization
I.
Normalization

• Normalization is a technique for producing a set of


suitable relations that support the data
requirements of an enterprise.
Suitable set of relations

• Characteristics of a suitable set of relations include:

– the minimal number of attributes necessary to


support the data requirements of the enterprise;

– attributes with a close logical relationship are found in


the same relation;

– minimal redundancy with each attribute represented


only once with the important exception of attributes
that form all or part of foreign keys.
Benefits of suitable set of relations

• The benefits of using a database that has a


suitable set of relations is that the database will
be:

– easier for the user to access and maintain the data;

– take up minimal storage space on the computer.


How Normalization Supports
Database Design

• Normalization is a bottom-up approach to DB design that begins by


examining the relationships between attributes.
• However a top-down approach can also be used that begins by
identifying the main entities and relationships and uses normalization
as a validation technique.
The Process of Normalization

• Normalization is a formal technique for analyzing


a relation based on its primary key and the
functional dependencies between the attributes of
that relation.

• Often executed as a series of steps. Each step


corresponds to a specific normal form, which has
known properties.
Normalization

• Four most commonly used normal forms are first


(1NF), second (2NF) and third (3NF) normal
forms, and Boyce–Codd normal form (BCNF).

• Normalization is based on functional


dependencies among the attributes of a relation.

• A relation can be normalized to a specific form to


prevent possible occurrence of update anomalies.
The Process of Normalization

The relationship between the normal forms.


It shows that some 1NF relations are also in
2NF and some 2NF relations are also in 3NF,
an so on.
The Process of Normalization
Unnormalized Form (UNF)

• Before discussing first normal form, we initially


give a definition of the state prior to first normal
form.

• Unnormalized form is a table that contains one or


more repeating groups.

• To create an unnormalized table


– Transform the data from the information source
(e.g. form) into table format with columns and rows.

• In this format, the table is in unnormalized form


(UNF).
Repeating group

• A repeating group is an attribute, or group of


attributes, within a table that occurs with multiple
values for a single occurrence of the nominated
key attribute(s) of that table.

• Nominated key: refers to the attribute(s) that


uniquely identify each row within the
unnormalized table.
Example: Form

Collection of DreamHome leases.


In the example it is assumed that a client rents a given
property only once and cannot rent more than one
property at any one time.
UNF example

• Sample data is taken from two leases for two different


clients and is transferred into table format with rows and
columns.
• This is an unnormalized table.

ClientRental unnormalized table.


UNF example

• We identify the key attribute for the Clientrental


unnormalized table as clientNo.

• Next we identify the repeating group in the


unnormalized table:

Repeating Group = (propertyNo, pAddress, rentstart,


rentFinish, rent, ownerNo, ownerName)

• As a consequence, there are multiple values at


the intersection of certain rows and columns.
First Normal Form (1NF)

• A relation in which the intersection of each row


and column contains one and only one value.
UNF to 1NF

• To transform the unnormalized table to first


normal form we identify and remove repeating
groups within the table.

– Nominate an attribute or group of attributes to act


as the key for the unnormalized table.

– Identify the repeating group(s) in the unnormalized


table which repeats for the key attribute(s).

• There are two common approaches to removing


repeating groups from unnormalized tables.
Method 1

• We remove the repeating group by entering


appropriate data into the empty columns of rows
containing the repeating data (‘flattening’ the
table). We fill in the blanks by duplicating the
nonrepeating data.

• The resulting relation contains atomic values at


the intersection of each row and column, and is
therefore in 1NF.

• With this approach redundancy is introduced


into the resulting relation.
Method 1 example

• Remove the repeating group by entering the


appropriate client data into each row.

• The resulting relation ClientRental is in 1NF as there is


a single value at the intersection of each row and
column.
Method 1 example

• We identify the candidate keys for the


ClientRental relation as being composite keys:
– (clientNo, propertyNo)
– (clientNo, rentStart)
– (propertyNo, rentStart)

• We select (clientNo, propertyNo) as the primary


key.

• The relation contains data describing clients,


property rented, and property owners, which is
repeated several times. As a result, the relation
contains significant data redundancy.
Method 2

• We remove the repeating group by placing the


repeating data along with a copy of the original
key attribute(s) into a separate relation.

• A primary key is identified for the new relation.

• This approach produces relations in at least 1NF


with less redundancy.
Method 2 example

• Using the second approach, we remove the repeating group by


placing the repeating data along with a copy of the original key
attribute (clientNo) into a separate table, called
PropertyRentalOwner.
Method 2 example

• Then we identify a primary key for the new table


(clientNo, propertyNo).

• The format of the resulting 1NF relations are as follows:

Client (clientNo, CName)

PropertyRentalOwner (clientNo, propertyNo, pAddress,


rentStart, rentFinish, rent, ownerNo, oName)

• Both the Client and PropertyRentalOwner tables are in


1NF, but the PropertyRentalOwner table contains
significant redundancy.
Second Normal Form (2NF)

• Second normal form is based on the concept of


full functional dependency.

• Full functional dependency indicates that if


– A and B are attributes of a relation,
– B is fully functionally dependent on A if B is
functionally dependent on A but not on any proper
subset of A.
• A functional dependency A  B is a full functional
dependency if removal of any attribute from A
results in the dependency not being sustained any
more.
Second Normal Form (2NF)

• A relation that is in 1NF and every non-primary-


key attribute is fully functionally dependent on the
primary key.

– Second normal form applies to relations with


composite keys (the primary key composed of two
or more attributes).

– A relation with a single attribute primary key is


automatically in at least 2NF.
1NF to 2NF

• Identify the primary key for the 1NF relation.

• Identify the functional dependencies in the


relation.

• If partial dependencies exist on the primary key


remove them by placing them in a new relation
along with a copy of their determinant.
Partial dependency

• A functional dependency A  B is partially


dependent if there is some attribute that can be
removed from A and the dependency still holds.
2NF example

Consider the ClientRental relation.

• This ClientRental table is in 1NF. The primary key of the


table is (clientNo, propertyNo).

• In order to move this table to a 2NF solution, we must


identify and remove the partial dependencies from the
table.
Functional dependencies in
ClientRental relation
• The functional dependencies (fd) for the
ClientRental relation are as follows:

• The presence of partial dependencies show that


the table is not in 2NF.
– cName is partially dependent on the primary key, in
other words, on only the clientNo attribute.
– Property attributes are also partially dependent on
the primary key.
Transform the ClientRental relation
into 2NF
• To remove the partial dependencies, we create new
tables so that the non-primary-key columns are
removed, along with a copy of the part of the
primary key on which they are fully functionally
dependent. 

• This results in the creation of three new relations


called Clioent, Rental, and PropertyOwner.
2NF relations derived from
ClientRental relation
• The three tables, Client, Rental and PropertyOwner are
in 2NF because every non-primary-key column is fully
functionally dependent on the primary key of the table.
Remarks

• Although 2NF relations have less redundancy than those


in 1NF, they may still suffer from update anomalies.

• E.g. if we want to update the name of on owner e.g.


Tony Diamond we have to update two tuples in the
PropertyOwner relation.

• If we update only one tuple and not the other the


database would be in an inconsistent state.

• This update anomaly is caused by a transitive


dependency.

• We need to remove such dependencies by progressing


to third normal form.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy