VII. Normalización
VII. Normalización
Databases
1
Objectives of DB table normalization
The consequences of the lack of database normalization
are:
• Inaccuracy of database systems.
• Slowing down of processes.
• Inefficiency in operations.
Database normalization helps to avoid these negative
effects from the design of new databases and also allows
checking whether existing ones guarantee the necessary
data or referential integrity.
2
Objectives of DB table normalization
When proceeding with DB normalization, 4 goals must be
set:
• Organize the data into logical groups, so that each
group describes a small part of the whole.
• Minimize the amount of duplicate data stored in a
database.
• Perfect the organization of data so that when
changes need to be made, the change only needs to be
applied in one place.
• Build a database that can be accessed quickly and
where it is possible to manipulate data with maximum
efficiency and without compromising its integrity. 3
Normalization (pros)
One of the key terms of relational data modeling is
normalization.
Normalization is a database design concept that is applied
to relational databases to avoid redundancies.
The objective of normalization is the reduction of
duplicate values and if a database is normalized in one of
the normal forms described, the resulting table will have
the advantage of having less redundancy than the
original. Thus, standardization simplifies the maintenance
of data banks.
4
Standardization (cons)
Normalizing a database always involves separating
attributes into independent tables. This probably
requires integrating foreign keys and can lead to key
redundancies. But its biggest drawback is that in a
normalized database the data that form a logical whole
are no longer stored together. If you want to join the
data that appears in separate tables, it is necessary to
execute a Join.
Database queries with Joins allow you to filter complex
data; But carrying them out requires a greater effort
than a simple query, to which is added the slowness
of the execution of the query when the Joins involve a
large number of tables. 5
NORMALIZATION OF RELATIONAL DB
Database normalization is a process that
consists of designating and applying a series
of rules to the relationships obtained after
moving from the entity-relationship model to
the relational model in order to minimize data
redundancy, facilitating its subsequent
management.
• First Normal Form (1NF)
• Second Normal Form (2NF)
• Third Normal Form (3NF)
• Fourth Normal Form (4NF)
• Fifth Normal Form (5NF)
6
First Normal Form (1NF)
A table is in first form if:
• All attributes are atomic. An attribute is atomic if the elements of
the domain are simple and indivisible.
• There should be no variation in the number of columns.
• Non-key fields must be identified by the key (functional
dependency).
• There must be an independence of the order of both the rows and
the columns; That is, if the data changes order, its meanings should
not change.
• The table contains a primary key
This normal form eliminates repeated values within a database.
7
Second Normal Form (2NF)
Functional dependency.
A relationship is in 2NF if it is in 1NF and if attributes that
are not part of any key depend completely on the primary
key. It means that does not exist partial dependencies. All
non-primary key attributes must depend solely on
the primary key.
8
Third Normal Form (3NF)
The data is in second normal form and the columns do not
depend on other non-key columns.
A table is in Third Normal Form or 3NF if it is in 2NF and
there are no attributes that do not belong to the
primary key that can be known by another attribute
that is not part of the primary key, that is, there are no
transitive functional dependencies.
Values in a record that are not part of that record's key do
not belong in the table. In general, whenever the contents
of a group of fields may apply to more than a single record
in the table, consider placing those fields in a separate
table.
9
Normalization (Example)
Non-normalized table:
10
Normalization (Example)
First normal form: no repeating groups
Tables should only have two dimensions. Since a student
has multiple classes, these classes must appear in a
separate table. The Class1, Class2, and Class3 fields in the
records above are indications of design issues.
Another way to look at this problem is with a one-to-many
relationship, you don't put the one side and the many side
in the same table. Instead, create another table in first
normal form by deleting the repeating group (Class#), as
shown below:
11
Normalization (Example)
12
Normalization (Example)
Second normal way: remove redundant data
Note the various class# values for each Student# value in the
table above. Class# does not functionally depend on Student#
(primary key), so this relationship is not in normal second format.
The following tables show the second normal form:
13
Normalization (Example)
Third normal way: delete data that does not depend on the key
In the last example, Adv-Room (advisor's office number) is functionally
dependent on the Advisor attribute. The solution is to move that
attribute from the Students table to the Faculty table, as shown below:
14
Normalization (Example)
15
16