0% found this document useful (0 votes)
15 views24 pages

Unit-1 RDBMS

About relational database

Uploaded by

yashwantverma42
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views24 pages

Unit-1 RDBMS

About relational database

Uploaded by

yashwantverma42
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Unit-1

What is a Relational Database (RDBMS)?


A relational database is a type of database that stores and provides access to data points that are
related to one another. Relational databases are based on the relational model, an intuitive,
straightforward way of representing data in tables. In a relational database, each row in the table is a
record with a unique ID called the key. The columns of the table hold attributes of the data, and each
record usually has a value for each attribute, making it easy to establish the relationships among data
points.
A relational database management system (RDBMS) is a collection of programs and capabilities that
enable IT teams and others to create, update, administer and otherwise interact with a relational
database. A relational database is a type of database that stores related data points.
RDBMSes store data in the form of tables, with most commercial relational database management
systems using Structured Query Language (SQL) to access the database. However, since SQL was
invented after the initial development of the relational model, it isn't necessary for RDBMS use.
The RDBMS is the most popular database system among organizations. It provides a dependable
method of storing and retrieving large amounts of data while offering a combination of system
performance and ease of implementation. It's also the basis for modern database systems
like MySQL.
An RDBMS is a type of database management system (DBMS) that stores data in a row-based table
structure that connects related data elements. An RDBMS includes functions that maintain the
security, accuracy, integrity and consistency of the data. This is different than the file storage used in
a DBMS.
Other differences between database management systems and relational database management
systems include the following:
• The relational model. RDBMSes use a relational model to map relationships between data
elements, while a DBMS can support different data models.
• SQL. RDBMSes use SQL as a standard language for managing and querying data, while
DBMSes don't have a standard language and can support different programming languages.
• Number of allowed users. While a DBMS can only accept one user at a time, an RDBMS
can operate with multiple users.
• Hardware and software requirements. A DBMS needs less software and hardware than an
RDBMS.
• Amount of data. RDBMSes can handle any amount of data, from small to large, while a
DBMS can only manage small amounts.
• Database structure. In a DBMS, data is kept in a hierarchical form, whereas an RDBMS
utilizes a table where the headers are used as column names and the rows contain the
corresponding values.
• ACID implementation. DBMSes don't use the atomicity, consistency, isolation and durability
(ACID) model for storing data. On the other hand, RDBMSes base the structure of their data
on the ACID model to ensure consistency.
• Distributed databases. While an RDBMS offers complete support for distributed databases,
a DBMS won't provide support.
• Types of programs managed. While an RDBMS helps manage the relationships between its
incorporated tables of data, a DBMS focuses on maintaining databases that are present within
the computer network and system hard disks.
• Support of database normalization. An RDBMS can be normalized, but a DBMS can't.
• Schemas. RDBMSes have a rigid schema, which limits the types of data they can store and
manage, while DBMSes can accommodate flexible schemas.
How an RDBMS works
As mentioned previously, an RDBMS stores data in the form of a table. Each system will have
varying numbers of tables with each table possessing its own unique primary key. The primary key is
then used to identify each table.
Within the table are rows and columns. The rows are known as records or horizontal entities; they
contain the information for the individual entry. The columns are known as vertical entities and
possess information about the specific field.
Before creating these tables, the RDBMS must check the following constraints:
• Primary keys identify each row in the table. One table can only contain one primary key. The
key must be unique and without null values.
• Foreign keys are used to link two tables. The foreign key is stored in one table and refers to
the primary key associated with another table.
• Not null ensures that every column doesn't have a null value, such as an empty cell.
• Check confirms that each entry in a column or row satisfies a precise condition and that every
column holds unique data.
• Data integrity ensures the integrity of the data is confirmed before the data is created.
RDBMSes also consist of the following notations:
• SQL. This is the domain-specific language used for storing and retrieving data.
• SQL query. This is a data request from an RDBMS system.
• Index. This is a data structure used to accelerate database retrieval.
• View. This is a table that shows a data output figured from underlying tables.
Ensuring the integrity of data includes several specific tests, including entity, domain, referential and
user-defined integrity. Entity integrity confirms that the rows aren't duplicated in the table. Domain
integrity ensures that data is entered into the table based on specific conditions, such as file format or
range of values. Referential integrity ensures that any row that's relinked to a different table can't be
deleted. Finally, user-defined integrity confirms that the table will satisfy all user-defined conditions.
Advantages of a relational database management system
The use of an RDBMS can be beneficial to most organizations; the systematic view of raw data helps
companies better understand and execute the information while enhancing the decision-making
process. Using tables to store data also improves the security of information stored in the databases.
Users can customize access and set barriers to limit the content that's made available. This feature
makes the RDBMS particularly useful to organizations in which the manager decides what data is
provided to employees and customers.
Furthermore, RDBMSes make it easy to add new data to the system or alter existing tables while
ensuring consistency with the previously available content.
Other advantages of the RDBMS include the following:
• Flexibility. Updating data is more efficient, as the changes only need to be made in one place.
• Maintenance. DBAs can easily maintain, control and update data in the database. Backups
also become easier, as automation tools included in the RDBMS automate these tasks.
• Data structure. The table format used in RDBMSes is easy to understand and provides an
organized and structural manner through which entries are matched by firing queries.
• ACID properties. These properties increase data consistency, isolation and durability.
• Security. RDBMS systems can include security features such as encryption, access controls
and user authentication.
• Scalability. RDBMS systems can horizontally distribute data across different servers.
Disadvantages of a relational database management system
On the other hand, relational database management systems also have some disadvantages. For
example, to implement an RDBMS, special software must be purchased. This introduces an
additional cost for execution. Once the software is obtained, the setup process can be tedious, as it
requires millions of lines of content to be transferred into the RDBMS tables. This process might
require the help of a programmer or a team of data entry specialists. Special attention must be paid to
the data during entry to ensure sensitive information isn't placed into the wrong hands.
Some other drawbacks of the RDBMS include the character limit placed on certain fields in the tables
and the inability to fully understand new forms of data -- such as complex numbers, designs and
images.
Furthermore, while isolated databases can be created using an RDBMS, the process requires large
chunks of information be separated from each other. Connecting these large amounts of data to form
the isolated database can be complicated.
Uses of RDBMS
Relational database management systems are frequently used in disciplines such as manufacturing,
human resources and banking. The system is also useful for airlines that need to store ticket service
and passenger documentation information, as well as universities that maintain student databases.
Other examples of RDBMS uses include the following:
• Business systems. Business applications can use RDBMSes to store, manage and process
transaction data.
• E-commerce. An RDBMS can be used to manage data related to inventory management,
orders, transactions and customer data.
• Healthcare. RDBMSes are used to manage data related to healthcare, medical records, lab
results and electronic health record systems.
• Education systems. RDBMSes can be used to manage student data and academic records.
Examples of RDBMS types
There are many different types of DBMSes, including a varying set of options for RDBMSes.
Examples of different RDBMSes include the following:
• Oracle Database. This RDBMS system produced and marketed by Oracle is known for its
varied feature set, scalability and security.
• MySQL. This widely used open source RDBMS system excels in speed, reliability and
usability.
• Azure SQL. This Microsoft-provided cloud-based RDBMS system is used for small database
applications.
• SQL Server. This Microsoft-provided RDBMS system is more complex than Azure SQL and
offers full control.
• IBM Db2. This IBM-offered RDBMS system was also extended to support object-relational
and non-relational structures such as JavaScript Object Notation and Extensible Markup
Language.
What is a Database Model?
A Database Model is a type of data model that defines a Database’s logical structure. It determines
how data can be stored, organized, and manipulated in the first place. The Relational Model, which
uses a table-based format, is the most common Database Model. It demonstrates how data is
organized and the various types of relationships that exist between them.
Types of Database Models
The different types of database models are:
• Relational Database Model
• Hierarchial Database Model
• Network Database Model
• Object-Oriented Database Model
• Object-Relational Database Model
• Entity Relationship Database Model
• Other Database Models
Relational Database Model
A Relational Database management system refers to the various software systems used to maintain
Relational Databases (RDBMS). The data in this type of Database Model is organized in two-
dimensional tables with rows and columns, and the relationship is maintained by storing a common
field. There are three main parts to it.
Three key terms, relations, attributes, and domains, are frequently used in Relational Models. A
table with rows and columns is what a Relation is. In relational databases, Attributes are the
defining characteristics or properties that define all items belonging to a particular category and are
applied to all cells in a column. The Domain is nothing more than the set of values that the attributes
can take. The relational Database Model is depicted in the following diagram.

Types of database models: Relational Database Model


Parameters in Relational Model
• Tuple: A tuple is a single row in a table.
• Cardinality Of a Relation: The cardinality of a relationship is determined by the number of
tuples in it. The relation has a cardinality of 4 in this case.
• Degree Of a Relation: Each tuple column is referred to as an attribute. The degree of a
relationship is determined by the number of attributes in it. The degree of the relationship in
the figure is 3.
Keys Of a Relation
• Primary Key: It’s the identifier that makes a table unique. There are no null values in it.
• Foreign Key: It refers to another table’s primary key. Only values that appear in the primary
key of the table to which it refers are allowed
Examples
• Oracle: The Oracle Database is also known as Oracle RDBMS or simply Oracle. Oracle
Corporation produces and markets a multi-Model Database management system. An Oracle
database is a logical collection of data. A database is used to save and retrieve data. It’s the
first database built specifically for enterprise grid computing, the most flexible and cost-
effective way to manage data and applications.
• MySQL: MySQL is a Relational Database management system (RDBMS) based on
Structured Query Language that is free to use (SQL). MySQL is available on almost every
platform, including Linux, UNIX, and Windows.
• Microsoft SQL Server: In corporate IT environments, Microsoft SQL Server is an RDBMS
that supports a wide range of transaction processing, business intelligence, and analytics
applications.
• PostgreSQL: PostgreSQL, or simply Postgres, is an object-Relational Database management
system (ORDBMS) that focuses on extensibility and compliance with industry standards.
• DB2: DB2 is an IBM database product. It’s a database management system for relational
databases (RDBMS). It is an RDBMS that is optimized for data storage, analysis, and
retrieval. With XML, the DB2 product now supports Object-Oriented features and non-
relational structures.
The tables below show a sample Relational Database Model for a bank environment, where data is
stored in two-dimensional tables.
Types
of Database Models: Relational Database Model Example
Advantages
Advantages
Here are a few key advantages of Relational Database Models:
• Changes in the Database structure have no impact on data access in the Relational Model.
• Revising any information as tables with rows and columns makes it much easier to
comprehend.
• Unlike other Models, the Relational Database Model supports both data independence and
structure independence, making Database design, maintenance, administration, and usage
much easier.
• You can use this to write complex queries to access or modify Database data.
• In comparison to other models, it is easier to maintain security.
Drawbacks
• It’s difficult to map objects in a Relational Database.
• The Relational Model lacks an object-oriented paradigm.
• With Relational Databases, maintaining data integrity is difficult.
• The Relational Model is suitable for small Databases but not for large Databases because they
are not designed for change. Each row represents a unique entry, and each column describes
unique attributes, in relational databases. Data Modeling requires planning ahead of time and,
depending on the system, can take months or even years. After-the-fact changes take time and
resources, and Database Modeling projects can take years and cost millions of dollars.
Because big data is always changing, a flexible and forgiving database platform is required.
• Hardware costs are incurred, making it expensive.
• The relational data model is not appropriate for all domains. Schema evolution is difficult due
to an inflexible data model. Poor horizontal scalability results in low distributed availability.
Due to joins, ACID transactions, and strict consistency constraints, performance has suffered
(especially in distributed environments).
• The implementation complexities and physical data storage details of a Relational Database
system are hidden from users.

Hierarchical Database Model

• It’s one of IBM’s first types of Database Models for information management. The data is
organized in a tree-like structure in a Hierarchical Database Model.
• Nowadays, these types of Database models are uncommon. It has nodes for records and
branches for fields. A hierarchical Database is exemplified by the Windows registry in
Windows XP whose configuration options are saved as node-based tree structures.
• The diagram below depicts a generalized Hierarchical Database Model (data represented or
stored in the root node, parent node, and child node).

Types of Database Models: Hierarchical Database Model


The diagram above illustrates a hierarchical Database Model for a university management system.
The “parent-child” relationship is used to store data in this type of Database.
Advantages
• The Model facilitates the addition and deletion of new data.
• Data at the top of the Hierarchy can be accessed quickly.
• It was compatible with linear data storage media like tapes. The hierarchical database was
well-suited to the tape storage systems used by mainframes in the 1970s, and it was widely
used in organizations with databases based on those systems.
• It applies to anything that relies on one-to-many relationships. For example, a president may
have many managers reporting to them, and those managers may report to many employees,
but each employee has only one manager.
Drawbacks
• It necessitates the storage of data in multiple entities regularly.
• Linear data storage mediums, such as tapes, are no longer used today.
• When looking for data, the DBMS must go through the entire Model from top to bottom until
the required information is found, which makes queries extremely slow.
• Only one-to-many relationships are supported by this Model; many-to-many relationships are
not.
Network Database Model
The Database Task Group formalized this model in the 1960s. The hierarchical model is generalized
in this model. This model can have multiple parent segments, which are grouped into levels, but there
is a logical relationship between the segments that belong to each level. Typically, any of the two
segments have a many-to-many logical relationship.
Because it resembles a Hierarchical Database Model, it is frequently referred to as a modified version
of a Hierarchical database. The Network Database Model organizes data in a graph-like fashion and
allows for multiple parent nodes.
The Network models are the types of Database models that are designed to represent objects and their
relationships flexibly. The network model extends the hierarchical model by allowing many-to-many
relationships between linked records, which implies multiple parent records.
The types of database models are built using sets of related records and are based on mathematical set
theory. Each set contains one owner or parent record as well as one or more child or member records.
This model can convey complex relationships because a record can be a member or child in multiple
sets.
After being formally defined by the Conference on Data Systems Languages in the 1970s, it became
extremely popular (CODASYL).

Advantages
• The network model is conceptually simple to implement.
• The network model can better represent data redundancy than the hierarchical Model.
• The network model can handle one-to-many and many-to-many relationships, which is
extremely useful in simulating real-world scenarios such as the Network model for a Finance
Department, Restaurant Chain workflow, etc.
• The network model is better than the hierarchical Model at isolating programs from complex
physical storage details. The network model allows each record to have multiple parent and
child records, forming a generalized graph structure, whereas the hierarchical database model
structures data as a tree of records with each record having one parent record and many
children.
Drawbacks
• Because all records are maintained using pointers, the Database structure becomes extremely
complicated.
• Any record’s insertion, deletion, and updating operations necessitate numerous pointer
adjustments.
• Changing the Database’s structure is extremely difficult.
Object-Oriented Database Model
In object-oriented programming, an Object Database is a system in which data is represented as
objects. Relational Databases, which are table-oriented, are not the same as object-oriented
Databases. The Object-Oriented Data Model is one of the types of database models that is based on
the widely used concept of object-oriented programming languages.
Polymorphism, inheritance, and overloading are all terms that come to mind when thinking about
inheritance. Some of the key concepts of object-oriented programming that have been applied to Data
Modeling include object identity, encapsulation, and information hiding with methods to provide an
interface to objects. In addition to structured and collection types, the object-oriented data model
supports a data-rich type system. Object-Oriented models, including general object databases with no
additional spatial functionality, are the best databases for spatial data, especially vector data.
The difference between Relational and object-oriented types of Database Models is illustrated in the
diagram below.

Types of
Database Models: Relational vs Object oriented
An Object-Oriented Model is illustrated in the diagram below.
Types
of Database Models: Object oriented Database Model Example
Advantages
• Object Databases can store a variety of data types, whereas Relational Databases store only
one type of data. Object-oriented Databases, unlike traditional Databases such as hierarchical,
network, and Relational Databases, can handle a variety of data types, including pictures,
voice, video, text, and numbers.
• You can reuse code, Model real-world scenarios, and improve reliability and flexibility with
object-oriented Databases.
• Because most of the tasks within the system are encapsulated, they can be reused and
incorporated into new tasks, object-oriented Databases have lower maintenance costs than
other Models.
Drawbacks
• An OODBMS lacks a theoretical foundation because there is no universally defined data
Model.
• OODBMS usage is still limited when compared to RDBMS usage.
• There is a lack of security support in OODBMSs that do not include adequate security
mechanisms.
• The system is more complex than conventional Database management systems.
Object-Relational Database Model
This hybrid Database Model is one of the types of Database Models that combines the Relational
Model’s simplicity with some of the Object-Oriented Database models’ advanced functionality. It
allows designers to incorporate objects into the common table structure.
SQL3, vendor languages, ODBC, JDBC, and proprietary call interfaces are all extensions of the
Relational Model’s languages and interfaces.
Entity Relationship Database Models
Entity Relationship Database Model is one of the types of Database models that is similar to the
network model, it captures relationships between real-world entities, but it isn’t as closely linked to
the Database’s physical structure. It’s more commonly used to conceptually design a Database.
The people, places, and things about which data points are stored are referred to as entities, and each
of them has specific attributes that make up their domain. The cardinality of entities, or the
relationships between them, is also mapped.
Types
of Database Models: Entity Relationship Database Model
The star schema is a common ER diagram that connects multiple dimensional tables through a central
fact table.
What is a Database Schema?

A database schema is a blueprint that represents the tables and relations of a data set.

It is important to have a good database schema design. The reasons are:

To avoid data redundancy which wastes memory and leads to data


inconsistency.
To have correctness and completeness of data.
To maintain data accuracy and integrity.
To write simple and easy queries.

Problems that arise with bad database schema is :

Anomalies occur whenever data is inserted, modified or deleted in case of large


database.
This makes data integrity harder to maintain.
Data inconsistency can occur.
Difficulty to scale the database when future application functionality is added.
Performance reduces.
Maintenance also becomes difficult.
To prevent all these problems one has to normalize the database by efficiently organizing
the data.
Normalization
Normalization is a process of specifying and defining keys, columns,
relationships in order to create an efficient database.

Objectives of Normalization
Normalization reduces data redundancy there by reduces the amount of space used
by database and ensures that data is stored efficiently.

It divides large tables into many smaller tables and makes a relation between
them.

It reduces cause of anomalies when data is manipulated.


Normalization defines rules for the relational table in the form of normal forms.

Normal Form is a process that evaluates each relation against defined rules and criteria.
It removes multi-valued primary keys, joins, functional dependencies etc., to improve
the relational table integrity and efficiency.

Functional Dependency (FD):

The functional dependency is a relationship that exists between two attributes.

It is constraint where one attribute determines the value of another one.

It plays a vital role to find the difference between good and bad database
design.

It typically exists between the primary key and non-key attribute within a table.

For any relation R, attribute Y is functionally dependent on attribute X (usually the PK), if for
every valid instance of X, that value of X uniquely determines the value of Y. This relationship
is indicated by the representation below :

X→Y

The left side of FD is known as a determinant, the right side of the production
is known as a dependent.

For example:

Assume we have an employee table with attributes: Emp_Id, Emp_Name,


Emp_Address.

Here Emp_Id attribute can uniquely identify the Emp_Name attribute of

employee table because if we know the Emp_Id, we can tell that employee
name associated with it.
Functional dependency can be written as:
Emp_Id → Emp_Name

We can say that Emp_Name is functionally dependent on Emp_Id.

Types of Functional Dependencies:


There are mainly four types of Functional Dependency in DBMS:
Multivalued Dependency
Trivial Functional Dependency
Non-Trivial Functional Dependency
Transitive Dependency
Multivalued Functional Dependency

Multivalued dependency occurs in the situation where there are multiple


independent multivalued attributes in a single table.

In Multivalued FD, entities of the dependent set are not dependent on each other.

In an Emp table empname and salary attributes both depend on empId for identification.
But both are independent to each other.

Emp_Id →{ Emp_Name,sal} is an example of multivalued FD.


Trivial Functional Dependency

In Trivial Functional Dependency, a dependent is always a subset of the determinant.

i.e. If X → Y and Y is the subset of X, then it is called trivial functional


dependency.

In Emp table {Emp_Id,Emp_Name}→Emp_Name is a trivial FD as Emp_Name is a


subset of {Emp_Id,Emp_Name}.

{Emp_Id,Emp_Name}→Emp_Id is also a trivial FD.

Non-Trivial Functional Dependency

In this FD, the dependent is strictly not a subset of the determinant.

If X → Y then Y is not a subset of X

{Emp_Id,Emp_Name} set can determine the value of Emp_Address or Salary. But


Emp_Address or Salary doesn’t belong to the set or not a subset of {Emp_Id,Emp_Name}

Hence, {Emp_Id,Emp_Name}→Sal is a non-trivial FD.


{Emp_Id,Emp_Name}→Emp_Address is also a non-trivial FD.

Transitive Functional Dependency

In transitive functional dependency, dependent is indirectly dependent on determinant. It


is formed by two functional dependencies.
NORMAL FORMS

Given a relation schema, we need to decide whether it is a good design or whether we need to
decompose it into smaller relations. Such a decision must be guided by an understanding of
what problems, if any, arise from the current schema. To provide such guidance, several normal
forms have been proposed. If a relation schema is in one of these normal forms, we know that
certain kinds of problems cannot arise.

The normal forms based on FDs:


First Normal Form (1NF):

First Normal Form is defined in the definition of relations (tables) itself. This rule
defines that all the attributes in a relation must have atomic domains.

In the first normal form, only single values are permitted at the intersection of
each row and column; hence, there are no repeating groups.

To normalize a relation that contains a repeating group, remove the repeating


group and form two new relations.

We re-arrange the relation (table) as below, to convert it to First Normal Form.


Second Normal Form (2NF):

Before we learn about the second normal form, we need to understand the following −

Prime Key attribute − An attribute, which is a part of the candidate-key, is known


as a prime attribute.

Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be


a non-prime attribute.
For the second normal form, the relation must first be in 1NF.
The relation is automatically in 2NF if, and only if, the Prime Key
comprises a single attribute.

If the relation has a composite Prime Key, then each non-key attribute must be fully
dependent on the entire PK and not on a subset of the PK.
A relation is in 2NF if it has No Partial Dependency.

Partial Dependency – If the proper subset of candidate key determines non-prime


attribute, it is called partial dependency.

We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID.
According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent
upon both and not on any of the prime key attribute individually. But we find that Stu_Name
can be identified by Stu_ID and Proj_Name can be identified by Proj_ID independently. This
is called partial dependency, which is not allowed in Second Normal Form.
Third Normal Form (3NF):

To be in third normal form, the relation must be in second normal form. Also

- all transitive dependencies must be removed; a non-key attribute may not be functionally
dependent on another non-key attribute.
For any non-trivial functional dependency, X → A, then either –
X is a superkey or,
A is prime attribute.

Transitive dependency – If A->B and B->C are two FDs then A->C is called
transitive dependency.

We find that in the above Student_detail relation, Stu_ID is the key and only prime key
attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a
superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so there exists
transitive dependency.

To bring this relation into third normal form, we break the relation into two relations as follows –
Boyce-Codd Normal Form (BCNF):

Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms.

A relation is in BCNF iff in every non-trivial functional dependency X –> Y, X is a super


key.

In the above example, Stu_ID is the super-key in the relation Student_Detail and Zip is
the super-key in the relation ZipCodes. So,

Stu_ID → Stu_Name, Zip


and

Zip → City

Which confirms that both the relations are in BCNF.

Example

Consider a relation R with attributes (student, subject, teacher).

Student Teacher Subject

Jhansi P.Naresh Database

Jhansi K.Das C

Subbu P.Naresh Database

Subbu R.Prasad C

F: { (student, Teacher) -> subject


(student, subject) -> Teacher Teacher
-> subject}

Candidate keys are (student, teacher) and (student, subject).


The above relation is in 3NF [since there is no transitive dependency]. A relation R is in
BCNF if for every non-trivial FD X->Y, X must be a key.

The above relation is not in BCNF, because in the FD (teacher->subject), teacher is not a
key.

So R is divided into two relations R1(Teacher, subject) and R2(student, Teacher).

R1

Teacher Subject

P.Naresh database

K.DAS C

R.Prasad C

R2

Student Teacher

Jhansi P.Naresh

Jhansi K.Das

Subbu P.Naresh

Subbu R.Prasad

All the anomalies which were present in R, now removed in the above two relations.
DECOMPOSITIONS

A decomposition of a relation schema R consists of replacing the relation schema by two (or
more) relation schemas that each contain a subset of the attributes of R and together include
all attributes in R.

When a relation in the relational model is not appropriate normal form then the decomposition
of a relation is required. In a database, breaking down the table into multiple tables termed as
decomposition.

The properties of a relational decomposition are listed below :

1. Attribute Preservation: Using functional dependencies the algorithms decompose the


universal relation schema R in a set of relation schemas D = {
R1, R2, ….. Rn } relational database schema, where ‘D’ is called the
Decomposition of R.

The attributes in R will appear in at least one relation schema Ri in the decomposition, i.e., no
attribute is lost. This is called the Attribute Preservation condition of decomposition.

2. Dependency Preservation: If each functional dependency X->Y specified in F appears


directly in one of the relation schemas Ri in the decomposition D or could be inferred from
the dependencies that appear in some Ri. This is the Dependency Preservation.
If a relation R is decomposed into relation R1 and R2, then the dependencies of R either
must be a part of R1 or R2 or must be derivable from the combination of functional
dependencies of R1 and R2.

For example, suppose there is a relation R (A, B, C, D) with functional dependency


set (A->BC). The relational R is decomposed into R1(ABC) and R2(AD) which is
dependency preserving because FD A->BC is a part of relation R1(ABC).

3. Lossless Join Decomposition: Lossless join property is a feature of decomposition


supported by normalization. It is the ability to ensure that any instance of the original
relation can be identified from corresponding instances in the smaller relations.

For example: R : relation, F : set of functional dependencies on R, X, Y :


decomposition of R, A decomposition {R1, R2, …, Rn} of a relation R is called a
lossless decomposition for R if the natural join of R1, R2, …, Rn produces exactly
the relation R.

The relation is said to be lossless decomposition if natural joins of all the decomposition
give the original relation.

Decomposition is lossless if

1. The union of attributes of both the sub relations R1 and R2 must contain all
the attributes of original relation R.

R1 R2=R

2. The intersection of attributes of both the sub relations R1 and R2 must not be
null, i.e., there should be some attributes that are present in both R1 and R2.

R1∩R2≠∅

3. The intersection of attributes of both the sub relations R1 and R2 must be


the superkey of R1 or R2, or both R1 and R2.

R1 ∩ R2 = Super key of R1 or R2
Let’s see an example of a lossless join decomposition. Suppose we have the
following relation EmployeeProjectDetail as:
<Employee Project Detail>

Employee_Code Employee_Name Employee_Email Project_Name Project_ID

101 John john@demo.com Project103 P03

101 John john@demo.com Project101 P01

102 Ryan ryan@example.com Project102 P02

103 Stephanie stephanie@abc.com Project102 P02

Now, we decompose this relation into EmployeeProject and ProjectDetail


relations as:

<Employee Project>

Employee_Code Project_ID Employee_Name Employee_Email

101 P03 John john@demo.com

101 P01 John john@demo.com

102 P04 Ryan ryan@example.com

103 P02 Stephanie stephanie@abc.com

The primary key of the above relation is {Employee_Code, Project_ID}.

<Project Detail>

Project_ID Project_Name

P03 Project103
P01 Project101

P04 Project104

P02 Project102

The primary key of the above relation is {Project_ID}.

Let’s first check the EmployeeProject ProjectDetail:


Employee Project Project Detail>

Employee_Code Project_ID Employee_Name Employee_Email Project_Name

101 P03 John john@demo.com Project103

101 P01 John john@demo.com Project101

102 P04 Ryan ryan@example.com Project104

103 P02 Stephanie stephanie@abc.com Project102

As we can see all the attributes of Employee Project and Project Detail are in
Employee Project Project Detail relation and it is the same as the original
relation. So the first condition holds.

Now let’s check the EmployeeProject ∩ ProjectDetail:

<EmployeeProject ∩ ProjectDetail>
Project_ID

P03

P01

P04

P02
As we can see this is not null, so the the second condition holds as well. Also the
EmployeeProject ∩ ProjectDetail = Project_Id. This is the super key of the
ProjectDetail relation, so the third condition holds as well.

Now, since all three conditions hold for our decomposition, this is a lossless
join decomposition.

4. Lack of Data Redundancy


Lack of Data Redundancy is also known as a Repetition of Information.
The proper decomposition should not suffer from any data redundancy.

The lack of data redundancy property may be achieved by Normalization


process.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy