0% found this document useful (0 votes)
59 views16 pages

Relational Algebra in DBMS

Relational algebra is a procedural query language used in Database Management Systems (DBMS) to perform operations on data stored in databases, forming the foundation for SQL. It includes basic operations such as selection, projection, union, set difference, Cartesian product, and renaming, as well as derived operations like join, intersection, and division. Understanding relational algebra is essential for working with modern databases like Oracle and Microsoft SQL Server.

Uploaded by

AMAN RAJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views16 pages

Relational Algebra in DBMS

Relational algebra is a procedural query language used in Database Management Systems (DBMS) to perform operations on data stored in databases, forming the foundation for SQL. It includes basic operations such as selection, projection, union, set difference, Cartesian product, and renaming, as well as derived operations like join, intersection, and division. Understanding relational algebra is essential for working with modern databases like Oracle and Microsoft SQL Server.

Uploaded by

AMAN RAJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Relational algebra

Relational algebra in DBMS is a procedural query language. Queries in relational algebra are
performed using operators. Relational Algebra is the fundamental block for modern language
SQL and modern Database Management Systems such as Oracle Database, Mircosoft SQL
Server, IBM Db2, etc.

Let's know what relational algebra in DBMS is and also we will learn about relational algebra
operations in DBMS.

Before reading this article, you should have some understanding of the following DBMS topics:

 Relational Calculus in DBMS


 Relational and Non-Relational Databases
 Decomposition in DBMS

What is Relational Algebra in DBMS?


Relational Algebra came in 1970 and was given by Edgar F. Codd (Father of DBMS). It is also
known as Procedural Query Language(PQL) as in PQL, a programmer/user has to mention two
things, "What to Do" and "How to Do".

Suppose our data is stored in a database, then relational algebra is used to access the data from
the database.

The First thing is we have to access the data, this needs to be specified in the query as "What to
Do", but we have to also specify the method/procedure in the query that is "How to Do" or how
to access the data from the database.

Types of Relational Operations in DBMS


In Relational Algebra, we have two types of Operations.

 Basic Operations
 Derived Operations

Applying these operations over relations/tables will give us new relations as output.
Basic Operations
Six fundamental operations are mentioned below. The majority of data retrieval operations are
carried out by these. Let's know them one by one.

But, before moving into detail, let's have two tables or we can say relations STUDENT(ROLL,
NAME, AGE) and EMPLOYEE(EMPLOYEE_NO, NAME, AGE) which will be used in the
below examples.

STUDENT

ROLL NAME AGE


1 Aman 20
2 Atul 18
3 Baljeet 19
4 Harsh 20
5 Prateek 21
6 Prateek 23
EMPLOYEE

EMPLOYEE_NO NAME AGE


E-1 Anant 20
E-2 Ashish 23
E-3 Baljeet 25
E-4 Harsh 20
E-5 Pranav 22

Select (σ)

Select operation is done by Selection Operator which is represented by "sigma"(σ). It is used to


retrieve tuples(rows) from the table where the given condition is satisfied. It is a unary
operator means it requires only one operand.
Notation : σ p(R)
Where σ is used to represent SELECTION
R is used to represent RELATION
p is the logic formula
Let's understand this with an example:
Suppose we want the row(s) from STUDENT Relation where "AGE" is 20

σ AGE=20 (STUDENT)

σ age=20 (student)

This will return the following output:

ROLL NAME AGE


1 Aman 20
4 Harsh 20

Project (∏)

Project operation is done by Projection Operator which is represented by "pi"(∏). It is used to


retrieve certain attributes(columns) from the table. It is also known as vertical partitioning as it
separates the table vertically. It is also a unary operator.
Notation : ∏ a(r)
Where ∏ is used to represent PROJECTION
r is used to represent RELATION
a is the attribute list
Let's understand this with an example:
Suppose we want the names of all students from STUDENT Relation.

∏ NAME(STUDENT)
This will return the following output:

NAME
Aman
Atul
Baljeet
Harsh
Prateek

As you can see from the above output it eliminates duplicates.


For multiple attributes, we can separate them using a ",".

∏ ROLL,NAME(STUDENT)

Above code will return two columns, ROLL and NAME.

ROLL NAME
1 Aman
2 Atul
3 Baljeet
4 Harsh
5 Prateek
6 Prateek

Union (∪)

Union operation is done by Union Operator which is represented by "union"(∪). It is the same as
the union operator from set theory, i.e., it selects all tuples from both relations but with the
exception that for the union of two relations/tables both relations must have the same set of

Notation: R ∪ S
Attributes. It is a binary operator as it requires two operands.

Where R is the first relation


S is the second relation

If relations don't have the same set of attributes, then the union of such relations will result
in NULL.

Let's have an example to clarify the concept:


Suppose we want all the names from STUDENT and EMPLOYEE relation.

∏ NAME(STUDENT) ∪ ∏ NAME(EMPLOYEE)
NAME
Aman
Anant
NAME
Ashish
Atul
Baljeet
Harsh
Pranav
Prateek

As we can see from the above output it also eliminates duplicates.

Set Difference (-)

Set Difference as its name indicates is the difference between two relations (R-S). It is denoted
by a "Hyphen"(-) and it returns all the tuples(rows) which are in relation R but not in relation S.
It is also a binary operator.
Notation : R - S
Where R is the first relation
S is the second relation

Just like union, the set difference also comes with the exception of the same set of attributes in
both relations.

Let's take an example where we would like to know the names of students who are in STUDENT
Relation but not in EMPLOYEE Relation.

∏ NAME(STUDENT) - ∏ NAME(EMPLOYEE)

This will give us the following output:

NAME
Aman
Atul
Prateek

Cartesian product (X)

Cartesian product is denoted by the "X" symbol. Let's say we have two relations R and S.
Cartesian product will combine every tuple(row) from R with all the tuples from S. I know it
sounds complicated, but once we look at an example, you'll see what I mean.
Notation: R X S
Where R is the first relation
S is the second relation

As we can see from the notation it is also a binary operator.


Let's combine the two relations STUDENT and EMPLOYEE.

STUDENT X EMPLOYEE
ROLL NAME AGE EMPLOYEE_NO NAME AGE
1 Aman 20 E-1 Anant 20
1 Aman 20 E-2 Ashish 23
1 Aman 20 E-3 Baljeet 25
1 Aman 20 E-4 Harsh 20
1 Aman 20 E-5 Pranav 22
2 Atul 18 E-1 Anant 20
2 Atul 18 E-2 Ashish 23
2 Atul 18 E-3 Baljeet 25
2 Atul 18 E-4 Harsh 20
2 Atul 18 E-5 Pranav 22

. . . And so on.

Rename (ρ)

Rename operation is denoted by "Rho"(ρ). As its name suggests it is used to rename the output
relation. Rename operator too is a binary operator.
Notation: ρ(R,S)
Where R is the new relation name
S is the old relation name

Let's have an example to clarify this


Suppose we are fetching the names of students from STUDENT relation. We would like to
rename this relation as STUDENT_NAME.

ρ(STUDENT_NAME,∏ NAME(STUDENT))

STUDENT_NAME

NAME
Aman
Atul
Baljeet
Harsh
Prateek

As you can see, this output relation is named "STUDENT_NAME".

Takeaway

 Select (σ) is used to retrieve tuples(rows) based on certain conditions.


 Project (∏) is used to retrieve attributes(columns) from the relation.
 Union (∪) is used to retrieve all the tuples from two relations.
 Set Difference (-) is used to retrieve the tuples which are present in R but not in S(R-S).
 Cartesian product (X) is used to combine each tuple from the first relation with each tuple
from the second relation.
 Rename (ρ) is used to rename the output relation.

Derived Operations
Also known as extended operations, these operations can be derived from basic operations and
hence named Derived Operations. These include three operations: Join Operations, Intersection
operations, and Division operations.

Let's study them one by one.

Join Operations

Join Operation in DBMS are binary operations that allow us to combine two or more relations.
They are further classified into two types: Inner Join, and Outer Join.

First, let's have two relations EMPLOYEE consisting


of E_NO, E_NAME, CITY and EXPERIENCE. EMPLOYEE table contains employee's
information such as id, name, city, and experience of employee(In Years). The other relation
is DEPARTMENT consisting of D_NO, D_NAME, E_NO and MIN_EXPERIENCE.

DEPARTMENT table defines the mapping of an employee to their department. It contains


Department Number, Department Name, Employee Id of the employee working in that
department, and the minimum experience required(In Years) to be in that department.

EMPLOYEE

E_NO E_NAME CITY EXPERIENCE


E-1 Ram Delhi 04
E-2 Varun Chandigarh 09
E-3 Ravi Noida 03
E-4 Amit Bangalore 07

DEPARTMENT

D_NO D_NAME E_NO MIN_EXPERIENCE


D-1 HR E-1 03
D-2 IT E-2 05
D-3 Marketing E-3 02
Also, let's have the Cartesian Product of the above two relations. It will be much easier to
understand Join Operations when we have the Cartesian Product.

E_NO E_NAME CITY EXPERIENCE D_NO D_NAME E_NO MIN_


E-1 Ram Delhi 04 D-1 HR E-1
E-1 Ram Delhi 04 D-2 IT E-2
E-1 Ram Delhi 04 D-3 Marketing E-3
E-2 Varun Chandigarh 09 D-1 HR E-1
E-2 Varun Chandigarh 09 D-2 IT E-2
E-2 Varun Chandigarh 09 D-3 Marketing E-3
E-3 Ravi Noida 03 D-1 HR E-1
E-3 Ravi Noida 03 D-2 IT E-2
E-3 Ravi Noida 03 D-3 Marketing E-3
E-4 Amit Bangalore 07 D-1 HR E-1
E-4 Amit Bangalore 07 D-2 IT E-2
E-4 Amit Bangalore 07 D-3 Marketing E-3

Inner Join

When we perform Inner Join, only those tuples returned that satisfy the certain condition. It is
also classified into three types: Theta Join, Equi Join and Natural Join.

Theta Join (θ)

Theta Join combines two relations using a condition. This condition is represented by the

Notation : R ⋈θ S
symbol "theta"(θ). Here conditions can be inequality conditions such as >,<,>=,<=, etc.

Where R is the first relation


S is the second relation

Let's have a simple example to understand this.

Suppose we want a relation where EXPERIENCE from EMPLOYEE >= MIN_EXPERIENCE


from DEPARTMENT.

E_NO E_NAME CITY EXPERIENCE D_NO D_NAME E_NO MIN_EXPERIENCE


E-1 Ram Delhi 04 D-1 HR E-1 03
E-1 Ram Delhi 04 D-3 Marketing E-3 02
E-2 Varun Chandigarh 09 D-1 HR E-1 03
E-2 Varun Chandigarh 09 D-2 IT E-2 05
E-2 Varun Chandigarh 09 D-3 Marketing E-3 02
E-3 Ravi Noida 03 D-1 HR E-1 03
E-3 Ravi Noida 03 D-3 Marketing E-3 02
E-4 Amit Bangalore 07 D-1 HR E-1 03
E_NO E_NAME CITY EXPERIENCE D_NO D_NAME E_NO MIN_EXPERIENCE
E-4 Amit Bangalore 07 D-2 IT E-2 05
E-4 Amit Bangalore 07 D-3 Marketing E-3 02
EMPLOYEE⋈θ EMPLOYEE.EXPERIENCE>=DEPARTMENT.MIN_EXPERIENCE DEPARTMENT

Check the Cartesian Product, if in any tuple/row EXPERIENCE >= MIN_EXPERIENCE then
insert this tuple/row in output relation.

Equi Join

Equi Join is a special case of theta join where the condition can only
contain **equality(=)** comparisons.
A non-equijoin is the inverse of an equi join, which occurs when you join on a condition other
than "=".

Let's have an example where we would like to join EMPLOYEE and DEPARTMENT relation
where E_NO from EMPLOYEE = E_NO from DEPARTMENT.

EMPLOYEE ⋈EMPLOYEE.E_NO = DEPARTMENT.E_NO DEPARTMENT


E_NO E_NAME CITY EXPERIENCE D_NO D_NAME E_NO MIN_
E-1 Ram Delhi 04 D-1 HR E-1
E-2 Varun Chandigarh 09 D-2 IT E-2
E-3 Ravi Noida 03 D-3 Marketing E-3

Check Cartesian Product, if the tuple contains same E_NO, insert that tuple in the output relation

Natural Join (⋈)

A comparison operator is not used in a natural join. It does not concatenate like a Cartesian
product. A Natural Join can be performed only if two relations share at least one common
attribute. Furthermore, the attributes must share the same name and domain.
Natural join operates on matching attributes where the values of the attributes in both relations
are the same and remove the duplicate ones.

Notation : R ⋈ S
Preferably Natural Join is performed on the foreign key.

Where R is the first relation


S is the second relation

Let's say we want to join EMPLOYEE and DEPARTMENT relation with E_NO as a common
attribute.

Notice, here E_NO has the same name in both the relations and also consists of the same
domain, i.e., in both relations E_NO is a string.

EMPLOYEE ⋈ DEPARTMENT
E_NO E_NAME CITY EXPERIENCE D_NO D_NAME MIN_EXPERIEN
E-1 Ram Delhi 04 D-1 HR 03
E-2 Varun Chandigarh 09 D-2 IT 05
E-3 Ravi Noida 03 D-3 Marketing 02

But unlike the above operation, where we have two columns of E_NO, here we are having only
one column of E_NO. This is because Natural Join automatically keeps a single copy of a
common attribute.

Outer Join

Unlike Inner Join which includes the tuple that satisfies the given condition, Outer Join also
includes some/all the tuples which don't satisfy the given condition. It is also of three types: Left
Outer Join, Right Outer Join, and Full Outer Join.

Let's say we have two relations R and S, then


Below is the representation of Left, Right, and Full Outer Joins.

Left Outer Join

As we can see from the diagram, Left Outer Join returns the matching tuples(tuples present in
both relations) and the tuples which are only present in Left Relation, here R.

However, if the matching tuples are NULL, then attributes/columns of Right Relation, here S are
made NULL in the output relation.

Let's understand this a bit more using an example:

EMPLOYEE ⟕EMPLOYEE.E_NO = DEPARTMENT.E_NO DEPARTMENT

Here we are combining EMPLOYEE and DEPARTMENT relation with the constraint that
EMPLOYEE's E_NO must be equal to DEPARTMENT's E_NO.
E_NO E_NAME CITY EXPERIENCE D_NO D_NAME MIN_EXPERIEN
E-1 Ram Delhi 04 D-1 HR 03
E-2 Varun Chandigarh 09 D-2 IT 05
E-3 Ravi Noida 03 D-3 Marketing 02
E-4 Amit Bangalore 07 - - -

As you can see here, all the tuples from left, i.e., EMPLOYEE relation are present. But E-4 is not
satisfying the given condition, i.e., E_NO from EMPLOYEE must be equal to E_NO from
DEPARTMENT, still it is included in the output relation. This is because Outer Join also
includes some/all the tuples which don't satisfy the condition. That's why Outer Join marked E-
4's corresponding tuple/row from DEPARTMENT as NULL.

Right Outer Join

Right Outer Join returns the matching tuples and the tuples which are only present in Right
Relation here S.

The same happens with the Right Outer Join, if the matching tuples are NULL, then the
attributes of Left Relation, here R are made NULL in the output relation.

We will combine EMPLOYEE and DEPARTMENT relations with the same constraint as above.

EMPLOYEE ⟖EMPLOYEE.E_NO = DEPARTMENT.E_NO DEPARTMENT


E_NO E_NAME CITY EXPERIENCE D_NO D_NAME MIN_EXPERIEN
E-1 Ram Delhi 04 D-1 HR 03
E-2 Varun Chandigarh 09 D-2 IT 05
E-3 Ravi Noida 03 D-3 Marketing 02

As all the tuples from DEPARTMENT relation have a corresponding E_NO in EMPLOYEE
relation, therefore no tuple from EMPLOYEE relation contains a NULL.

Full Outer Join

Full Outer Join returns all the tuples from both relations. However, if there are no matching
tuples then, their respective attributes are made NULL in output relation.

Again, combine the EMPLOYEE and DEPARTMENT relation with the same constraint.

EMPLOYEE ⟗EMPLOYEE.E_NO = DEPARTMENT.E_NO DEPARTMENT


E_NO E_NAME CITY EXPERIENCE D_NO D_NAME MIN_EXPERIEN
E-1 Ram Delhi 04 D-1 HR 03
E-2 Varun Chandigarh 09 D-2 IT 05
E-3 Ravi Noida 03 D-3 Marketing 02
E-4 Amit Bangalore 07 - - -
Intersection (∩)

Intersection operation is done by Intersection Operator which is represented


by "intersection"(∩).It is the same as the intersection operator from set theory, i.e., it selects all
the tuples which are present in both relations. It is a binary operator as it requires two operands.
Also, it eliminates duplicates.
Notation : R ∩ S
Where R is the first relation
S is the second relation

Let's have an example to clarify the concept:


Suppose we want the names which are present in STUDENT as well as in EMPLOYEE relation,
Relations we used in Basic Operations.

∏ NAME(STUDENT) ∩ ∏ NAME(EMPLOYEE)
NAME
Baljeet
Harsh

Division (÷)

Division Operation is represented by "division"(÷ or /) operator and is used in queries that


involve keywords "every", "all", etc.
Notation : R(X,Y)/S(Y)
Here,
R is the first relation from which data is retrieved.
S is the second relation that will help to retrieve the data.
X and Y are the attributes/columns present in relation. We can have multiple attributes in
relation, but keep in mind that attributes of S must be a proper subset of attributes of R.
For each corresponding value of Y, the above notation will return us the value of X from
tuple<X,Y> which exists everywhere.

It's a bit difficult to understand this in a theoretical way, but you will understand this with an
example.

Let's have two relations, ENROLLED and COURSE. ENROLLED consist of two attributes
STUDENT_ID and COURSE_ID. It denotes the map of students who are enrolled in given
courses.

COURSE contains the list of courses available.


See, here attributes/columns of COURSE relation are a proper subset of attributes/columns of
ENROLLED relation. Hence Division operation can be used here.

ENROLLED
STUDENT_ID COURSE_ID
Student_1 DBMS
Student_2 DBMS
Student_1 OS
Student_3 OS

COURSE

COURSE_ID
DBMS
OS

Now the query is to return the STUDENT_ID of students who are enrolled in every course.

ENROLLED(STUDENT_ID, COURSE_ID)/COURSE(COURSE_ID)

This will return the following relation as output.

STUDENT_ID
Student_1

Relational Calculus

Introduction
Relational Calculus in database management system (DBMS) is all about "What you want ?".
Relational calculus does not tell us how to get the results from the Database, but it just cares
about what we want.

The theory of Relational calculus was introduced by computer scientist and


mathematician Edgar Codd. Let's deep dive and try to understand Relational calculus.

What is Relational Calculus?


Before understanding Relational calculus in DBMS, we need to understand Procedural
Language and Declarative Langauge.

1. Procedural Language - Those Languages which clearly define how to get the required
results from the Database are called Procedural Language. Relational algebra is a
Procedural Language.
2. Declarative Language - Those Language that only cares about What to get from the
database without getting into how to get the results are called Declarative
Language. Relational Calculus is a Declarative Language.
So Relational Calculus is a Declarative Language that uses Predicate Logic or First-Order Logic
to determine the results from Database.

Types of Relational Calculus in DBMS


Relational Calculus is of Two Types:

1. Tuple Relational Calculus (TRC)


2. Domain Relational Calculus (DRC)

Tuple Relational Calculus (TRC)

Tuple Relational Calculus in DBMS uses a tuple variable (t) that goes to each row of the table
and checks if the predicate is true or false for the given row. Depending on the given predicate
condition, it returns the row or part of the row.

The Tuple Relational Calculus expression Syntax

{t \| P(t)}

Where t is the tuple variable that runs over every Row, and P(t) is the predicate logic expression
or condition.

Let's take an example of a Customer Database and try to see how TRC expressions work.

Customer Table
Customer_id Name Zip code
1 Rohit 12345
2 Rahul 13245
3 Rohit 56789
4 Amit 12345.

Example 1: Write a TRC query to get all the data of customers whose zip code is 12345.
TRC Query: {t \| t ∈ Customer ∧ t.Zipcode = 12345} or TRC Query: {t \| Customer(t) ∧
t[Zipcode] = 12345 }

Workflow of query - The tuple variable "t" will go through every tuple of the Customer table.
Each row will check whether the Cust_Zipcode is 12345 or not and only return those rows that
satisfies the Predicate expression condition.

The TRC expression above can be read as "Return all the tuple which belongs to the
Customer Table and whose Zipcode is equal to 12345."

Result of the TRC expression above:

Customer_id Name Zip code


1 Rohit 12345
4. Amit 12345

Example 2: Write a TRC query to get the customer id of all the Customers.

TRC query: { t \| ∃s (s ∈ Customer ∧ s.Customer_id = t.customer_id) }

Result of the TRC Query:

Customer_id
1
2
3
4

Domain Relational Calculus (DRC)

Domain Relational Calculus uses domain Variables to get the column values required from the
database based on the predicate expression or condition.

The Domain realtional calculus expression syntax:

{<x1,x2,x3,x4...> \| P(x1,x2,x3,x4...)}

where,

<x1,x2,x3,x4...> are domain variables used to get the column values required,
and P(x1,x2,x3...) is predicate expression or condition.

Let's take the example of Customer Database and try to understand DRC queries with some
examples.
Customer Table
Customer_id Name Zip code
1 Rohit 12345
2 Rahul 13245
3 Rohit 56789
4 Amit 12345

Example 1: Write a DRC query to get the data of all customers with Zip code 12345.

DRC query: {<x1,x2,x3> \| <x1,x2> ∈ Customer ∧ x3 = 12345 }

Workflow of Query: In the above query x1,x2,x3 (ordered) refers to the attribute or column
which we need in the result, and the predicate condition is that the first two domain variables x1
and x2 should be present while matching the condition for each row and the third domain
variable x3 should be equal to 12345.

Result of the DRC query will be:

Customer_id Name Zip code


1 Rohit 12345
4 Amit 12345

Example 2: Write a DRC query to get the customer id of all the customer.

DRC Query: { <x1> \| ∃ x2,x3(<x1,x2,x3> ∈ Customer ) }

Result of the above Query will be:

Customer_id
1
2
3
4
NOTE: The above Queries are written for both TRC and DRC so that they can
be compared and can be corelated.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy