0% found this document useful (0 votes)
28 views27 pages

Chapter 4

The document discusses relational algebra operations including select, project, union, intersection, and set difference. Select and project are unary operations that can retrieve rows or columns from a relation. Union combines rows from relations while intersection and set difference identify rows that are in common or different between relations.

Uploaded by

Samrawit Dawit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views27 pages

Chapter 4

The document discusses relational algebra operations including select, project, union, intersection, and set difference. Select and project are unary operations that can retrieve rows or columns from a relation. Union combines rows from relations while intersection and set difference identify rows that are in common or different between relations.

Uploaded by

Samrawit Dawit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Chapter 4

RELATIONAL ALGEBRA AND CALCULUS


Introduction
• A data model must include a set of operations to manipulate the
database, in addition to the data model's concepts for defining
database structure and constraints.
• The basic set of operations for the relational model is the relational
algebra.
– These operations enable a user to specify basic retrieval requests.
• The result of a retrieval is a new relation, which may have been
formed from one or more relations
• A sequence of relational algebra operations forms a relational algebra
expression,
– whose result will also be a relation that represents the result of a database
query (or retrieval request)
• The relational algebra is very important for several reasons because.
– it provides a formal foundation for relational model operations.
– some of its concepts are incorporated into the SQL standard query language for
RDBMSs.
• The operations of the relational algebra can be divided in to two groups.
These are
1. Set operations from mathematical set theory; these are applicable
because each relation is defined to be a set of tuples in the formal
relational model.
– Set operations include UNION, INTERSECTION, SET DIFFERENCE, and CARTESIAN
PRODUCT
2. The other group consists of operations developed specifically for relational
databases-these include SELECT,PROJECT, and JOIN.
• SELECT and PROJECT operations are unary operations that operate on
single relations.
• JOIN and other complex binary operations, which operate on two tables
• Some common database requests cannot be performed with the original
relational algebra operations, so additional operations were created to
express these requests.
– These include aggregate functions, which are operations that can summarize data
from the tables, as well as additional types of JOIN and UNION operations.
UNARY RELATIONAL OPERATIONS:SELECT AND PROJECT
The select operation:
• The SELECT operation is used to select a subset of the
tuples from a relation that satisfy a selection condition
• It keeps only those tuples that satisfy a qualifying
condition
• The SELECT operation can also be visualized as a horizontal
partition of the relation into two sets of tuples-those tuples
that satisfy the condition and are selected, and those
tuples that do not satisfy the condition and are discarded.
example1: to select the EMPLOYEE tuples whose department
is 4, or those whose salary is greater than $30,000, we
specify with SELECT as
σDNO=4 (EMPLOYEE) OR σ SALARY>30000(EMPLOYEE )

Example 2: to select the tuples for all employees who either


work in department 4 and make over $25,000 per year, or
work in department 5 and make over $30,000, we can
specify the following operation:
SELECT

σ(DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY> 30000)


(EMPLOYEE)
• In general, the SELECT operation is denoted by
σ<selection condition>(R),
• where the symbol σ (sigma) is used to denote the SELECT operator,
and the selection condition is a Boolean expression specified on the
attributes of relation R.
• The relation resulting from the SELECT operation has the same
attributes as R
• The Boolean expression specified in <selection condition> is made up
of a number of clauses of the form :
<attribute name> <comparison op> <constant value>,
• where <attribute name> is the name of an attribute of R,
<comparison op> is normally one of the operators {=, <=, >=)}, and
<constant value> is a constant value from the attribute domain.
• Clauses can be arbitrarily connected by the Boolean operators AND,
OR, and NOT to form a general selection condition
For example
• To select the tuples for all employees who either work in department
4 and make over $25,000 per year, or work in department 5 and
make over $30,000, we can specify the following SELECT operation:
σ (DNO=4 AND SALARY>25000) OR (DNO=5 AND SALARY>30000)(EMPLOYEE)
• In general, the result of a SELECT operation can be determined as
follows.
• The <selection condition> is applied independently
to each tuple t in R.
• This is done by substituting each occurrence of an
attribute Ai in the selection condition with its value
in the tuple t[Ai].
• If the condition evaluates to TRUE, then tuple t is
selected.
• All the selected tuples appear in the result of the
SELECT operation.
• The Boolean conditions AND, OR, and NOT have their normal
interpretation, as follows:
– (condl AND cond2) is TRUE if both (cond l ) and (cond2) are TRUE; otherwise, it
is FALSE.
– (cond l OR cond2) is TRUE if either (cond l ) or (cond2) or both are TRUE;
otherwise, it is FALSE.
– (NOT cond) is TRUE if cond is FALSE; otherwise, it is FALSE.
• The SELECT operator is unary; that is, it is applied to a single relation
• The degree of the relation resulting from a SELECT operation-its
number of attributes-is the same as the degree of R.
• The number of tuples in the resulting relation is always less than or
equal to the number of tuples in R
• Notice that the SELECT operation is commutative; that is,
σ <cond1>(σ <cond2>(R)) = σ<cond2>(σ <cond1>(R))
The project operation
• the SELECT operation selects some of the rows from the table while
discarding other rows.
• The PROJECT operation, on the other hand, selects certain columns
from the table and discards the other columns
• If we are interested in only certain attributes of a relation, we use the
PROJECT operation to project the relation over these attributes only
• The result of the PROJECT operation can hence be visualized as a
vertical partition of the relation into two relations: one has the
needed columns (attributes) and contains the result of the operation,
and the other contains the discarded columns.
• For example, to list each employee's first and last name and salary,
we can use the PROJECT operation as follows:
∏LNAME, FNAME, SALARY( EMPLOYEE)
• The general form of the PROJECT operation is
∏<attribute list> (R)
• where ∏ (pi) is the symbol used to represent the PROJECT
operation, and <attribute list> is the desired list of attributes from the
attributes of relation R
• The result of the PROJECT operation has only the attributes specified
in <attribute list> in the same order as they appear in the list.
– Hence, its degree is equal to the number of attributes in <attribute list>.
• The number of tuples in a relation resulting from a PROJECT
operation is always less than or equal to the number of tuples in R.
• In projection ∏<Iist1 > (∏<list2>)) = ∏ <list1>(R) as long as <Iist 2>
contains the attributes in <list1 >; otherwise, the left-hand side is an
incorrect expression
– commutatively does not hold on PROJECT
Sequences of Operations and the RENAME Operation
• we may want to apply several relational algebra operations one after
the other.
• Either we can write the operations as a single relational algebra
expression by nesting the operations, or we can apply one operation
at a time and create intermediate result relations.
• In the case of intermediate result , we must give names to the
relations that hold the intermediate results.
• For example: to retrieve the first name, last name, and salary of all
employees who work in department number 5, we must apply a
SELECT and a PROJECT operation.
• We can write a single relational algebra expression as follows:
∏ FNAME, LNAME, SALARY(σDNO=5 (EMPLOYEE))
• Alternatively, we can explicitly show the sequence of operations,
giving a name to each intermediate relation
DEP5_EMPS σDNO=5 (EMPLOYEE)
RESULT∏ FNAME, LNAME,SALARY (DEP5_EMPS)
Note:
• It is often simpler to break down a complex sequence of operations
by specifying intermediate result relations than to write a single
relational algebra expression
RELATIONAL ALGEBRA OPERATIONS FROM SET THEORY
The UNION, INTERSECTION, and MINUS Operations
• These are standard mathematical operations on sets
• When these operations are adapted to relational databases, the two
relations on which any of these three operations are applied must
have the same type of tuples;
• This condition has been called union compatibility.
• Two relations R(A1, Az, ... , An) and S(B 1, Bz, ... , Bn) are said to be
union compatible if they have the same degree n and if dom(Ai) =
dom(Bi) for 1 <i<=n
• This means that the two relations have the same number of
attributes, and each corresponding pair of attributes has the same
domain.
UNION
• The result of this operation, denoted by R U S, is a relation
that includes all tuples that are either in R or in S or in
both R and S. Duplicate tuples are eliminated.
For example:
• to retrieve the social security numbers of all employees
who either work in department 5 or directly supervise an
employee who works in department 5, we can use the
UNION operation as follows:
DEP5_EMPSσDNO=5 (EMPLOYEE)
RESULT1∏SSN (DEP5_EMPS)
RESULT2 (SSN) ∏SUPERSSN (DEP5_EMPS)
RESULT RESULT1 U RESULT2
• The relation RESULT1 has the social security numbers
of all employees who work in department 5,
• whereas RESULT2 has the social security numbers of
all employees who directly supervise an employee
who works in department 5.
• The UNION operation produces the tuples that are in
either RESULT1 or RESULT2 or both.
• Note that the SSN value 333445555 appears only
once in the result.
Intersection:
• The result of this operation, denoted by R n S, is a relation that
includes all tuples that are in both Rand S.
Set difference (or MINUS)
• The result of this operation, denoted by R - S, is a relation that
includes all tuples that are in R but not in S.
• Figure 6,4 illustrates the three operations.
• The relations STUDENT and INSTRUCTOR in Figure 6,4a are union
compatible, and their tuples represent the names of students and
instructors, respectively.
• The result of the UNION operation in Figure 6,4b shows the names of
all students and instructors.
• Note that duplicate tuples appear only once in the result. The result
of the INTERSECTION operation (Figure 6,4c) includes only those who
are both students and instructors.
The Cartesian Product

• The CARTESIAN PRODUCT operation-also known as CROSS PRODUCT


or CROSS JOIN-which is denoted by x
• This operation is used to combine tuples from two relations in a
combinatorial fashion.
• In general, the result of R(A j , Az, ... , An) X S(Bj , Bz, ... , Bm) is a
relation Q with degree n + m attributes Q(Aj , Az' ... , An' Bj , Bz, ... ,
Bm), in that order
• The resulting relation Q has one tuple for each combination of tuples-
one from R and one from S.
• Hence, if R has nR tuples (denoted as IRI = nR ), and S has ns tuples,
then Rx S will have nR * ns tuples.
– For example, suppose that we want to retrieve a list of names of each female employee's
dependents

FEMALE_EMPSσSEX=' F' (EMPLOYEE)

EMPNAMES∏FNAME, LNAME, SSN (FEMALE_EMPS)


EMP_DEPENDENTSEMPNAMES X DEPENDENT

σ
ACTUAL_DEPENDENTS SSN=ESSN (EMP_DEPENDENTS)

RESULT∏FNAME, LNAME, DEPENDENLNAME (ACTUAL_DEPENDENTS )


• In EMP_DEPENDENTS, every tuple from EMPNAMES is combined
with every tuple from DEPENDENT, giving a result that is not very
meaningful
• We want to combine a female employee tuple only with her
particular dependents-namely, the DEPENDENT tuples whose ESSN
values match the SSN value of the EMPLOYEE tuple
• The ACTUAL_DEPENDENTS relation accomplishes this.
BINARY RELATIONAL OPERATIONS
The JOIN Operation
• The JOIN operation, denoted by Ɣ , is used to combine related tuples
from two relations into single tuples
• it allows us to process relationships among relations.
• Example: suppose that we want to retrieve the name of the manager of
each department
• To get the manager's name, we need to combine each department tuple
with the employee tuple whose SSN value matches the MGRSSN value in
the department tuple
• We do this by using the JOIN operation, and then projecting the result
over the necessary attributes, as follows
DEPT_MGR DEPARTMENT Ɣ MGRSSN=SSN EMPLOYEE
RESULT∏DNAME, LNAME, FNAME (DEPT_MGR)
• Note: that MGRSSN is a foreign key and that the
referential integrity constraint plays a role in having
matching tuples in the referenced relation Employee .
• The result of the JOIN is a relation Q with n + m attributes Q(AI,
Az, ... , An' BI, B2, ... , Bm ) in that order; Q has one tuple for each
combination of tuples-one from R and one from S-whenever the
combination satisfies the join condition
• In JOIN, only combinations of tuples satisfying the join condition
appear in the result, whereas in the CARTESIAN PRODUCT all
combinations of tuples are included in the result
• This is the main difference between CARTESIAN PRODUCT and JOIN.
• Each tuple combination for which the join condition evaluates to
TRUE is included in the resulting relation Q as a single combined tuple

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy