0% found this document useful (0 votes)
6 views144 pages

Chapter 3 - Relational Data Model _ 1

Uploaded by

silkytanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views144 pages

Chapter 3 - Relational Data Model _ 1

Uploaded by

silkytanu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 144

Chapter 3

Relational Data Model


Contents
• Concept of relations and its characteristics
• Schema-instance distinction
• Integrity Constraints
• Converting the database specification in E/R and extended E/R notation to
the relational schema
• Relational algebra operators: Selection, Projection, Cross product, Types of
Joins, Division
• Tuple relation calculus
• Domain relational calculus
• Introduction to SQL
• Data definition in SQL, table
• Key & Foreign key definition
• Data manipulation in SQL
• Nested queries
• Notion of aggregation
Relational Model
• Relational data model is the primary data model
for commercial data- processing applications.
• A relational database consists of a collection of
tables, each of which is assigned a unique name
• A row in a table represents a relationship among
a set of values.
• Thus, a table is an entity set and a row is an
entity.
• The columns or properties are called attributes.
Relational Model
• For each attribute, there is a set of permitted
values, called the domain of that attribute.
• Same domain can be shared by more than one
attribute.
• Degree is the number of attributes in the
relation/ table.
• Cardinality is the number of tuples or rows in the
relation/table.
• The attribute values are required to be atomic,
i.e. indivisible.
Relational Model
• Let D1, D2, and D3 are the domains.
• Any row of the table consists of a 3 data values
v1, v2, v3 where v1 ϵ D1, v2 ϵ D2 and v3 ϵ D3.
• Thus, the table will contain only a subset of the
set of all possible rows.
• Therefore, the table is a subset of D1 x D2 x D3.
• Each attribute of a relation has a unique name.
• NULL Value is a domain value which is a member
of any possible domain.
Relational Model
• Database Schema is the logical design of the
database.
• If (a1, a2 ...an) be the attributes, then the
relation schema will be R=(a1,a2 ...an).
• Database Instance is the snapshot of the data in
the database at a given instant of time.
• Relation is denoted by lower case names and
Relation Schema is the name beginning with an
uppercase letter.
Relational Database
• Relational database is a database consisting of
multiple relations or tables.
• The information about an enterprise is broken up
into parts, with each relation storing one part of
the information.
• The normalization process deals with how to
design relational schemas.
Relational Data Integrity
• Candidate key is an attribute or set of attributes that
can uniquely identify a row or tuple in a table.
• Let R be the relation with attributes a1, a2 ... an.
• The set of attributes of R is said to be a candidate key
of R iff the following two properties holds:
1. Uniqueness:
• At any given time, no two distinct tuples or rows of R
have the same value for ai, the same value for aj ...an
2. Minimality:
• No proper subset of the set (ai, aj ... an) has the
uniqueness property.
Relational Data Integrity
The major types of integrity constraints are:
1. Domain Constraints
• All the values that appear in a column of a
relation must be taken from the same domain.
• This constraint can be applied by specifying a
particular data type to a column.
Relational Data Integrity
2. Entity Integrity Constraints
• The entity integrity rule is designed to assure
that every relation has a primary key, and that
the data values for that primary key are all valid.
• Usually, the primary key of each relation is the
first column.
• Entity integrity guarantees that every primary
key attribute is NOT NULL.
• Primary key performs the unique identification
function in a relational model.
Relational Data Integrity
3. Referential Integrity Constraints
• In relational data model, associations between
tables are defined by using foreign keys
• A referential integrity constraint is a rule that
maintains consistency among the rows of two
relations
• The rule states that if there is a foreign key in one
relation, either each foreign key value must
match a primary key value in the other table or
else the foreign key value must be NULL.
Relational Data Integrity
3. Referential Integrity
• The linking between the foreign key and primary
key allows a set of relations to form an integrated
database.
4. Operational Constraints
• These are the constraints enforced in the
database by the business rules or real world
limitations
Database Languages
DDL (Data Definition Language)
• DDL is used to define the conceptual schema.
• The definition includes the information of all the
entity sets and their associated attributes as well
as the relationships between the entity sets.
• The data values stored in the database must
specify certain consistency constraints.
• The database systems check these constraints
every time the database is updated.
Database Languages
• The output of the DDL is placed in the Data
Dictionary which contains the metadata (data
about data).
• The data dictionary is considered to be a special
type of table, which can only be accessed and
updated by the database system itself
• The database system consults the data
dictionary, before querying or modifying the
actual data, for the validation purpose
• CREATE, ALTER, DROP, RENAME & TRUNCATE
Database Languages
DML (Data Manipulation Language)
• DML is used to manipulate data in the database
• A query is a statement in the DML that requests the
retrieval of data from the database
• SELECT, INSERT, UPDATE & DELETE
DCL (Data Control Languages)
• DCL allows in changing the permissions on database
structures
• GRANT & REVOKE
TCL (Transaction Control Language)
• TCL allows permanently recording the changes made to
the rows stored in a table or undoing such changes
CODD’s Rules
• Codd’s rules are a set of 12 rules proposed by E.
F. Codd designed to define what is required from
a database management system in order for it to
be considered relational, i.e. RDBMS.
• Any database that satisfies even six rules may be
categorized as RDBMS.
Rule0
• A relational system should be able to manage
databases, entirely through its relational
capabilities
CODD’s Rules
Rule1: Information representation
• The entire information is explicitly and logically
represented by the data values of the tables in
the relational data model.
Rule2: Guaranteed access
• In relational model, at each cell, i.e. the
interaction of each row and column, it will have
one and only one value of data (or NULL value).
• Each value of data must be addressable via the
combination of a table name, primary key value
and the column name.
CODD’s Rules
Rule3: Systematic treatment of NULL values
• NULL values are supported in fully relational DBMS for
representing missing information and inapplicable
information in a systematic way independent of data
type
Rule4: Database description rule
• The database description is represented at the logical
level in the same way as ordinary data, so that
authorized users can apply the same relational
language to its interrogation as they apply to the
regular data.
• This means, the RDBMS should have a data dictionary.
CODD’s Rules
Rule5: Comprehensive data sub-language
• The RDBMS should have its own extension of SQL.
• The SQL should support Data Definition, View
Definition, Data Manipulation, Integrity
Constraint, and Authorization.
Rule6: Views updation
• All views that are theoretically updatable are also
updatable by the system.
• Similarly, the views which are theoretically non-
updatable are also non-updatable by the
database system.
CODD’s Rules
Rule7: High-level update, insert, deletes
• A RDBMS should not only support retrieval of data as
relational sets, but should also support insertion,
updation and deletion of data as a relational set
Rule8: Physical data independence
• Application programs and terminal activities are not
disturbed if any changes are made either in storage
representations or access methods
Rule9: Logical data independence
• User programs and the user should not be aware of
any changes to the structure of the tables such as the
addition of extra columns
CODD’s Rules
Rule10: Distribution independence
• A relational DBMS has distribution independence.
• The RDBMS may spread across more than one
system and across several networks.
• However to the end-user, the tables should
appear no different to those that are local
Rule11: Integrity rule
• Integrity rules must be supported by the
relational data sub-language.
• Entity integrity: no component of a primary key
may have a NULL value.
CODD’s Rules
• Referential integrity: for every unique non-null
’foreign key’ values in the database, there should
be a matching primary key value from the same
domain
Rule12: Data integrity cannot be subverted
• If low level access is allowed to a system it should
not be able to subvert or bypass integrity rules to
change the data.
Conversion of ER to Relational Model
• A database that conforms to an ER diagram schema can be
represented by a collection of relational schemas.
• Both the ER model and Relational data model are abstract, logical
representations of real-world enterprises.
1. Representation of Strong Entity sets
• A strong entity set reduces to a schema with the same attributes.
• The primary key of the entity set serves as the primary key of the
resulting schema.
Loan = (loan_no, amount)
Conversion of ER to Relational Model
2. Representation of Weak Entity sets
• A weak entity set becomes a table that includes a
column for the primary key of the identifying strong
entity set.
• The primary key is constructed by the collection of
foreign key and partial key.
Loan = (loan_no, amount)
Payment = (loan_no, payment_no, payment_date,
payment_amt)
Conversion of ER to Relational Model
3. Representation of Relationship sets
3.a. Binary M:N
• Union of the primary key attributes from the participating
entity sets becomes the primary key of the relationship
• Customer = (cid, cname, address)
• Loan = (loan_no, amount)
• Borrow = (cid, loan_no)
• If borrow_date is mentioned as descriptive attribute, then
Borrow = (cid, loan_no, borrow_date)
Conversion of ER to Relational Model
3.b. Binary M:1/1:M
• Construct two tables, one for the entity set at 1 side and another
for entity set at M side, add the descriptive attributes and a
reference of the primary key of 1 side to the entity set at M side.
• Stud = (roll, name, branch)
• Library = (bid, bname, price, roll)
• The foreign key can be represented by specifying the name as:
Library = (bid, bname, price, borrowing_roll)
• If borrow_date is the descriptive attribute, then
Library = (bid, bname, price, borrowing_roll, borrow_date)
Conversion of ER to Relational Model
3.c. Binary 1:1
• Construct two tables.
• In this case, either side can be chosen to act as the many
side.
• That is, extra attributes can be added to either of the
tables corresponding to the two entity sets, but not at the
same time.
• Employee = (eid, ename, address)
• Department = (did, dname, location)
Conversion of ER to Relational Model
4. Representation of Recursive Relationship sets
• Two tables will be constructed; one for entity set and one for
relationship set.
• Employee = (eid, ename, address)
• Works_for = (mgrid, workerid)
• This ER diagram can also be represented by using a single
relation schema which contains a foreign key for each tuple in
the original entity set
• Employee = (eid, ename, address, manager_id)
Conversion of ER to Relational Model
5. Representation of Composite attributes
• The composite attributes are flattened out by
creating a separate attribute for each of its parts.
• Customer = (cid, name, address_street,
address_city, address_pin)
Conversion of ER to Relational Model
6. Representation of Multi-valued attributes
• A multi-valued attribute M of an entity set E is
represented by a separate schema E_M as
E_M(primary key of E,M)
• Employee = (eid, name, address)
• Employee_phone_no = (eid, phone_no)
Conversion of ER to Relational Model
7. Representation of Generalization/Specialization
• one schema will be constructed for the generalized entity
set and the schemas for each of the specialized entity
sets.
• Person = (person_id, name, address)
• Employee = (person_id, salary)
• Customer = (person_id, credit_rating)
Conversion of ER to Relational Model
• When the generalization/specialization is a
disjoint case, the schemas are constructed only
for the specialized entity sets
• Employee = (employee_id, name, address,
salary)
• Customer = (customer_id, name, address,
credit_rating)
Conversion of ER to Relational Model
8. Representation of Aggregation
• Create a schema containing the primary key of
the aggregated relationship, primary key of the
associated entity set and descriptive attributes.
• Employee = (eid, name, address)
• Branch = (bid, bname, asset)
• Job = (jobid, position, responsibility)
• Works_on = (eid, bid, jobid)
• Manager = (mid, mgrname)
• Manages = (eid, bid, jobid, mid)
Conversion of ER to Relational Model
ER Diagram
• Consider a CONFERENCE_REVIEW database in which researchers submit their research
papers for consideration. Reviews by reviewers are recorded for use in the paper selection
process. The database system caters primarily to reviewers who record answers to
evaluation questions for each paper they review and make recommendations regarding
whether to accept or reject the paper. The data requirements are summarized as follows:
• ■ Authors of papers are uniquely identified by e-mail id. First and last names are also
recorded.
• ■ Each paper is assigned a unique identifier by the system and is described by a title,
abstract, and the name of the electronic file containing the paper.
• ■ A paper may have multiple authors, but one of the authors is designated as the contact
author.
• ■ Reviewers of papers are uniquely identified by e-mail address. Each re-viewer’s first
name, last name, phone number, affiliation, and topics of interest are also recorded.
• ■ Each paper is assigned between two and four reviewers. A reviewer rates each paper
assigned to him or her on a scale of 1 to 10 in four categories: technical merit, readability,
originality, and relevance to the conference. Finally, each reviewer provides an overall
recommendation regarding each paper.
• ■ Each review contains two types of written comments: one to be seen by the review
committee only and the other as feedback to the author(s).
• Design an entity–relationship diagram for the CONFERENCE_REVIEW data-base.
ER Diagram
Conversion to Relation
• Paper (id, title, abstract, name);
• Author (email_id, first_name, last_name);
• Reviewer (email, first_name, last_name, phone,
interests);
• Review (id,email, technical_merit, readability,
oraginality, rele-vance, to_author, to_committee,
Recommendation);
• Submit (id, email_id);
Question
• A Company is organized into a number of departments. Each department has a unique
name and location. An employee can manage only one department at a time and a
department can be managed by only one employee. The Start date for the manager is
recorded. Department may have several locations. A department controls a number of
projects. Projects have a unique name, number and a single location. Company’s
employee’s name, employee number (which is unique), address, salary, sex and birth
date are stored. An employee is assigned to only one department, but may work for
several projects. Number of hours/week an employee works on each project is recorded.
An employee may have dependant(s). Employee’s dependants are tracked for health
insurance purposes. The dependants record consists of dependent's name (not unique),
birthdate, relationship to employee. All employee can be classified as officer or secretary,
but not the both. (Note: All relations mentioned above are with employees but not
explicitly with officer or secretary)
• (a) Construct an ER Diagram for the above problem description. Clearly mention all
assumptions made by you in imposing constraints.
• (b) Map the ER diagram into relations and specify the primary keys and foreign keys of
each relation.
Question
Question
Question
• The Flight database stores detail about an airline’s fleet, flights
and seat bookings as per the following: The airline has one or
more airplanes. An airplane is associated with a model number,
a unique registration number and the capacity to take one or
more passengers. An airplane flight has a unique flight number,
a departure airport, a destination airport, a departure date and
time and an arrival date and time. Each flight is carried out by a
single airplane. A passenger has given names, a surname and a
unique email address. A passenger can book a seat on a flight. It
is to be noted here that, a pilot (who is an employee) can fly
various airplanes, but allotted to only one flight.
• (a) Draw the ER diagram of the above problem description.
Make necessary assumptions.
• (b) Map the ER diagram into relations and specify the primary
keys and foreign keys of each relation.
Question
Question
Question
MCQ
• Which one of the following is used to represent
the supporting many-one relationships of a weak
entity set in an entity-relationship diagram?
MCQ

• Which of the following is NOT a super key in a


relational schema with attributes V,W,X,Y,Z and
primary key VY?
MCQ
MCQ
• Consider a relation table with a single record for each registered student with
the following attributes.
1. Registration Num: Unique registration number of each registered student.
2. UID: Unique identity number, unique at the national level for each citizen.
3. Bank Account Num: Unique account number at the bank. A student can
have multiple accounts or joint accounts. This attribute stores the primary
account number.
4. Name: Name of the student
5. Hostel Room: Room number of the hostel.
• Which of the following options is INCORRECT?
MCQ
MCQ
MCQ
Query Language
• Language in which user requests information
from the database are:
-Procedural language
-Nonprocedural language
• The categories of different languages are:
1. SQL
2. Relational Algebra
3. Relational Calculus
-Tuple Relational Calculus
-Domain Relational Calculus
Relational Algebra
Relational Algebra
• Relational algebra is a procedural language for
manipulating relations.
• That is, these operations use one or two existing
relations to create a new relation.
• Fundamental operators:
- Unary: SELECT, PROJECT, RENAME
- Binary: UNION, SET DIFFERENCE, CARTESIAN
PRODUCT
• Secondary operators:
-INTERSECTION, NATURAL JOIN, DIVISION, and
SELECT Operator(σ)
• SELECT operation is used to create a relation from
another relation by selecting only those tuples or
rows from the original relation that satisfy a
specified condition.
• It is denoted by sigma (σ) symbol.
• The predicate appears as a subscript to σ.
• The argument relation is in parenthesis after the σ.
• The result is a relation that has the same attributes
as the relation specified in <relation-name>.
• The general syntax of select operator is:
• σ <selection-condition> (<relation name>)
SELECT Operator(σ)
• The operators used in selection predicate may
be: =, <, <=, >, >=.
• Different predicates can be combined into a
larger predicate by using the connectors like:
AND, OR, NOT.
• Query: Find the details of the loans taken from
’Bhubaneswar Main’ branch.
σ branch_name=‘BhubaneswarMain’ (Loan)
SELECT Operator(σ)
Project Operator (π)
• PROJECT operation can be thought of as
eliminating unwanted columns.
• It is denoted by pie(π) symbol.
• The attributes needed to be appeared in the
resultant relation appear as subscript to π.
• The argument relation follows in parenthesis.
• The general syntax of project operator is:
π <attribute-list> (<relation name>)
Project Operator (π)
Composition of Operators
Relational algebra operators can be composed together
into a relational algebra expression to answer the complex
queries
Rename Operator (ρ)
• The results of relational algebra expressions do
not have a name that can be used to refer them.
• It is useful to be able to give them names
• The rename operator is used for this purpose.
• It is denoted by rho (ρ) symbol.
• The general syntax of rename operator is:
ρ (E) where E is a relational-algebra expression.
Rename Operator (ρ)
• The different forms of the rename operation for
renaming the relation are:
ρ(b1,b2,b3) (E)
• For example, the attributes of Customer
(cust_name, cust_street, cust_city) can be
renamed as:
ρ(name, street, city) (Customer)
Union Compatibility
• To perform the set operations such as UNION,
DIFFERENCE and INTERSECTION, the relations
need to be union compatible for the result to be
a valid relation.
• Two relations R1(a1,a2,... an) and R2(b1,b2,...
bm) are union compatible iff:
* n = m, i.e. both relations have same arity
* dom(ai ) = dom(bi ) for 1 <= i <= n
UNION Operator(U)
• The union operation is used to combine data
from two relations.
• It is denoted by union(U) symbol.
• The union of two relations R1(a1,a2,... an) and
R2(b1,b2,... bn) is a relation R3(c1,c2,... cn) such
that:
dom(ci ) = dom(ai ) U dom(bi ), 1 <= i <= n
• R1 U R2 is a relation that includes all tuples that
are either present in R1 or R2 or in both without
duplicate tuples
UNION Operator(U)
SET DIFFERENCE Operator(-)
• The difference operation is used to identify the
rows that are in one relation and not in another.
• It is denoted as (-) symbol.
• The difference of two relations R1(a1,a2,... an)
and R2(b1,b2,... bn) is a relation R3 (c1,c2,... cn)
such that:
dom(ci ) = dom(ai ) - dom(bi ), 1 <= i <= n
• R1 - R2 is a relation that includes all tuples that
are in R1, but not in R2.
Set Difference Operator (-)
Cartesian Product Operator(X)
• The Cartesian product of two relations R1(a1,a2,... an)
with cardinality i and R2(b1,b2,... bm) with cardinality j
is a relation R3 with
- degree k = n + m,
- cardinality i*j
- attributes (a1,a2,... an, b1,b2,... bm)
• R1 X R2 is a relation that includes all the possible
combinations of tuples from R1 and R2.
• The Cartesian product is used to combine information
from any two relations.
• It is not a useful operation by itself; but is used in
Cartesian Product (X)
Cartesian Product (X)
Intersection Operator(Π)
• The intersection operation is used to identify the
rows that are common to two relations.
• It is denoted by (Π) symbol.
• The intersection of two relations R1(a1,a2,... an)
and R2(b1,b2,... bn) is a relation R3 (c1,c2,... cn)
such that:
dom(ci ) = dom(ai ) Π dom(bi ), 1 <= i <= n
• R1 Π R2 is a relation that includes all tuples that are
present in both R1 and R2.
• The intersection operation can be rewritten by a
pair of set difference operations as R Π S = R - (R - S)
Intersection Operator(Π)
JOIN Operator
• The join is a binary operation that is used to
combine certain selections and a Cartesian
product into one operation.
• It is denoted by join ( ) symbol.
• The join operation forms a cartesian product of
its two arguments, performs a selection that
appear in both relations, and removes the
duplicate attributes
JOIN Operator
JOIN Operator
Division Operator(÷)
• The division operation creates a new relation by selecting the rows
in one relation that match every row in another relation.
• The division operation requires that we look at an entire relation
at once.
• It is denoted by division (÷) symbol.
• Let A, B, C are three relations and we desire B ÷ C to give A as the
result.
• This operation is possible iff:
-The columns of C must be a subset of the columns of B.
- The columns of A are all and only those columns of B that are
not columns of C
- A row is placed in A if and only if it is associated with B and
with every row of C
• The division operation is the reverse of the Cartesian product
Division Operator(÷)
Assignment Operator( ←)
• In relational algebra, the assignment operator
gives a name to a relation.
• It is denoted by (←) symbol.
• Assignment must always be made to a temporary
relation variable.
• The result of the right of the ← symbol is
assigned to the relation variable on the left of the
← symbol.
Assignment Operator( ←)
Generalized Projection
• The generalized-projection operation extends the
projection operation by allowing arithmetic
functions to be used in the projection list.
• The general form of generalized-projection is:
π a1,a2,…,an (R)
Aggregate Functions(g)
• Aggregate functions take a collection of values
and return a single value as a result. NULL value
will not participate in the aggregate functions.
• The general form of aggregate function is:
grouping_attribute g aggregate_functions (R)
Works = (emp_id, ename, salary, branch_name)
Query: Find the total sum of salaries of all the
employees
g SUM(salary)(Works)
Aggregate Functions(g)
Join
• The join operation is used to connect data across
relations.
• Tables are joined on columns that have the same data
type in the tables.
• Join operation joins two relations by merging those
tuples from two relations that satisfy a given condition.
• The condition is defined on attributes belonging to
relations to be joined.
• Different categories of join are:
- Inner Join
- Outer Join
Inner Join
• In the inner join, tuples with NULL valued join
attributes do not appear in the result.
• Tuples with NULL values in the join attributes are
also eliminated.
• The different types of inner join are:
- Theta Join
- Equi Join
- Natural Join
Theta Join
• The theta join is a join with a specified condition
involving a column from each relation.
• This condition specifies that the two columns should
be compared in some way.
• The comparison operator can be any of the six: <, <=,
>, >=, = and ≠
• Theta join is denoted by ( 𝚹) symbol.

R 𝚹𝚹 S = 𝚷all (𝞼 𝚹 (R X S))
• The general form of theta join is:

- Degree (Result) = Degree (R) + Degree (S)


- Cardinality (Result) <= Cardinality(R) X Cardinality(S)
Theta Join
Theta Join
Equi Join
• The equi join is the theta join based on equality
of specified columns.
• That means the equi join is the special type of
theta join where the comparison operator is =.

= S = 𝚷all (𝞼 = (R X S))
• The general form of equi join is:
R
- Degree (Result) = Degree (R) + Degree (S)
- Cardinality (Result) <= Cardinality(R) X
Cardinality(S)
Equi Join
Natural Join
• To perform natural join on two relations, they should
contain at least one common attributes.
• It is just like the equi join with the elimination of the
common attributes.
• The natural join is denoted by ( ) symbol

S = 𝚷all attributes - common attributes (𝞼 = (R X S))


• The general form of natural join is:
R
- Degree (Result) = Degree (R) + Degree (S) - Degree (R Π S)
- Cardinality (Result) <= Cardinality(R) X Cardinality(S)
• The general form of the natural join can also be

S = 𝚷all (R S)
represented as:
R
Natural Join
Outer Join
• It is an extension of the natural join operation to deal
with the missing information.
• The outer join consists of two steps:
- First, a natural join is executed
- Then if any record in one relation does not match a
record from the other relation in the natural join, that
unmatched record is added to the join relation, and the
additional columns are filled with NULLs
• The different types of outer join are:
- Left Outer Join
- Right Outer Join
Left Outer Join

• The left outer join preserves all tuples in left


relation.
• The left outer join is denoted by symbol:
• All information from the left relation is present in
the result of the left outer join.
Left Outer Join
Left Outer Join
Right Outer Join
• The right outer join preserves all tuples in right
relation.
• The right outer join is denoted by symbol:
• All information from the right relation is present
in the result of the right outer join.
Right Outer Join
Right Outer Join
Full Outer Join
• The full outer join preserves all tuples in both
relations.
• The full outer join is denoted by symbol:
• All information from both the relations is present
in the result of the full outer join.
Full Outer Join
Self Join
• The self join is similar to the theta join.
• It joins a relation to itself by a condition.
• The self join can be viewed as a join of two
copies of the same relation.

R 𝚹 R = 𝚷all (𝞼 𝚹 (R X R))
• The general form of self join

• Thus, the self join creates two alias or copies of


the same relation; then performs the theta join
by a condition based on the attributes of these
two copies.
Self Join
Relational Algebra Query
Emp(empNo,name)
Project(projectNo,pName,manager)
Assigned_To(projectNo,empNo)
Query: Find empNo of employees working on

𝞹empNo ( 𝞼projectNo=‘comp01’(Assigned_To))
project ’comp01’

Query: Find details of employees working on

𝞹 empNo,name (𝞼 projectNo=‘comp01’(Emp
project ’comp01’

Assigned_To))
Relational Algebra Query
Emp(empNo,name)
Project(projectNo,pName,manager)
Assigned_To(projectNo,empNo)
Query: Find empNo of employees working on
project ’comp01’
Query: Find details of employees working on
project ’comp01’
Query: Find the empNo who don’t work on
project ’comp01’
Relational Algebra Query
• Query: Obtain the details of employees working

𝞹 empNo,name (𝞼 pname=‘database’(Emp Assigned_To


on the ‘database’ project

Project))
• Query: Find the details of employees working

𝞹 empNo,name (𝞼 projectNo=‘comp01’ AND projectNo=‘comp02’ (Emp


on the ’comp01’ and ’comp02’ projects

Assigned_To))
Relational Algebra Query

• Query: Find the empNo who don’t work on

𝞹 empNo (Assigned_To) - 𝞹 empNo (𝞼 projectNo=‘comp01’


project ’comp01’

(Assigned_To))

OR

𝞹 empNo (𝞼 projectNo≠‘comp01’ (Assigned_To)


Relational Algebra Query
• Sailors(sid, sname, rating, age)
• Boats(bid, bname, color)
• Reserves(sid, bid, day)
Query: Find the names of sailors who’ve reserved

𝞹 sname (𝞼 bid=105(Reserves Sailors))


boat 105

𝞹 sname (𝞼 bid=105(Reserves) Sailors)


OR
Relational Algebra Query
• Query: Find the names of sailors who’ve

𝞹 sname (𝞼color=‘green’(Boats Reserves


reserved a green boat

Sailors))

𝞹 sname ((𝞼color=‘green’(Boats)) Reserves


OR

Sailors)
Query: Find the sailor ids of the sailors who’ve

𝞹 sid,bid (Reserves) ÷ 𝞹 bid (Boats)


reserved all boats
Relational Algebra Query
•Customer(cust_id, cust_name, annual_revenue)
•Truck(truckno, driver_name)
•City(city_name, population)
•Shipment(shipment_no, cust_id, weight, truckno,
destination_city)
Query: Find the list of shipment numbers for

𝞹 shipment_no (𝞼 weight>20pound(Shipment))
shipments weighing over 20 pounds
Relational Algebra Query
Query: Find the names of customers with more

𝞹 cust_name (𝞼 annual_revenue>$10million(Customer))
than $10 million in annual revenue

𝞹 driver_name (𝞼 truckno=45(Truck))
Query : Find the driver of truck 45

Query: Find the names of cities which have

𝞹 destination_city (𝞼 weight>100pounds (Shipment))


received shipments weighing over 100 pounds
Relational Algebra Query
Query: Find the name and annual revenue of
customers who have sent shipments weighing

𝞹 cust_name,annual_revenue (𝞼
over 100 pounds

weight>100pounds(Customer Shipment))
Query: Find the truck numbers of trucks which
have carried shipments weighing over 100

𝞹 truckno (𝞼 weight>100pounds(Shipment))
pounds
Relational Algebra Query
Query: Find the names of drivers who have

𝞹 driver_name (𝞼 weight>100pounds(Shipment
delivered shipments weighing over 100 pounds
Truck))

Query: List the cities which have received


shipments from customers having over $15

𝞹 destination_city (𝞼 annual_revenue>$15million(Customer
million in annual revenue

Shipment))
Relational Algebra Query
Query: List the customers having over $5 million
in annual revenue who have sent shipments

𝞹 cust_name (𝞼 annual_revenue>$5million(Customer)
weighing greater than 1 pound

𝞼 weight>1pound (Shipment))
Query: List the customers whose shipments have

𝞹 cust_name (𝞼 driver_name=‘Ramesh’(Customer
been delivered by truck driver Ramesh

Shipment Truck))
Relational Algebra Query
Query: Find the customers having over $5 million in
annual revenue who have sent shipments
weighing less than 1 pound or have sent a

𝞹 cust_name (𝞼 annual_revenue>$5million(Customer)
shipment to Bhubaneswar

𝞼 weight>1pound OR destination_city=‘Bhubaneswar’ (Shipment))


Query: Find the customers who have sent shipments

𝞹 cust_name,destination_city (Customer Shipment) ÷ 𝞹


to every city with population over 500000

𝞼 population>500000(City))
city (
Relational Algebra Query
Query: List the drivers who have delivered
shipments for customers with annual revenue
over $20 million to cities with population over 1

𝞹 driver_name (𝞼 annual_revenue>20million AND


million

population>1million(Customer Shipment Truck


City))
Query: Find the drivers who have delivered

𝞹 driver_name,destination_city (Truck Shipment) ÷


shipments to every city

𝞹 city_name (City)
Question
Consider the COMPANY database schema:
Employee (ssn, fname, lname, bdate, address,
gender, salary, super_ssn, dno)
Department (dno, dname, mgr_ssn,
mgr_start_date)
Dept_Location (dno, dloc)
Project (pno, pname, plocation, dno)
Works_On (essn, pno, hours)
Dependent (essn, dependent_name, gender, bdate,
relationship)
Question
a. List the names of all employees who have a
dependent with the same first name as
themselves.

𝞹 lname, fname
(Employee fname=dependent name AND

ssn=essn (Dependent))

b. Find the names of employees who are

𝞹 lname, fname (Employee


supervised by ‘ Rakesh’.
ssn=super_ssn (𝞹 ssn
(𝞼fname=’Rakesh’(Employee )))
Question
c. Retrieve the names of all employees who work

proj_emp(pno, ssn) <- 𝞼fpno, ssn (Works_on)


on every project.

all_proj <- 𝞹 pno (Project)

result <- 𝞹 lname, fname (Employee


emp_all_proj <- proj_emp ÷ all_proj

emp_all_proj)
d. Retrieve the average salary of all female

g (𝞼
employees.
(Employee ))
Question
e. List the last names of department managers who

dept_mngr(ssn) <- 𝞹 mgr_ssn (Department)


have no dependents.

emp_depnt(ssn) <- 𝞹 essn (Dependent)


result_emp <- dept_mngr - emp_depnt
result <- 𝞹 lname (Employee result_emp)
Question
Question
Question
Question
Question
Question
Question
Question
Question
Relational Calculus
• Relational calculus is non-procedural
• Relational calculus is mainly based on the well-
known propositional calculus, which is a method
of calculating with sentences or declarations.
• Various types of relational calculus are:
-Tuple Relational Calculus (TRC)
-Domain Relational Calculus (DRC)
Tuple Relational Calculus (TRC)
• A tuple variable is a variable that takes on tuples
of a particular relation schema as values.
• A tuple relational calculus query has the form:
{T/ P(T)}
• The result of this query is the set of all tuples t
for which the formula P(T) evaluates to TRUE
with T = t
Example - Sailors (sid, sname, rating, age)
Query: Find all the sailors with a rating above 4
{T/ S ϵ Sailors ˄ S.rating > 4}
Tuple Relational Calculus (TRC)
• Let Rel be a relation name, R and S be the tuple
variables, a is an attribute of R, b is an attribute
of S, op is operator in the set {<, <=, >, >=, =,≠}.
• An Atomic formula is one of the following:
- R ϵ Rel
- R.a op S.b
- R.a op Constant or Constant op R.a
Tuple Relational Calculus (TRC)
• To represent the join and division of relational
algebra by relational calculus, we need
quantifiers such as: existential for join and
universal for division.
• A quantifier quantifies or indicates the quantity
of something
• The existential quantifier (Ǝ) a particular type of
thing exist
• Similarly, the universal quantifier(Ɐ) states that
some condition applies to all or to every row of
some type.
Tuple Relational Calculus (TRC)
• A formula is recursively defined by using the following
rules:
- Any atomic formula
- If p and q are formulae, then :¬ p, p ᴧ q, p v q, p ⇒ q
are also formulae
- If p is a formula that contains T as a variable, then Ǝ
T(p) and Ɐ T(p) are also formulae
• The quantifiers Ǝ or Ɐ are said to bind the tuple variable R
• A variable is said to be free in a formula if the formula does
not contain an occurrence of a quantifier that binds it.
• In most of the queries, the output is shown by using the
free variables
Tuple Relational Calculus (TRC)
Safe Expressions
• Whenever we use universal quantifiers or existential
quantifiers in a calculus expression, we must make
sure that the resulting expression makes sense
• A safe expression in relational calculus is one that is
guaranteed to yield a finite number of tuples as its
result; otherwise, the expression is called unsafe
• That means, an expression is said to be safe if all
values in its result are from the domain of the
expression
TRC Query
Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
Reserves(sid, bid, day)
Query: Find the names & ages of sailors with a
rating above 4
{T/ ƎS ϵ Sailors (S.rating >4 ᴧ T.sname=S.sname ᴧ
T.age=S.age)}
Query: Find the sailor name, boat id & reservation
day for each reservation
{T/ƎR ϵ Reserves ƎS ϵ Sailors (R.sid = S.sid ᴧ
TRC Query
Query: Find the names of sailors who have
reserved boat 111
Query: Find the names of sailors who have
reserved a green boat
Query: Find the names of sailors who have
reserved at least 2 boats.
Query: Find the names of sailors who have
reserved all boats.
Query: Find sailors who have reserved all green
boats
TRC Query
Query: Find the names of sailors who have
reserved boat 111
{T/ƎR ϵ Reserves ƎS ϵ Sailors (R.sid = S.sid ᴧ
R.bid=111 ᴧ T.sname=S.sname)}
Query: Find the names of sailors who have
reserved a green boat
{T/ƎS ϵ Sailors ƎB ϵ Boats ƎR ϵ Reserves(R.sid = S.sid
ᴧ R.bid=B.bid ᴧ B.color=’green’ ᴧ
T.sname=S.sname )}
TRC Query
Query: Find the names of sailors who have
reserved at least 2 boats.
{T/ƎS ϵ Sailors ƎR1 ϵ Reserves ƎR2 ϵ Reserves
(R1.sid= R2.sid ᴧ R1.sid= S.sid ᴧ R1.bid ≠R2.bid ᴧ
T.sname=S.sname)}
Query: Find the names of sailors who have
reserved all boats.
{T/ƎS ϵ Sailors ⱯB ϵ Boats ƎR ϵ Reserves(S.sid=R.sid
ᴧ R.bid =B.bid ᴧ T.sname=S.sname)}
TRC Query
Query: Find sailors who have reserved all green
boats
{S/ƎS ϵ Sailors ⱯB ϵ Boats ƎR ϵ Reserves
(B.color=’green’ ᴧ S.sid=R.sid ᴧ R.bid= B.bid)}
Domain Relational Calculus (DRC)
• In tuple relational calculus, the variables range over
the tuples whereas in domain relational calculus, the
variables range over the domains.
• The domain variables are the ones which range over
the underlying domains instead of over the relations.
• The domain relational calculus query has the form:
{<x1, x2, ... xn> | P(x1, x2, ... xn)}
where xi is a domain variable and P(x1, x2,... xn) is
the domain relational calculus formula
• The result of this query is the set of all tuples for
which the formula evaluates to TRUE
Domain Relational Calculus (DRC)
• Let Rel be a relation name, R and S be the
domain variables, op is operator in the set {<, <=,
>, >=, =,≠}.
• An Atomic formula is one of the following:
- <x1, x2, ... xn> ϵ Rel
- R op S
- R op Constant or Constant op R
Domain Relational Calculus (DRC)
• A formula is recursively defined by using the following
rules:
- Any atomic formula
- If p and q are formulae, then :¬ p, p ᴧ q, p v q, p
⇒ q are also formulae
- If p is a formula that contains T as a domaim
variable, then Ǝ T(p) and Ɐ T(p) are also formulae
• The quantifiers Ǝ or Ɐ are said to bind the domain
variable T.
• A variable is said to be free in a formula if the formula
does not contain an occurrence of a quantifier that
DRC Query
Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
Reserves(sid, bid, day)
Query: Find all sailors with a rating above 7
{<Is, N, R, A> / Ǝ <Is, N,R, A> ϵ Sailors (R >7)}
Query: Find the names of sailors who reserved
boat 111
{<N> / Ǝ <Is, N, R, A> ϵ Sailors ᴧ Ǝ <Ir, Br, D> ϵ
Reserves (Ir=Is ᴧ Br=111))}
DRC Query
• Query: Find the names of sailors who have
reserved a green boat
{<N>/ Ǝ<Is, N, R, A> ϵ Sailors ᴧ Ǝ <Ir, Br, D> ϵ
Reserves ᴧ Ǝ<Bb, Bn, C> ϵ Boats (Ir=Is ᴧ Ib=Br ᴧ
C=‘green’)}
• Query: Find the names of sailors who have
reserved at least 2 boats
• {<N>/Ǝ <Is, N, R, A> ϵ Sailors ᴧ Ǝ<Ir, Br1, D1> ϵ
Reserves ᴧ Ǝ <Ir, Br2, D2> ϵ Reserves (Ir=Is ᴧ Br1 ≠
Br2)}
DRC Query
• Query: Find the names of sailors who have
reserved all boats
{<N>/ Ǝ <Is, N, R, A> ϵ Sailors ᴧ Ɐ <Bb, Bn, C> ϵ
Boats ᴧ Ǝ <Ir, Br, D> ϵ Reserves ( Is=Ir ᴧ Bb=Br)}
• Query: Find sailors who have reserved all green
boats
{<Is,N,R,A>/ Ǝ <Is,N,R,A> ϵ Sailors ᴧ Ɐ <Bb, Bn, C>
ϵ Boats ᴧ Ǝ <Ir, Br, D> ϵ Reserves (C=’green’ ᴧ Is=Ir
ᴧ Br=Bb)}

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy