0% found this document useful (0 votes)
39 views33 pages

DBMS Mca Ii Sem Notes

The document provides an overview of the Relational Data Model, including concepts such as relations, tuples, and schemas, as well as constraints like key, entity integrity, and referential integrity. It also covers update operations and their implications for constraint violations, along with an introduction to relational algebra and calculus for querying databases. Key operations such as SELECT and PROJECT are discussed, emphasizing their properties and usage in relational queries.

Uploaded by

dhamoder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views33 pages

DBMS Mca Ii Sem Notes

The document provides an overview of the Relational Data Model, including concepts such as relations, tuples, and schemas, as well as constraints like key, entity integrity, and referential integrity. It also covers update operations and their implications for constraint violations, along with an introduction to relational algebra and calculus for querying databases. Key operations such as SELECT and PROJECT are discussed, emphasizing their properties and usage in relational queries.

Uploaded by

dhamoder
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT I

Contents ………………………….. Page No.


The Relational Data Model and Relational Database Constraints-
1. Relational Model Concepts 1
2. Relational Model Constraints and Relational Database Schemas 4
3. Update Operations and Dealing with Constraint Violation 8

The Relational Algebra and Relational Calculus-


4. Unary Relational Operations: SELECT and PROJECT –
5. Relational Algebra Operations from Set Theory –
6. Binary Relational Operations: JOIN and DIVISION –
7. Additional Relational Operation –
Relational Calculus
8. Tuple Relational Calculus
9. Domain Relational Calculus

0
1. RELATIONAL MODEL CONCEPTS

 The relational Model of Data is based on the concept of a Relation


 The strength of the relational approach to data management comes from the formal
foundation provided by the theory of relations
 A Relation is a mathematical concept based on the ideas of sets
 The model was first proposed by Dr. E.F. Codd of IBM Research in 1970 in the following paper:
 "A Relational Model for Large Shared Data Banks," Communications of the ACM, June
1970
Informal Definition of Relation
 Informally, a relation looks like a table of values.
 A relation typically contains a set of rows.
 The data elements in each row represent certain facts that correspond to a real-world entity or
relationship
 In the formal model, rows are called tuples
 Each column has a column header that gives an indication of the meaning of the data
items in that column
 In the formal model, the column header is called an attribute name (or just attribute)

 Key of a Relation:
 Each row has a value of a data item (or set of items) that uniquely identifies that row in
the table
 Called the key
 In the STUDENT table, SSN is the key

1
 Sometimes row-ids or sequential numbers are assigned as keys to identify the rows in a
table
 Called artificial key or surrogate key
Formal Definition - Relation
 Key of a Relation:
 Each row has a value of a data item (or set of items) that uniquely identifies that row in
the table
 Called the key
 In the STUDENT table, SSN is the key
 Sometimes row-ids or sequential numbers are assigned as keys to identify the rows in a
table
 Called artificial key or surrogate key
Schema
 The Schema (or description) of a Relation:
 Denoted by R(A1, A2, .....An)
 R is the name of the relation
 The attributes of the relation are A1, A2, ..., An
 Example:
CUSTOMER (Cust-id, Cust-name, Address, Phone#)
 CUSTOMER is the relation name
 Defined over the four attributes: Cust-id, Cust-name, Address, Phone#
 Each attribute has a domain or a set of valid values.
 For example, the domain of Cust-id is 6 digit numbers.
Tuple
 A tuple is an ordered set of values (enclosed in angled brackets ‘< … >’)
 Each value is derived from an appropriate domain.
 A row in the CUSTOMER relation is a 4-tuple and would consist of four values, for example:
 <632895, "John Smith", "101 Main St. Atlanta, GA 30332", "(404) 894-2000">
 This is called a 4-tuple as it has 4 values
 A tuple (row) in the CUSTOMER relation.
 A relation is a set of such tuples (rows)

2
Domain
 A domain has a logical definition:
 Example: “USA_phone_numbers” are the set of 10 digit phone numbers valid in the U.S.
 A domain also has a data-type or a format defined for it.
 The USA_phone_numbers may have a format: (ddd)ddd-dddd where each d is a decimal
digit.
 Dates have various formats such as year, month, date formatted as yyyy-mm-dd, or as
dd mm,yyyy etc.
 The attribute name designates the role played by a domain in a relation:
 Used to interpret the meaning of the data elements corresponding to that attribute
 Example: The domain Date may be used to define two attributes named “Invoice-date”
and “Payment-date” with different meanings
State
 The relation state is a subset of the Cartesian product of the domains of its attributes
 each domain contains the set of all possible values the attribute can take.
 Example: attribute Cust-name is defined over the domain of character strings of maximum
length 25
 dom(Cust-name) is varchar(25)
 The role these strings play in the CUSTOMER relation is that of the name of a customer.

Informal Terms Formal Terms

Table Relation

Column Header Attribute

All possible Column Values Domain

Row Tuple

Table Definition Schema of a Relation

Populated Table State of the Relation

Characteristics of Relation

3
 Ordering of tuples in a relation r(R):
 The tuples are not considered to be ordered, even though they appear to be in the
tabular form.
 Ordering of attributes in a relation schema R (and of values within each tuple):
 We will consider the attributes in R(A1, A2, ..., An) and the values in t=<v1, v2, ..., vn> to
be ordered .
 (However, a more general alternative definition of relation does not require this
ordering).
 Values in a tuple:
 All values are considered atomic (indivisible).
 Each value in a tuple must be from the domain of the attribute for that column
 If tuple t = <v1, v2, …, vn> is a tuple (row) in the relation state r of R(A1, A2, …,
An)
 Then each vi must be a value from dom(Ai)
 A special null value is used to represent values that are unknown or inapplicable to
certain tuples.
 Notation:
 We refer to component values of a tuple t by:
 t[Ai] or t.Ai
 This is the value vi of attribute Ai for tuple t
 Similarly, t[Au, Av, ..., Aw] refers to the subtuple of t containing the values of attributes
Au, Av, ..., Aw, respectively in t
2. RELATIONAL MODEL CONSTRAINTS AND RELATIONAL DATABASE SCHEMAS
 Constraints are conditions that must hold on all valid relation states.
 There are three main types of constraints in the relational model:
 Key constraints
 Entity integrity constraints
 Referential integrity constraints
 Another implicit constraint is the domain constraint
 Every value in a tuple must be from the domain of its attribute (or it could be null, if allowed for
that attribute)
Key Constraints

4
 Superkey of R:
 Is a set of attributes SK of R with the following condition:
 No two tuples in any valid relation state r(R) will have the same value for SK
 That is, for any distinct tuples t1 and t2 in r(R), t1[SK] ¹ t2[SK]
 This condition must hold in any valid state r(R)
 Key of R:
 A "minimal" superkey
 That is, a key is a superkey K such that removal of any attribute from K results in a set of
attributes that is not a superkey (does not possess the superkey uniqueness property)
 Example: Consider the CAR relation schema:
 CAR(State, Reg#, SerialNo, Make, Model, Year)
 CAR has two keys:
 Key1 = {State, Reg#}
 Key2 = {SerialNo}
 Both are also superkeys of CAR
 {SerialNo, Make} is a superkey but not a key.
 In general:
 Any key is a superkey (but not vice versa)
 Any set of attributes that includes a key is a superkey
 A minimal superkey is also a key
 If a relation has several candidate keys, one is chosen arbitrarily to be the primary key.
 The primary key attributes are underlined.
 Example: Consider the CAR relation schema:
 CAR(State, Reg#, SerialNo, Make, Model, Year)
 We chose SerialNo as the primary key
 The primary key value is used to uniquely identify each tuple in a relation
 Provides the tuple identity
 Also used to reference the tuple from another tuple
 General rule: Choose as primary key the smallest of the candidate keys (in terms of size)
 Not always applicable – choice is sometimes subjective

5
Entity Integrity:
 The primary key attributes PK of each relation schema R in S cannot have null values in
any tuple of r(R).
 This is because primary key values are used to identify the individual tuples.
 t[PK] ¹ null for any tuple t in r(R)
 If PK has several attributes, null is not allowed in any of these attributes
 Note: Other attributes of R may be constrained to disallow null values, even though
they are not members of the primary key.
Referential Integrity
 A constraint involving two relations
 The previous constraints involve a single relation.
 Used to specify a relationship among tuples in two relations:
 The referencing relation and the referenced relation.
 Tuples in the referencing relation R1 have attributes FK (called foreign key attributes) that
reference the primary key attributes PK of the referenced relation R2.
 A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK].
 A referential integrity constraint can be displayed in a relational database schema as a directed
arc from R1.FK to R2.
 Statement of the constraint
 The value in the foreign key column (or columns) FK of the the referencing relation R1
can be either:
 (1) a value of an existing primary key value of a corresponding primary key PK in
the referenced relation R2, or
 (2) a null.
 In case (2), the FK in R1 should not be a part of its own primary key.

6
Displaying Relational Database Schema and its constraints
 Each relation schema can be displayed as a row of attribute names
 The name of the relation is written above the attribute names
 The primary key attribute (or attributes) will be underlined
 A foreign key (referential integrity) constraints is displayed as a directed arc (arrow) from the
foreign key attributes to the referenced table
 Can also point the the primary key of the referenced relation for clarity

Other Type of Constrains


 Semantic Integrity Constraints:
 based on application semantics and cannot be expressed by the model per se
 Example: “the max. no. of hours per employee for all projects he or she works on is 56
hrs per week”
 A constraint specification language may have to be used to express these
 SQL-99 allows triggers and ASSERTIONS to express for some of these

7
3. UPDATE OPERATIONS AND DEALING WITH CONSTRAINT VIOLATION
 Each relation will have many tuples in its current relation state
 The relational database state is a union of all the individual relation states
 Whenever the database is changed, a new state arises
 Basic operations for changing the database:
 INSERT a new tuple in a relation
 DELETE an existing tuple from a relation
 MODIFY an attribute of an existing tuple

 INSERT a tuple.
 DELETE a tuple.
 MODIFY a tuple.
 Integrity constraints should not be violated by the update operations.

8
 Several update operations may have to be grouped together.
 Updates may propagate to cause other updates automatically. This may be necessary to
maintain integrity constraints.
 In case of integrity violation, several actions can be taken:
 Cancel the operation that causes the violation (RESTRICT or REJECT option)
 Perform the operation but inform the user of the violation
 Trigger additional updates so the violation is corrected (CASCADE option, SET NULL
option)
 Execute a user-specified error-correction routine
Possible violations of each operation
 INSERT may violate any of the constraints:
 Domain constraint:
 if one of the attribute values provided for the new tuple is not of the specified
attribute domain
 Key constraint:
 if the value of a key attribute in the new tuple already exists in another tuple in
the relation
 Referential integrity:
 if a foreign key value in the new tuple references a primary key value that does
not exist in the referenced relation
 Entity integrity:
 if the primary key value is null in the new tuple
 DELETE may violate only referential integrity:
 If the primary key value of the tuple being deleted is referenced from other tuples in the
database
 Can be remedied by several actions: RESTRICT, CASCADE, SET NULL (see Chapter
8 for more details)
 RESTRICT option: reject the deletion
 CASCADE option: propagate the new primary key value into the foreign
keys of the referencing tuples
 SET NULL option: set the foreign keys of the referencing tuples to NULL

9
 One of the above options must be specified during database design for each foreign key
constraint
 UPDATE may violate domain constraint and NOT NULL constraint on an attribute being modified
 Any of the other constraints may also be violated, depending on the attribute being updated:
 Updating the primary key (PK):
 Similar to a DELETE followed by an INSERT
 Need to specify similar options to DELETE
 Updating a foreign key (FK):
 May violate referential integrity
 Updating an ordinary attribute (neither PK nor FK):
 Can only violate domain constraints
THE RELATIONAL ALGEBRA AND RELATIONAL CALCULUS-
Overview
 Relational algebra is the basic set of operations for the relational model
 These operations enable a user to specify basic retrieval requests (or queries)
 The result of an operation is a new relation, which may have been formed from
one or more input relations
 This property makes the algebra “closed” (all objects in relational algebra
are relations)
 The algebra operations thus produce new relations

 These can be further manipulated using operations of the same algebra


 A sequence of relational algebra operations forms a relational algebra expression
 The result of a relational algebra expression is also a relation that
represents the result of a database query (or retrieval request)

10
4. UNARY RELATIONAL OPERATIONS: SELECT AND PROJECT –

11
SELECT Operation Properties
The SELECT operation  <selection condition>(R) produces a
relation S that has the same schema (same attributes)
as R
SELECT  is commutative:
 <condition1>( < condition2> (R)) =  <condition2> ( < condition1> (R))
Because of commutativity property, a cascade
(sequence) of SELECT operations may be applied in
any order:
<cond1>(<cond2> (<cond3> (R)) = <cond2> (<cond3> (<cond1> ( R)))
A cascade of SELECT operations may be replaced by a
single selection with a conjunction of all the
conditions:
<cond1>(< cond2> (<cond3>(R)) =  <cond1> AND < cond2> AND < cond3>(R)))
The number of tuples in the result of a SELECT is
less than (or equal to) the number of tuples in the
input relation R

12
The general form of the project
operation is:
<attribute list>(R)
 (pi) is the symbol used to represent the
project operation
<attribute list> is the desired list of
attributes from relation R.
The project operation removes any
duplicate tuples
This is because the result of the project
operation must be a set of tuples 13

Mathematical sets do not allow duplicate


elements.
PROJECT Operation Properties
The number of tuples in the result of
projection <list>(R) is always less or equal to
the number of tuples in R
If the list of attributes includes a key of R, then
the number of tuples in the result of PROJECT
is equal to the number of tuples in R
PROJECT is not commutative
 <list1> ( <list2> (R) ) =  <list1> (R) as long as <list2>
contains the attributes in <list1>

14
The general RENAME operation  can
be expressed by any of the following
forms:
S (B1, B2, …, Bn )(R) changes both:
the relation name to S, and
the column (attribute) names to B1, B1, …..Bn
S(R) changes:
the relation name only to S
(B1, B2, …, Bn )(R) changes: 15

the column (attribute) names only to B1, B1,


5. RELATIONAL ALGEBRA OPERATIONS FROM SET THEORY –

16
Example:
To retrieve the social security numbers of all
employees who either work in department 5
(RESULT1 below) or directly supervise an employee
who works in department 5 (RESULT2 below)
We can use the UNION operation as follows:
DEP5_EMPS  DNO=5 (EMPLOYEE)
RESULT1   SSN(DEP5_EMPS)
RESULT2(SSN)  SUPERSSN(DEP5_EMPS)
RESULT  RESULT1  RESULT2
The union operation produces the tuples that are in
either RESULT1 or RESULT2 or both

17
Type Compatibility of operands is required for
the binary set operation UNION , (also for
INTERSECTION , and SET DIFFERENCE –, see
next slides)
R1(A1, A2, ..., An) and R2(B1, B2, ..., Bn) are
type compatible if:
they have the same number of attributes, and
the domains of corresponding attributes are type
compatible (i.e. dom(Ai)=dom(Bi) for i=1, 2, ..., n).
The resulting relation for R1R2 (also for
R1R2, or R1–R2, see next slides) has the
same attribute names as the first operand
relation R1 (by convention)

18
SET DIFFERENCE (also called MINUS or
EXCEPT) is denoted by –
The result of R – S, is a relation that
includes all tuples that are in R but not
in S
The attribute names in the result will
be the same as the attribute names
in R
The two operand relations R and S
must be “type compatible”

19
CARTESIAN (or CROSS) PRODUCT Operation
This operation is used to combine tuples from two
relations in a combinatorial fashion.
Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm)
Result is a relation Q with degree n + m attributes:
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order.
The resulting relation state has one tuple for each
combination of tuples—one from R and one from
S.
6. BINARY RELATIONAL OPERATIONS: JOIN AND DIVISION –
Hence, if R has nR tuples (denoted as |R| = nR ), and
S has nS tuples, then R x S will have nR * nS tuples.
20
The two operands do NOT have to be "type
compatible”
Example: Suppose that we want to retrieve the
name of the manager of each department.
To get the manager’s name, we need to combine
each DEPARTMENT tuple with the EMPLOYEE tuple
whose SSN value matches the MGRSSN value in the
department tuple.
We do this by using the join operation.

DEPT_MGR  DEPARTMENT MGRSSN=SSN


EMPLOYEE
MGRSSN=SSN is the join condition
Combines each department record with the
employee who manages the department
The join condition can also be specified as (table
name.field name) DEPARTMENT.MGRSSN= 21

EMPLOYEE.SSN
The general case of JOIN operation is
called a Theta-join: R S
theta
The join condition is called theta
Theta can be any general boolean
expression on the attributes of R and S;
for example:
R.Ai<S.Bj AND (R.Ak=S.Bi OR R.Ap<S.Bq)
Most join conditions involve one or
more equality conditions “AND”ed
together; for example: 22

R.Ai=S.Bj AND R.Ak=S.Bi AND R.Ap=S.Bq


EQUIJOIN Operation
The most common use of join involves
join conditions with equality
comparisons only
Such a join, where the only comparison
operator used is =, is called an
EQUIJOIN.
In the result of an EQUIJOIN we always
have one JOIN
NATURAL or more pairs of attributes
Operation
(whose names need
Another variation of JOINnot be NATURAL
called identical) that
JOIN —
have identical values in every tuple.
denoted by * — was created to get rid of the
The JOIN
second seen in the
(superfluous) previous
attribute in anexample
EQUIJOIN was
an EQUIJOIN.
condition.
because one of each pair of attributes with identical
values is superfluous
The standard definition of natural join requires that
the two join attributes, or each pair of
corresponding join attributes, have the same name
in both relations
If this is not the case, a renaming operation
is applied first.
23
Example: To apply a natural join on the DNUMBER
attributes of DEPARTMENT and DEPT_LOCATIONS, it is
sufficient to write:
DEPT_LOCS  DEPARTMENT * DEPT_LOCATIONS
Only attribute with the same name is DNUMBER
An implicit join condition is created based on this
attribute:
DEPARTMENT.DNUMBER=DEPT_LOCATIONS.DNUM
BER

Another example: Q  R(A,B,C,D) * S(C,D,E)


The implicit join condition includes each pair of
attributes with the same name, “AND”ed together:
R.C=S.C AND R.D=S.D
Result keeps only one attribute of each such pair:
Q(A,B,C,D,E)

24
25
26
7. ADDITIONAL RELATIONAL OPERATION –
GROUP FUNCTIONS (or) AGGREGATE FUNCTIONS
 Sum
 Avg
 Max
 Min
 Count
Group functions will be applied on all the rows but produces single output.
a) SUM
This will give the sum of the values of the specified column.
Syntax: sum (column)
Ex: SQL> select sum(sal) from emp;
SUM(SAL)
----------
38600

27
b) AVG
This will give the average of the values of the specified column.
Syntax: avg (column)
Ex: SQL> select avg(sal) from emp;
AVG(SAL)
---------------
2757.14286

c) MAX
This will give the maximum of the values of the specified column.
Syntax: max (column)
Ex: SQL> select max(sal) from emp;
MAX(SAL)
----------
5000

d) MIN
This will give the minimum of the values of the specified column.
Syntax: min (column)
Ex: SQL> select min(sal) from emp;
MIN(SAL)
----------
500

e) COUNT
This will give the count of the values of the specified column.
Syntax: count (column)
Ex: SQL> select count(sal),count(*) from emp;
COUNT(SAL) COUNT(*)
-------------- ------------
14 14
GROUP BY AND HAVING
GROUP BY
Using group by, we can create groups of related information.
Columns used in select must be used with group by, otherwise it was not a group by
expression.
Ex: SQL> select deptno, sum(sal) from emp group by deptno;
DEPTNO SUM(SAL)
---------- ----------

28
10 8750
20 10875
30 9400

SQL> select deptno,job,sum(sal) from emp group by deptno,job;


DEPTNO JOB SUM(SAL)
------ -------------- ----------
10 CLERK 1300
10 MANAGER 2450
10 PRESIDENT 5000
20 ANALYST 6000
20 CLERK 1900
20 MANAGER 2975
30 CLERK 950
30 MANAGER 2850
30 SALESMAN 5600

HAVING
This will work as where clause which can be used only with group by because of
absence of where clause in group by.
Ex: SQL> select deptno,job,sum(sal) tsal from emp group by deptno,job having
sum(sal) > 3000;
DEPTNO JOB TSAL
---------- --------- ----------
10 PRESIDENT 5000
20 ANALYST 6000
30 SALESMAN 5600

SQL> select deptno,job,sum(sal) tsal from emp group by deptno,job having sum(sal) >
3000 order by job;
DEPTNO JOB TSAL
---------- --------- ----------
20 ANALYST 6000
10 PRESIDENT 5000
30 SALESMAN 5600
ORDER OF EXECUTION
 Group the rows together based on group by clause.
 Calculate the group functions for each group.
 Choose and eliminate the groups based on the having clause.
 Order the groups based on the specified column.

29
RELATIONAL CALCULUS

1. There is an alternate way of formulating queries known as Relational Calculus. Relational


calculus is a non-procedural query language. In the non-procedural query language, the user is
concerned with the details of how to obtain the end results. The relational calculus tells what to
do but never explains how to do. Most commercial relational languages are based on aspects of
relational calculus including SQL-QBE and QUEL.

Why it is called Relational Calculus?

2. It is based on Predicate calculus, a name derived from branch of symbolic language. A predicate
is a truth-valued function with arguments. On substituting values for the arguments, the function
result in an expression called a proposition. It can be either true or false. It is a tailored version of
a subset of the Predicate Calculus to communicate with the relational database.

Types of Relational calculus:

8. TUPLE RELATIONAL CALCULUS (TRC)

It is a non-procedural query language which is based on finding a number of tuple variables also known
as range variable for which predicate holds true. It describes the desired information without giving a
specific procedure for obtaining that information. The tuple relational calculus is specified to select the
tuples in a relation. In TRC, filtering variable uses the tuples of a relation. The result of the relation can
have one or more tuples.

30
Notation:

A Query in the tuple relational calculus is expressed as following notation

1. {T | P (T)} or {T | Condition (T)}

Where

T is the resulting tuples

P(T) is the condition used to fetch T.

For example:

1. { T.name | Author(T) AND T.article = 'database' }

Output: This query selects the tuples from the AUTHOR relation. It returns a tuple with 'name' from
Author who has written an article on 'database'.

TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential (∃) and Universal
Quantifiers (∀).

For example:

1. { R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}

Output: This query will yield the same result as the previous one.

9. DOMAIN RELATIONAL CALCULUS (DRC)

The second form of relation is known as Domain relational calculus. In domain relational calculus,
filtering variable uses the domain of attributes. Domain relational calculus uses the same operators as
tuple calculus. It uses logical connectives ∧ (and), ∨ (or) and ┓ (not). It uses Existential (∃) and
Universal Quantifiers (∀) to bind the variable. The QBE or Query by example is a query language related
to domain relational calculus.

Notation:

31
1. { a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}

Where

a1, a2 are attributes


P stands for formula built by inner attributes

For example:

1. {< article, page, subject > | ∈ javatpoint ∧ subject = 'database'}

Output: This query will yield the article, page, and subject from the relational javatpoint, where the
subject is a databas

32

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy