0% found this document useful (0 votes)
349 views9 pages

Rdbms (Unit 2)

relational database management system notes for Thiruvalluvar university syllabus who comes from b.sc computer science ug this is for unit 2 check my page for more rdbms notes

Uploaded by

hari karan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
349 views9 pages

Rdbms (Unit 2)

relational database management system notes for Thiruvalluvar university syllabus who comes from b.sc computer science ug this is for unit 2 check my page for more rdbms notes

Uploaded by

hari karan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

UNIT – II

RELATIONAL DATA MODEL

Concepts of Relational Model

A relational database consists of a collection of tables, each of which is assigned a unique name. For
example, consider the instructor table, which stores information about instructors. The table has four column
headers: ID, name, dept name, and salary. Each row of this table records information about an instructor.
In general, a row in a table represents a relationship among a set of values. Since a table is a collection
of such relationships, there is a close correspondence between the concept of table and the mathematical
concept of relation, from which the relational data model takes its name.
In the relational model the term relation is used to refer to a table, while the term tuple is used to refer
to a row. Similarly, the term attribute refers to a column of a table.
We use the term relation instance to refer to a specific instance of a relation, i.e., containing a specific
set of rows. For each attribute of a relation, there is a set of permitted values, called the domain of the attribute.
Thus, the domain of the salary attribute of the instructor relation is the set of all possible salary values.
We require that, for all relations r, the domains of all attributes of r be atomic. A domain is atomic if
elements of the domain are considered to be indivisible units. The null value is a special value that signifies that
the value is unknown or does not exist for an attribute.
Database Schema
The database schema, which is the logical design of the database, and the database instance, which is
a snapshot of the data in the database at a given instant in time. The concept of a relation corresponds to the
programming-language notion of a variable, while the concept of a relation schema corresponds to the
programming-language notion of type definition.
In general, a relation schema consists of a list of attributes and their corresponding domain. The concept
of a relation instance corresponds to the programming-language notion of a value of a variable. The relation
schema of department relation is:
department = {dept name, building, budget}
Relational Constraints
The values of the attribute of a tuple must be such that they can uniquely identify the tuple. It is done by
using different types of constraints.
A super key is a set of one or more attributes that, taken collectively, allow us to identify uniquely a
tuple in the relation. For example, the ID attribute of the relation instructor is sufficient to distinguish one
instructor tuple from another. Thus, ID is a super key.
We are often interested in super keys for which no proper subset is a super key. Such minimal super
keys are called candidate keys. The term primary key to denote a candidate key that is chosen by the database
designer as the principal means of identifying tuples within a relation.
A relation, say r1, may include among its attributes the primary key of another relation, say r2. This
attribute is called a foreign key from r1, referencing r2. The relation r1 is also called the referencing relation
of the foreign key dependency, and r2 is called the referenced relation of the foreign key.
The constraint from section to teaches is an example of a referential integrity constraint; a referential
integrity constraint requires that the values appearing in specified attributes of any tuple in the referencing
relation also appear in specified attributes of at least one tuple in the referenced relation.

RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 1


Relational Languages
A query language is a language in which a user requests information from the database. These
languages are usually on a level higher than that of a standard programming language. Query languages can be
categorized as either procedural or nonprocedural.
In a procedural language, the user instructs the system to perform a sequence of operations on the
database to compute the desired result. In a nonprocedural language, the user describes the desired
information without giving a specific procedure for obtaining that information.
There are a number of “pure” query languages: The relational algebra is procedural, whereas the tuple
relational calculus and domain relational calculus are nonprocedural languages.
Relational Algebra
The relational algebra is a procedural query language. It consists of a set of operations that take one or
two relations as input and produce a new relation as their result. The fundamental operations in the relational
algebra are select, project, union, set difference, Cartesian product, and rename. In addition to the fundamental
operations, there are several other operations—namely, set intersection, natural join, and assignment operations.
Fundamental Operations
The select, project, and rename operations are called unary operations, because they operate on one
relation. The other three operations operate on pairs of relations and are, therefore, called binary operations.
The Select Operation
The select operation selects tuples that satisfy a given predicate. We use the lowercase Greek letter
sigma (ϭ) to denote selection. The predicate appears as a subscript to ϭ.
Examples:
• To select those tuples of the instructor relation where the instructor is in the “Physics” department,
ϭ dept_name=“ Physics” (instructor )
• We can find all instructors with salary greater than 20,000 by writing:
ϭ salary>90000 (instructor)
In general, we allow comparisons using =, _=, <, ≤, >, and ≥ in the selection predicate. Furthermore, we
can combine several predicates into a larger predicate by using the connectives and (∧), or (∨), and not (¬).
• To find the instructors in Physics with a salary greater than 20,000, we write:
ϭ dept name =“Physics”∧ salary>90000 (instructor )
The Project Operation (Π)
The project operation is a unary operation that returns its argument relation, with certain attributes left
out. Projection is denoted by the uppercase Greek letter pi (Π). We list those attributes that we wish to appear in
the result as a subscript to Π. We write the query to display a particular attribute in instructor relation as:
Π ID, name (instructor)
The Union Operation (∪)

The union operation performs a set union of two “similarly structured” relations. Consider a query to
find the set of all courses taught in the Fall 2009 semester, the Spring 2010 semester, or both.
To find the set of all courses taught in the Fall 2009 semester, we write:
Π course_id (ϭ semester = “Fall” ∧ year=2009 (section))
To find the set of all courses taught in the Spring 2010 semester, we write:
Π course_id (ϭ semester = “Spring” ∧ year=2009 (section))

RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 2


we need the union of these two sets; that is, we need all section IDs that appear in either or both of the
two relations. We find these data by the binary operation union, denoted, as in set theory, by ∪.
Π course_id (ϭ semester =“Fall” ∧ year=2009 (section)) ∪ Π course_id (ϭ semester =“Spring” ∧ year=2010 (section))
Therefore, for a union operation r ∪ s to be valid, we require that two conditions hold:
1. The relations r and s must be of the same arity. That is, they must have the same number of attributes.
2. The domains of the i’th attribute of r and the i’th attribute of s must be the same, for all i.
The Set-Difference Operation (-)
The set-difference operation, denoted by −, allows us to find tuples that are in one relation but are not
in another. The expression r − s produces a relation containing those tuples in r but not in s.
We can find all the courses taught in the Fall 2009 semester but not in Spring 2010 semester by writing:
Π course_id (ϭ semester =“Fall” ∧ year=2009 (section)) ∪ Π course_id (ϭ semester =“Spring” ∧ year=2010 (section))

we must ensure that set differences are taken between compatible relations.

The Cartesian-Product Operation (×)


The Cartesian-product operation, denoted by a cross (×), allows us to combine information from any
two relations. We write the Cartesian product of relations r1 and r2 as r1 × r2.
We need the information in both the instructor relation and the teaches relation to do so. If we write:
ϭ dept name =“Physics” (instructor × teaches)
The Rename Operation (ρ)
The rename operator, denoted by the lowercase Greek letter rho (ρ ) is used to provide names to the
relations or change the existing names of the relation. Given a relational-algebra expression E, the expression.
ρ x (E)
returns the result of expression E under the name x.
The Natural-Join Operation
The natural join is a binary operation that allows us to combine certain selections and a Cartesian
product into one operation. It is denoted by the join symbol ∞. The natural join of r and s, denoted by r _ s, is
a relation on schema R ∪ S formally defined as follows:
r ∞ s =ΠR ∪ S (ϭ r.A1 =s.A1 ∧ r.A2 =s.A2 ∧...∧ r.An = s.An (r × s))
The Tuple Relational Calculus
The tuple relational calculus, by contrast, is a nonprocedural query language. It describes the desired
information without giving a specific procedure for obtaining that information.
A query in the tuple relational calculus is expressed as:
{t | P(t)}
That is, it is the set of all tuples t such that predicate P is true for t.
Example Queries
Find the ID, name, dept name, salary for instructors whose salary is greater than 80000:
{t | t ∈ instructor ∧ t[salary] > 80000}
Suppose that we want only the ID attribute, rather than all attributes of the instructor relation. To write
this query in the tuple relational calculus, we need to write an expression for a relation on the schema (ID).

RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 3


To express this request, we need the construct “there exists” from mathematical logic. The notation:
∃ t ∈ r (Q(t))

means “there exists a tuple t in relation r such that predicate Q(t) is true”.
Using this notation, we can write the query “Find the instructor ID for each instructor with a salary greater than
80,000” as:
{t | ∃ s ∈ instructor (t[ID] = s[ID] ∧ s[salary] > 80000)}

We read the expression as “The set of all tuples t such that there exists a tuple s in relation instructor for
which the values of t and s for the ID attribute are equal, and the value of s for the salary attribute is greater
than 80,000.”
The formula P ⇒ Q means “P implies Q”; that is, “if P is true, then Q must be true”. Consider the
query, Find all students who have taken all courses offered in the Biology department.” To write this query in
the tuple relational calculus, we introduce the “for all” construct, denoted by ∀. The notation:
∀ t ∈ r (Q(t))
means “Q is true for all tuples t in relation r.”
We write the expression for our query as follows:
{t | ∃ r ∈ student (r [ID] = t[ID]) ∧ ( ∀ u ∈ course (u[dept name] = “ Biology” ⇒ ∃ s ∈ takes (t[ID] = s[ID]
∧ s [course id] = u [course id]))}
We interpret this expression as “The set of all students (that is, (ID) tuples t) suchthat, for all tuples u in
the course relation, if the value of u on attribute dept name is ’Biology’, then there exists a tuple in the takes
relation that includes the student ID and the course id.”
Formal Definition
We are now ready for a formal definition. A tuple-relational-calculus expression is of the form:
{t | P(t)}
where P is a formula. Several tuple variables may appear in a formula. A tuple variable is said to be a free
variable unless it is quantified by a ∃ or ∀. Thus, in:
t ∈ instructor ∧ ∃s ∈ department(t[dept name] = s[dept name])
t is a free variable. Tuple variable s is said to be a bound variable.
A tuple-relational-calculus formula is built up out of atoms. An atom has one of the following forms:
• s ∈ r, where s is a tuple variable and r is a relation.
• s[x] Θ u[y], where s and u are tuple variables, x is an attribute on which s is defined, y is an attribute on which
u is defined, and Θ is a comparison operator (<, ≤, =, _=, >, ≥); we require that attributes x and y have domains
whose members can be compared by Θ.
• s[x] Θ c, where s is a tuple variable, x is an attribute on which s is defined, _ is a comparison operator, and c is
a constant in the domain of attribute x.
Safety of Expressions
There is one final issue to be addressed. A tuple-relational-calculus expression may generate an infinite
relation. Suppose that we write the expression:
{t |¬ (t ∈ instructor)}
There are infinitely many tuples that are not in instructor. To help us define a restriction of the tuple relational
calculus, we introduce the concept of the domain of a tuple relational formula, P. Intuitively, the domain of P,
denoted dom(P), is the set of all values referenced by P.
We say that an expression {t | P(t)} is safe if all values that appear in the result are values from dom(P).
The expression {t |¬ (t ∈ instructor)} is not safe.

RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 4


The Domain Relational Calculus
The domain relational calculus, uses domain variables that take on values from an attribute’s domain,
rather than values for an entire tuple. Domain relational calculus serves as the theoretical basis of the widely
used QBE language, just as relational algebra serves as the basis for the SQL language.
Formal Definition
An expression in the domain relational calculus is of the form
{< x1, x2, . . . , xn > | P(x1, x2, . . . , xn)}
where x1, x2, . . . , xn represent domain variables. P represents a formula composed of atoms, as was the
case in the tuple relational calculus. An atom in the domain relational calculus has one of the following forms:
< x1, x2, . . . , xn > ∈ r, where r is a relation on n attributes and x1, x2, . . . , xn are domain variables or
domain constants.
• x Θ y, where x and y are domain variables and Θ is a comparison operator (<, ≤, =, _=, >, ≥). We
require that attributes x and y have domains that can be compared by Θ .
• x Θ c, where x is a domain variable, Θ is a comparison operator, and c is a constant in the domain of
the attribute for which x is a domain variable.
We build up formulae from atoms by using the following rules:
• An atom is a formula.
• If P1 is a formula, then so are ¬P1 and (P1).
• If P1 and P2 are formulae, then so are P1 ∨ P2, P1 ∧ P2, and P1 ⇒ P2.
• If P1(x) is a formula in x, where x is a free domain variable, then
∃ x (P1(x)) and ∀ x (P1(x))
Example Queries
Find the instructor ID, name, dept name, and salary for instructors whose salary is greater than 80,000:

{< i, n, d, s > | < i, n, d, s > ∈ instructor ∧ s > 80000}


Find all instructor ID for instructors whose salary is greater than 80,000:
{< n > | ∃ i, d, s (< i, n, d, s > ∈ instructor ∧ s > 80000)}
Find the names of all instructors in the Physics department together with the course id of all courses they teach:
{< n, c > | ∃ i, a (< i, c, a, s, y > ∈ teaches ∧ ∃ d, s (< i, n, d, s > ∈ instructor ∧ d = “Physics”))}
Find the set of all courses taught in the Fall 2009 semester, the Spring 2010 semester, or both:
{< c > | ∃ s (< c, a, s, y, b, r, t >∈ section ∧ s = “Fall” ∧ y = “2009” ∨ ∃u (< c, a, s, y, b, r, t >∈ section
∧ s = “Spring” ∧ y = “2010”}
Find all students who have taken all courses offered in the biology department:
{< i > | ∃ n, d, t (< i, n, d, t > ∈ student) ∧ ∀ x, y, z,w (< x, y, z,w > ∈ course ∧ z = “Biology” ⇒
∃ a, b (< a, x, b, r, p, q > ∈ takes ∧ < c, a > ∈ depositor ))}
SQL (Structured Query Language)
IBM developed the original version of SQL, originally called Sequel, as part of the System R project in
the early 1970s. The Sequel language has evolved since then, and its name has changed to SQL (Structured
Query Language). Many products now support the SQL language. SQL has clearly established itself as the
standard relational database language.
In 1986, the American National Standards Institute (ANSI) and the International Organization for
Standardization (ISO) published an SQL standard, called SQL-86.

RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 5


The SQL language has several parts:
• Data-definition language (DDL). The SQL DDL provides commands for defining relation schemas, deleting
relations, and modifying relation schemas.
• Data-manipulation language (DML). The SQL DML provides the ability to query information from the
database and to insert tuples into, delete tuples from, and modify tuples in the database.
Transaction control. SQL includes commands for specifying the beginning and ending of transactions.
Basic Schema Definition
We define an SQL relation by using the create table command. The following command creates a
relation department in the database.
create table department (dept name varchar (20), building varchar (15), budget numeric (12,2),
primary key (dept name));
The general form of the create table command is:
create table r (A1 D1, A2 D2, . . . , An Dn, integrity-constraint1, . . . , integrity-constraint k );
where r is the name of the relation, each Ai is the name of an attribute in the schema of relation r, and Di
is the domain of attribute Ai; that is, Di specifies the type of attribute Ai along with optional constraints that
restrict the set of allowed values for Ai .
Basic Structure of SQL Queries
The basic structure of an SQL query consists of three clauses: select, from, and where. The query takes
as its input the relations listed in the from clause, operates on them as specified in the where and select clauses,
and then produces a relation as the result.
Queries on a Single Relation
Find the names of all instructors names are found in the instructor relation,
select name from instructor
In those cases where we want to force the elimination of duplicates, we insert the keyword distinct after select.
We can rewrite the preceding query as:
select distinct dept name from instructor
SQL allows the use of the logical connectives and, or, and not in the where clause. The operands of the
logical connectives can be expressions involving the comparison operators <, <=, >, >=, =, and <>. SQL
allows us to use the comparison operators to compare strings and arithmetic expressions, as well as special
types, such as date types.
Find the names of all instructors in the Computer Science department who have salary greater than 70,000:
select name from instructor where dept name = ’Comp. Sci.’ and salary > 70000;
Queries on Multiple Relations
Set Operations
Set operators are special type of operators which are used to combine the result of two queries. The
Operators covered under Set operations are:
1. UNION
2. UNION ALL
3. INTERSECT
4. MINUS (Except)
There are certain rules which must be followed to perform operations using SET operators in SQL. They are,
1. The number and order of columns must be the same.
2. Data types must be compatible.
RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 6
Table 1: First
Table 2: Second

UNION Operation
UNION is used to combine the results of two or more SELECT statements. However it will eliminate
duplicate rows from its result set.
select * from first UNION select * from second;
Output:

UNION ALL
This operation is similar to Union. But it also shows the duplicate rows.
select * from first UNION ALL select * from second;
Output:

INTERSECT
Intersect operation is used to combine two SELECT statements, but it only retuns the records which are
common from both SELECT statements.
select * from First INTERSECT select * from Second;
Output:

MINUS
The Minus operation combines results of two SELECT statements and It displays the rows which are
present in the first query but absent in the second query with no duplicates.
select * from First MINUS select * from Second;
Output:

RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 7


Null Values
Null values present special problems in relational operations, including arithmetic operations,
comparison operations, and set operations. The result of an arithmetic expression is null if any of the input
values is null.
The definitions of the Boolean operations are extended to deal with the value unknown.
• and: The result of true and unknown is unknown, false and unknown is false, while unknown and unknown is
unknown.
• or: The result of true or unknown is true, false or unknown is unknown, while unknown or unknown is
unknown.
• not: The result of not unknown is unknown.
SQL uses the special keyword null in a predicate to test for a null value. To find all instructors who
appear in the instructor relation with null values.
select name from instructor where salary is null;
Aggregate Functions
Aggregate functions are functions that take a collection (a set or multiset) of values as input and return a
single value. SQL offers five built-in aggregate functions:
• Average: avg
• Minimum: min
• Maximum: max
• Total: sum
• Count: count
The input to sum and avg must be a collection of numbers, but the other operators can operate on
collections of nonnumeric data types, such as strings, as well.
Examples:
Find the average salary of instructors in the Computer Science department.
select avg (salary) from instructor where dept name= ’Comp. Sci’
we can give a meaningful name to the attribute by using the as clause as follows:
select avg (salary) as avg salary from instructor where dept name= ’Comp. Sci.’
Find the total number of instructors who teach a course in the Spring 2010 semester. In this case, an instructor
counts only once, regardless of the number of course sections that the instructor teaches.
select count (distinct ID) from teaches where semester = ’Spring’ and year = 2010;
To find the number of tuples in the course relation, we write
select count (*) from course;
Aggregation with Grouping
The attribute or attributes given in the group by clause are used to form groups. Tuples with the same
value on all attributes in the group by clause are placed in one group.
Find the average salary in each department.” We write this query as follows:
select dept_name, avg (salary) as avg _salary from instructor group by dept_name;
The Having Clause
SQL applies predicates in the having clause after groups have been formed, so aggregate functions may
be used.We express this query in SQL as follows:

RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 8


select dept_name, avg (salary) as avg_salary from instructor group by dept_name
having avg (salary) > 42000;
The meaning of a query containing aggregation, group by, or having clauses is defined by the
following sequence of operations:
1. As was the case for queries without aggregation, the from clause is first evaluated to get a relation.
2. If a where clause is present, the predicate in the where clause is applied on the result relation of the
from clause.
3. Tuples satisfying the where predicate are then placed into groups by the group by clause if it is present.
4. The having clause, if it is present, is applied to each group; the groups that do not satisfy the having
clause predicate are removed.

RELATIONAL DATABASE MANAGEMENT SYSTEMS (UNIT II) Page 9

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy