Rdbms (Unit 2)
Rdbms (Unit 2)
A relational database consists of a collection of tables, each of which is assigned a unique name. For
example, consider the instructor table, which stores information about instructors. The table has four column
headers: ID, name, dept name, and salary. Each row of this table records information about an instructor.
In general, a row in a table represents a relationship among a set of values. Since a table is a collection
of such relationships, there is a close correspondence between the concept of table and the mathematical
concept of relation, from which the relational data model takes its name.
In the relational model the term relation is used to refer to a table, while the term tuple is used to refer
to a row. Similarly, the term attribute refers to a column of a table.
We use the term relation instance to refer to a specific instance of a relation, i.e., containing a specific
set of rows. For each attribute of a relation, there is a set of permitted values, called the domain of the attribute.
Thus, the domain of the salary attribute of the instructor relation is the set of all possible salary values.
We require that, for all relations r, the domains of all attributes of r be atomic. A domain is atomic if
elements of the domain are considered to be indivisible units. The null value is a special value that signifies that
the value is unknown or does not exist for an attribute.
Database Schema
The database schema, which is the logical design of the database, and the database instance, which is
a snapshot of the data in the database at a given instant in time. The concept of a relation corresponds to the
programming-language notion of a variable, while the concept of a relation schema corresponds to the
programming-language notion of type definition.
In general, a relation schema consists of a list of attributes and their corresponding domain. The concept
of a relation instance corresponds to the programming-language notion of a value of a variable. The relation
schema of department relation is:
department = {dept name, building, budget}
Relational Constraints
The values of the attribute of a tuple must be such that they can uniquely identify the tuple. It is done by
using different types of constraints.
A super key is a set of one or more attributes that, taken collectively, allow us to identify uniquely a
tuple in the relation. For example, the ID attribute of the relation instructor is sufficient to distinguish one
instructor tuple from another. Thus, ID is a super key.
We are often interested in super keys for which no proper subset is a super key. Such minimal super
keys are called candidate keys. The term primary key to denote a candidate key that is chosen by the database
designer as the principal means of identifying tuples within a relation.
A relation, say r1, may include among its attributes the primary key of another relation, say r2. This
attribute is called a foreign key from r1, referencing r2. The relation r1 is also called the referencing relation
of the foreign key dependency, and r2 is called the referenced relation of the foreign key.
The constraint from section to teaches is an example of a referential integrity constraint; a referential
integrity constraint requires that the values appearing in specified attributes of any tuple in the referencing
relation also appear in specified attributes of at least one tuple in the referenced relation.
The union operation performs a set union of two “similarly structured” relations. Consider a query to
find the set of all courses taught in the Fall 2009 semester, the Spring 2010 semester, or both.
To find the set of all courses taught in the Fall 2009 semester, we write:
Π course_id (ϭ semester = “Fall” ∧ year=2009 (section))
To find the set of all courses taught in the Spring 2010 semester, we write:
Π course_id (ϭ semester = “Spring” ∧ year=2009 (section))
we must ensure that set differences are taken between compatible relations.
means “there exists a tuple t in relation r such that predicate Q(t) is true”.
Using this notation, we can write the query “Find the instructor ID for each instructor with a salary greater than
80,000” as:
{t | ∃ s ∈ instructor (t[ID] = s[ID] ∧ s[salary] > 80000)}
We read the expression as “The set of all tuples t such that there exists a tuple s in relation instructor for
which the values of t and s for the ID attribute are equal, and the value of s for the salary attribute is greater
than 80,000.”
The formula P ⇒ Q means “P implies Q”; that is, “if P is true, then Q must be true”. Consider the
query, Find all students who have taken all courses offered in the Biology department.” To write this query in
the tuple relational calculus, we introduce the “for all” construct, denoted by ∀. The notation:
∀ t ∈ r (Q(t))
means “Q is true for all tuples t in relation r.”
We write the expression for our query as follows:
{t | ∃ r ∈ student (r [ID] = t[ID]) ∧ ( ∀ u ∈ course (u[dept name] = “ Biology” ⇒ ∃ s ∈ takes (t[ID] = s[ID]
∧ s [course id] = u [course id]))}
We interpret this expression as “The set of all students (that is, (ID) tuples t) suchthat, for all tuples u in
the course relation, if the value of u on attribute dept name is ’Biology’, then there exists a tuple in the takes
relation that includes the student ID and the course id.”
Formal Definition
We are now ready for a formal definition. A tuple-relational-calculus expression is of the form:
{t | P(t)}
where P is a formula. Several tuple variables may appear in a formula. A tuple variable is said to be a free
variable unless it is quantified by a ∃ or ∀. Thus, in:
t ∈ instructor ∧ ∃s ∈ department(t[dept name] = s[dept name])
t is a free variable. Tuple variable s is said to be a bound variable.
A tuple-relational-calculus formula is built up out of atoms. An atom has one of the following forms:
• s ∈ r, where s is a tuple variable and r is a relation.
• s[x] Θ u[y], where s and u are tuple variables, x is an attribute on which s is defined, y is an attribute on which
u is defined, and Θ is a comparison operator (<, ≤, =, _=, >, ≥); we require that attributes x and y have domains
whose members can be compared by Θ.
• s[x] Θ c, where s is a tuple variable, x is an attribute on which s is defined, _ is a comparison operator, and c is
a constant in the domain of attribute x.
Safety of Expressions
There is one final issue to be addressed. A tuple-relational-calculus expression may generate an infinite
relation. Suppose that we write the expression:
{t |¬ (t ∈ instructor)}
There are infinitely many tuples that are not in instructor. To help us define a restriction of the tuple relational
calculus, we introduce the concept of the domain of a tuple relational formula, P. Intuitively, the domain of P,
denoted dom(P), is the set of all values referenced by P.
We say that an expression {t | P(t)} is safe if all values that appear in the result are values from dom(P).
The expression {t |¬ (t ∈ instructor)} is not safe.
UNION Operation
UNION is used to combine the results of two or more SELECT statements. However it will eliminate
duplicate rows from its result set.
select * from first UNION select * from second;
Output:
UNION ALL
This operation is similar to Union. But it also shows the duplicate rows.
select * from first UNION ALL select * from second;
Output:
INTERSECT
Intersect operation is used to combine two SELECT statements, but it only retuns the records which are
common from both SELECT statements.
select * from First INTERSECT select * from Second;
Output:
MINUS
The Minus operation combines results of two SELECT statements and It displays the rows which are
present in the first query but absent in the second query with no duplicates.
select * from First MINUS select * from Second;
Output: