0% found this document useful (0 votes)
23 views70 pages

23IT204T - DBMS Unit 1

The document outlines the fundamentals of Database Management Systems (DBMS), covering topics such as the purpose of database systems, data models, and database languages. It discusses the advantages and disadvantages of various data models, including the relational model and entity-relationship model, as well as the architecture of database systems. Additionally, it highlights the importance of data integrity, security, and the roles of storage and query processors within a DBMS.

Uploaded by

janani1466selvam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views70 pages

23IT204T - DBMS Unit 1

The document outlines the fundamentals of Database Management Systems (DBMS), covering topics such as the purpose of database systems, data models, and database languages. It discusses the advantages and disadvantages of various data models, including the relational model and entity-relationship model, as well as the architecture of database systems. Additionally, it highlights the importance of data integrity, security, and the roles of storage and query processors within a DBMS.

Uploaded by

janani1466selvam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 70

VEC – IV Semester – II Year – 23IT204T –

CSE

23IT204T – DATABASE MANAGEMENT SYSTEMS

UNIT 1 RELATIONAL DATABASES 10


Purpose of Database System – Views of data – Data Models –Database System
Architecture – Introduction to relational databases – Keys – Integrity
constraints- Domain Constraints, Entity Integrity Constraints, Referential
Integrity Constraints - Relational Algebra – SQL fundamentals.

DBMS – Definition
 A database-management system (DBMS) is a collection of interrelated data and
a set of programs to access those data. The collection of data, usually referred
to as the database, contains information relevant to an enterprise.
 The primary goal of a DBMS is to provide a way to store and retrieve database
information that is both convenient and efficient.

Database Applications
Databases are widely used. Here are some representative applications:
• Enterprise Information
◦ Sales: For customer, product, and purchase information.
◦ Accounting: For payments, receipts, account balances, assets
and other accounting information.
◦ Human resources: For information about employees, salaries,
payroll taxes, and benefits, and for generation of paychecks.
◦ Manufacturing: For management of the supply chain
◦ Online retailers: For sales data noted above plus online order
tracking, generation of recommendation lists, and maintenance of online
product evaluations.
• Banking and Finance
◦ Banking: For customer information, accounts, loans, and
banking transactions.
◦ Credit card transactions: For purchases on credit cards and
generation of monthly statements.
◦ Finance: For storing information about holdings, sales, and purchases
of financial instruments such as stocks and bonds;
• Universities: For student information, course registrations, and
grades (in addition to standard enterprise information such as human
resources and accounting).

Airlines: For reservations and schedule information. Airlines were among

Dr.S.SRIDEVI, AP/CSE
1
VEC – IV Semester – II Year – 23IT204T –
CSE

the first to use databases in a geographically distributed manner.


• Telecommunication: For keeping records of calls made, generating
monthly bills, maintaining balances on prepaid calling cards, and storing
information about the communication networks.

PURPOSE OF DATABASE SYSTEM


 Database systems arose in response to early methods of computerized
management of commercial data.
 This typical file-processing system is supported by a conventional
operating system.
 The system stores permanent records in various files, and it needs
different application programs to extract records from, and add records
to, the
appropriate files.
 Before database management systems (DBMSs) were introduced,
organizations usually stored information in such systems.
Keeping organizational information in a file-processing system has a number of
major disadvantages:
 Data redundancy and inconsistency.
Redundancy leads to higher storage and access cost. In addition, it may lead to
data inconsistency; that is, the various copies of the same data may no longer
agree. For example, a changed student address may be reflected in the Music
department records but not elsewhere in the system.
 Difficulty in accessing data.
The point here is that conventional file-processing environments do not allow
needed data to be retrieved in a convenient and efficient manner. More
responsive data-retrieval systems are required for general use.
 Data isolation.
Because data are scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is
difficult.
 Integrity problems.
The data values stored in the database must satisfy certain types of consistency
constraints.
 Atomicity problems.
A computer system, like any other device, is subject to failure. In many
applications, it is crucial that, if a failure occurs, the data be restored to the
consistent state that existed prior to the failure.
 Concurrent-access anomalies.
For the sake of overall performance of the system and faster response, many
systems allow multiple users to update the data simultaneously.
 Security problems.
Not every user of the database system should be able to access all the data.
Dr.S.SRIDEVI, AP/CSE
2
VEC – IV Semester – II Year – 23IT204T –
CSE

These difficulties, among others, prompted the development of database


systems. DBMS has the concepts and algorithms that enable database systems
to solve the problems with file-processing systems.

VIEWS OF DATA
A major purpose of a database system is to provide users with an abstract view
of the data. That is, the system hides certain details of how the data are stored
and maintained.

Data Abstraction
For the system to be usable, it must retrieve data efficiently. The need for
efficiency has led designers to use complex data structures to represent data in
the database. Since many database-system users are not computer trained,
developers hide the complexity from users through several levels of abstraction,
to simplify users’ interactions with the system:

 Physical level. The lowest level of abstraction describes how the data are
actually stored. The physical level describes complex low-level data
structures in detail.

 Logical level. The next-higher level of abstraction describes what data


are stored in the database, and what relationships exist among those data.
The logical level thus describes the entire database in terms of a small
number of relatively simple structures.

 View level. The highest level of abstraction describes only part of the
entire database. Even though the logical level uses simpler structures,
complexity remains because of the variety of information stored in a large
database.

type instructor = record ID : char (5); name : char (20);


dept name : char (20); salary : numeric (8,2); end;
Dr.S.SRIDEVI, AP/CSE
3
VEC – IV Semester – II Year – 23IT204T –
CSE

This code defines a new record type called instructor with four fields.

Instances and Schemas


 Databases change over time as information is inserted and deleted.
 The collection of information stored in the database at a particular
moment is called an instance of the database.
 The overall design of the database is called the database schema.
Schemas are changed infrequently, if at all.
 Database systems have several schemas, partitioned according to the
levels of abstraction.
 The physical schema describes the database design at the physical level,
while the logical schema describes the database design at the logical
level.
 A database may also have several schemas at the view level, sometimes
called subschemas that describe different views of the database.
 Application programs are said to exhibit physical data independence if
they do not depend on the physical schema, and thus need not be
rewritten if the physical schema changes.
DATA MODELS
Underlying the structure of a database is the data model: a collection of
conceptual tools for describing data, data relationships, data semantics, and
consistency constraints. A data model provides a way to describe the design of a
database at the physical, logical, and view levels.

There are a number of different data models that we shall cover in the text. The
data models can be classified into four different categories:

Relational Model.
 The relational model uses a collection of tables to represent both data and
the relationships among those data. Each table has multiple columns, and
each column has a unique name. Tables are also known as relations.
 Tables are also known as relations. The relational model is an example of
a
record-based model.
Table: Student
Student ID Student Name Department Date of Birth
111 Ajay CSE 23-June-1999
112 Aravind CSE 20-Jan-1998
113 Balakumaran CSE 21-Jun-1999

Student(StudentID, StudentName, Department, DOB) The underlined student

Dr.S.SRIDEVI, AP/CSE
4
VEC – IV Semester – II Year – 23IT204T –
CSE

ID is the primary key.


 Advantages
 Structural Independence
 Conceptual Simplicity
 Design, Implementation and Maintenance

 Disadvantages
 Significant hardware and software overhead
 Not as good as transaction process modeling
 May have slow processing time than the hierarchical and
network model
Entity-Relationship Model.
The entity-relationship (E-R) data model uses a collection of basic objects,
called entities, and relationships among these objects. An entity is a “thing” or
“object”
in the real world that is distinguishable from other objects. The entity-
relationship model is widely used in database design,

 Rectangle: Represents Entity sets.


 Ellipses: Attributes
 Diamonds: Relationship Set
 Lines: They link attributes to Entity Sets and Entity sets to Relationship
Set
 Double Ellipses: Multivalued Attributes
 Dashed Ellipses: Derived Attributes
 Double Rectangles: Weak Entity Sets
 Double Lines: Total participation of an entity in a relationship set

Components of a ER Diagram ER Diagram Components

Dr.S.SRIDEVI, AP/CSE
5
VEC – IV Semester – II Year – 23IT204T –
CSE

 Advantages
o Easy to develop relational model using ER model
o ER specifies mapping cardinalities
o Specifies key like primary key, foreign key

 Disadvantages
o Used for design purpose only not implementation

Object-Based Data Model.


 Object-oriented programming (especially in Java, C++, or C#) has
become the dominant software-development methodology.
 This led to the development of an object-oriented data model that can be
seen as extending the E-R model with notions of encapsulation, methods
(functions), and object identity.
 The object-relational data model combines features of the object-oriented
data model and relational data model.

Semistructured Data Model.


 The semi structured data model permits the specification of data where
individual data items of the same type may have different sets of
attributes.
 This is in contrast to the data models mentioned earlier, where every data
item of a particular type must have the same set of attributes.
 The Extensible Markup Language (XML) is widely used to represent
semi structured data.
Historically, the network data model and the hierarchical data model
preceded the relational data model. These models were tied closely to the
underlying implementation, and complicated the task of modeling data. As a
result they are used little now, except in old database code that is still in service
in some places.

Database Languages
A database system provides a data-definition language to specify the database
schema and a data-manipulation language to express database queries and
updates. In practice, the data-definition and data-manipulation languages are
not two separate languages; instead they simply form parts of a single database
language, such as the widely used SQL language.
Data-Manipulation Language
A data-manipulation language (DML) is a language that enables users to access
or manipulate data as organized by the appropriate data model. The types of
access are:
• Retrieval of information stored in the database
Dr.S.SRIDEVI, AP/CSE
6
VEC – IV Semester – II Year – 23IT204T –
CSE

• Insertion of new information into the database


• Deletion of information from the database
• Modification of information stored in the database

There are basically two types:


 Procedural DMLs require a user to specify what data are needed and
how to get those data.
 Declarative DMLs (also referred to as nonprocedural DMLs) require a
user to specify what data are needed without specifying how to get those
data.

A query is a statement requesting the retrieval of information. The portion of a


DML that involves information retrieval is called a query language. Although
technically incorrect, it is common practice to use the terms query language
and data-manipulation language synonymously.

Data-Definition Language
 A database schema by a set of definitions expressed by a special language
called a data-definition language (DDL).The DDL is also used to specify
additional properties of the data.
 Storage structure and access methods used by the database system are
specified by a set of statements in a special type of DDL called a data
storage and definition language.
 These statements define the implementation details of the database
schemas, which are usually hidden from the users.
 The data values stored in the database must satisfy certain consistency
constraints. For example, suppose the university requires that the account
balance of a department must never be negative.

Domain Constraints. A domain of possible values must be associated with


every attribute (for example integer types, character types, date/time types).
Declaring an attribute to be of a particular domain acts as a constraint on the
values that it can take. Domain constraints are the most elementary form of
integrity constraint.

Referential Integrity. There are cases where we wish to ensure that a value
that appears in one relation for a given set of attributes also appears in a
certain set of attributes in another relation (referential integrity).

Assertions. An assertion is any condition that the database must always satisfy.
Domain constraints and referential-integrity constraints are special forms of
assertions.

Authorization. To differentiate among the users as far as the type of access


Dr.S.SRIDEVI, AP/CSE
7
VEC – IV Semester – II Year – 23IT204T –
CSE

they are permitted on various data values in the database.


These differentiations are expressed in terms of authorization, the most
common being: read authorization, which allows reading, but not
modification, of data;
insert authorization, which allows insertion of new data, but not modification
of existing data; update authorization, which allows modification, but not
deletion, of data;
delete authorization, which allows deletion of data.

Data Storage and Querying


A database system is partitioned into modules that deal with each of the
responsibilities of the overall system. The functional components of a database
system can be broadly divided into the storage manager and the query
processor components.

DATABASE SYSTEM ARCHITECTURE

 The architecture of a database system is greatly influenced by the


underlying computer system on which the database system runs.
 Database systems can be centralized, or client-server, where one server
machine executes work on behalf of multiple client machines. Database
system scan also be designed to exploit parallel computer architectures.

Storage Manager
 The storage manager is the component of a database system that provides
the interface between the low-level data stored in the database and the
application programs and queries submitted to the system.
 The storage manager is responsible for the interaction with the file
manager. The raw data are stored on the disk using the file system
provided by the operating system. The storage manager translates the
various DML statements into low-level file-system commands.

The storage manager components include:


• Authorization and integrity manager, which tests for the
satisfaction of integrity constraints and checks the authority of users to
access data.
• Transaction manager, which ensures that the database remains in a
consistent (correct) state despite system failures, and that concurrent
transaction executions proceed without conflicting.
• File manager, which manages the allocation of space on disk storage
and the data structures used to represent information stored on disk.
• Buffer manager, which is responsible for fetching data from disk
storage into main memory, and deciding what data to cache in main

Dr.S.SRIDEVI, AP/CSE
8
VEC – IV Semester – II Year – 23IT204T –
CSE

memory. The buffer manager is a critical part of the database system,


since it enables the database to handle data sizes that are much larger
than the size of main memory.

The storage manager implements several data structures as part of the


physical system implementation:
• Data files, which store the database itself.
• Data dictionary, which stores metadata about the structure of the
database, in particular the schema of the database.

• Indices, which can provide fast access to data items. Like the index in this
textbook, a database index provides pointers to those data items that hold a
particular value.

• The Query Processor


The query processor components include:
• DDL interpreter, which interprets DDL statements and records the
definitions in the data dictionary.
• DML compiler, which translates DML statements in a query
language into an evaluation plan consisting of low-level instructions that
the query evaluation engine understands.

The DML compiler also performs query optimization; that is, it picks the
lowest cost evaluation plan from among the alternatives.
Query evaluation engine, which executes low-level instructions generated by
the DML compiler.

Dr.S.SRIDEVI, AP/CSE
9
VEC – IV Semester – II Year – 23IT204T –
CSE

In two-tier architecture, the application resides at the client machine, where


it invokes database system functionality at the server machine through query
language statements. Application program interface standards like ODBC and
JDBC are used for interaction between the client and the server.
In contrast,
in three-tier architecture, the client machine acts as merely a front end and
does not contain any direct database calls. Instead, the client end
communicates with an application server, usually through a forms interface.
The application server in turn communicates with a database system to access
data. The business logic of the application, which says what actions to carry out
under what conditions, is embedded in the application server, instead of being
distributed across multiple clients. Three-tier applications are more appropriate
for large applications, and for applications that run on the World Wide Web.
Database Users and Administrators

Dr.S.SRIDEVI, AP/CSE
10
VEC – IV Semester – II Year – 23IT204T –
CSE

A primary goal of a database system is to retrieve information from and


store new information into the database.
Database Users and User Interfaces
There are four different types of database-system users, differentiated by the
way they expect to interact with the system.
1. Naive users are unsophisticated users who interact with the system by
invoking one of the application programs that have been written
previously.
2. Application programmers are computer professionals who write
application programs.
3. Sophisticated users interact with the system without writing programs.
In- stead, they form their requests either using a database query language
or by using tools such as data analysis software.
4. Specialized users are sophisticated users who write specialized
database applications that do not fit into the traditional data-processing
framework.

Database Administrator
One of the main reasons for using DBMS is to have central control of both the
data and the programs that access those data. A person who has such
central control over the system is called a database administrator
(DBA).
The functions of a DBA include:
 Schema definition. The DBA creates the original database schema by
executing a set of data definition statements in the DDL.
 Storage structure and access-method definition.
 Schema and physical organization modification. The DBA carries out
changes to the schema and physical organization to reflect the changing
needs of the organization, or to alter the physical organization to improve
performance.
 Granting of authorization for data access. By granting different types
of authorization, the database administrator can regulate which parts of
the database various users can access.
 Routine maintenance. Examples of the database administrator’s routine
maintenance activities are:
o Periodically backing up the database, either onto tapes or onto
remote servers, to prevent loss of data in case of disasters such as
flooding.
o Ensuring that enough free disk space is available for normal
operations, and upgrading disk space as required.
o Monitoring jobs running on the database and ensuring that
performance is not degraded by very expensive tasks submitted by
some users.

Dr.S.SRIDEVI, AP/CSE
11
VEC – IV Semester – II Year – 23IT204T –
CSE

Introduction to Relational Data Base


In relational model
 relation is used to refer to a table
 tuple is used to refer to a row
 attribute refers to a column of a table
 relation instance refers to a specific instance of a relation
 For each attribute of a relation, there is a set of permitted values, called
the
domain of that attribute.
 A domain is atomic if elements of the domain are considered to be
indivisible units.
o For example, suppose the table instructor had an attribute phone
number, which can store a set of phone numbers corresponding to
the instructor. Then the domain of phone number would not be
atomic, since an element of the domain is a set of phone numbers,
and it has sub parts, namely the individual phone numbers in the
set.
 The null value is a special value that signifies that the value is unknown
or does not exist.
o For example, suppose as before that we include the attribute phone
number in the instructor relation. It may be that an instructor does
not have a phone number at all, or that the telephone number is
unlisted.
 Degree – Total number of columns in the relational database.
 Cardinality – Total number of unique column values – tuples in database

Database Schema
 Database schema, which is the logical design of the database
 Database instance, which is a snapshot of the data in the database at a
given instant in time.
 The concept of a relation corresponds to the programming-language
notion of a variable, while the concept of a relation schema corresponds
to the programming-language notion of type definition.
o student (ID, name, dept name, tot cred)
o advisor (s id, i id)
o takes (ID, course id, sec id, semester, year, grade)
o classroom (building, room number, capacity)
o time slot (time slot id, day, start time, end time)

ID Name Department Email Credits


111 Priya CSE Shan@vec.in 9.2
Dr.S.SRIDEVI, AP/CSE
12
VEC – IV Semester – II Year – 23IT204T –
CSE

112 Shan EEE shanv@vec.in 8.7


113 Ajay CSE ajay@vec.in 8.2
114 Aravind EEE arav@vec.in 7.7
115 Pooja CSE Pooja@vec.in 9.1

Relation : Student Tuple

111 Priya CSE Shan@vec.in 9.2


Attributes
ID Name Department Email Credits

Relation Instance
Select * from student where id = 111 or id =112;
ID Name Department Email Credits
111 Priya CSE Shan@vec.in 9.2
112 Shan EEE shanv@vec.in 8.7

Degree = Total Number of Columns = 5 Cardinality = Total Number of

Rows = 5

KEYS
 We must have a way to specify how tuples within a given relation are
distinguished.
 This is expressed in terms of their attributes.
 That is, the values of the attribute values of a tuple must be such that they
can uniquely identify the tuple.
 In other words, no two tuples in a relation are allowed to have exactly the
same value for all attributes.

A superkey is a set of one or more attributes that, taken collectively, allow us


to identify uniquely a tuple in the relation. For example, the ID attribute of the
relation instructor is sufficient to distinguish one instructor tuple from another.
Thus, ID is a superkey. The name attribute of instructor, on the other hand, is
not a superkey, because several instructors might have the same name.

A superkey may contain extraneous attributes. For example, the combination


of ID and name is a superkey for the relation instructor. If K is a superkey,
then so is any superset of K. We are often interested in superkeys for which no
proper subset is a superkey. Such minimal superkeys are called candidate
keys.

Dr.S.SRIDEVI, AP/CSE
13
VEC – IV Semester – II Year – 23IT204T –
CSE

Candidate Key = Super Key – Primary Key

The term Primary key used to denote a candidate key that is chosen by the
database designer as the principal means of identifying tuples within a relation.

A key (whether primary, candidate, or super) is a property of the entire


relation, rather than of the individual tuples.

A superkey of a relation is a set of one or more attributes whose values are


guaranteed to identify tuples in the relation uniquely. A candidate key is a
minimal superkey, that is, a set of attributes that forms a superkey, but none
of whose subsets is a superkey. One of the candidate keys of a relation is
chosen as its primary key.

A foreign key is a set of attributes in a referencing relation, such that for each
tuple in the referencing relation, the values of the foreign key attributes are
guaranteed to occur as the primary key value of a tuple in the referenced
relation.

Dr.S.SRIDEVI, AP/CSE
14
VEC – IV Semester – II Year – 23IT204T –
CSE

Dr.S.SRIDEVI, AP/CSE
15
VEC – IV Semester – II Year – 23IT204T –
CSE

A schema diagram is a pictorial depiction of the schema of a database that


shows the relations in the database, their attributes, and primary keys and
foreign keys.

The relational query languages define a set of operations that operate on


tables, and output tables as their results. These operations can be combined to
get expressions that express desired queries.

The relational algebra provides a set of operations that take one or more
relationsasinputandreturnarelationasanoutput.Practicalquerylanguages such as
SQL are based on the relational algebra, but add a number of useful syntactic
features.

Dr.S.SRIDEVI, AP/CSE
16
VEC – IV Semester – II Year – 23IT204T –
CSE

SQL Fundamentals
SQL is a database computer language designed for the retrieval and
management of data in relational database. SQL stands for Structured Query
Language.

What is SQL?

SQL is Structured Query Language, which is a computer language for storing,


manipulating and retrieving data stored in relational database.
SQL is the standard language for Relation Database System. All relational
database management systems like MySQL, MS Access, Oracle, Sybase,
Informix, postgres and SQL Server use SQL as standard database language.

Also, they are using different dialects, such as:


 MS SQL Server using T-SQL,
 Oracle using PL/SQL,
 MS Access version of SQL is called JET SQL (native format) etc.

Why SQL?
 Allows users to access data in relational database management systems.
 Allows users to describe the data.
 Allows users to define the data in database and manipulate that data.
 Allows embedding within other languages using SQL modules, libraries &
pre-compilers.
 Allows users to create and drop databases and tables.
 Allows users to create view, stored procedure, functions in a database.
 Allows users to set permissions on tables, procedures, and views

History:
 1970 -- Dr. Edgar F. "Ted" Codd of IBM is known as the father of
relational databases. He described a relational model for databases.
 1974 -- Structured Query Language appeared.
 1978 -- IBM worked to develop Codd's ideas and released a product
named System/R.
 1986 -- IBM developed the first prototype of relational database and
standardized by ANSI. The first relational database was released by
Relational Software and its later becoming Oracle.

SQL Process:
When you are executing an SQL command for any RDBMS, the system
determines the best way to carry out your request and SQL engine figures out
how to interpret the task. There are various components included in the
process. These components are Query Dispatcher, Optimization Engines,
Dr.S.SRIDEVI, AP/CSE
17
VEC – IV Semester – II Year – 23IT204T –
CSE

Classic Query Engine and SQL Query Engine, etc. Classic query engine handles
all non-SQL queries but SQL query engine won't handle logical files.
Following is a simple diagram showing SQL Architecture

Overview of the SQL Query Language


The SQL language has several parts:
 Data-definition language(DDL).The SQL DDL provides commands for
defining relation schemas, deleting relations, and modifying relation
schemas.
 Data-manipulation language (DML). The SQL DML provides the
ability to query information from the database and to insert tuples into,
delete tuples from, and modify tuples in the database.
 Integrity. The SQL DDL includes commands for specifying integrity
constraints that the data stored in the database must satisfy. Updates that
violate integrity constraints are disallowed.
 View definition. The SQL DDL includes commands for defining views.
 Transaction control. SQL includes commands for specifying the
beginning and ending of transactions.
 Embedded SQL and dynamic SQL. Embedded and dynamic SQL define
how SQL statements can be embedded within general-purpose
programming languages, such as C, C++, and Java. Authorization. The
SQL DDL includes commands for specifying access rights to relations and
views.
Dr.S.SRIDEVI, AP/CSE
18
VEC – IV Semester – II Year – 23IT204T –
CSE

Basic Types
The SQL standard supports a variety of built-in types, including:
 char(n): A fixed-length character string with user-specified length n. The
full form, character, can be used instead.
 varchar(n): A variable-length character string with user-specified
maximum length n. The full form, character varying, is equivalent.
 int: An integer(a finite subsetof the integersthat ismachine
dependent).The full form, integer, is equivalent.
 smallint: A small integer (a machine-dependent subset of the integer
type).
 numeric(p,d):Afixed-pointnumberwithuser-specifiedprecision.Thenum-
ber consists of p digits (plus a sign), and d of the p digits are to the right
of the decimal point. Thus, numeric(3,1) allows 44.5 to be stored exactly,
but neither 444.5 or 0 .32 can be stored exactly in a field of this type.
 real, double precision: Floating-point and double-precision floating-
point numbers with machine-dependent precision.
 float(n): A floating-point number, with precision of at least n digits.

Integrity Constraints
 Integrity constraints ensure that changes made to the database by
authorized users do not result in a loss of data consistency.
 Thus, integrity constraints guard against accidental damage to the
database.
Integrity constraints include
o not null
o unique
o check(<predicate>)
1. Not Null Constraint
name varchar(20) not null budget numeric(12,2) not null

The not null specification prohibits the insertion of a null value for the attribute.
Any database modification that would cause a null to be inserted in an attribute
declared to be not null generates an error diagnostic.

2. Unique Constraint
SQL also supports an integrity constraint:
unique (Aj1, Aj2,...,Ajm)

The unique specification says that attributes Aj1, Aj2,...,Ajm form a candidate
key; that is, no two tuples in the relation can be equal on all the listed
attributes.

Dr.S.SRIDEVI, AP/CSE
19
VEC – IV Semester – II Year – 23IT204T –
CSE

However, candidate key attributes are permitted to be null unless they have
explicitly been declared to be not null.

3. The check Clause


 The clause check(P) specifies a predicate P that must be satisfied by every
tuple in a relation.
 A common use of the check clause is to ensure that attribute values
satisfy specified conditions, in effect creating a powerful type system.
 For instance, a clause check (budget > 0) in the create table command
for relation department would ensure that the value of budget is
nonnegative. As another example, consider the following:
create table section (course id varchar (8), sec id varchar (8), semester
varchar (6), year numeric (4,0),building varchar (15), primary key
(course id, sec id, semester, year), check (semester in (’Fall’, ’Winter’,
’Spring’, ’Summer’)));
4. Referential Integrity
 To ensure that a value that appears in one relation for a given set of
attributes also appears for a certain set of attributes in another relation.
 This condition is called referential integrity.
More generally, let r1 and r2 be relations whose set of attributes are R1 and R2,
respectively, with primary keys K1 and K2. We say that a subset of R2 is a
foreign key referencing K1 in relation r1 if it is required that, for every tuple t2
in r2, there must be a tuplet1 in r1 such that t1.K1 = t2.. Requirements of
this form are called referential-integrity constraints, or subset
dependencies.
create table course (course id varchar (8), title varchar (50), dept name
varchar (20), credits numeric (2,0) check (credits > 0), primary key
(course id), foreign key (dept name) references department)

DDL (Data Definition Language) : DDL or Data Definition Language actually


consists of the SQL commands that can be used to define the database schema.
It simply deals with descriptions of the database schema and is used to create
and modify the structure of database objects in database.

Examples of DDL commands:

 CREATE – is used to create the database or its objects (like table, index,
function, views, store procedure and triggers).
 DROP – is used to delete objects from the database.
 ALTER-is used to alter the structure of the database.
 TRUNCATE–is used to remove all records from a table, including all
spaces allocated for the records are removed.
 COMMENT –is used to add comments to the data dictionary.
 RENAME –is used to rename an object existing in the database.
Dr.S.SRIDEVI, AP/CSE
20
VEC – IV Semester – II Year – 23IT204T –
CSE

DML(Data Manipulation Language) : The SQL commands that deals with


the manipulation of data present in database belong to DML or Data
Manipulation Language and this includes most of the SQL statements.
Examples of DML:

 SELECT – is used to retrieve data from the a database.


 INSERT – is used to insert data into a table.
 UPDATE – is used to update existing data within a table.
 DELETE – is used to delete records from a database table.

DCL(Data Control Language) : DCL includes commands such as GRANT and


REVOKE which mainly deals with the rights, permissions and other controls of
the database system.
Examples of DCL commands:

 GRANT-gives user’s access privileges to database.


 REVOKE-withdraw user’s access privileges given by using the GRANT
command.

TCL(transaction Control Language) : TCL commands deals with the


transaction within the database.
Examples of TCL commands:

 COMMIT– commits a Transaction.


 ROLLBACK– rollbacks a transaction in case of any error occurs.
 SAVEPOINT–sets a save point within a transaction.
 SET TRANSACTION–specify characteristics for the transaction.

Dr.S.SRIDEVI, AP/CSE
21
VEC – IV Semester – II Year – 23IT204T –
CSE

Dr.S.SRIDEVI, AP/CSE
22
VEC – IV Semester – II Year – 23IT204T –
CSE

Underlined Column names are Primary Key Attributes

DDL (Data Definition Language)

CREATE TABLE
 Specifies a new base relation by giving it a name, and specifying each of
its attributes and their data types (INTEGER, FLOAT, DECIMAL(i,j),
CHAR(n), VARCHAR(n))
 A constraint NOT NULL may be specified on an attribute
In SQL2, can use the CREATE TABLE command for specifying the primary
key attributes, secondary keys, and referential integrity constraints
(foreign keys).
 Key attributes can be specified via the PRIMARY KEY and UNIQUE
phrases

CREATE TABLE DEPARTMENT


( DNAME VARCHAR(10) NOT NULL, DNUMBER
INTEGER NOT NULL, MGRSSN
CHAR(9),MGRSTARTDATE CHAR(9) );

CREATE TABLE DEPT


Dr.S.SRIDEVI, AP/CSE
23
VEC – IV Semester – II Year – 23IT204T –
CSE

( DNAME VARCHAR(10) NOT NULL, DNUMBER INTEGER


NOT NULL, MGRSSN CHAR(9), MGRSTARTDATE
CHAR(9), PRIMARY KEY (DNUMBER), UNIQUE (DNAME),
FOREIGN KEY (MGRSSN) REFERENCES EMP );
DROP TABLE
 Used to remove a relation (base table) and its definition
 The relation can no longer be used in queries, updates, or any other
commands since its description no longer exists

DROP TABLE DEPENDENT;

TRUNCATE
TRUNCATE removes all rows from a table. The operation cannot be
rolled back and no triggers will be fired. As such, TRUCATE is faster and
doesn't use as much undo space as a DELETE.
TRUNCATE TABLE emp;

ALTER TABLE

 Used to add an attribute to one of the base relations


 The new attribute will have NULLs in all the tuples of the relation right
after the command is executed; hence, the NOT NULL constraint is not
allowed for such an attribute

ALTER TABLE EMPLOYEE ADD JOB VARCHAR (12);

The database users must still enter a value for the new attribute JOB for
each EMPLOYEE tuple. This can be done using the UPDATE command.
REFERENTIAL INTEGRITY OPTIONS

We can specify RESTRICT, CASCADE, SET NULL or SET DEFAULT on


referential integrity constraints (foreign keys)

CREATE TABLE DEPT


( DNAME VARCHAR(10) NOT NULL, DNUMBER INTEGER
NOT NULL, MGRSSN CHAR(9),
MGRSTARTDATE CHAR(9), PRIMARY KEY (DNUMBER),
UNIQUE (DNAME),
FOREIGN KEY (MGRSSN) REFERENCES EMP ON DELETE
SET DEFAULT ON UPDATE CASCADE );

CREATE TABLE EMP


Dr.S.SRIDEVI, AP/CSE
24
VEC – IV Semester – II Year – 23IT204T –
CSE

( ENAME VARCHAR(30) NOT NULL, ESSN CHAR(9),


BDATE DATE,
DNO INTEGER DEFAULT 1, SUPERSSN CHAR(9), PRIMARY
KEY (ESSN),
FOREIGN KEY (DNO) REFERENCES DEPT ON DELETE SET
DEFAULT ON UPDATE CASCADE,
FOREIGN KEY (SUPERSSN) REFERENCES EMP ON DELETE
SET NULL ON UPDATE CASCADE );
DML(Data Manipulation Language)

 SQL has one basic statement for retrieving information from a database; the
SELECT statement
 This is not the same as the SELECT operation of the relational algebra
 Important distinction between SQL and the formal relational model; SQL
allows a table (relation) to have two or more tuples that are identical in all
their attribute values
 Hence, an SQL relation (table) is a multi-set (sometimes called a bag) of
tuples; it is not a set of tuples
 SQL relations can be constrained to be sets by specifying PRIMARY KEY or
UNIQUE attributes, or by using the DISTINCT option in a query
 Basic form of the SQL SELECT statement is called a mapping or a SELECT-
FROM-WHERE block

SELECT <attribute list> FROM <table list> WHERE


<condition>

 <attribute list> is a list of attribute names whose values are to be retrieved


by the query
 <table list> is a list of the relation names required to process the query
 <condition> is a conditional (Boolean) expression that identifies the tuples to
be retrieved by the query

Dr.S.SRIDEVI, AP/CSE
25
VEC – IV Semester – II Year – 23IT204T –
CSE

Query 0: Retrieve the birthdate and address of the employee whose


name is 'John B. Smith'.

SELECT BDATE, ADDRESS FROM EMPLOYEE


WHERE FNAME='John' AND MINIT='B’ AND
LNAME='Smith’

Similar to a SELECT-PROJECT pair of relational algebra operations; the


SELECT- clause specifies the projection attributes and the WHERE-clause
specifies the selection condition. However, the result of the query may
contain duplicate tuples.

Query 1: Retrieve the name and address of all employees who work
for the 'Research' department.

SELECT FNAME, LNAME, ADDRESS FROM EMPLOYEE,


DEPARTMENT
WHERE DNAME='Research' AND DNUMBER=DNO

Query 2: For every project located in 'Stafford', list the project


number, the controlling department number, and the department
manager's last name, address, and birthdate.

SELECT PNUMBER, DNUM, LNAME, BDATE, ADDRESS FROM


PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN AND
PLOCATION='Stafford'

In Q2, there are two join conditions


The join condition DNUM=DNUMBER relates a project to its controlling
department
The join condition MGRSSN=SSN relates the controlling department to the
employee who manages that department
Query 3: For each employee, retrieve the employee's name, and the
name of his or her immediate supervisor.

SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME FROM


EMPLOYEE E S
WHERE E.SUPERSSN=S.SSN

In Q3, the alternate relation names E and S are called aliases or tuple
variables for the EMPLOYEE relation

Query 4: Retrieve the SSN values for all employees.

Dr.S.SRIDEVI, AP/CSE
26
VEC – IV Semester – II Year – 23IT204T –
CSE

SELECT SSN FROMEMPLOYEE

Query 5:
SELECT SSN, DNAME
FROM EMPLOYEE, DEPARTMENT
If more than one relation is specified in the FROM-clause and there is no join
condition, then the CARTESIAN PRODUCT of tuples is selected

To retrieve all the attribute values of the selected tuples, a * is used,


which stands for all the attributes
Examples:

SELECT * FROMEMPLOYEE
WHERE DNO=5

SELECT *
FROMEMPLOYEE, DEPARTMENT WHERE
DNAME='Research' AND
DNO=DNUMBER
USE OF DISTINCT
 SQL does not treat a relation as a set; duplicate tuples can appear
 To eliminate duplicate tuples in a query result, the keyword DISTINCT is
used
 For example, the result of Q6 may have duplicate SALARY values whereas
Q11A does not have any duplicate values
Q6: SELECT SALARY
FROMEMPLOYEE
Q6A: SELECT DISTINCT SALARY FROMEMPLOYEE

SET OPERATIONS
 SQL has directly incorporated some set operations
 There is a union operation (UNION), and in some versions of SQL
there are set difference (MINUS) and intersection (INTERSECT)
operations
 The resulting relations of these set operations are sets of tuples;
duplicate tuples are eliminated from the result
 The set operations apply only to union compatible relations ; the two
relations must have the same attributes and the attributes must appear
in the same order
Query 7: Make a list of all project numbers for projects that involve an
employee whose last name is 'Smith' as a worker or as a manager of the
Dr.S.SRIDEVI, AP/CSE
27
VEC – IV Semester – II Year – 23IT204T –
CSE

department that controls the project.

(SELECT PNAME
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNUM=DNUMBER AND MGRSSN=SSN AND
LNAME='Smith')

UNION

(SELECT PNAME
FROM PROJECT, WORKS_ON, EMPLOYEE WHERE
PNUMBER=PNO AND ESSN=SSN AND LNAME='Smith')
NESTING OF QUERIES
 A complete SELECT query, called a nested query , can be specified
within the WHERE-clause of another query, called the outer query

Query 8: Retrieve the name and address of all employees who work for
the 'Research' department.

SELECT FNAME, LNAME, ADDRESS FROM EMPLOYEE


WHERE DNO IN (SELECT DNUMBER FROM DEPARTMENT
WHERE DNAME='Research' )

 The nested query selects the number of the 'Research' department


 The outer query select an EMPLOYEE tuple if its DNO value is in the
result of either nested query
 The comparison operator IN compares a value v with a set (or multi-
set) of values V, and evaluates to TRUE if v is one of the elements in V

CORRELATED NESTED QUERIES


 If a condition in the WHERE-clause of a nested query references an
attribute of a relation declared in the outer query, the two queries are
said to be correlated
 The result of a correlated nested query is different for each tuple (or
combination of tuples) of the relation(s) the outer query

Query 9: Retrieve the name of each employee who has a dependent with
the same first name as the employee.

Q9: SELECT E.FNAME, E.LNAME FROM EMPLOYEE AS E


WHERE E.SSN IN
(SELECT ESSN FROMDEPENDENT WHERE
ESSN=E.SSN AND

Dr.S.SRIDEVI, AP/CSE
28
VEC – IV Semester – II Year – 23IT204T –
CSE

E.FNAME=DEPENDENT_NAME)
 In Q9, the nested query has a different result for each tuple in the
outer query
 A query written with nested SELECT... FROM... WHERE... blocks and
using the = or IN comparison operators can always be expressed as a
single block query. For example, Q9 may be written as in Q9A

Q9A
SELECTE.FNAME, E.LNAME
FROM EMPLOYEE E, DEPENDENT D WHERE E.SSN=D.ESSN
AND E.FNAME=D.DEPENDENT_NAME

Relational database systems are expected to be equipped with a query language


that can assist its users to query the database instances. There are two kinds of
query languages − relational algebra and relational calculus.

RELATIONAL ALGEBRA

Relational algebra is a procedural query language, which takes instances of


relations as input and yields instances of relations as output. It uses operators to
perform queries. An operator can be either unary or binary. They accept
relations as their input and yield relations as their output. Relational algebra is
performed recursively on a relation and intermediate results are also considered
relations.
The fundamental operations of relational algebra are as follows −

 Select
 Project
 Union
 Set different
 Cartesian product
 Rename

Dr.S.SRIDEVI, AP/CSE
29
VEC – IV Semester – II Year – 23IT204T –
CSE

We will discuss all these operations in the following sections.

Select Operation (σ)

It selects tuples that satisfy the given predicate from a relation.


Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is
prepositional logic formula which may use connectors like and, or, and not.
These terms may use relational operators like − =, ≠, ≥, < , >, ≤.
For example –
σmarks >= 80(DBMS_marks)
select * from DBMS_marks where marks >= 80;

σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
σsubject = "database" and price = "450"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is
450.
σsubject = "database" and price = 450 or year > 2010(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450
or those books published after 2010.
For example, to select the EMPLOYEE tuples whose department is
4, or those whose salary is greater than $30,000, we can individually specify each

Dr.S.SRIDEVI, AP/CSE
30
VEC – IV Semester – II Year – 23IT204T –
CSE

of
these two conditions with a SELECT operation as follows:
σDno = 4(EMPLOYEE)
σSalary > 30000(EMPLOYEE)
AND,OR
σ(Dno = 4 AND Salary > 25000) (EMPLOYEE)
SQL Query:
SELECT * FROM EMPLOYEE WHERE Dno=4 AND Salary>25000;
σ(Dno = 4 AND Salary > 25000) OR (Dno = 5 AND Salary > 30000) (EMPLOYEE)

Project Operation (∏)

It projects column(s) that satisfy a given predicate.


Notation − ∏A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example −
∏subject, author (Books)
Selects and projects columns named as subject and author from the relation
Books.

∏Lname,Fname,Salary (EMPLOYEE)
Selects and projects columns named as Lname,Fname and Salary from the
relation EMPLOYEE.

If the attribute list includes only non-key attributes of R, duplicate tuples are
likely to
occur. The PROJECT operation removes any duplicate tuples, so the result of the
PROJECT operation is a set of distinct tuples, and hence a valid relation. This is
known as duplicate elimination. For example, consider the following PROJECT
operation:

∏Sex, Salary (EMPLOYEE)


SQL Query:
SELECT DISTINCT Sex, Salary FROM EMPLOYEE;
Dr.S.SRIDEVI, AP/CSE
31
VEC – IV Semester – II Year – 23IT204T –
CSE

Rename Operation

The operation results do not have any


names. In general, for most queries, we need to apply several relational algebra
operations one after the other. Either we can write the operations as a single
relational algebra expression by nesting the operations, or we can apply one
operation at a time and create intermediate result relations. In the latter case, we
must
give names to the relations that hold the intermediate results. For example, to
retrieve the first name, last name, and salary of all employees who work in
department number 5, we must apply a SELECT and a PROJECT operation. We
can write a
single relational algebra expression, also known as an in-line expression, as
follows:
∏Fname, Lname, Salary(σDno =5 (EMPLOYEE))
Alternatively:
DEP5_EMPS  σDno =5 (EMPLOYEE)
RESULT  ∏Fname, Lname, Salary( DEP5_EMPS )

Relational Algebra Operations from Set theory

Dr.S.SRIDEVI, AP/CSE
32
VEC – IV Semester – II Year – 23IT204T –
CSE

Union, Intersection and Minus Operations

We can define the three operations UNION, INTERSECTION, and SET


DIFFERENCE
on two union-compatible relations R and S as follows:

■ UNION: The result of this operation, denoted by R ∪ S, is a relation that


includes all tuples that are either in R or in S or in both R and S. Duplicate
tuples are eliminated.

■ INTERSECTION: The result of this operation, denoted by R ∩ S, is a relation


that includes all tuples that are in both R and S.

■ SET DIFFERENCE (or MINUS): The result of this operation, denoted by


R – S, is a relation that includes all tuples that are in R but not in S.

Dr.S.SRIDEVI, AP/CSE
33
VEC – IV Semester – II Year – 23IT204T –
CSE

The Cartesian Product (Cross Product or Cross Join) Operation (X)

In general, the result of R(A1, A2, ..., An) × S(B1, B2, ..., Bm) is a relation Q with
degree n + m attributes Q(A1, A2, ..., An, B1, B2, ..., Bm), in that order.
The resulting relation Q has one tuple for each combination of tuples—one from R
and one from S. Hence, if R has n R tuples (denoted as |R| = nR), and S has nS
tuples,
then R × S will have nR * nS tuples.

Union Operation (∪)

It performs binary union between two given relations and is defined as −


r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s
Where r and s are either database relations or relation result set (temporary
relation).
For a union operation to be valid, the following conditions must hold −

Dr.S.SRIDEVI, AP/CSE
34
VEC – IV Semester – II Year – 23IT204T –
CSE

 r, and s must have the same number of attributes.


 Attribute domains must be compatible.
 Duplicate tuples are automatically eliminated.
∏ author (Books) ∪ ∏ author (Articles)
Output − Projects the names of the authors who have either written a book or an
article or both.

Set Difference (−)

The result of set difference operation is tuples, which are present in one relation
but are not in the second relation.
Notation − r − s
Finds all the tuples that are present in r but not in s.
∏ author (Books) − ∏ author (Articles)
Output − Provides the name of authors who have written books but not articles.

Cartesian Product (Χ)

Combines information of two different relations into one.


Notation − r Χ s
Where r and s are relations and their output will be defined as −
r Χ s = { q t | q ∈ r and t ∈ s}
σauthor = 'tutorialspoint'(Books Χ Articles)
Output − Yields a relation, which shows all the books and articles written by
tutorialspoint.

Rename Operation (ρ)

The results of relational algebra are also relations but without any name. The
rename operation allows us to rename the output relation. 'rename' operation is
denoted with small Greek letter rho ρ.
Notation − ρ x (E)
Where the result of expression E is saved with name of x.
Additional operations are −

 Set intersection
 Assignment
 Natural join
Dr.S.SRIDEVI, AP/CSE
35
VEC – IV Semester – II Year – 23IT204T –
CSE

Dr.S.SRIDEVI, AP/CSE
36
VEC – IV Semester – II Year – 23IT204T –
CSE

Example: Retrieve a list of names of each female employee’s dependents.

FEMALE_EMPS  σSex =’F’ (EMPLOYEE)


EMPNAMES  ∏Fname, Lname, Ssn( FEMALE_EMPS )

EMP_DEPENDENTS  EMPNAMES X DEPENDENT


ACTUAL_DEPENDENTS  σSsn =Essn (EMP_DEPENDENTS)

RESULT  ∏Fname, Lname, Dependent_name(ACTUAL_DEPENDENTS)

Dr.S.SRIDEVI, AP/CSE
37
VEC – IV Semester – II Year – 23IT204T –
CSE

Dr.S.SRIDEVI, AP/CSE
38
VEC – IV Semester – II Year – 23IT204T –
CSE

The Join Operation ( )

Used to combine related tuples from two relations into single “longer”
tuples.
The general form is:
R S
<join condition>
In JOIN, only combinations of tuples satisfying the join condition appear in the
result, whereas in the CARTESIAN PRODUCT all combinations of tuples are
included in the result.
Example: Retrieve the name of the manager of each department.
To get the manager’s name, we need to combine each department tuple with the
employee tuple whose Ssn value matches the Mgr_ssn value in the department
tuple. We do this by using the JOIN operation and then projecting the result over
the necessary attributes, as follows:

DEPT_MGR  DEPARTMENT Mgr_ssn=Ssn EMPLOYEE


RESULT  ∏Dept, Lname, Fname( DEPT_MGR )

Note that Mgr_ssn is a foreign key of the DEPARTMENT relation that references
Ssn, the primary key of the EMPLOYEE relation. This referential integrity
constraint plays a role in having matching tuples in the referenced relation
EMPLOYEE.
The JOIN operation can be specified as a CARTESIAN PRODUCT operation
followed by a SELECT operation. However, JOIN is very important because it is
used very frequently when specifying database queries. Consider the earlier
example illustrating CARTESIAN PRODUCT, which included the following
sequence of operations:
EMP_DEPENDENTS  EMPNAMES X DEPENDENT
ACTUAL_DEPENDENTS  σSsn =Essn (EMP_DEPENDENTS)

These two operations can be replaced with a single JOIN operation as follows:\

Dr.S.SRIDEVI, AP/CSE
39
VEC – IV Semester – II Year – 23IT204T –
CSE

ACTUAL_DEPENDENTS  EMPNAMES Ssn =Essn DEPENDENT

Example:
student(sid, sname,age,addr,deptno)
dept(dno,dname,head,noofemployees)
course(cid,title,sid)

1. Find the names of student who studied in CSE department

select sname from student, dept where dname = ‘CSE’ and student.deptno =
dept.dno

t1 -> ∏(sname,deptno) (student)

t2 -> ∏(dno,dname) (σdname = "CSE”(dept))

t3 -> t1 (t1.dno = t2.dno) t2

Query 1. Retrieve the name and address of all employees who work for the
‘Research’ department.

Query 2. For every project located in ‘Stafford’, list the project number, the
controlling department number, and the department manager’s last name,

Dr.S.SRIDEVI, AP/CSE
40
VEC – IV Semester – II Year – 23IT204T –
CSE

address, and birth date.

Variations of Join: The EQUIJOIN and NATURAL JOIN


The most common use of JOIN involves join conditions with equality comparisons
only. Such a JOIN, where the only comparison operator used is =, is called an
EQUIJOIN.
Both previous examples were EQUIJOINs.
Notice that in the result of an EQUIJOIN we always have one or more pairs of
attributes that have identical values in every tuple. For example, in Figure 6.6,
the values of the attributes Mgr_ssn and Ssn are identical in every tuple of
DEPT_MGR (the EQUIJOIN result) because the equality join condition specified
on these two attributes requires the values to be identical in every tuple in the
result.

Because one of each pair of attributes with identical values is superfluous, a


new operation called NATURAL JOIN—denoted by * —was created to get rid of
the second (superfluous) attribute in an EQUIJOIN condition. The standard
definition of NATURAL JOIN requires that the two join attributes (or each pair of
join attributes) have the same name in both relations. If this is not the case, a
renaming operation is applied first.

Suppose we want to combine each PROJECT tuple with the DEPARTMENT


tuple that controls the project. In the following example, first we rename the
Dnumber attribute of DEPARTMENT to Dnum—so that it has the same name as
the Dnum attribute in PROJECT—and then we apply NATURAL JOIN

Dr.S.SRIDEVI, AP/CSE
41
VEC – IV Semester – II Year – 23IT204T –
CSE

Dr.S.SRIDEVI, AP/CSE
42
VEC – IV Semester – II Year – 23IT204T –
CSE

The general form is:


* (<list1>),(<list2>) S
Q R

<list1> - specifies a list of i attributes from R


<list2> - specifies a list of i attributes from S
Division Operation

Dr.S.SRIDEVI, AP/CSE
43
VEC – IV Semester – II Year – 23IT204T –
CSE

Division Operator (÷): Division operator A÷B can be applied if and only if:
 Attributes of B is proper subset of Attributes of A.

Dr.S.SRIDEVI, AP/CSE
44
VEC – IV Semester – II Year – 23IT204T –
CSE

 The relation returned by division operator will have attributes = (All


attributes of A – All Attributes of B)
 The relation returned by division operator will return those tuples from
relation A which are associated to every B’s tuple.
Consider the relation STUDENT_SPORTS and ALL_SPORTS given in Table 2 and
Table 3 above.
To apply division operator as
STUDENT_SPORTS÷ ALL_SPORTS
 The operation is valid as attributes in ALL_SPORTS is a proper subset of
attributes in STUDENT_SPORTS.
 The attributes in resulting relation will have attributes {ROLL_NO,SPORTS}-
{SPORTS}=ROLL_NO
 The tuples in resulting relation will have those ROLL_NO which are
associated with all B’s tuple {Badminton, Cricket}. ROLL_NO 1 and 4 are
associated to Badminton only. ROLL_NO 2 is associated to all tuples of B. So
the resulting relation will be:

ROLL_N
O

The DIVISION operation, denoted by ÷, is useful for a special kind of query that
sometimes occurs in database applications.
An example is Retrieve the names of employees who work on all the projects that
‘John Smith’ works on. To express this query using the DIVISION operation,
proceed as follows. First, retrieve the list of project numbers that ‘John Smith’
works on in the intermediate relation SMITH_PNOS:

Dr.S.SRIDEVI, AP/CSE
45
VEC – IV Semester – II Year – 23IT204T –
CSE

Dr.S.SRIDEVI, AP/CSE
46
VEC – IV Semester – II Year – 23IT204T –
CSE

Query 3. Find the names of employees who work on all the projects
controlled
by department number 5.

Dr.S.SRIDEVI, AP/CSE
47
VEC – IV Semester – II Year – 23IT204T –
CSE

DEPT
5_PROJS RESULT_EMP_SSNS
Pn Ss
o n
RESULT
Lname Fname

Ssn,pno div pno


In this query, we first create a table DEPT5_PROJS that contains the project
numbers of all projects controlled by department 5. Then we create a table
EMP_PROJ that holds (Ssn, Pno) tuples, and apply the division operation. Notice
that we renamed the attributes so that they will be correctly used in the division
operation. Finally, we join the result of the division, which holds only Ssn values,
with the EMPLOYEE table to retrieve the desired attributes from EMPLOYEE.

Query 4. Make a list of project numbers for projects that involve an


employee whose last name is ‘Smith’, either as a worker or as a manager
of the department that controls the project.

Dr.S.SRIDEVI, AP/CSE
48
VEC – IV Semester – II Year – 23IT204T –
CSE

In this query, we retrieved the project numbers for projects that involve an
employee named Smith as a worker in SMITH_WORKER_PROJS.
SMITHS(Essn)  ∏Ssn(σLname = ‘Smith’ (EMPLOYEE))
Then we retrieved the project numbers for projects that involve an employee
named Smith as manager of the department that controls the project in
SMITH_MGR_PROJS.
SMITH_WORKER_PROJS  ∏Pno(WORKS_ON * SMITHS)

MGRS  ∏Lname,Dnumber (EMPLOYEE Ssn = Mgr_Ssn DEPARTMENT)


SMITH_MANAGED_DEPTS(Dnum)  ∏Dnumber(σLname =‘Smith’ (MGRS))
SMITH_MGR_PROJS(Pno)  ∏Pnumber(SMITH_MANAGED_DEPTS * PROJECT)
Finally, we applied the UNION operation on SMITH_WORKER_PROJS and
SMITH_MGR_PROJS.
RESULT  (SMITH_WORKER_PROJS ∪ SMITH_MGR_PROJS)
As a single in-line expression, this query becomes,

Aggregate Functions and Grouping


Common functions applied to collections of numeric values include SUM,
AVERAGE, MAXIMUM, and MINIMUM. The COUNT function is used for counting
tuples or values.

Dr.S.SRIDEVI, AP/CSE
49
VEC – IV Semester – II Year – 23IT204T –
CSE

Another common type of request involves grouping the tuples in a relation by the
value of some of their attributes and then applying an aggregate function
independently to each group.
An example would be to group EMPLOYEE tuples by Dno, so that each group
includes the tuples for employees working in the same department. We can then
list each Dno value along with, say, the average salary of employees within the

We can define an AGGREGATE FUNCTION operation, using the symbol ℑ


department, or the number of employees who work in the department.

<grouping attributes>ℑ<function list> (R)


(pronounced script F), to specify these types of requests as follows:

For example, to retrieve each department number, the number of employees in


the department, and their average salary, while renaming the resulting attributes
as indicated below, we write:

Query 5. List the names of all employees with two or more dependents.
We have to use the AGGREGATE FUNCTION operation with the COUNT
aggregate function. We assume that dependents of the same employee have
distinct Dependent_name values.

Query 6. Retrieve the names of employees who have no dependents.


This is an example of the type of query that uses the MINUS (SET DIFFERENCE)
operation.
We first retrieve a relation with all employee Ssns in ALL_EMPS. Then we create
a table with the Ssns of employees who have at least one dependent in
EMPS_WITH_DEPS. Then we apply the SET DIFFERENCE operation to retrieve
employees Ssns with no dependents in EMPS_WITHOUT_DEPS, and finally join
this with EMPLOYEE to retrieve the desired attributes.

Dr.S.SRIDEVI, AP/CSE
50
VEC – IV Semester – II Year – 23IT204T –
CSE

As a single in-line expression, this query becomes,

Query 7. List the names of managers who have atleast one dependent.
In this query, we retrieve the Ssns of managers in MGRS, and the Ssns of
employees with at least one dependent in EMPS_WITH_DEPS, then we apply the
SET INTERSECTION operation to get the Ssns of managers who have at least one
dependent.

Constraints on Relational database model


On modeling the design of the relational database we can put some restrictions
like what values are allowed to be inserted in the relation, what kind of
Dr.S.SRIDEVI, AP/CSE
51
VEC – IV Semester – II Year – 23IT204T –
CSE

modifications and deletions are allowed in the relation. These are the restrictions
we impose on the relational database.

Constraints in the databases can be categorized into 3 main categories:


1. Constraints that are applied in the data model is called Implicit constraints.
2. Constraints that are directly applied in the schemas of the data model, by
specifying them in the DDL(Data Definition Language). These are called
as schema-based constraints or Explicit constraints.
3. Constraints that cannot be directly applied in the schemas of the data model.
We call these Application based or semantic constraints.
So here we will deal with Implicit constraints.
Mainly Constraints on the relational database are of 4 types:
1. Domain constraints
2. Key constraints
3. Entity Integrity constraints
4. Referential integrity constraints
Let discuss each of the above constraints in detail.
1. Domain constraints :
1. Every domain must contain atomic values(smallest indivisible units) it means
composite and multi-valued attributes are not allowed.
2. We perform datatype check here, which means when we assign a data type to
a column we limit the values that it can contain. Eg. If we assign the datatype
of attribute age as int, we can’t give it values other then int datatype.

Explanation:
In the above relation, Name is a composite attribute and Phone is a multi-values
attribute, so it is violating domain constraint.

2. Key Constraints or Uniqueness Constraints :


1. These are called uniqueness constraints since it ensures that every tuple in
the relation should be unique.
2. A relation can have multiple keys or candidate keys(minimal superkey), out of
which we choose one of the keys as primary key, we don’t have any
restriction on choosing the primary key out of candidate keys, but it is
suggested to go with the candidate key with less number of attributes.
3. Null values are not allowed in the primary key, hence Not Null constraint is
also a part of key constraint.

Dr.S.SRIDEVI, AP/CSE
52
VEC – IV Semester – II Year – 23IT204T –
CSE

Explanation:
In the above table, EID is the primary key, and first and the last tuple has the
same value in EID ie 01, so it is violating the key constraint.

3. Entity Integrity Constraints :


1. Entity Integrity constraints says that no primary key can take NULL value,
since using primary key we identify each tuple uniquely in a relation.

Explanation:
In the above relation, EID is made primary key, and the primary key can’t take
NULL values but in the third tuple, the primary key is null, so it is a violating
Entity Integrity constraints.

4. Referential Integrity Constraints:


1. The Referential integrity constraints is specified between two relations or
tables and used to maintain the consistency among the tuples in two
relations.
2. This constraint is enforced through foreign key, when an attribute in the
foreign key of relation R1 have the same domain(s) as the primary key of
relation R2, then the foreign key of R1 is said to reference or refer to the
primary key of relation R2.
3. The values of the foreign key in a tuple of relation R1 can either take the
values of the primary key for some tuple in relation R2, or can take NULL
values, but can’t be empty.
Example:

Dr.S.SRIDEVI, AP/CSE
53
VEC – IV Semester – II Year – 23IT204T –
CSE

create table emp(eid integer, name varchar2(20), dno integer primary


key)
create table dplace(dno integer, place varchar2(20),foreign key(dno) references
emp(dno));
Explanation:
In the above, DNO of the first relation is the foreign key, and DNO in the second
relation is the primary key. DNO = 22 in the foreign key of the first table is not
allowed since DNO = 22
is not defined in the primary key of the second relation. Therefore Referential
integrity constraints is violated here

Dr.S.SRIDEVI, AP/CSE
54
VEC – IV Semester – II Year – 23IT204T –
CSE

File System vs DBMS – Difference between File System and DBMS

Dr.S.SRIDEVI, AP/CSE
55
VEC – IV Semester – II Year – 23IT204T –
CSE

Dr.S.SRIDEVI, AP/CSE
56
VEC – IV Semester – II Year – 23IT204T –
CSE

Dr.S.SRIDEVI, AP/CSE
57
VEC – IV Semester – II Year – 23IT204T –
CSE

Super Key Vs Candidate Key

BASIS FOR
SUPER KEY CANDIDATE KEY
COMPARISON

Basic A single attribute or a set of A proper subset of a

attributes that uniquely super key, which is also a

identifies all attributes in a super key is a candidate

relation is super key. key.

One in other It is not compulsory that all All candidate keys are

super keys will be candidate super keys.

keys.

Selection The set of super keys forms The set of candidate keys

the base for selection of form the base for

candidate keys. selection of a single

primary key.

Count There are comparatively more There are comparatively

super keys in a relation. less candidate keys in a

relation.

Dr.S.SRIDEVI, AP/CSE
58
VEC – IV Semester – II Year – 23IT204T –
CSE

SQL QUERIES
Query 1. Retrieve the name and address of all employees who work for the ‘Research’
department.
Q1: SELECT Fname, Lname, Address
FROM EMPLOYEE, DEPARTMENT
WHERE Dname=‘Research’ AND Dnumber=Dno;

Q1A: SELECT Fname, Lname, Address


FROM (EMPLOYEE JOIN DEPARTMENT ON Dno=Dnumber)
WHERE Dname=‘Research’;

Q1B: SELECT Fname, Lname, Address


FROM (EMPLOYEE NATURAL JOIN
(DEPARTMENT AS DEPT (Dname, Dno, Mssn, Msdate)))
WHERE Dname=‘Research’;
Query 2. For every project located in ‘Stafford’, list the project number, the controlling
department number, and the department manager’s last name, address, and birth date.
Q2: SELECT Pnumber, Dnum, Lname, Address, Bdate
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND
Plocation=‘Stafford’;

Q2A: SELECT Pnumber, Dnum, Lname, Address, Bdate


FROM ((PROJECT JOIN DEPARTMENT ON Dnum=Dnumber)
JOIN EMPLOYEE ON Mgr_ssn=Ssn)
WHERE Plocation=‘Stafford’;
Query 3: To find the names of employees who work on all the projects controlled by
department number 5
SELECT E.EmpName
FROM Emp E
WHERE NOT EXISTS (
SELECT P.ProjId
FROM Project P
WHERE P.ControlledByDeptNo = 5
AND NOT EXISTS (
SELECT W.EmpId
FROM WorksOn W
WHERE W.ProjId = P.ProjId AND W.EmpId = E.EmpId
)
);

Query 4. Make a list of all project numbers for projects that involve an employee whose
last name is ‘Smith’, either as a worker or as a manager of the department that controls
the project.
Q4A: (SELECT DISTINCT Pnumber
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum=Dnumber AND Mgr_ssn=Ssn
AND Lname=‘Smith’ )

Dr.S.SRIDEVI, AP/CSE
59
VEC – IV Semester – II Year – 23IT204T –
CSE

UNION
( SELECT DISTINCT Pnumber
FROM PROJECT, WORKS_ON, EMPLOYEE
WHERE Pnumber=Pno AND Essn=Ssn
AND Lname=‘Smith’ );
Query 5: List the names of all employees with two or more dependents
SELECT E.EmpName
FROM Emp E
JOIN Dependent D ON E.EmpId = D.EmpId
GROUP BY E.EmpId, E.EmpName
HAVING COUNT(D.DependentId) >= 2;
Query 6. Retrieve the names of employees who have no dependents.
Q6: SELECT Fname, Lname
FROM EMPLOYEE
WHERE NOT EXISTS ( SELECT *
FROM DEPENDENT
WHERE Ssn=Essn );
Query 7. List the names of managers who have at least one dependent.
Q7: SELECT Fname, Lname
FROM EMPLOYEE
WHERE EXISTS ( SELECT *
FROM DEPENDENT
WHERE Ssn=Essn )
AND
EXISTS ( SELECT *
FROM DEPARTMENT
WHERE Ssn=Mgr_ssn );
Query 8. For each employee, retrieve the employee’s first and last name and the first
and last name of his or her immediate supervisor.
Q8: SELECT E.Fname, E.Lname, S.Fname, S.Lname
FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.Super_ssn=S.Ssn;
Query 18. Retrieve the names of all employees who do not have supervisors.
Q18: SELECT Fname, Lname
FROM EMPLOYEE
WHERE Super_ssn IS NULL;
Query 19. Find the sum of the salaries of all employees, the maximum salary, the
minimum salary, and the average salary.
Q19: SELECT SUM (Salary), MAX (Salary), MIN (Salary), AVG (Salary)
FROM EMPLOYEE;
Queries 21 and 22. Retrieve the total number of employees in the company (Q21) and
the number of employees in the ‘Research’ department (Q22).
Q21: SELECT COUNT (*)
FROM EMPLOYEE;
Q22: SELECT COUNT (*)
FROM EMPLOYEE, DEPARTMENT
WHERE DNO=DNUMBER AND DNAME=‘Research’;

ASSIGNMENT 1:
Dr.S.SRIDEVI, AP/CSE
60
VEC – IV Semester – II Year – 23IT204T –
CSE

1) Consider the following relational schemas for the employee database. CO1
C5
Employee(ename,city,street)
Works(ename,companyname,salary)
Company(companyname,city)
Manages(ename,mname)
Write both the SQL and relational algebra query for the following:
1. Find the names of all employees who work for the ‘fbc’.
2. Find the names, street addresses and cities of all employees who work for the ‘fbc’ and
earn more than 200000 per annum.
3. Find the names of all employees in this database who live in the same city as the
company for which they work.
4. Find the names of all employees who earn more than every employee of sbc.
5. Find the names of all employees who do not work for fbc.
6. Find the names of all employees who do not have manager.
7. Find the names of all employees who do have manager.
8. Find the companyname which has got the highest total salary.
9. Find the names of all employees whose first letter is ‘a’.
10. Delete the record of employee ‘john’.
11. Display the details of all employees in a sorted order.
12. Find the names of all employees who are living in Chennai.
13. Find the names of all employees who are not living in Bangalore and Hyderabad.
14. Find the names of all employees where fifth letter is ‘y’.
15. Change the name of the companyname column to cname.

Solutions for Assignment 1:


1. Find the names of all employees who work for the ‘fbc’.

SQL :

SQL> select ename from works where companyname='fbc';

ENAME
--------------------
ajith
siva

Relational Algebra:

∏ename(σcompanyname=’fbc’(works))
2. Find the names, street addresses and cities of all employees who work for the ‘fbc’ and
earn more than 200000 per annum.

SQL :

SQL> select ename,street,city from employee where ename in (select ename from works
Dr.S.SRIDEVI, AP/CSE
61
VEC – IV Semester – II Year – 23IT204T –
CSE

where companyname='fbc' and salary>200000);

ENAME STREET CITY


-------------------- -------------------- --------------------
ajith kamarajar chennai
siva kamarajar chennai

or

SQL> select works.ename,street,city from employee inner join works on


employee.ename=works.ename where companyname='fbc' and salary>200000;

ENAME STREET CITY


-------------------- -------------------- --------------------
ajith kamarajar chennai
siva kamarajar chennai

Relational Algebra:

t1  ∏ename(σ(companyname=’fbc’) and (salary > 200000)

(works))

t2 ∏ename,city,street(t1 * Employee)

3. Find the names of all employees in this database who live in the same city as the
company for which they work.

SQL :

SQL> select employee.ename from employee,works,company where


employee.ename=works.ename and employee.city=company.city and
works.companyname=company.companyname;

ENAME
--------------------
ajith
john
siva
vijay

Dr.S.SRIDEVI, AP/CSE
62
VEC – IV Semester – II Year – 23IT204T –
CSE

or

SQL> select employee.ename from employee inner join works on


employee.ename=works.ename inner join company on employee.city=company.city and
works.companyname=company.companyname;

ENAME
--------------------
ajith
john
siva
vijay

Relational Algebra:

t1  ∏ename,city(Employee)

t2  ∏ename,companyname(Works)

t3  ∏ companyname,city(Company)

t4  t1 t1.ename = t3.ename t3

t5  t2 t2.companyname = t3.companyname t3

t6  t4 t4.city = t5.city t5

Result  ∏ename(t6)
4. Find the names of all employees who earn more than every employee of sbc.

SQL :

SQL> select ename from works where salary>(select max(salary) from works where
companyname='sbc');

ENAME
--------------------
ajith
Dr.S.SRIDEVI, AP/CSE
63
VEC – IV Semester – II Year – 23IT204T –
CSE

vijay
siva

Relational Algebra:

5. Find the names of all employees who do not work for fbc.

SQL :
SQL> select ename from works where companyname not in ('fbc');

ENAME
--------------------
john
vijay

or

SQL> select ename from works where companyname!='fbc';

ENAME
--------------------
john
vijay

Relational Algebra:

∏ename(σcompanyname != ’fbc’(works))
6. Find the names of all employees who do not have manager.

SQL :

SQL> select ename from manages where mname is null;

ENAME
--------------------
john
siva

Relational Algebra:

∏ename(σmname = ’NULL’(manages))
Dr.S.SRIDEVI, AP/CSE
64
VEC – IV Semester – II Year – 23IT204T –
CSE

7. Find the names of all employees who do have manager.

SQL :
SQL> select ename from manages where mname is not null;

ENAME
--------------------
ajith
vijay

Relational Algebra:

∏ename(σmname != ’NULL’(manages))
8. Identify all possible primary key and foreign key.
SQL :

Relational Algebra:

9. Create a view to store the companyname and the number of employees in each
company and do all possible manipulations on the view. If not justify your answer.
SQL :
SQL> create view v1(cname,no_of_employees) as (select companyname,count(ename)
from works group by companyname);

View created.

SQL> select * from v1;

CNAME NO_OF_EMPLOYEES
-------------------- ---------------
bbc 1
fbc 2
sbc 1

10. Find the companyname which has got the highest total salary.
SQL :

SQL> select companyname from (select companyname,sum(salary) from works group by


companyname having sum(salary)=(select max(sum(salary)) from works group by

Dr.S.SRIDEVI, AP/CSE
65
VEC – IV Semester – II Year – 23IT204T –
CSE

companyname));

COMPANYNAME
--------------------
fbc

Relational Algebra:

11. Find the names of all employees whose first letter is ‘a’.
SQL :
SQL> select ename from employee where ename like 'a%';

ENAME
--------------------
ajith

Relational Algebra:

∏ename(σename = ’a%’(employee))
12. Delete the record of employee ‘john’.
SQL :
SQL> delete from employee where ename='john';

1 row deleted.

SQL> select * from employee;

ENAME CITY STREET


-------------------- -------------------- --------------------
ajith chennai kamarajar
vijay hyderabad rajiv
siva chennai kamarajar

13. Display the details of all employees in a sorted order.


SQL :
SQL> select * from employee order by ename;

Dr.S.SRIDEVI, AP/CSE
66
VEC – IV Semester – II Year – 23IT204T –
CSE

ENAME CITY STREET


-------------------- -------------------- --------------------
ajith chennai kamarajar
siva chennai kamarajar
vijay hyderabad rajiv

Relational Algebra:

14. Find the names of all employees who are living in Chennai.
SQL :
SQL> select ename from employee where city in 'chennai';

ENAME
--------------------
ajith
siva

Relational Algebra:

∏ename(σcity = ’chennai’(employee))
15. Find the names of all employees who are not living in Bangalore and Hyderabad.
SQL :

SQL> select ename from employee where city not in ('bangalore','hyderabad');

ENAME
--------------------
ajith
siva

∏ename(σ(city != ’bangalore’) and(city != ’hyderabad’)

(employee))
Relational Algebra:

Dr.S.SRIDEVI, AP/CSE
67
VEC – IV Semester – II Year – 23IT204T –
CSE

16. Update the salary of ‘bbc’ employees by Rs.1000.


SQL :
SQL> update works set salary=salary+1000 where companyname='bbc';

1 row updated.

SQL> select * from works;

ENAME COMPANYNAME SALARY


-------------------- -------------------- ---------
ajith fbc 300000
john sbc 150000
vijay bbc 201000
siva fbc 400000

17. Add a new column manager_no onto the manages relation.


SQL :
SQL> alter table manages add(manager_no number);

Table altered.

SQL> select * from manages;

ENAME MNAME MANAGER_NO


-------------------- -------------------- ----------
ajith smith
john
vijay henry
siva

18. Set a NOT NULL constraint for the salary column.


SQL :
SQL> create table works (ename varchar2(20),companyname varchar2(20),salary
number constraint con6 not null);

19. Find the names of all employees where fifth letter is ‘y’.
SQL :
SQL> select ename from employee where ename like '____y%';

ENAME
--------------------
vijay

Dr.S.SRIDEVI, AP/CSE
68
VEC – IV Semester – II Year – 23IT204T –
CSE

Relational Algebra:

∏ename(σename = ’____y%’(employee))
20. Find the names of all employees whose salary is in the range 100000 to 200000.
SQL :
SQL> select ename from works where salary between 100000 and 200000;

ENAME
--------------------
john

Relational Algebra:

∏ename(σ(salary >= 100000) and (salary <= 200000)(works))


21. Create a view to store the employee name and the companyname and find out the
type of view.
SQL :
SQL> create view v2 as (select ename,companyname from works);

View created.

SQL> select * from v2;

ENAME COMPANYNAME
-------------------- --------------------
ajith fbc
john sbc
vijay bbc
siva fbc

22. Change the name of the companyname column to cname.

SQL :

SQL> alter table works rename column companyname to cname;

Table altered.

Dr.S.SRIDEVI, AP/CSE
69
VEC – IV Semester – II Year – 23IT204T –
CSE

SQL> select * from works;

ENAME CNAME SALARY


-------------------- -------------------- ---------
ajith fbc 300000
john sbc 150000
vijay bbc 201000
siva fbc 400000

Dr.S.SRIDEVI, AP/CSE
70

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy