100% found this document useful (1 vote)
2K views28 pages

Cb3401-Unit 1

The document provides an overview of database management systems (DBMS) and their key components. It discusses the different types of database languages including data definition language (DDL) for defining schemas, data manipulation language (DML) for accessing and manipulating data, data control language (DCL) for managing user privileges, and transaction control language (TCL) for transaction management. It also describes relational databases and their data models, focusing on the relational model with tables, rows, and columns and how relationships can be established between tables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
2K views28 pages

Cb3401-Unit 1

The document provides an overview of database management systems (DBMS) and their key components. It discusses the different types of database languages including data definition language (DDL) for defining schemas, data manipulation language (DML) for accessing and manipulating data, data control language (DCL) for managing user privileges, and transaction control language (TCL) for transaction management. It also describes relational databases and their data models, focusing on the relational model with tables, rows, and columns and how relationships can be established between tables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

IV- SEM/II B.E. CSE (CS) Prepared By: R.

Reshma/AP/CSE

CB3401: DATABASE MANAGEMENT SYSTEMS AND SECURITY


LECTURE NOTES
UNIT I RELATIONAL DATABASES 9
Data Models – Relational Data Models – Relational Algebra – Structured Query Language – Entity-Relationship Model –
Mapping ER Models to Relations – Distributed Databases – Data Fragmentation – Replication

INTRODUCTION
A Database Management System (DBMS) is a software system that is designed to manage and organize data in a structured
manner. It allows users to create, modify, and query a database, as well as manage the security and access controls for that
database.

DBMS provides an environment to store and retrieve data conveniently and efficiently.
Key Features of DBMS
• Data modeling: A DBMS provides tools for creating and modifying data models, which define the structure and
relationships of the data in a database.
• Data storage and retrieval: A DBMS is responsible for storing and retrieving data from the database, and can
provide various methods for searching and querying the data.
• Concurrency control: A DBMS provides mechanisms for controlling concurrent access to the database, to ensure
that multiple users can access the data without conflicting with each other.
• Data integrity and security: A DBMS provides tools for enforcing data integrity and security constraints, such as
constraints on the values of data and access controls that restrict who can access the data.
• Backup and recovery: A DBMS provides mechanisms for backing up and recovering the data in the event of a
system failure.
• DBMS can be classified into two types: Relational Database Management System (RDBMS) and Non-Relational
Database Management System (NoSQL or Non-SQL)
• RDBMS: Data is organized in the form of tables and each table has a set of rows and columns. The data are related
to each other through primary and foreign keys.
• NoSQL: Data is organized in the form of key-value pairs, documents, graphs, or column-based. These are designed
to handle large-scale, high-performance scenarios.
A database is a collection of interrelated data that helps in the efficient retrieval, insertion, and deletion of data from the
database and organizes the data in the form of tables, views, schemas, reports, etc. For Example, a university database
organizes the data about students, faculty, admin staff, etc., which helps in the efficient retrieval, insertion, and deletion of
data from it.
Database Languages in DBMS
o A DBMS has appropriate languages and interfaces to express database queries and updates.
o Database languages can be used to read, store and update the data in the database.

Types of Database Languages

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

1. Data Definition Language (DDL)

o DDL stands for Data Definition Language. It is used to define database structure or pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of tables and schemas, their
names, indexes, columns in each table, constraints, etc.

Here are some tasks that come under DDL:

o Create: It is used to create objects in the database.


o Alter: It is used to alter the structure of the database.
o Drop: It is used to delete objects from the database.
o Truncate: It is used to remove all records from a table.
o Rename: It is used to rename an object.
o Comment: It is used to comment on the data dictionary.

These commands are used to update the database schema that's why they come under Data definition language.

2. Data Manipulation Language (DML)

DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a database. It handles user
requests.

Here are some tasks that come under DML:

o Select: It is used to retrieve data from a database.


o Insert: It is used to insert data into a table.
o Update: It is used to update existing data within a table.
o Delete: It is used to delete all records from a table.
o Merge: It performs UPSERT operation, i.e., insert or update operations.
o Call: It is used to call a structured query language or a Java subprogram.
o Explain Plan: It has the parameter of explaining data.
o Lock Table: It controls concurrency.

3. Data Control Language (DCL)

o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.

(But in Oracle database, the execution of data control language does not have the feature of rolling back.)

Here are some tasks that come under DCL:

o Grant: It is used to give user access privileges to a database.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

o Revoke: It is used to take back permissions from the user.

There are the following operations which have the authorization of Revoke:

CONNECT, INSERT, USAGE, EXECUTE, DELETE, UPDATE and SELECT.

4. Transaction Control Language (TCL)

TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical transaction.

Here are some tasks that come under TCL:

o Commit: It is used to save the transaction on the database.


o Rollback: It is used to restore the database to original since the last Commit.

ACID Properties in DBMS

DBMS is the management of data that should remain integrated when any changes are done in it. It is because if the integrity
of the data is affected, whole data will get disturbed and corrupted. Therefore, to maintain the integrity of the data, there are
four properties described in the database management system, which are known as the ACID properties. The ACID
properties are meant for the transaction that goes through a different group of tasks, and there we come to see the role of the
ACID properties.

Relational Databases
A relational database is a collection of information that organizes data in predefined relationships where data is stored in
one or more tables (or "relations") of columns and rows, making it easy to see and understand how different data structures
relate to each other. Relationships are a logical connection between different tables, established based on interaction among
these tables.
A relational database (RDB) is a way of structuring information in tables, rows, and columns. An RDB can establish links—
or relationships–between information by joining tables, which makes it easy to understand and gain insights about the
relationship between various data points.
Data Models
Data Model is the modeling of the data description, data semantics, and consistency constraints of the data. It provides the
conceptual tools for describing the design of a database at each level of data abstraction. Therefore, there are following four
data models used for understanding the structure of the database:

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

1) Relational Data Model: This type of model designs the data in the form of rows and columns within a table. Thus, a
relational model uses tables for representing data and in-between relationships. Tables are also called relations. This model
was initially described by Edgar F. Codd, in 1969. The relational data model is the widely used model which is primarily
used by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data as objects and relationships among
them. These objects are known as entities, and relationship is an association among these entities. This model was designed
by Peter Chen and published in 1976 papers. It was widely used in database designing. A set of attributes describe the
entities. For example, student_name, student_id describes the 'student' entity. A set of the same type of entities is known as
an 'Entity set', and the set of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of functions, encapsulation, and object identity,
as well. This model supports a rich type system that includes structured and collection types. Thus, in 1980s, various
database systems following the object-oriented approach were developed. Here, the objects are nothing but the data carrying
its properties.

4) Semistructured Data Model: This type of data model is different from the other three data models (explained above).
The semistructured data model allows the data specifications at places where the individual data items of the same type may
have different attributes sets. The Extensible Markup Language, also known as XML, is widely used for representing the
semistructured data. Although XML was initially designed for including the markup information to the text document, it
gains importance because of its application in the exchange of data.

The relational database model


Developed by E.F. Codd from IBM in the 1970s, the relational database model allows any table to be related to another
table using a common attribute. Instead of using hierarchical structures to organize data, Codd proposed a shift to using a
data model where data is stored, accessed, and related in tables without reorganizing the tables that contain them. Think of
the relational database as a collection of spreadsheet files that help businesses organize, manage, and relate data. In the
relational database model, each “spreadsheet” is a table that stores information, represented as columns (attributes) and rows
(records or tuples).

Attributes (columns) specify a data type, and each record (or row) contains the value of that specific data type. All tables in
a relational database have an attribute known as the primary key, which is a unique identifier of a row, and each row can
be used to create a relationship between different tables using a foreign key—a reference to a primary key of another
existing table.

How the relational database model works in practice: Say you have a Customer table and an Order table.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

The Customer table contains data about the customer:


• Customer ID (primary key)

• Customer name

• Billing address

• Shipping address

In the Customer table, the customer ID is a primary key that uniquely identifies who the customer is in the relational
database. No other customer would have the same Customer ID.
The Order table contains transactional information about an order:
• Order ID (primary key)

• Customer ID (foreign key)

• Order date

• Shipping date

• Order status

Here, the primary key to identify a specific order is the Order ID. You can connect a customer with an order by using a
foreign key to link the customer ID from the Customer table. The two tables are now related based on the shared customer
ID, which means you can query both tables to create formal reports or use the data for other applications. For instance, a
retail branch manager could generate a report about all customers who made a purchase on a specific date or figure out
which customers had orders that had a delayed delivery date in the last month.

The above explanation is meant to be simple. But relational databases also excel at showing very complex relationships
between data, allowing you to reference data in more tables as long as the data conforms to the predefined relational schema
of your database. As the data is organized as pre-defined relationships, you can query the data declaratively. A declarative
query is a way to define what you want to extract from the system without expressing how the system should compute the
result. This is at the heart of a relational system as opposed to other systems.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Examples of relational databases


Now that you understand how relational databases work, you can begin to learn about the many relational database
management systems that use the relational database model. A relational database management system (RDBMS) is a
program used to create, update, and manage relational databases. Some of the most well-known RDBMSs include MySQL,
PostgreSQL, MariaDB, Microsoft SQL Server, and Oracle Database.

Cloud-based relational databases like Cloud SQL, Cloud Spanner , and AlloyDB have become increasingly popular as they
offer managed services for database maintenance, patching, capacity management, provisioning and infrastructure support.

Benefits of relational databases


The main benefit of the relational database model is that it provides an intuitive way to represent data and allows easy
access to related data points. As a result, relational databases are most commonly used by organizations that need to
manage large amounts of structured data, from tracking inventory to processing transactional data to application logging.
There are many other advantages to using relational databases to manage and store your data, including:

• Flexibility
It’s easy to add, update, or delete tables, relationships, and make other changes to data whenever you need without
changing the overall database structure or impacting existing applications.
• ACID compliance
Relational databases support ACID (Atomicity, Consistency, Isolation, Durability) performance to ensure data validity
regardless of errors, failures, or other potential mishaps.
• Ease of use
It’s easy to run complex queries using SQL, which enables even non-technical users to learn how to interact with the
database.
• Collaboration
Multiple people can operate and access data simultaneously. Built-in locking prevents simultaneous access to data when
it’s being updated.
• Built-in security
Role-based security ensures data access is limited to specific users.
• Database normalization
Relational databases employ a design technique known as normalization that reduces data redundancy and improves data
integrity.

Relational vs. non-relational databases


The main difference between relational and non-relational databases (NoSQL databases) is how data is stored and organized.
Non-relational databases do not store data in a rule-based, tabular way. Instead, they store data as individual, unconnected
files and can be used for complex, unstructured data types, such as documents or rich media files.

Unlike relational databases, NoSQL databases follow a flexible data model, making them ideal for storing data that changes
frequently or for applications that handle diverse types of data.

Relational Model in DBMS


In DBMS, the relational model refers to an abstract model that we use to manage and organise the data that gets stored in a
database. Thus, it stores information in inter-related two-dimensional tables, also called relations, in which every row
represents some entity while every column represents the entity’s properties.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

The relational model represents DB in the form of a collection of various relations. This relation refers to a table of various
values. And every row present in the table happens to denote some real-world entities or relationships. The names of tables
and columns help us interpret the meaning of the values present in every row of the table. This data gets represented in the
form of a set of various relations. Thus, in the relational model, basically, this data is stored in the form of tables. However,
this data’s physical storage is independent of its logical organization.
Popular Relational Database Management Systems:

• IBM – DB2 and Informix Dynamic Server


• Oracle – Oracle and RDB
• Microsoft – SQL Server and Access

Properties of a Relational Model


The relational databases consist of the following properties:

• Every row is unique


• All of the values present in a column hold the same data type
• Values are atomic
• The columns sequence is not significant
• The rows sequence is not significant
• The name of every column is unique

Illustration of the Relational Model


A relational model represents how we can store data in Relational Databases. Here, a relational database stores information
in the form of relations or tables.
Now, let us consider a relation EMPLOYEE with attributes ID_NO, NAME, ADDRESS, ROLL_NO, and AGE shown in
this table:
EMPLOYEE

ID_NO NAME ADDRESS ROLL_NO AGE

C1 RIYA DELHI 15 20

C2 SUNITA GURGAON 16 22

C3 ASHWANI ROHTAK 12 18

C4 PREETI DELHI 25

Important Terminologies
Here are some Relational Model concepts in DBMS:

• Attribute: It refers to every column present in a table. The attributes refer to the properties that help us define a
relation. E.g., Employee_ID, Student_Rollno, SECTION, NAME, etc.
• Tuple – It is a single row of a table that consists of a single record. The relation above consists of four tuples, one
of which is like:

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

C1 RIYA DELHI 15 20

• Tables – In the case of the relational model, all relations are saved in the table format, and it is stored along with
the entities. A table consists of two properties: columns and rows. While rows represent records, the columns
represent attributes.
• Degree: It refers to the total number of attributes that are there in the relation. The EMPLOYEE relation defined
here has degree 5.
• Relation Schema: It represents the relation’s name along with its attributes. E.g., EMPLOYEE (ID_NO, NAME,
ADDRESS, ROLL_NO, AGE) is the relation schema for EMPLOYEE. If a schema has more than 1 relation, then
it is known as Relational Schema.
• Column: It represents the set of values for a certain attribute. The column ID_NO is extracted from the relation
EMPLOYEE.
• Cardinality: It refers to the total number of rows present in the given table. The EMPLOYEE relation defined here
has cardinality 4.
• Relation instance – It refers to a finite set of tuples present in the RDBMS system. A relation instance never has
duplicate tuples.
• Attribute domain – Every attribute has some predefined value and scope, which is known as the attribute domain.
• Relation key – Each and every row consists of a single or multiple attributes. It is known as a relation key.
• NULL Values: The value that is NOT known or the value that is unavailable is known as a NULL value. This null
value is represented by the blank spaces. E.g., the MOBILE of the EMPLOYEE having ID_NO 4 is NULL.

Constraints in Relational Model


While we design a Relational Model, we have to define some conditions that must hold for the data present in a database.
These are known as constraints. One has to check these constraints before performing any operation (like insertion, updating
and deletion) in the database. If there occurs any kind of a violation in any of the constraints, the operation will ultimately
fail.

Domain Constraints
The domain constraints are like attribute level constraints. Now an attribute is only capable of taking values that lie inside
the domain range. For example, if a constraint ID_NO>0 is applied on the EMPLOYEE relation, inserting some negative
value of ID_NO will result in failure.

Key Integrity
Each and every relation present in the database should have at least one set of attributes that uniquely defines a tuple. Those
sets of attributes are known as keys. For example, ID_NO in EMPLOYEE is a key. Now, remember that no two students
would be capable of having the very same ID number. Thus, a key primarily consists of these two properties:

• It has to be unique for all the available tuples.


• It can not consist of any NULL values.

Referential Integrity
Whenever one of the attributes of a relation is capable of only taking values from another attribute of the same relation or
other relations, it is termed referential integrity.
Now, let us have the following two relations:
LEARNER

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

ID_NO NAME ADDRESS ROLL_NO AGE CODE_OF_BRANCH

C1 RIYA DELHI 15 20 CS

C2 SUNITA GURGAON 16 22 CS

C3 ASHWANI ROHTAK 12 18 ECE

C4 PREETI DELHI 18 25 IT

SUBJECT

SUBJECT_NAME SUBJECT_CODE

COMPUTER SCIENCE CS

INFORMATION TECHNOLOGY IT

ELECTRONICS AND COMMUNICATION ENGINEERING ECE

CIVIL ENGINEERING CV

The SUBJECT_CODE of LEARNER can only take the values that are present in the SUBJECT_CODE of SUBJECT,
which is known as referential integrity constraint. Thus, the relation that is referencing to the other relation is known as
REFERENCING RELATION (LEARNER in this case), while that relation to which the other relations refer is known as
REFERENCED RELATION (SUBJECT in this case).

Relational Algebra
Relational Algebra is a procedural query language. Relational algebra mainly provides a theoretical foundation for relational
databases and SQL. The main purpose of using Relational Algebra is to define operators that transform one or more input
relations into an output relation. Given that these operators accept relations as input and produce relations as output, they
can be combined and used to express potentially complex queries that transform potentially many input relations (whose
data are stored in the database) into a single output relation (the query results). As it is pure mathematics, there is no use of
English Keywords in Relational Algebra and operators are represented using symbols.

Fundamental Operators
These are the basic/fundamental operators used in Relational Algebra.
1. Selection(σ)
2. Projection(π)
3. Union(U)
4. Set Difference(-)
5. Set Intersection(∩)
6. Rename(ρ)
7. Cartesian Product(X)

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

1. Selection(σ): It is used to select required tuples of the relations.

Example:
A B C
1 2 4
2 2 3
3 2 3
4 3 4

For the above relation, σ(c>3)R will select the tuples which have c more than 3.

A B C
1 2 4
4 3 4
Note: The selection operator only selects the required tuples but does not display them. For display, the data projection
operator is used.

2. Projection(π): It is used to project required column data from a relation.

Example: Consider Table 1. Suppose we want columns B and C from Relation R. π(B,C)R will show following columns.

B C
2 4
2 3
3 4

Note: By Default, projection removes duplicate data.

3. Union(U): Union operation in relational algebra is the same as union operation in set theory.

Example: FRENCH
Student_Name Roll_Number
Ram 01
Mohan 02
Vivek 13
Geeta 17
GERMAN
Student_Name Roll_Number
Vivek 13
Geeta 17
Shyam 21
Rohan 25
Consider the following table of Students having different optional subjects in their course.

π(Student_Name)FRENCH U π(Student_Name)GERMAN
Student_Name
Ram
Mohan
Vivek
Geeta
Shyam

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Rohan
Note: The only constraint in the union of two relations is that both relations must have the same set of Attributes.

4. Set Difference(-): Set Difference in relational algebra is the same set difference operation as in set theory.

Example: From the above table of FRENCH and GERMAN, Set Difference is used as follows

π(Student_Name)FRENCH - π(Student_Name)GERMAN
Student_Name
Ram
Mohan
Note: The only constraint in the Set Difference between two relations is that both relations must have the same set of
Attributes.

5. Set Intersection(∩): Set Intersection in relational algebra is the same set intersection operation in set theory.

Example: From the above table of FRENCH and GERMAN, the Set Intersection is used as follows

π(Student_Name)FRENCH ∩ π(Student_Name)GERMAN
Student_Name
Vivek
Geeta
Note: The only constraint in the Set Difference between two relations is that both relations must have the same set of
Attributes.

6. Rename(ρ): Rename is a unary operation used for renaming attributes of a relation.

ρ(a/b)R will rename the attribute 'b' of the relation by 'a'.

7. Cross Product(X): Cross-product between two relations. Let’s say A and B, so the cross product between A X B will
result in all the attributes of A followed by each attribute of B. Each record of A will pair with every record of B.

Example: A

Name Age Sex


Ram 14 M
Sona 15 F
Kim 20 M
B
ID Course
1 DS
2 DBMS
AXB

Name Age Sex ID Course


Ram 14 M 1 DS
Ram 14 M 2 DBMS
Sona 15 F 1 DS
Sona 15 F 2 DBMS
Kim 20 M 1 DS
Kim 20 M 2 DBMS

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Note: If A has ‘n’ tuples and B has ‘m’ tuples then A X B will have ‘ n*m ‘ tuples.

Derived Operators
These are some of the derived operators, which are derived from the fundamental operators.
• Natural Join(⋈)
• Conditional Join
1. Natural Join(⋈): Natural join is a binary operator. Natural join between two or more relations will result in a set of all
combinations of tuples where they have an equal common attribute.

Example: EMP

Name ID Dept_Name
A 120 IT
B 125 HR
C 110 Sales
D 111 IT
DEPT

Dept_Name Manager
Sales Y
Production Z
IT A
Natural join between EMP and DEPT with the condition:

EMP.Dept_Name = DEPT.Dept_Name

EMP ⋈ DEPT

Name ID Dept_Name Manager


A 120 IT A
C 110 Sales Y
D 111 IT A

2. Conditional Join: Conditional join works similarly to natural join. In natural join, by default condition is equal between
common attributes while in conditional join we can specify any condition such as greater than, less than, or not equal.

Example:
R
ID Sex Marks
1 F 45
2 F 55
3 F 60

S
ID Sex Marks
10 M 20
11 M 22
12 M 59

Join between R and S with condition R.marks >= S.marks

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

R.ID R.Sex R.Marks S.ID S.Sex S.Marks


1 F 45 10 M 20
1 F 45 11 M 22
2 F 55 10 M 20
2 F 55 11 M 22
3 F 60 10 M 20
3 F 60 11 M 22
3 F 60 12 M 59

Relational Calculus
As Relational Algebra is a procedural query language, Relational Calculus is a non-procedural query language. It basically
deals with the end results. It always tells me what to do but never tells me how to do it.
There are two types of Relational Calculus
• Tuple Relational Calculus(TRC)
• Domain Relational Calculus(DRC)

Structured Query Language (SQL)


One has to write application programs to access data in case of a file system. However, for database management systems
there are special kinds of programming languages called query language that can be used to access data from the database.
The Structured Query Language (SQL) is the most popular query language used by major relational database management
systems such as MySQL, ORACLE, SQL Server, etc. SQL is easy to learn as the statements comprise descriptive English
words and are not case-sensitive.

We can create and interact with a database using SQL efficiently and easily. The benefit of SQL is that we don’t have to
specify how to get the data from the database. Rather, we simply specify what is to be retrieved, and SQL does the rest.
Although called a query language, SQL can do much more besides querying. SQL provides statements for defining the
structure of the data, manipulating data in the database, declaring constraints and retrieving data from the database in various
ways, depending on our requirements.

How Queries can be Categorized in Relational Database?


The queries to deal with relational database can be categorized as:
• Data Definition Language:It is used to define the structure of the database. e.g; CREATE TABLE, ADD
COLUMN, DROP COLUMN and so on.
• Data Manipulation Language:It is used to manipulate data in the relations.
e.g.; INSERT, DELETE, UPDATE and so on.
• Data Query Language:It is used to extract the data from the relations. e.g.; SELECT So first we will consider the
Data Query Language. A generic query to retrieve data from a relational database is:
1. SELECT [DISTINCT] Attribute_List FROM R1,R2….RM
2. [WHERE condition]
3. [GROUP BY (Attributes)[HAVING condition]]
4. [ORDER BY(Attributes)[DESC]];
Part of the query represented by statement 1 is compulsory if you want to retrieve from a relational database. The statements
written inside [] are optional. We will look at the possible query combination on relation shown in Table 1.

Different Query Combinations


Case 1: If we want to retrieve attributes ROLL_NO and NAMEof all students, the query will be:

SELECT ROLL_NO, NAME FROM STUDENT;

ROLL_NO NAME
1 RAM

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

2 RAMESH
3 SUJIT
4 SURESH
Case 2: If we want to retrieve ROLL_NO and NAME of the students whose ROLL_NO is greater than 2, the query will
be:

SELECT ROLL_NO, NAME FROM STUDENT WHERE ROLL_NO>2;

ROLL_NO NAME
3 SUJIT
4 SURESH
CASE 3: If we want to retrieve all attributes of students, we can write * in place of writing all attributes as:

SELECT * FROM STUDENT WHERE ROLL_NO>2;

ROLL_NO NAME ADDRESS PHONE AGE


3 SUJIT ROHTAK 9156253131 20
4 SURESH DELHI 9156768971 18
CASE 4: If we want to represent the relation in ascending order by AGE, we can use ORDER BY clause as:

SELECT * FROM STUDENT ORDER BY AGE;

ROLL_NO NAME ADDRESS PHONE AGE


1 RAM DELHI 9455123451 18
2 RAMESH GURGAON 9652431543 18
4 SURESH DELHI 9156768971 18
3 SUJIT ROHTAK 9156253131 20
Note:
ORDER BY AGE is equivalent to ORDER BY AGE ASC. If we want to retrieve the results in descending order of AGE,
we can use ORDER BY AGE DESC.

CASE 5: If we want to retrieve distinct values of an attribute or group of attribute, DISTINCT is used as in:

SELECT DISTINCT ADDRESS FROM STUDENT;

ADDRESS
DELHI
GURGAON
ROHTAK

If DISTINCT is not used, DELHI will be repeated twice in result set. Before understanding GROUP BY and HAVING, we
need to understand aggregations functions in SQL.

Aggregation Functions
Aggregation functions are used to perform mathematical operations on data values of a relation. Some of the common
aggregation functions used in SQL are:

• COUNT: Count function is used to count the number of rows in a relation. E.g;

SELECT COUNT (PHONE) FROM STUDENT;

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

COUNT(PHONE)
4

• SUM: SUM function is used to add the values of an attribute in a relation. e.g;
SELECT SUM(AGE) FROM STUDENT;

SUM(AGE)
74
In the same way, MIN, MAX and AVG can be used. As we have seen above, all aggregation functions return only 1 row.

• AVERAGE: It gives the average values of the tupples. It is also defined as sum divided by count values.
Syntax:
AVG(attributename)
OR
SUM(attributename)/COUNT(attributename)
The above-mentioned syntax also retrieves the average value of tupples.

• MAXIMUM:It extracts the maximum value among the set of tupples.


Syntax:
MAX(attributename)

• MINIMUM:It extracts the minimum value amongst the set of all the tupples.
Syntax:
MIN(attributename)

• GROUP BY:Group by is used to group the tuples of a relation based on an attribute or group of attribute. It is
always combined with aggregation function which is computed on group. e.g.;

SELECT ADDRESS, SUM(AGE) FROM STUDENT GROUP BY (ADDRESS);

In this query, SUM(AGE) will be computed but not for entire table but for each address. i.e.; sum of AGE for address
DELHI(18+18=36) and similarly for other address as well. The output is:
ADDRESS SUM(AGE)
DELHI 36
GURGAON 18
ROHTAK 20
If we try to execute the query given below, it will result in error because although we have computed SUM(AGE) for each
address, there are more than 1 ROLL_NO for each address we have grouped. So it can’t be displayed in result set. We need
to use aggregate functions on columns after SELECT statement to make sense of the resulting set whenever we are using
GROUP BY.

SELECT ROLL_NO, ADDRESS, SUM(AGE) FROM STUDENT GROUP BY (ADDRESS);

NOTE:
• An attribute that is not a part of GROUP BY clause can’t be used for selection.
• Any attribute which is part of GROUP BY CLAUSE can be used for selection but it is not mandatory.
But we could use attributes that are not a part of the GROUP BY clause in an aggregate function.

Entity-Relationship Model

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

The Entity Relational Model is a model for identifying entities to be represented in the database and representation of how
those entities are related. The ER data model specifies enterprise schema that represents the overall logical structure of a
database graphically.
The Entity Relationship Diagram explains the relationship among the entities present in the database. ER models are used
to model real-world objects like a person, a car, or a company and the relation between these real-world objects. In short,
the ER Diagram is the structural format of the database.
Why Use ER Diagrams In DBMS?
• ER diagrams are used to represent the E-R model in a database, which makes them easy to convert into relations
(tables).
• ER diagrams provide the purpose of real-world modeling of objects which makes them intently useful.
• ER diagrams require no technical knowledge and no hardware support.
• These diagrams are very easy to understand and easy to create even for a naive user.
• It gives a standard solution for visualizing the data logically.

Symbols Used in ER Model


ER Model is used to model the logical view of the system from a data perspective which consists of these symbols:
• Rectangles: Rectangles represent Entities in the ER Model.
• Ellipses: Ellipses represent Attributes in the ER Model.
• Diamond: Diamonds represent Relationships among Entities.
• Lines: Lines represent attributes to entities and entity sets with other relationship types.
• Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
• Double Rectangle: Double Rectangle represents a Weak Entity.

Symbols used in ER Diagram


Components of ER Diagram
ER Model consists of Entities, Attributes, and Relationships among Entities in a Database System.

Components of ER Diagram
Entity
An Entity may be an object with a physical existence – a particular person, car, house, or employee – or it may be an object
with a conceptual existence – a company, a job, or a university course.
Entity Set: An Entity is an object of Entity Type and a set of all entities is called an entity set. For Example, E1 is an entity
having Entity Type Student and the set of all students is called Entity Set. In ER diagram, Entity Type is represented as:

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Entity Set
1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not depend on other Entity in the Schema. It
has a primary key, that helps in identifying it uniquely, and it is represented by a rectangle. These are called Strong Entity
Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set. But some entity type exists for which
key attributes can’t be defined. These are called Weak Entity types.

For Example, A company may store the information of dependents (Parents, Children, Spouse) of an Employee. But the
dependents don’t have existed without the employee. So Dependent will be a Weak Entity Type and Employee will be
Identifying Entity type for Dependent, which means it is Strong Entity Type.
A weak entity type is represented by a Double Rectangle. The participation of weak entity types is always total. The
relationship between the weak entity type and its identifying strong entity type is called identifying relationship and it is
represented by a double diamond.

Strong Entity and Weak Entity


Attributes
Attributes are the properties that define the entity type. For example, Roll_No, Name, DOB, Age, Address, and Mobile_No
are the attributes that define entity type Student. In ER diagram, the attribute is represented by an oval.

Attribute
1. Key Attribute
The attribute which uniquely identifies each entity in the entity set is called the key attribute. For example, Roll_No will be
unique for each student. In ER diagram, the key attribute is represented by an oval with underlying lines.

Key Attribute
2. Composite Attribute
An attribute composed of many other attributes is called a composite attribute. For example, the Address attribute of the
student Entity type consists of Street, City, State, and Country. In ER diagram, the composite attribute is represented by an
oval comprising of ovals.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Composite Attribute
3. Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example, Phone_No (can be more than one for a given
student). In ER diagram, a multivalued attribute is represented by a double oval.

Multivalued Attribute
4. Derived Attribute
An attribute that can be derived from other attributes of the entity type is known as a derived attribute. e.g.; Age (can be
derived from DOB). In ER diagram, the derived attribute is represented by a dashed oval.

Derived Attribute
The Complete Entity Type Student with its Attributes can be represented as:

Entity and Attributes


Relationship Type and Relationship Set
A Relationship Type represents the association between entity types. For example, ‘Enrolled in’ is a relationship type that
exists between entity type Student and Course. In ER diagram, the relationship type is represented by a diamond and
connecting the entities with lines.

Entity-Relationship Set
A set of relationships of the same type is known as a relationship set. The following relationship set depicts S1 as enrolled
in C2, S2 as enrolled in C1, and S3 as registered in C3.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Relationship Set

Degree of a Relationship Set


The number of different entity sets participating in a relationship set is called the degree of a relationship set.
1. Unary Relationship: When there is only ONE entity set participating in a relation, the relationship is called a unary
relationship. For example, one person is married to only one person.

Unary Relationship
2. Binary Relationship: When there are TWO entities set participating in a relationship, the relationship is called a binary
relationship. For example, a Student is enrolled in a Course.

Binary Relationship
3. n-ary Relationship: When there are n entities set participating in a relation, the relationship is called an n-ary
relationship.

Cardinality
The number of times an entity of an entity set participates in a relationship set is known as cardinality. Cardinality can be
of different types:
1. One-to-One: When each entity in each entity set can take part only once in the relationship, the cardinality is one-to-one.
Let us assume that a male can marry one female and a female can marry one male. So the relationship will be one-to-one.
the total number of tables that can be used in this is 2.

one to one cardinality

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Using Sets, it can be represented as:

Set Representation of One-to-One

2. One-to-Many: In one-to-many mapping as well where each entity can be related to more than one relationship and the
total number of tables that can be used in this is 2. Let us assume that one surgeon department can accommodate many
doctors. So the Cardinality will be 1 to M. It means one department has many Doctors.
Total number of tables that can used is 3.

one to many cardinality


Using sets, one-to-many cardinality can be represented as:

Set Representation of One-to-Many

3. Many-to-One: When entities in one entity set can take part only once in the relationship set and entities in other entity
sets can take part more than once in the relationship set, cardinality is many to one. Let us assume that a student can take
only one course but one course can be taken by many students. So the cardinality will be n to 1. It means that for one course
there can be n students but for one student, there will be only one course.
The total number of tables that can be used in this is 3.

many to one cardinality


Using Sets, it can be represented as:

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Set Representation of Many-to-One


In this case, each student is taking only 1 course but 1 course has been taken by many students.

4. Many-to-Many: When entities in all entity sets can take part more than once in the relationship cardinality is many to
many. Let us assume that a student can take more than one course and one course can be taken by many students. So the
relationship will be many to many.

The total number of tables that can be used in this is 3.

many to many cardinality


Using Sets, it can be represented as:

Many-to-Many Set Representation


In this example, student S1 is enrolled in C1 and C3 and Course C3 is enrolled by S1, S3, and S4. So it is many-to-many
relationships.

Participation Constraint
Participation Constraint is applied to the entity participating in the relationship set.
1. Total Participation – Each entity in the entity set must participate in the relationship. If each student must enroll in a
course, the participation of students will be total. Total participation is shown by a double line in the ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT participate in the relationship. If some courses are
not enrolled by any of the students, the participation in the course will be partial.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set having total participation and Course Entity
set having partial participation.

Total Participation and Partial Participation


Using Set, it can be represented as,

Set representation of Total Participation and Partial Participation


Every student in the Student Entity set participates in a relationship but there exists a course C4 that is not taking part in the
relationship.
How to Draw ER Diagram?
• The very first step is Identifying all the Entities, and place them in a Rectangle, and labeling them accordingly.
• The next step is to identify the relationship between them and pace them accordingly using the Diamond, and make
sure that, Relationships are not connected to each other.
• Attach attributes to the entities properly.
• Remove redundant entities and relationships.
• Add proper colors to highlight the data present in the database.

Mapping ER Models to Relations


In Database Management Systems, ER stands for Entity-Relationship. ER modelling help to figure out the set of entities,
attributes of each entitiy, and the relationship that is shared between entities. In other words it helps us to explain the logical
structure of databases.
This image shows the ER-Diagram for a Company Database.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Now, let’s dive in to convert this ER-Diagram to Relational Schema…


Step 1:
• Figure out all the regular/strong entity from the diagram and then create a corresponding relation(table) that
includes all the simple attributes.
• Choose one of the attributes as a primary key. If composite, the simple attributes together form the primary key.
• For the given ER-Diagram we have Employee, Department and Project as strong/regular entity, as they are
enclosed in single rectangle.
• So, we create respective relations that is depicted in the figure below.

After step 1:
Step 2:
• Figure out the weak entity types from the diagram and create a corresponding relation(table) that includes all
its simple attributes.
• Add as foreign key all of the primary key attributes in the entity corresponding to the owner entity.
• The primary key is a combination of all the primary key attributes from the owner and the primary key of the
weak entity.
• For the given ER-Diagram we have Dependent as a weak entity, as it is enclosed in a double rectangle that is
indicative of an entity being weak.
• The Dependent relation(table) is created that is shown in the figure below.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

After Step 2:
Step 3:
• Now we need to figure out the entities from ER diagram for which there exists a 1-to-1 relationship.
• The entities for which there exists a 1-to-1 relationship, choose one relation(table) as S, the other as T.
Better if S has total participation (reduces the number of NULL values).
• Then we need to add to S all the simple attributes of the relationship if there exists any.
• After that, we add as a foreign key in S the primary key attributes of T.
• For the given ER-Diagram there exists a 1-to-1 relationship between Employee and Department entity.
• Here Department has total participation therefore consider it as relation S and Employee as relation T.
• The 1-to-1 mapping between Employee and Department is depicted in the figure below.

Step 4:
• Now we need to figure out the entities from ER diagram for which there exists a 1-to-N relationship.
• The entities for which there exists a 1-to-N relationship, choose a relation as S as the type at N-side of
relationship and other as T.
• Then we add as a foreign key to S all of the primary key attributes of T.
• In the given ER diagram there are two 1-to-N relationships that exists between Employee-
Department and Employee-Dependent entity.
• The 1-to-N mapping between Employee-Department and Employee-Dependent is depicted in the figure below.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

After Step 4.
Step 5:
• Now we need to figure out the entities from ER diagram for which there exists an M-to-N relationship.
• Create a new relation(table) S.
• The primary keys of relations(tables) between which M-to-N relationship exists, are added to the new relation
S created, that acts as a foreign key.
• Then we,add any simple attributes of the M-to-N relationship to S.
• For the given ER-Diagram there exists M-to-N relationship between Employee and Project entity.
• The new table Works_On is created for mapping the relationship between Employee and Project relation(table).

After Step 5;
Step 6:
• Now identify the relations(tables) that contain multi-valued attributes.
• Then we need to create a new relation S
• In the new relation S we add as foreign keys the primary keys of the corresponding relation.
• Then we add the multi-valued attribute to S; the combination of all attributes in S forms the primary key.
• For the given ER-Diagram there exists a multi-valued attribute (Locations) in Department relation(table).

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

• So, we create a new relation called Dept_Locations. To this new relation we add the primary key
of Department Table that is D_Number and the multi-valued attribute Locations.

After step 6.

Distributed Databases
A distributed database is basically a database that is not limited to one system, it is spread over different sites, i.e, on
multiple computers or over a network of computers. A distributed database system is located on various sites that don’t
share physical components. This may be required when a particular database needs to be accessed by various users
globally. It needs to be managed such that for the users it looks like one single database.

Types:
1. Homogeneous Database:
In a homogeneous database, all different sites store database identically. The operating system, database management
system, and the data structures used – all are the same at all sites. Hence, they’re easy to manage.

Homogeneous Databases

2. Heterogeneous Database:
In a heterogeneous distributed database, different sites can use different schema and software that can lead to problems in
query processing and transactions. Also, a particular site might be completely unaware of the other sites. Different computers
may use a different operating system, different database application. They may even use different data models for the
database. Hence, translations are required for different sites to communicate.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

Heterogeneous Database
Distributed Data Storage :
There are 2 ways in which data can be stored on different sites. These are:

1. Replication –
In this approach, the entire relationship is stored redundantly at 2 or more sites. If the entire database is available at all
sites, it is a fully redundant database. Hence, in replication, systems maintain copies of data.

This is advantageous as it increases the availability of data at different sites. Also, now query requests can be processed in
parallel.
However, it has certain disadvantages as well. Data needs to be constantly updated. Any change made at one site needs to
be recorded at every site that relation is stored or else it may lead to inconsistency. This is a lot of overhead. Also,
concurrency control becomes way more complex as concurrent access now needs to be checked over a number of sites.

2. Fragmentation –
In this approach, the relations are fragmented (i.e., they’re divided into smaller parts) and each of the fragments is stored
in different sites where they’re required. It must be made sure that the fragments are such that they can be used to
reconstruct the original relation (i.e, there isn’t any loss of data).
Fragmentation is advantageous as it doesn’t create copies of data, consistency is not a problem.

Fragmentation of relations can be done in two ways:


• Horizontal fragmentation – Splitting by rows –
The relation is fragmented into groups of tuples so that each tuple is assigned to at least one fragment.
• Vertical fragmentation – Splitting by columns –
The schema of the relation is divided into smaller schemas. Each fragment must contain a common candidate key
so as to ensure a lossless join.
In certain cases, an approach that is hybrid of fragmentation and replication is used.

Applications of Distributed Database:


• It is used in Corporate Management Information System.
• It is used in multimedia applications.
• Used in Military’s control system, Hotel chains etc.
• It is also used in manufacturing control system.
A distributed database system is a type of database management system that stores data across multiple computers or sites
that are connected by a network. In a distributed database system, each site has its own database, and the databases are
connected to each other to form a single, integrated system.

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering


IV- SEM/II B.E. CSE (CS) Prepared By: R. Reshma/AP/CSE

The main advantage of a distributed database system is that it can provide higher availability and reliability than a
centralized database system. Because the data is stored across multiple sites, the system can continue to function even if
one or more sites fail. In addition, a distributed database system can provide better performance by distributing the data
and processing load across multiple sites.
There are several different architectures for distributed database systems, including:
• Client-server architecture: In this architecture, clients connect to a central server, which manages the distributed
database system. The server is responsible for coordinating transactions, managing data storage, and providing
access control.
• Peer-to-peer architecture: In this architecture, each site in the distributed database system is connected to all
other sites. Each site is responsible for managing its own data and coordinating transactions with other sites.
• Federated architecture: In this architecture, each site in the distributed database system maintains its own
independent database, but the databases are integrated through a middleware layer that provides a common
interface for accessing and querying the data.
Distributed database systems can be used in a variety of applications, including e-commerce, financial services, and
telecommunications. However, designing and managing a distributed database system can be complex and requires
careful consideration of factors such as data distribution, replication, and consistency.
Advantages of Distributed Database System :
1) There is fast data processing as several sites participate in request processing.
2) Reliability and availability of this system is high.
3) It possess reduced operating cost.
4) It is easier to expand the system by adding more sites.
5) It has improved sharing ability and local autonomy.

DBMS vs RDBMS vs FPS: The Comparison

***********************

Department of Computer Science and Engineering Dhanalakshmi Srinivasan College of Engineering

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy