0% found this document useful (0 votes)
7 views85 pages

Question Bank

The syllabus for the Database Management Systems course outlines the objectives, course outcomes, and key topics covered in the curriculum, including database design, SQL, E-R models, transaction management, and distributed databases. Students will learn to design and implement database systems, understand data abstraction levels, and explore various database architectures. The document also includes important topics, definitions, and examples related to DBMS concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views85 pages

Question Bank

The syllabus for the Database Management Systems course outlines the objectives, course outcomes, and key topics covered in the curriculum, including database design, SQL, E-R models, transaction management, and distributed databases. Students will learn to design and implement database systems, understand data abstraction levels, and explore various database architectures. The document also includes important topics, definitions, and examples related to DBMS concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

Syllabus

16 February 2023
11:03

Core Paper - III: Database Management Systems Year – I Semester-I Credits: 4

Objective of the course:


This course introduces the concepts of database systems design and to understand the context, phases and
techniques for designing and building database information systems in business. To understand the components
of a computerized database information system.

Course Outcomes:
After successful completion of this course, the students should be able to Design a correct, new database
information system for a business functional area and implement the design in either SQL or NoSQL To understand the
concepts of open source databases.

Unit-I: Introduction to Database Systems – Relational Model – Structure – Relational Algebra – Null
Values – SQL – Set Operation – Views – Advanced SQL – Embedded SQL – Recursive Queries – The Tuple Relational
Calculus – Domain Relational Calculus.

Unit-II: E-R Model – Constraints – E-R- Diagrams Weak Entity Sets – Reduction to Relational Schemes
– Relational Database Design – Features of Relational Design – Automatic Domains and First Normal Form
– Decomposition using Functional Dependencies – Multivalued Dependencies – More Normal Forms – Web
Interface – Object – Based Databases – Structured Types and inheritance in SQL – Table inheritance – Persistent.

Unit-III: Storage and File Structure – RAID – File Organisation – Indexing and Hashing – B Tree – B Tree Index
files - Static and Dynamic Hashing – Query Processing – Sorting & Join Operators – Query Optimization – Choice of
Evaluation Plans.

Unit-IV: Transaction Management – Implementation of Atomicity and Durability –Serializability – Recoverability –


Concurrency Control – Dead Lock Handling – Recovery System – Buffer Management.

Unit-V: Database – System Architecture – Client Server – Architectures – Parallel System –Network Types – Distributed
Database – Homogeneous and Heterogeneous Database – Directory System – Case Study –Oracle – MSSQL Server.

DBMS - PCATC Page 1


Important Topics
19 February 2023
20:57

 Database management System


 Relational Model
 Normalization
 Database Model Architecture
 Overview of Physical Storage Media
 File Organization
 B+ Tree
 B Tree
 SQL
 ER Diagram
 Relational Algebra
 Relational Calculus
 Deadlock
 Recovery Management
 RAID
 DBMS Architecture

DBMS - PCATC Page 2


2 Marks
18 February 2023
17:16

1. Define the term DBMS.(Page no 02)


The database is a collection of inter-related data which is used to retrieve, insert and delete the data efficiently. It is also used to organize the
in the form of a table, schema, views, and reports, etc.

2. Give the levels of data abstraction.

i. Physical or Internal Level:


The physical or internal layer is the lowest level of data abstraction in the database management system. It is the layer tha t
how data is actually stored in the database. It defines methods to access the data in the database. It defines complex data
detail, so it is very complex to understand, which is why it is kept hidden from the end user.

Data Administrators (DBA) decide how to arrange data and where to store data. The Data Administrator (DBA) is the person whos e
is to manage the data in the database at the physical or internal level. There is a data center that securely stores the raw data in
on hard drives at this level.
ii. Logical or Conceptual Level:
The logical or conceptual level is the intermediate or next level of data abstraction. It explains what data is going to be s tored in
database and what the relationship is between them.

It describes the structure of the entire data in the form of tables. The logical level or conceptual level is less complex th an the
level. With the help of the logical level, Data Administrators (DBA) abstract data from raw data present at the physical leve l.
iii. View or External Level:
View or External Level is the highest level of data abstraction. There are different views at this level that define the part s of the
data of the database. This level is for the end-user interaction; at this level, end users can access the data based on their queries.

3. What are stored and derived attributes?

i. Stored Attribute :
Stored attribute is an attribute which are physically stored in the database.
Assume a table called as student. There are attributes such as student_id, name, roll_no, course_Id. We cannot derive value
these attribute using other attributes. So, these attributes are called as stored attribute.
ii. Derived Attribute :
A derived attribute is an attribute whose values are calculated from other attributes. In a student table if we have an
called as date_of_birth and age. We can derive value of age with the help of date_of_birth attribute

4. Define the term primary key.


A primary key is a special relational database table column (or combination of columns) designated to uniquely identify each table record. A
primary key is used as a unique identifier to quickly parse data within the table. A table cannot have more than one primary key.

5. List the properties of decomposition.

i. Lossless Decomposition
• Decomposition must be lossless. It means that the information should not get lost from the relation that is
decomposed.
• It gives a guarantee that the join will result in the same relation as it was decomposed

ii. Dependency Preservation


• Dependency is an important constraint on the database.
• Every dependency must be satisfied by at least one decomposed table.
• If {A → B} holds, then two sets are functional dependent. And, it becomes more useful for checking the dependency
easily if both sets in a same relation.
• This decomposition property can only be done by maintaining the functional dependency.
In this property, it allows to check the updates without computing the natural join of the database structure.

DBMS - PCATC Page 3


• In this property, it allows to check the updates without computing the natural join of the database structure.

iii. Lack of Data Redundancy


• Lack of Data Redundancy is also known as a Repetition of Information.
• The proper decomposition should not suffer from any data redundancy.
• The careless decomposition may cause a problem with the data.
• The lack of data redundancy property may be achieved by Normalization process.

6. What is inheritance?

Table inheritance Only tables that are defined on named row types support table inheritance. Table inheritance is the property that
allows a table to inherit the behavior (constraints, storage options, triggers) from the supertable above it in the table hie rarchy.

7. Define the term RAID.(Page no 206)

8. What is sparse index?

It is an index record that appears for only some of the values in the file . Sparse Index helps you to resolve the issues of dense
Indexing in DBMS. In this method of indexing technique, a range of index columns stores the same data block address, and when data needs to
be retrieved, the block address will be fetched.

9. What are the ACID properties?

A transaction is a single logical unit of work that accesses and possibly modifies the contents of a database. Transactions access data usi ng
read and write operations.
In order to maintain consistency in a database, before and after the transaction, certain properties are followed. These are

10. Mention the types of serializability.

Two major types of serializability exist: view-serializability, and conflict-serializability.


View-serializability matches the general definition of serializability given above.

Conflict-serializability is a broad special case, i.e., any schedule that is conflict-serializable is also view-serializable, but not necessarily the
opposite.

11. What are distributed databases?

A distributed database is basically a database that is not limited to one system, it is spread over different sites, i.e, on
computers or over a network of computers. A distributed database system is located on various sites that don’t share
components. This may be required when a particular database needs to be accessed by various users globally. It needs to be
managed such that for the users it looks like one single database.
Types:
i. Homogeneous Database:
In a homogeneous database, all different sites store database identically. The operating system, database management
and the data structures used – all are the same at all sites. Hence, they’re easy to manage.
ii. Heterogeneous Database:
In a heterogeneous distributed database, different sites can use different schema and software that can lead to problems in
query processing and transactions. Also, a particular site might be completely unaware of the other sites. Different
may use a different operating system, different database application. They may even use different data models for the
Hence, translations are required for different sites to communicate.
12. Write any two DML commands with examples for their usage.

SELECT DML Command


SELECT is the most important data manipulation command in Structured Query Language. The SELECT command shows the records
the specified table. It also shows the particular record of a particular column by using the WHERE clause.

DBMS - PCATC Page 4


Syntax of SELECT DML command
SELECT column_Name_1, column_Name_2, ….., column_Name_N FROM Name_of_table;

Here, column_Name_1, column_Name_2, ….., column_Name_N are the names of those columns whose data we want to retrieve
from the table.

If we want to retrieve the data from all the columns of the table, we have to use the following SELECT command:
SELECT * FROM table_name;

Examples of SELECT Command


Example 1: This example shows all the values of every column from the table.
SELECT * FROM Student;
This SQL statement displays the following values of the student table:
Student_ID Student_Name Student_Marks
BCA1001 Abhay 85
BCA1002 Anuj 75
BCA1003 Bheem 60
BCA1004 Ram 79
BCA1005 Sumit 80

INSERT DML Command


INSERT is another most important data manipulation command in Structured Query Language, which allows users to insert data in
database tables.

Syntax of INSERT Command


INSERT INTO TABLE_NAME ( column_Name1 , column_Name2 , column_Name3 , .... column_NameN ) VALUES (value_1, value_
2, value_3, .... value_N ) ;

Examples of INSERT Command


Example 1: This example describes how to insert the record in the database table.
Let's take the following student table, which consists of only 2 records of the student.
Stu_Id Stu_Name Stu_Mark Stu_Ag
101 Ramesh 92 20
201 Jatin 83 19

Suppose, you want to insert a new record into the student table. For this, you have to write the following DML INSERT command :
INSERT INTO Student (Stu_id, Stu_Name, Stu_Marks, Stu_Age) VALUES (104, Anmol, 89, 19);

13. List few popular applications of DBMS.

Applications of DBMS
In so many fields, we will use a database management system.
Let’s see some of the applications where database management system uses −
• Railway Reservation System − The railway reservation system database plays a very important role by keeping record of
ticket booking, train’s departure time and arrival status and also gives information regarding train late to people through
the database.
• Library Management System − Now-a-days it’s become easy in the Library to track each book and maintain it because
of the database. This happens because there are thousands of books in the library. It is very difficult to keep a record
of all books in a copy or register. Now DBMS used to maintain all the information related to book issue dates, name of
the book, author and availability of the book.
• Banking − Banking is one of the main applications of databases. We all know there will be a thousand transactions
through banks daily and we are doing this without going to the bank. This is all possible just because of DBMS that
manages all the bank transactions.
• Universities and colleges − Now-a-days examinations are done online. So, the universities and colleges are maintaining
DBMS to store Student’s registrations details, results, courses and grade all the information in the database. For
example, telecommunications. Without DBMS there is no telecommunication company. DBMS is most useful to these
companies to store the call details and monthly postpaid bills.

DBMS - PCATC Page 5


companies to store the call details and monthly postpaid bills.

14. Define the term relational algebra.

Relational Algebra
Relational algebra is a procedural query language, which takes instances of relations as input and yields instances of relati ons
output. It uses operators to perform queries. An operator can be either unary or binary. They accept relations as their input and
yield relations as their output. Relational algebra is performed recursively on a relation and intermediate results are also
considered relations.
The fundamental operations of relational algebra are as follows −
• Select (σ)
• Project(∏)
• Union(U)
• Set different(-)
• Cartesian product(X)
• Rename Operation(ρ)

15. What is data dictionary?

Data Dictionary consists of database metadata. It has records about objects in the database.

What Data Dictionary consists of


Data Dictionary consists of the following information −
• Name of the tables in the database
• Constraints of a table i.e. keys, relationships, etc.
• Columns of the tables that related to each other
• Owner of the table
• Last accessed information of the object
• Last updated information of the object

16. Define strong and weak entity sets.

Strong Entity
Strong Entity is independent to any other entity in the schema. A strong entity always have a primary key. In ER diagram, a s trong
entity is represented by rectangle. Relationship between two strong entities is represented by a diamond. A set of strong ent ities is
known as strong entity set.

Weak Entity
Weak entity is dependent on strong entity and cannot exists without a corresponding strong. It has a foreign key which relate s it
strong entity. A weak entity is represented by double rectangle. Relationship between a strong entity and a weak entity is
by double diamond. The foreign key is also called a partial discriminator key.

17. What is meant by lossless join decomposition?

Lossless-join decomposition is a process in which a relation is decomposed into two or more relations. This property guarantees
the extra or less tuple generation problem does not occur and no information is lost from the original relation during the
decomposition. It is also known as non-additive join decomposition.
When the sub relations combine again then the new relation must be the same as the original relation was before decomposition .
Consider a relation R if we decomposed it into sub-parts relation R1 and relation R2.
The decomposition is lossless when it satisfies the following statement −
• If we union the sub Relation R1 and R2 then it must contain all the attributes that are available in the original relation R
decomposition.
• Intersections of R1 and R2 cannot be Null. The sub relation must contain a common attribute. The common attribute must
contain unique data.

18. Differentiate between super class and sub class.

Superclass vs Subclass

DBMS - PCATC Page 6


Superclass vs Subclass
When implementing inheritance, the existing class from which the When implementing inheritance, the class that inherits the properties
new classes are derived is the Superclass. methods from the Superclass is the Subclass.
Synonyms
Superclass is known as base class, parent class. Subclass is known as derived class, child class.
Functionality
A superclass cannot use the properties and methods of the A subclass can use the properties and methods of the Superclass.
Single-Level-Inheritance
There is one Superclass. There is one Subclass.
Hierarchical Inheritance
There is one Superclass There are many Subclasses.
Multiple Inheritance
There are many Superclasses. There is one Subclass.

19. What is static hashing?

Static Hashing in a Database Management System (DBMS) can be defined as a technique for mapping the finalized or
data of illogical sizes into ordered flat sizes in the database . It is achieved by applying the respective hashing functions, where the
static hash values are also called as static hash codes, static hashes, or digests.

20. What is sorting?

Sorting Method in DBMS


It is the technique of storing the records in ascending or descending order of one or more columns. It is useful
some of the queries will ask us to return sorted records, or in operations like joins will be more efficient in sorted
records. All the records are by default sorted based on the primary key column. In addition, we can specify to sort
records based on other columns, as required.

21. When a transaction rolls back?

Rollback is mainly called when you get one or more than one SQL exception in the statements of Transaction (T i), then the T i get aborted
and start over from the beginning. This is the only way to know what has been committed and what hasn’t been committed.

22. Define the term deadlock.

Deadlock in DBMS
A deadlock is a condition where two or more transactions are waiting indefinitely for one another to give up locks. Deadlock is
be one of the most feared complications in DBMS as no task ever gets finished and is in waiting state forever.

For example: In the student table, transaction T1 holds a lock on some rows and needs to update some rows in the grade table.
Simultaneously, transaction T2 holds locks on some rows in the grade table and needs to update the rows in the Student table
held by Transaction T1.
Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock and similarly, transaction T2 is waiti ng for
release its lock. All activities come to a halt state and remain at a standstill. It will remain in a standstill until the DB MS detects
deadlock and aborts one of the transactions.

DBMS - PCATC Page 7


23. What is meant by client server?(Page no 382)

24. What are the network types?

Computer Network Types


A computer network is a group of computers linked to each other that enables the computer to communicate with another
and share their resources, data, and applications.
A computer network can be categorized by their size. A computer network is mainly of four types:

• LAN(Local Area Network)


• PAN(Personal Area Network)
• MAN(Metropolitan Area Network)
• WAN(Wide Area Network)

25. What is meant by database?(Repeated Question)

26. Expand and write a note on the term SQL.

SQL is a language to operate databases; it includes Database Creation, Database Deletion, Fetching Data Rows, Modifying &
Data rows, etc.
SQL stands for Structured Query Language which is a computer language for storing, manipulating and retrieving data stored in a
relational database. SQL was developed in the 1970s by IBM Computer Scientists and became a standard of the American
National Standards Institute (ANSI) in 1986, and the International Organization for Standardization (ISO) in 1987.

Though SQL is an ANSI (American National Standards Institute) standard language, but there are many different dialects of the SQL language
like MS SQL Server is using T-SQL and Oracle is using PL/SQL.

SQL is the standard language to communicate with Relational Database Systems. All the Relational Database Management
(RDMS) like MySQL, MS Access, Oracle, Sybase, Informix, Postgres and SQL Server use SQL as their Standard Database Language.

27. What is tuple relational calculus?(Page no 110)

Tuple Relational Calculus (TRC)


It is a non-procedural query language which is based on finding a number of tuple variables also known as range variable for
predicate holds true. It describes the desired information without giving a specific procedure for obtaining that information . The
relational calculus is specified to select the tuples in a relation. In TRC, filtering variable uses the tuples of a relation . The result
relation can have one or more tuples.
Notation:
A Query in the tuple relational calculus is expressed as following notation
{T | P (T)} or {T | Condition (T)}
Where
T is the resulting tuples
P(T) is the condition used to fetch T.

For example:

DBMS - PCATC Page 8


For example:
{ T.name | Author(T) AND T.article = 'database' }
Output: This query selects the tuples from the AUTHOR relation. It returns a tuple with 'name' from Author who has written an
article on 'database'.
TRC (tuple relation calculus) can be quantified. In TRC, we can use Existential ( ∃) and Universal Quantifiers (∀).

For example:
{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}
Output: This query will yield the same result as the previous one.

28. What is called E-R Model?(Page no 122)

29. What are multi valued dependencies?

Multivalued Dependency
• Multivalued dependency occurs when two attributes in a table are independent of each other but, both depend on a third
attribute.
• A multivalued dependency consists of at least two attributes that are dependent on a third attribute that's why it always
at least three attributes.

30. Define the term persistance.

Persistence ensures that data in a database will not be altered without authorization and will be accessible for as long as the
company requires it. The relational database management system, or RDBMS, is the forefather of permanent data storage.

31. What is meant by RAID?(Repeated Question)

32. What is meant by external sorting?

External sorting is a term for a class of sorting algorithms that can handle massive amounts of data. External sorting is
when the data being sorted does not fit into the main memory of a computing device (usually RAM) and instead, must reside
the slower external memory (usually a hard drive).
External sorting typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in the
memory are read, sorted, and written out to a temporary file. In the merge phase, the sorted sub -files are combined into a
larger file.
33. List out the states in transaction management.

States of Transactions
A transaction in a database can be in one of the following states −

• Active − In this state, the transaction is being executed. This is the initial state of every transaction.
• Partially Committed − When a transaction executes its final operation, it is said to be in a partially committed state.
• Failed − A transaction is said to be in a failed state if any of the checks made by the database recovery system fails. A
failed transaction can no longer proceed further.
• Aborted − If any of the checks fails and the transaction has reached a failed state, then the recovery manager rolls
back all its write operations on the database to bring the database back to its original state where it was prior to the

DBMS - PCATC Page 9


back all its write operations on the database to bring the database back to its original state where it was prior to the
execution of the transaction. Transactions in this state are called aborted. The database recovery module can select
one of the two operations after a transaction aborts −
○ Re-start the transaction
○ Kill the transaction
• Committed − If a transaction executes all its operations successfully, it is said to be committed. All its effects are now
permanently established on the database system.

34. What is concurrency control?

Concurrency control concept comes under the Transaction in database management system (DBMS). It is a procedure in DBMS
helps us for the management of two simultaneous processes to execute without conflicts between each other, these conflicts oc cur
multi user systems.
Concurrency can simply be said to be executing multiple transactions at a time. It is required to increase time efficiency. I f many
transactions try to access the same data, then inconsistency arises. Concurrency control required to maintain consistency dat a

35. Give the general structure of a client/server architecture.(Page no 383)

36. Expand LDAP.(Page no 408)

Lightweight Directory Access Protocol (LDAP) is an internet protocol works on TCP/IP, used to access information from
directories. LDAP protocol is basically used to access an active directory.
Features of LDAP:
1. Functional model of LDAP is simpler due to this it omits duplicate, rarely used and esoteric feature.
2. It is easier to understand and implement.
3. It uses strings to represent data

37. Differentiate between schema and instance.

Schema Instance
It is the overall description of the database. It is the collection of information stored in a database at
particular moment.
Schema is same for whole database. Data in instances can be changed using addition,
updation.
Does not change Frequently. Changes Frequently.
Defines the basic structure of the database i.e how the data will It is the set of Information stored at a particular time.
stored in the database.

DBMS - PCATC Page 10


stored in the database.

38. What is the importance of handling null values in a relation?

Adding NULL Values to a database breaks the relations implicit in the model and leads to ‘TRUE’, ‘FALSE’ and ‘UNKNOWN’. At be st this leads
to increased code complexity by having to use null handling functions, horizontally decomposed WHERE clauses and inference.

39. Define the terms domain, attribute, tuple and relation.

Data arrange in row and columns and having certain properties.


(a) Relation :-
(b) Domain :- This is a pool of values from which the actual values appearing in a given column are drawn.
(c) Tuple :- A row of a relation is known as tuple.
(d) Attribute :- A column of a relation is known as attribute.

40. What is the objective of normalization?

Objective of Normalization
1. It is used to remove the duplicate data and database anomalies from the relational table.
2. Normalization helps to reduce redundancy and complexity by examining new data types used in the table.
3. It is helpful to divide the large database table into smaller tables and link them using relationship.
4. It avoids duplicate data or no repeating groups into a table.
5. It reduces the chances for anomalies to occur in a database.

41. Define multivalued dependency.(Repeated Question)

42. Mention the storage types.

Types of Data Storage


For storing the data, there are different types of storage options available. These storage types differ from one another as per the
speed and accessibility. There are the following types of storage devices used for storing the data:
• Primary Storage
• Secondary Storage
• Tertiary Storage

43. Define term index.

Indexing in DBMS
• Indexing is used to optimize the performance of a database by minimizing the number of disk accesses required when a
processed.
• The index is a type of data structure. It is used to locate and access the data in a database table quickly.
Index structure:
Indexes can be created using some database columns.

• The first column of the database is the search key that contains a copy of the primary key or candidate key of the table. The
values of the primary key are stored in sorted order so that the corresponding data can be accessed easily.
• The second column of the database is the data reference. It contains a set of pointers holding the address of the disk block
the value of the particular key can be found.

44. What is B tree?

DBMS - PCATC Page 11


44. What is B tree?

B Tree is a specialized m-way tree that can be widely used for disk access. A B -Tree of order m can have at most m-1 keys and m
children. One of the main reason of using B tree is its capability to store large number of keys in a single node and large k ey
keeping the height of the tree relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains the following properties.
1. Every node in a B-Tree contains at most m children.
2. Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
3. The root nodes must have at least 2 nodes.
4. All leaf nodes must be at the same level.

45. List the properties of transaction.

Transaction property
The transaction has the four properties. These are used to maintain consistency in a database, before and after the transacti on.
Property of Transaction
1. Atomicity
2. Consistency
3. Isolation
4. Durability

46. Give the reasons for allowing concurrency.

• Waiting Time: It means if a process is in a ready state but still the process does not get the system to get execute is called
waiting time. So, concurrency leads to less waiting time.
• Response Time: The time wasted in getting the response from the cpu for the first time, is called response time. So,
concurrency leads to less Response Time.
• Resource Utilization: The amount of Resource utilization in a particular system is called Resource Utilization. Multiple
transactions can run parallel in a system. So, concurrency leads to more Resource Utilization.
• Efficiency: The amount of output produced in comparison to given input is called efficiency. So, Concurrency leads to more
Efficiency.

DBMS - PCATC Page 12


47. What is parallel system?

A parallel DBMS is a DBMS that runs across multiple processors or CPUs and is mainly designed to execute query operations
parallel, wherever possible. The parallel DBMS link a number of smaller machines to achieve the same throughput as
from a single large machine.
In Parallel Databases, mainly there are three architectural designs for parallel DBMS. They are as follows:
(1)Shared Memory Architecture
(2)Shared Disk Architecture
(3)Shared Nothing Architecture

48. Write the syntax for create table command.

Creating a basic table involves naming the table and defining its columns and each column's data type.
The SQL CREATE TABLE statement is used to create a new table.
Syntax
The basic syntax of the CREATE TABLE statement is as follows −
CREATE TABLE table_name(
column1 datatype,
column2 datatype,
column3 datatype,
.....
columnN datatype,
PRIMARY KEY( one or more columns )
);

DBMS - PCATC Page 13


6 Marks
18 February 2023
17:18

1. Describe the disadvantages of file processing system.


2. Brief on set operations.(Page no 78)
3. Write the features of relational design.
4. Write short notes on normalization.
5. What is static hashing? Why we need dynamic hashing?
6. Briefly explain Buffer management.(Page 370)
7. Compare Homogeneous and Heterogeneous databases.
8. Write short notes on set operations.(Repeated Question)
9. With examples, explain recursive queries.
10. What are various integrity constraints? Explain.(Page no 94)
11. Write about RAID.(Page Number 206)
12. Give a brief account on join operators with example.
13. Briefly explain concurrency control with locking methods.(Page no 340)
14. Elaborate on directory system.
15. Describe the relational model.
16. Discuss in detail about set operations.(Repeated Questions)
17. Write short notes on basic structure of E-R diagrams.(Page no 5.4)
18. Narrate the structure types and inheritance in SQL.(Page no 178)
19. Explain join operation in query processing.(Page no 287)
20. Give short notes on deadlock handling.(Page no 347)
21. Explain client/server architecture.(Page no 382)
22. What is a view? Explain it.
23. Briefly explain embedded SQL.
24. Brief on relational database design.
25. Write short notes on RAID.(Repeated Questions)
26. Give brief account on query optimization.
27. Describe briefly about deadlocks.
28. List and explain network types.

DBMS - PCATC Page 14


Relational database design
19 February 2023
21:18

Relational ...

Features of Good Relational Designs


There are two types of good relational designs
i. Design alternative: Larger Schemas
ii. Design alternative: Smaller Schemas

DBMS - PCATC Page 15


Network Types
19 February 2023
17:01

Local Area Network (LAN) :


LAN is the most frequently used network. A LAN is a computer network that connects computers together through a common
communication path, contained within a limited area, that is, locally. A LAN encompasses two or more computers connected
server. The two important technologies involved in this network are Ethernet and Wi -fi.
Examples of LAN are networking in a home, school, library, laboratory, college, office, etc.

Advantages of LAN
• LAN can share data at speeds ranging from 10 Mbps to 1000 Mbps. The transmission speed of data is high in LAN networks
the range of the LAN is limited to a certain space.
• Multiple computers and devices like printers and scanners can be connected using a LAN cable.
• LAN is considered a very secure network as it can be accessed only within a specific range, and it is impossible to get connected
without its ID and password if implemented.
• The ownership of the LAN network is private. It can be accessed only when the user has an authentic user ID and password.
• The user can download or upload any document over the LAN network and print any copy through the printer connected to the
same LAN.
• Any software and application can also be downloaded and uploaded using LAN.
• Usually, the range of the LAN network is 0-150m, but the range of the LAN can also be extended up to 1 Km if required.
• It becomes easy for the users to keep their data secured as if someone is using LAN, then all the data get stored in one place,
is referred to as the host computer.
• The users also do not need to purchase separate printers or scanners for each computer as the LAN allows the users to share
printer with all the other computers that are connected to the same LAN, and because of this, cost reduction in purchasing
can be made.
• LAN enables the users to share one internet connection with all others computers or devices connected to it.
• LAN is also very cheap to use as the users can share data with other connected devices instantly and cheaply.
Disadvantages of LAN
• There is no doubt that LAN does not cost much as compared to other options available. But initially, to set up the LAN, a high
has to be incurred by the user for its proper installation as there are some software/hardware requirements.
• The tools which are required while installing the LAN and for its proper working are somewhat costly. These tools are Ethernet
cables, routers, switches, etc.
• All the connected users on a single LAN can access the files and data of other devices which are connected on the same LAN.
can also access the internet history of each device that is connected through that LAN which means that LAN does not provide
privacy to its users from inside accesses.
• The range of the LAN is limited, and therefore only those who are in the range of the LAN can use it.
• As all the data of the different devices connected through LAN are stored in the server/host computer, it becomes easy for
to access the entire data at once, which means that there is always a risk to data privacy, including loss and misuse.

DBMS - PCATC Page 16


• There should be someone with a piece of proper knowledge about LAN and networking as LAN need regular maintenance. Most of
the time, problems like hardware failure and failure of the system can be seen. Therefore, those who are using LAN in the office
for some other official work should keep someone as a full-time employee to fix this kind of issue instantly when required.
• The presence of any kind of virus becomes very dangerous for all the computers connected on the same LAN. If any one of the
computers is affected by the virus, then there are chances that all of the other computers will also be affected by the virus.
• As all the data of the connected device are stored on one central server; therefore, in case of server failure, no files of the other
connected devices can be accessed in such a situation.

Wide Area Network (WAN) :


WAN is a type of computer network that connects computers over a large geographical distance through a shared
path. It is not restrained to a single location but extends over many locations. WAN can also be defined as a group of local area
networks that communicate with each other.
The most common example of WAN is the Internet.

Examples Of Wide Area Network:


• Mobile Broadband: A 4G network is widely used across a region or country.
• Last mile: A telecom company is used to provide the internet services to the customers in hundreds of cities by connecting their
home with fiber.
• Private network: A bank provides a private network that connects the 44 offices. This network is made by using the telephone
leased line provided by the telecom company.

Advantages Of Wide Area Network:


Following are the advantages of the Wide Area Network:
• Geographical area: A Wide Area Network provides a large geographical area. Suppose if the branch of our office is in a different
city then we can connect with them through WAN. The internet provides a leased line through which we can connect with
another branch.
• Centralized data: In case of WAN network, data is centralized. Therefore, we do not need to buy the emails, files or back up
servers.
• Get updated files: Software companies work on the live server. Therefore, the programmers get the updated files within seconds.
• Exchange messages: In a WAN network, messages are transmitted fast. The web application like Facebook, Whatsapp, Skype
allows you to communicate with friends.
• Sharing of software and resources: In WAN network, we can share the software and other resources like a hard drive, RAM.
• Global business: We can do the business over the internet globally.
• High bandwidth: If we use the leased lines for our company then this gives the high bandwidth. The high bandwidth increases
the data transfer rate which in turn increases the productivity of our company.

Disadvantages of Wide Area Network:


The following are the disadvantages of the Wide Area Network:
• Security issue: A WAN network has more security issues as compared to LAN and MAN network as all the technologies are
combined together that creates the security problem.
Needs Firewall & antivirus software: The data is transferred on the internet which can be changed or hacked by the hackers, so

DBMS - PCATC Page 17


• Needs Firewall & antivirus software: The data is transferred on the internet which can be changed or hacked by the hackers, so
the firewall needs to be used. Some people can inject the virus in our system so antivirus is needed to protect from such a virus.
• High Setup cost: An installation cost of the WAN network is high as it involves the purchasing of routers, switches.
• Troubleshooting problems: It covers a large area so fixing the problem is difficult.

DBMS - PCATC Page 18


Deadlock
19 February 2023
16:55

Deadlock in DBMS
A deadlock is a condition where two or more transactions are waiting indefinitely for one another to give up locks. Deadlock is said
one of the most feared complications in DBMS as no task ever gets finished and is in waiting state forever.

For example: In the student table, transaction T1 holds a lock on some rows and needs to update some rows in the grade table.
Simultaneously, transaction T2 holds locks on some rows in the grade table and needs to update the rows in the Student table held
by Transaction T1.
Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock and similarly, transaction T2 is waiti ng for T1
release its lock. All activities come to a halt state and remain at a standstill. It will remain in a standstill until the DB MS detects the
deadlock and aborts one of the transactions.

Deadlock Avoidance
• When a database is stuck in a deadlock state, then it is better to avoid the database rather than aborting or restating the
This is a waste of time and resource.
• Deadlock avoidance mechanism is used to detect any deadlock situation in advance. A method like "wait for graph" is used for
detecting the deadlock situation but this method is suitable only for the smaller database. For the larger database, deadlock
prevention method can be used.

Deadlock Detection
In a database, when a transaction waits indefinitely to obtain a lock, then the DBMS should detect whether the transaction is
a deadlock or not. The lock manager maintains a Wait for the graph to detect the deadlock cycle in the database.

Wait for Graph


• This is the suitable method for deadlock detection. In this method, a graph is created based on the transaction and their loc k. If
created graph has a cycle or closed loop, then there is a deadlock.
• The wait for the graph is maintained by the system for every transaction which is waiting for some data held by the others. T he
system keeps checking the graph if there is any cycle in the graph.
The wait for a graph for the above scenario is shown below:

DBMS - PCATC Page 19


Deadlock Prevention
• Deadlock prevention method is suitable for a large database. If the resources are allocated in such a way that deadlock never
occurs, then the deadlock can be prevented.
• The Database management system analyzes the operations of the transaction whether they can create a deadlock situation or
If they do, then the DBMS never allowed that transaction to be executed.
Wait-Die scheme
In this scheme, if a transaction requests for a resource which is already held with a conflicting lock by another transaction then the
simply checks the timestamp of both transactions. It allows the older transaction to wait until the resource is available for
Let's assume there are two transactions Ti and Tj and let TS(T) is a timestamp of any transaction T. If T2 holds a lock by so me other
transaction and T1 is requesting for resources held by T2 then the following actions are performed by DBMS:
1. Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has held some resource, then Ti is allowed to wait until the data -
available for execution. That means if the older transaction is waiting for a resource which is locked by the younger
then the older transaction is allowed to wait for resource until it is available.
2. Check if TS(Ti) < TS(Tj) - If Ti is older transaction and has held some resource and if Tj is waiting for it, then Tj is killed and
restarted later with the random delay but with the same timestamp.
Wound wait scheme
• In wound wait scheme, if the older transaction requests for a resource which is held by the younger transaction, then older
transaction forces younger one to kill the transaction and release the resource. After the minute delay, the younger transact ion
restarted but with the same timestamp.
• If the older transaction has held a resource which is requested by the Younger transaction, then the younger transaction is
wait until older releases it.

differences between Wait – Die and Wound -Wait scheme prevention schemes :

Wait – Die Wound -Wait


It is based on a non-preemptive technique. It is based on a preemptive technique.
In this, older transactions must wait for the younger one to release its In this, older transactions never wait for younger
data items. transactions.
The number of aborts and rollback is higher in these techniques. In this, the number of aborts and rollback is lesser.

Recovery from DeadLoack(Page no 351)

DBMS - PCATC Page 20


Embedded SQL
19 February 2023
16:50

Embedded SQL is the one which combines the high level language with the DB language like SQL. It allows the application langu ages to communicate with DB
get requested result. The high level languages which supports embedding SQLs within it are also known as host language. There are different host languages
support embedding SQL within it like C, C++, ADA, Pascal, FORTRAN, Java etc. When SQL is embedded within C or C++, then it is known as Pro*C/C++ or simply
Pro*C language. Pro*C is the most commonly used embedded SQL

Structure of Embedded SQL


Structure of embedded SQL defines step by step process of establishing a connection with DB and executing the code in the DB
the high level language.

Connection to DB
This is the first step while writing a query in high level languages. First connection to the DB that we are accessing needs to be
established. This can be done using the keyword CONNECT. But it has to precede with ‘EXEC SQL’ to indicate that it is a SQL
statement.
EXEC SQL CONNECT db_name;
EXEC SQL CONNECT HR_USER; //connects to DB HR_USER

Declaration Section
Once connection is established with DB, we can perform DB transactions. Since these DB transactions are dependent on the
and variables of the host language. Depending on their values, query will be written and executed. Similarly, results of DB q uery
be returned to the host language which will be captured by the variables of host language. Hence we need to declare the
pass the value to the query and get the values from query. There are two types of variables used in the host language.
• Host variable : These are the variables of host language used to pass the value to the query as well as to capture the values
returned by the query. Since SQL is dependent on host language we have to use variables of host language and such variables a re
known as host variable. But these host variables should be declared within the SQL area or within SQL code. That means
compiler should be able to differentiate it from normal C variables. Hence we have to declare host variables within BEGIN
DECLARE and END DECLARE section. Again, these declare block should be enclosed within EXEC SQL and ‘;’.
EXEC SQL BEGIN DECLARE SECTION;
int STD_ID;
char STD_NAME [15];
char ADDRESS[20];
EXEC SQL END DECLARE SECTION;
We can note here that variables are written inside begin and end block of the SQL, but they are declared using C code. It doe s not use SQL code to declare the
variables. Why? This is because they are host variables – variables of C language. Hence we cannot use SQL syntax to declare them. Host language supports
almost all the datatypes from int, char, long, float, double, pointer, array, string, structures etc.
When host variables are used in a SQL query, it should be preceded by colon – ‘:’ to indicate that it is a host variable. Hence
when pre-compiler compiles SQL code, it substitutes the value of host variable and compiles.
EXEC SQL SELECT * FROM STUDENT WHERE STUDENT_ID =:STD_ID;
In above code, :STD_ID will be replaced by its value when pre-compiler compiles it.
Suppose we do not know what should be the datatype of host variables or what is the datatype in oracle for few of the columns .
such case we can allow the compiler to fetch the datatype of column and assign it to the host variable. It is done using ‘BAS ED
clause. But format of declaration will be in host language.
EXEC SQL BEGIN DECLARE SECTION;
BASED ON STUDENT.STD_ID sid;
BASED ON STUDENT.STD_NAME sname;
BASED ON STUDENT.ADDRESS saddress;
EXEC SQL END DECLARE SECTION;
• Indicator Variable : These variables are also host variables but are of 2 byte short type always. These variables are used to
capture the NULL values that a query returns or to INSERT/ UPDATE any NULL values to the tables. When it is used in a SELECT
query, it captures any NULL value returned for any column. When used along with INSERT or UPDATE, it sets the column value as
NULL, even though the host variable has value. If we have to capture the NULL values for each host variable in the code, then we
have to declare indicator variables to each of the host variables. These indicator variables are placed immediately after the host
variable in a query or separated by INDICATOR between host and indicator variable.
EXEC SQL SELECT STD_NAME INTO :SNAME :IND_SNAME
FROM STUDENT WHERE STUDENT_ID =:STD_ID;
Or
EXEC SQL SELECT STD_NAME INTO :SNAME INDICATOR :IND_SNAME
FROM STUDENT WHERE STUDENT_ID =:STD_ID;
INSERT INTO STUDENT (STD_ID, STD_NAME)
VALUES (:SID, :SNAME INDICATOR :IND_SNAME); --Sets NULL to STD_NAME
UPDATE STUDENT
SET ADDRESS = :STD_ADDR :IND_SADDR; --Sets NULL to ADDRESS
Though indicator variable sets/gets NULL values to the column, it passes/ gets different integer values. When SELECT query is
executed, it gets 4 different integer values listed below :

DBMS - PCATC Page 21


executed, it gets 4 different integer values listed below :
When insert / update statement is executed along with indicator variable, then it can pass two values to indicate to assign o r not
assign NULL values.
Execution Section
This is the execution section, and it contains all the SQL queries and statements prefixed by ‘EXEC SQL’.
EXEC SQL SELECT * FROM STUDENT WHERE STUDENT_ID =:STD_ID;
EXEC SQL SELECT STD_NAME INTO :SNAME :IND_SNAME
FROM STUDENT WHERE STUDENT_ID =:STD_ID;
INSERT INTO STUDENT (STD_ID, STD_NAME)
VALUES (:SID, :SNAME);
UPDATE STUDENT
SET ADDRESS = :STD_ADDR
WHERE STD_ID = :SID;
Above examples show simple SQL queries/statements. But we can have complex queries too.
In this embedded SQL, all the queries are dependent on the values of host variable and queries are static. That means, in abo ve example of SELECT query, it
pulls student details for the student Id inserted. But suppose user enters student name instead of student ID. Then these SQL s are not flexible to modify the
to fetch details based on name. Suppose query is based on name and address of a student. Then code will not modify the query to fetch details based on name
address of a student. That means queries are static and it cannot be modified based on user input. Hence this kind of SQLs is known as static SQLs.
Error Handling
Like any other programming language, in embedded SQL also we need to handle errors. Error handling method would be based on
the host language. Here we are using C language and we use labeling method, i.e.; when error occurs we stop the current
of execution and ask the compiler to jump to error handling section of the code to continue. In order to handle error, C prog rams
require separate error handling structure which holds different variables to capture different set of errors. This structure is
SQL Communication Area or SQLCA. Below is the structure of SQLCA.
struct sqlca {
/* ub1 */ char sqlcaid [8];
/* b4 */ long sqlabc;
/* b4 */ long sqlcode;
struct {
/* ub2 */ unsigned short sqlerrml;
/* ub1 */ char sqlerrmc[70];
} sqlerrm;
….
long sqlcode; //returns the error code

char sqlstate [6]; //returns predefined error statements
….
}
If we have to use this error handling structure, then we have to include sqlca.h header file in the program, using #include d irectives. In this structure mainly
SQLCODE and SQLSTATE are used to see the type of error. SQLCODE returns different values for different types of errors.
Whenever error occurs in the code, then we have to redirect the execution of code to handle the error rather than executing
This is done using WHENEVER statement.
EXEC SQL WHENEVER condition action;
The condition in WHENEVER clause can be
• SQLWARNING – indicates SQL warning. It indicates the compiler that when SQL warning occurs perform action.
• SQLERROR – indicates SQL Error. The SQLCODE will have negative value.
• NOT FOUND – SQLCODE will have positive value indicating no records are fetched.
On receiving error or warning, action can be any one of the following:
• CONTINUE – indicates to continue with the normal execution of the code.
• DO – it calls a function and hence program will move to execute this error handling function.
• GOTO <label> – Program will jump to the location <label> to execute error handling.
• STOP – it immediately stops the execution of the program by calling exit (0) and all the incomplete transactions will be
rolled back.
EXEC SQL WHENEVER SQLWARNING DO display_warning();
EXEC SQL WHENEVER SQLERROR STOP;
EXEC SQL WHENEVER NOT FOUND GOTO lbl_no_records;
Whenever we use ‘WHENEVER’ clause, first statement should be ‘EXEC SQL INCLUDE SQLCA;’ in the code. This is to indicate
that error handling needs to be done for the following code.
Consider a simple Pro*C program to illustrate embedded SQL. This program below accepts student name from the user and
DB for his student id.
#include <stdio.h>
#include <sqlca.h>
int main(){
EXEC SQL INCLUDE SQLCA;
EXEC SQL BEGIN DECLARE SECTION;
BASED ON STUDENT.STD_ID SID; // host variable to store the value returned by query
char *STD_NAME; // host variable to pass the value to the query
short ind_sid;// indicator variable
EXEC SQL END DECLARE SECTION;
//Error handling

DBMS - PCATC Page 22


//Error handling
EXEC WHENEVER NOT FOUND GOTO error_msg1;
EXEC WHENEVER SQLERROR GOTO error_msg2;
printf("Enter the Student name:");
scanf("%s", STD_Name);
// Executes the query
EXEC SQL SELECT STD_ID INTO : SID INDICATOR ind_sid FROM STUDENT WHERE STD_NAME = : STD_NAME;
printf("STUDENT ID:%d", STD_ID); // prints the result from DB
exit(0);
// Error handling labels
error_msg1:
printf("Student Id %d is not found", STD_ID);
printf("ERROR:%ld", sqlca->sqlcode);
printf("ERROR State:%s", sqlca->sqlstate);
exit(0);
error_msg2:
printf("Error has occurred!");
printf("ERROR:%ld", sqlca->sqlcode);
printf("ERROR State:%s", sqlca->sqlstate);
exit(0);
}

Pasted from <https://tutorialcup.com/dbms/embedded-sql.htm>

DBMS - PCATC Page 23


View
19 February 2023
16:37

Views in SQL are kind of virtual tables. A view also has rows and columns as they are in a real table in the database. We can
view by selecting fields from one or more tables present in the database. A View can either have all the rows of a table or
rows based on certain condition. In this article we will learn about creating , deleting and updating Views. Sample Tables:
StudentDetails

StudentMarks

CREATING VIEWS
We can create View using CREATE VIEW statement. A View can be created from a single table or multiple tables. Syntax:
CREATE VIEW view_name AS
SELECT column1, column2.....
FROM table_name
WHERE condition;
view_name: Name for the View
table_name: Name of the table
condition: Condition to select rows
Examples:
• Creating View from a single table:
• In this example we will create a View named DetailsView from the table StudentDetails. Query:
CREATE VIEW DetailsView AS
SELECT NAME, ADDRESS
FROM StudentDetails
WHERE S_ID < 5;
• To see the data in the View, we can query the view in the same manner as we query a table.
SELECT * FROM DetailsView;
• Output:
• In this example, we will create a view named StudentNames from the table StudentDetails. Query:
CREATE VIEW StudentNames AS
SELECT S_ID, NAME
FROM StudentDetails

DBMS - PCATC Page 24


FROM StudentDetails
ORDER BY NAME;
• If we now query the view as,
SELECT * FROM StudentNames;
• Output:
• Creating View from multiple tables: In this example we will create a View named MarksView from two tables StudentDetails
StudentMarks. To create a View from multiple tables we can simply include multiple tables in the SELECT statement. Query:
CREATE VIEW MarksView AS
SELECT StudentDetails.NAME, StudentDetails.ADDRESS, StudentMarks.MARKS
FROM StudentDetails, StudentMarks
WHERE StudentDetails.NAME = StudentMarks.NAME;
• To display data of View MarksView:
SELECT * FROM MarksView;
• Output:
DELETING VIEWS
We have learned about creating a View, but what if a created View is not needed any more? Obviously we will want to delete it .
SQL allows us to delete an existing View. We can delete or drop a View using the DROP statement. Syntax:
DROP VIEW view_name;
view_name: Name of the View which we want to delete.
For example, if we want to delete the View MarksView, we can do this as:
DROP VIEW MarksView;
UPDATING VIEWS
There are certain conditions needed to be satisfied to update a view. If any one of these conditions is not met, then we will not
allowed to update the view.
1. The SELECT statement which is used to create the view should not include GROUP BY clause or ORDER BY clause.
2. The SELECT statement should not have the DISTINCT keyword.
3. The View should have all NOT NULL values.
4. The view should not be created using nested queries or complex queries.
5. The view should be created from a single table. If the view is created using multiple tables then we will not be allowed to
the view.
• We can use the CREATE OR REPLACE VIEW statement to add or remove fields from a view. Syntax:
CREATE OR REPLACE VIEW view_name AS
SELECT column1,column2,..
FROM table_name
WHERE condition;
• For example, if we want to update the view MarksView and add the field AGE to this View from StudentMarks Table, we can do
this as:
CREATE OR REPLACE VIEW MarksView AS
SELECT StudentDetails.NAME, StudentDetails.ADDRESS, StudentMarks.MARKS, StudentMarks.AGE
FROM StudentDetails, StudentMarks
WHERE StudentDetails.NAME = StudentMarks.NAME;
• If we fetch all the data from MarksView now as:
SELECT * FROM MarksView;
• Output:
• Inserting a row in a view: We can insert a row in a View in a same way as we do in a table. We can use the INSERT INTO
statement of SQL to insert a row in a View.Syntax:
INSERT INTO view_name(column1, column2 , column3,..)
VALUES(value1, value2, value3..);
view_name: Name of the View
• Example: In the below example we will insert a new row in the View DetailsView which we have created above in the example of
“creating views from a single table”.
INSERT INTO DetailsView(NAME, ADDRESS)
VALUES("Suresh","Gurgaon");
• If we fetch all the data from DetailsView now as,
SELECT * FROM DetailsView;
• Output:
• Deleting a row from a View: Deleting rows from a view is also as simple as deleting rows from a table. We can use the DELETE
statement of SQL to delete rows from a view. Also deleting a row from a view first delete the row from the actual table and t he
change is then reflected in the view.Syntax:
DELETE FROM view_name
WHERE condition;
view_name:Name of view from where we want to delete rows
condition: Condition to select rows

DBMS - PCATC Page 25


condition: Condition to select rows
• Example: In this example we will delete the last row from the view DetailsView which we just added in the above example of
inserting rows.
DELETE FROM DetailsView
WHERE NAME="Suresh";
• If we fetch all the data from DetailsView now as,
SELECT * FROM DetailsView;
• Output:
WITH CHECK OPTION
The WITH CHECK OPTION clause in SQL is a very useful clause for views. It is applicable to a updatable view. If the view is n ot
updatable, then there is no meaning of including this clause in the CREATE VIEW statement.
• The WITH CHECK OPTION clause is used to prevent the insertion of rows in the view where the condition in the WHERE clause in
CREATE VIEW statement is not satisfied.
• If we have used the WITH CHECK OPTION clause in the CREATE VIEW statement, and if the UPDATE or INSERT clause does not
satisfy the conditions then they will return an error.
Example: In the below example we are creating a View SampleView from StudentDetails Table with WITH CHECK OPTION clause.
CREATE VIEW SampleView AS
SELECT S_ID, NAME
FROM StudentDetails
WHERE NAME IS NOT NULL
WITH CHECK OPTION;
In this View if we now try to insert a new row with null value in the NAME column then it will give an error because the view is
created with the condition for NAME column as NOT NULL. For example,though the View is updatable but then also the below
query for this View is not valid:
INSERT INTO SampleView(S_ID)
VALUES(6);
NOTE: The default value of NAME column is null. Uses of a View : A good database should contain views due to the given
reasons:
1. Restricting data access – Views provide an additional level of table security by restricting access to a predetermined set of rows
and columns of a table.
2. Hiding data complexity – A view can hide the complexity that exists in a multiple table join.
3. Simplify commands for the user – Views allows the user to select information from multiple tables without requiring the users
to actually know how to perform a join.
4. Store complex queries – Views can be used to store complex queries.
5. Rename Columns – Views can also be used to rename the columns without affecting the base tables provided the number of
columns in view must match the number of columns specified in select statement. Thus, renaming helps to hide the names of
the columns of the base tables.
6. Multiple view facility – Different views can be created on the same table for different users.

Pasted from <https://www.geeksforgeeks.org/sql -views/>

DBMS - PCATC Page 26


ER Diagram
19 February 2023
15:45

ER (Entity Relationship) Diagram in DBMS


• ER model stands for an Entity-Relationship model. It is a high-level data model. This model is used to define the data elements and
relationship for a specified system.
• It develops a conceptual design for the database. It also develops a very simple and easy to design view of data.
• In ER modeling, the database structure is portrayed as a diagram called an entity-relationship diagram.
For example, Suppose we design a school database. In this database, the student will be an entity with attributes like address, name, id, age,
etc. The address can be another entity with attributes like city, street name, pin code, etc and there will be a relationship between them.

Component of ER Diagram

1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be represented as rectangles.
Consider an organization as an example- manager, product, employee, department etc. can be taken as an entity.

a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't contain any key attribute of its own. The weak entity is
represented by a double rectangle.

2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.

DBMS - PCATC Page 27


a. Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents a primary key. The key attribute is represented by an
ellipse with the text underlined.

b. Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute. The composite attribute is represented by an ellipse,
those ellipses are connected with an ellipse.

c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a multivalued attribute. The double oval is used to represent
multivalued attribute.
For example, a student can have more than one phone number.

d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It can be represented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another attribute like Date of birth.

DBMS - PCATC Page 28


3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to represent the relationship.

Types of relationship are as follows:


a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is known as one to one relationship.
For example, A female can marry to one male, and a male can marry to one female.

b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity on the right associates with the relationship then
known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the only specific scientist.

c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity on the right associates with the relationship then it
known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have many students.

d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an entity on the right associates with the relationship
it is known as a many-to-many relationship.
For example, Employee can assign by many projects and project can have many employees.

DBMS - PCATC Page 29


Join Operaters
19 February 2023
15:37

SQL | Join (Inner, Left, Right and Full Joins)


SQL Join statement is used to combine data or rows from two or more tables based on a common field between them. Different types of
Joins are as follows:
• INNER JOIN
• LEFT JOIN
• RIGHT JOIN
• FULL JOIN
Consider the two tables below:
Student

StudentCourse

The simplest Join is INNER JOIN.

A. INNER JOIN

The INNER JOIN keyword selects all rows from both the tables as long as the condition is satisfied. This keyword will create the result-
combining all rows from both the tables where the condition satisfies i.e value of the common field will be the same.
Syntax:
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
INNER JOIN table2
ON table1.matching_column = table2.matching_column;
table1: First table.
table2: Second table
matching_column: Column common to both the tables.
Note: We can also write JOIN instead of INNER JOIN. JOIN is same as INNER JOIN.

DBMS - PCATC Page 30


Example Queries(INNER JOIN)
This query will show the names and age of students enrolled in different courses.
SELECT StudentCourse.COURSE_ID, Student.NAME, Student.AGE FROM Student
INNER JOIN StudentCourse
ON Student.ROLL_NO = StudentCourse.ROLL_NO;
Output:

B. LEFT JOIN

This join returns all the rows of the table on the left side of the join and matches rows for the table on the right side of the join. For the
for which there is no matching row on the right side, the result-set will contain null. LEFT JOIN is also known as LEFT OUTER JOIN.
Syntax:
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
LEFT JOIN table2
ON table1.matching_column = table2.matching_column;
table1: First table.
table2: Second table
matching_column: Column common to both the tables.
Note: We can also use LEFT OUTER JOIN instead of LEFT JOIN, both are the same.

DBMS - PCATC Page 31


Example Queries(LEFT JOIN):
SELECT Student.NAME,StudentCourse.COURSE_ID
FROM Student
LEFT JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
Output:

C. RIGHT JOIN

RIGHT JOIN is similar to LEFT JOIN. This join returns all the rows of the table on the right side of the join and matching rows for the
the left side of the join. For the rows for which there is no matching row on the left side, the result-set will contain null. RIGHT JOIN is
known as RIGHT OUTER JOIN.
Syntax:
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
RIGHT JOIN table2
ON table1.matching_column = table2.matching_column;
table1: First table.
table2: Second table
matching_column: Column common to both the tables.
Note: We can also use RIGHT OUTER JOIN instead of RIGHT JOIN, both are the same.

DBMS - PCATC Page 32


Example Queries(RIGHT JOIN):
SELECT Student.NAME,StudentCourse.COURSE_ID
FROM Student
RIGHT JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
Output:

D. FULL JOIN

FULL JOIN creates the result-set by combining results of both LEFT JOIN and RIGHT JOIN. The result-set will contain all the rows from
tables. For the rows for which there is no matching, the result-set will contain NULL values.

Syntax:
SELECT table1.column1,table1.column2,table2.column1,....
FROM table1
FULL JOIN table2
ON table1.matching_column = table2.matching_column;
table1: First table.

DBMS - PCATC Page 33


table1: First table.
table2: Second table
matching_column: Column common to both the tables.
Example Queries(FULL JOIN):
SELECT Student.NAME,StudentCourse.COURSE_ID
FROM Student
FULL JOIN StudentCourse
ON StudentCourse.ROLL_NO = Student.ROLL_NO;
Output:

NAME COURSE_ID
HARSH 1
PRATIK 2
RIYANKA 2
DEEP 3
SAPTARHI 1
DHANRAJ NULL
ROHIT NULL
NIRAJ NULL
NULL 4
NULL 5
NULL 4

DBMS - PCATC Page 34


RAID
19 February 2023
15:21

RAID
RAID refers to redundancy array of the independent disk. It is a technology which is used to connect multiple secondary stora ge devices
increased performance, data redundancy or both. It gives you the ability to survive one or more drive failure depending upon the RAID
used.
It consists of an array of disks in which multiple disks are connected to achieve different goals.

RAID technology
There are 7 levels of RAID schemes. These schemas are as RAID 0, RAID 1, ...., RAID 6.
These levels contain the following characteristics:
• It contains a set of physical disk drives.
• In this technology, the operating system views these separate disks as a single logical disk.
• In this technology, data is distributed across the physical drives of the array.
• Redundancy disk capacity is used to store parity information.
• In case of disk failure, the parity information can be helped to recover the data.
Standard RAID levels
RAID 0
• RAID level 0 provides data stripping, i.e., a data can place across multiple disks. It is based on stripping that means if one disk fails
data in the array is lost.
• This level doesn't provide fault tolerance but increases the system performance.
Example:
Disk 0 Disk 1 Disk 2 Disk 3
20 21 22 23
24 25 26 27
28 29 30 31
32 33 34 35
In this figure, block 0, 1, 2, 3 form a stripe.
In this level, instead of placing just one block into a disk at a time, we can work with two or more blocks placed it into a disk before
to the next one.
Disk 0 Disk 1 Disk 2 Disk 3
20 22 24 26
21 23 25 27
28 30 32 34
29 31 33 35
In this above figure, there is no duplication of data. Hence, a block once lost cannot be recovered.
Pros of RAID 0:
• In this level, throughput is increased because multiple data requests probably not on the same disk.
• This level full utilizes the disk space and provides high performance.
• It requires minimum 2 drives.
Cons of RAID 0:
• It doesn't contain any error detection mechanism.
• The RAID 0 is not a true RAID because it is not fault-tolerance.
• In this level, failure of either disk results in complete data loss in respective array.
RAID 1
This level is called mirroring of data as it copies the data from drive 1 to drive 2. It provides 100% redundancy in case of a failure.
Example:
Disk 0 Disk 1 Disk 2 Disk 3
A A B B
C C D D
E E F F

DBMS - PCATC Page 35


E E F F
G G H H
Only half space of the drive is used to store the data. The other half of drive is just a mirror to the already stored data.
Pros of RAID 1:
• The main advantage of RAID 1 is fault tolerance. In this level, if one disk fails, then the other automatically takes over.
• In this level, the array will function even if any one of the drives fails.
Cons of RAID 1:
• In this level, one extra drive is required per drive for mirroring, so the expense is higher.
RAID 2
• RAID 2 consists of bit-level striping using hamming code parity. In this level, each data bit in a word is recorded on a separate disk
code of data words is stored on different set disks.
• Due to its high cost and complex structure, this level is not commercially used. This same performance can be achieved by RAID 3 at
lower cost.
Pros of RAID 2:
• This level uses one designated drive to store parity.
• It uses the hamming code for error detection.
Cons of RAID 2:
• It requires an additional drive for error detection.
RAID 3
• RAID 3 consists of byte-level striping with dedicated parity. In this level, the parity information is stored for each disk section and
to a dedicated parity drive.
• In case of drive failure, the parity drive is accessed, and data is reconstructed from the remaining devices. Once the failed drive is
replaced, the missing data can be restored on the new drive.
• In this level, data can be transferred in bulk. Thus high-speed data transmission is possible.
Disk 0 Disk 1 Disk 2 Disk 3
A B C P(A, B, C)
D E F P(D, E, F)
G H I P(G, H, I)
J K L P(J, K, L)
Pros of RAID 3:
• In this level, data is regenerated using parity drive.
• It contains high data transfer rates.
• In this level, data is accessed in parallel.
Cons of RAID 3:
• It required an additional drive for parity.
• It gives a slow performance for operating on small sized files.
RAID 4
• RAID 4 consists of block-level stripping with a parity disk. Instead of duplicating data, the RAID 4 adopts a parity-based approach.
• This level allows recovery of at most 1 disk failure due to the way parity works. In this level, if more than one disk fails, then there is
way to recover the data.
• Level 3 and level 4 both are required at least three disks to implement RAID.
Disk 0 Disk 1 Disk 2 Disk 3
A B C P0
D E F P1
G H I P2
J K L P3
In this figure, we can observe one disk dedicated to parity.
In this level, parity can be calculated using an XOR function. If the data bits are 0,0,0,1 then the parity bits is XOR(0,1,0 ,0) = 1. If the parity
are 0,0,1,1 then the parity bit is XOR(0,0,1,1)= 0. That means, even number of one results in parity 0 and an odd number of o ne results in
1.
C1 C2 C3 C4 Parity
0 1 0 0 1

DBMS - PCATC Page 36


0 1 0 0 1
0 0 1 1 0
Suppose that in the above figure, C2 is lost due to some disk failure. Then using the values of all the other columns and the parity bit, we
recompute the data bit stored in C2. This level allows us to recover lost data.
RAID 5
• RAID 5 is a slight modification of the RAID 4 system. The only difference is that in RAID 5, the parity rotates among the drives.
• It consists of block-level striping with DISTRIBUTED parity.
• Same as RAID 4, this level allows recovery of at most 1 disk failure. If more than one disk fails, then there is no way for data recovery.
Disk 0 Disk 1 Disk 2 Disk 3 Disk 4
0 1 2 3 P0
5 6 7 P1 4
10 11 P2 8 9
15 P3 12 13 14
P4 16 17 18 19
This figure shows that how parity bit rotates.
This level was introduced to make the random write performance better.
Pros of RAID 5:
• This level is cost effective and provides high performance.
• In this level, parity is distributed across the disks in an array.
• It is used to make the random write performance better.
Cons of RAID 5:
• In this level, disk failure recovery takes longer time as parity has to be calculated from all available drives.
• This level cannot survive in concurrent drive failure.
RAID 6
• This level is an extension of RAID 5. It contains block-level stripping with 2 parity bits.
• In RAID 6, you can survive 2 concurrent disk failures. Suppose you are using RAID 5, and RAID 1. When your disks fail, you need to
the failed disk because if simultaneously another disk fails then you won't be able to recover any of the data, so in this case RAID 6
its part where you can survive two concurrent disk failures before you run out of options.
Disk 1 Disk 2 Disk 3 Disk 4
A0 B0 Q0 P0
A1 Q1 P1 D1
Q2 P2 C2 D2
P3 B3 C3 Q3
Pros of RAID 6:
• This level performs RAID 0 to strip data and RAID 1 to mirror. In this level, stripping is performed before mirroring.
• In this level, drives required should be multiple of 2.
Cons of RAID 6:
• It is not utilized 100% disk capability as half is used for mirroring.
• It contains very limited scalability.

Pasted from <https://www.javatpoint.com/dbms-raid>

DBMS - PCATC Page 37


Integrity constraints
19 February 2023
15:10

Integrity Constraints
• Integrity constraints are a set of rules. It is used to maintain the quality of information.
• Integrity constraints ensure that the data insertion, updating, and other processes have to be performed in such a way that data
integrity is not affected.
• Thus, integrity constraint is used to guard against accidental damage to the database.
Types of Integrity Constraint

1. Domain constraints
• Domain constraints can be defined as the definition of a valid set of values for an attribute.
• The data type of domain includes string, character, integer, time, date, currency, etc. The value of the attribute must be available in
corresponding domain.
Example:

2. Entity integrity constraints


• The entity integrity constraint states that primary key value can't be null.
• This is because the primary key value is used to identify individual rows in relation and if the primary key has a null value, then
identify those rows.
• A table can contain a null value other than the primary key field.
Example:

3. Referential Integrity Constraints


• A referential integrity constraint is specified between two tables.
• In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary Key of Table 2, then every value of the
Key in Table 1 must be null or be available in Table 2.

DBMS - PCATC Page 38


Key in Table 1 must be null or be available in Table 2.
Example:

4. Key constraints
• Keys are the entity set that is used to identify an entity within its entity set uniquely.
• An entity set can have multiple keys, but out of which one key will be the primary key. A primary key can contain a unique and null
value in the relational table.
Example:
Play Video

DBMS - PCATC Page 39


Recursive Queries
19 February 2023
15:07

A recursive query is a query that refers to a recursive CTE. The recursive queries are helpful in many circumstances such as for
hierarchical data like organizational structure, tracking lineage, etc.
Syntax:
WITH RECURSIVE cte_name AS(
CTE_query_definition <-- non-recursive term
UNION [ALL]
CTE_query definition <-- recursive term
) SELECT * FROM cte_name;
Let's analyze the above syntax:
• The non-recursive term is a CTE query definition that forms the base result set of the CTE structure.
• The recursive term can be one or more CTE query definitions joined with the non-recursive term through the UNION or UNION ALL
operator. The recursive term references the CTE name itself.
• The recursion stops when no rows are returned from the previous iteration.
First, we create a sample table using the below commands to perform examples:
CREATE TABLE employees (
employee_id serial PRIMARY KEY,
full_name VARCHAR NOT NULL,
manager_id INT
);
Then we insert data into our employee table as follows:
INSERT INTO employees (
employee_id,
full_name,
manager_id
)
VALUES
(1, 'M.S Dhoni', NULL),
(2, 'Sachin Tendulkar', 1),
(3, 'R. Sharma', 1),
(4, 'S. Raina', 1),
(5, 'B. Kumar', 1),
(6, 'Y. Singh', 2),
(7, 'Virender Sehwag ', 2),
(8, 'Ajinkya Rahane', 2),
(9, 'Shikhar Dhawan', 2),
(10, 'Mohammed Shami', 3),
(11, 'Shreyas Iyer', 3),
(12, 'Mayank Agarwal', 3),
(13, 'K. L. Rahul', 3),
(14, 'Hardik Pandya', 4),
(15, 'Dinesh Karthik', 4),
(16, 'Jasprit Bumrah', 7),
(17, 'Kuldeep Yadav', 7),
(18, 'Yuzvendra Chahal', 8),
(19, 'Rishabh Pant', 8),
(20, 'Sanju Samson', 8);
Now that the table is ready we can look into some examples.
Example 1:
The below query returns all subordinates of the manager with the id 3.
WITH RECURSIVE subordinates AS (
SELECT
employee_id,
manager_id,
full_name
FROM
employees
WHERE
employee_id = 3
UNION
SELECT

DBMS - PCATC Page 40


SELECT
e.employee_id,
e.manager_id,
e.full_name
FROM
employees e
INNER JOIN subordinates s ON s.employee_id = e.manager_id
) SELECT
*
FROM
subordinates;
Output:

Example 2:
The below query returns all subordinates of the manager with the id 4.
WITH RECURSIVE subordinates AS (
SELECT
employee_id,
manager_id,
full_name
FROM
employees
WHERE
employee_id = 4
UNION
SELECT
e.employee_id,
e.manager_id,
e.full_name
FROM
employees e
INNER JOIN subordinates s ON s.employee_id = e.manager_id
) SELECT
*
FROM
subordinates;
Output:

DBMS - PCATC Page 41


DBMS - PCATC Page 42
Buffer Management
19 February 2023
14:45

Buffer Management

Log-record buffering :

Here, it is required to know how the buffer management has to function which is essential to the implementation of a
recovery scheme that ensures data consistency and imposes a minimal amount of overhead on interactions.

a) Transaction Ti goes into the commit state after the <Ti commit> log record has been output to the stable storage.

b) Before the <Ti commit> log record can be output to stable storage, all log records concerning transaction Ti must have
have been output to the stable storage.

c) Before a block of data in the main memory can be output to the database, all log records concerned with the data in
block must have been output to the stable storage.

This rule is said to be the ‘write-ahead logging’ (WAL) rule.

Database buffering :

The system stores the database in non-volatile storage and brings the blocks of data into the main memory as required.
this process, if any block is modified in the main memory, it should be stored in disk and then that block can be used
other blocks to be overwritten.

One might expect that transactions would force-output all modified blocks to disk when they commit. Such a policy is
called ‘force policy’. When a data block B1 wants to be output to the disk, all log records concerned with the data in B1
must be output to stable storage.

DBMS - PCATC Page 43


static hashing and Why we need dynamic hashing
19 February 2023
14:31

Static Hashing
In static hashing, the resultant data bucket address will always be the same. That means if we generate an address for EMP_ID =103 using
hash function mod (5) then it will always result in same bucket address 3. Here, there will be no change in the bucket addres s.
Hence in this static hashing, the number of data buckets in memory remains constant throughout. In this example, we will have five data
in the memory used to store the data.

Operations of Static Hashing


• Searching a record
When a record needs to be searched, then the same hash function retrieves the address of the bucket where the data is stored.
• Insert a Record
When a new record is inserted into the table, then we will generate an address for a new record based on the hash key and rec ord is stored
that location.
• Delete a Record
To delete a record, we will first fetch the record which is supposed to be deleted. Then we will delete the records for that address in memory.
• Update a Record
To update a record, we will first search it using a hash function, and then the data record is updated.
If we want to insert some new record into the file but the address of a data bucket generated by the hash function is not emp ty, or data
exists in that address. This situation in the static hashing is known as bucket overflow. This is a critical situation in this method.
To overcome this situation, there are various methods. Some commonly used methods are as follows:
1. Open Hashing
When a hash function generates an address at which data is already stored, then the next bucket will be allocated to it. This mechanism is
as Linear Probing.
For example: suppose R3 is a new address which needs to be inserted, the hash function generates address as 112 for R3. But the generated
address is already full. So the system searches next available data bucket, 113 and assigns R3 to it.

2. Close Hashing
When buckets are full, then a new data bucket is allocated for the same hash result and is linked after the previous one. Thi s mechanism is
as Overflow chaining.
For example: Suppose R3 is a new address which needs to be inserted into the table, the hash function generates address as 110 for it. But
this bucket is full to store the new data. In this case, a new bucket is inserted at the end of 110 buckets and is linked to it.

DBMS - PCATC Page 44


this bucket is full to store the new data. In this case, a new bucket is inserted at the end of 110 buckets and is linked to it.

What is Dynamic Hashing?


It is a hashing technique that enables users to lookup a dynamic data set. Means, the data set is modified by adding data to or removing the
from, on demand hence the name ‘Dynamic’ hashing. Thus, the resulting data bucket keeps increasing or decreasing depending on the
records.
In this hashing technique, the resulting number of data buckets in memory is ever -changing.

Operations Provided by Dynamic Hashing


Dynamic hashing provides the following operations −
• Delete − Locate the desired location and support deleting data (or a chunk of data) at that location.
• Insertion − Support inserting new data into the data bucket if there is a space available in the data bucket.
• Query − Perform querying to compute the bucket address.
• Update − Perform a query to update the data.
Advantages of Dynamic Hashing
Dynamic hashing is advantageous in the following ways −
• It works well with scalable data.
• It can handle addressing large amount of memory in which data size is always changing.
• Bucket overflow issue comes rarely or very late.
Disadvantages of Dynamic Hashing
Dynamic hashing comes with the following disadvantage −
• The location of the data in memory keeps changing according to the bucket size. Hence if there is a phenomenal increase in da ta, then
maintaining the bucket address table becomes a challenge.
Differences between Static and Dynamic Hashing
Here are some prominent differences by which Static Hashing is different than Dynamic Hashing −

Key Factor Static Hashing Dynamic Hashing


Form of Data Fixed-size, non-changing data. Variable-size, changing data.
Result The resulting Data Bucket is of fixed-length. The resulting Data Bucket is of variable-length.
Bucket Overflow Challenge of Bucket overflow can arise often depending upon memory Bucket overflow can occur very late or doesn’t occur at
Complexity Simple Complex

DBMS - PCATC Page 45


Normalization
19 February 2023
13:50

Normalization
A large database defined as a single relation may result in data duplication. This repetition of data may result in:
• Making relations very large.
• It isn't easy to maintain and update data as it would involve searching many records in relation.
• Wastage and poor utilization of disk space and resources.
• The likelihood of errors and inconsistencies increases.
So to handle these problems, we should analyze and decompose the relations with redundant data into smaller, simpler, and wel l-
relations that are satisfy desirable properties. Normalization is a process of decomposing the relations into relations with fewer
What is Normalization?
• Normalization is the process of organizing the data in the database.
• Normalization is used to minimize the redundancy from a relation or set of relations. It is also used to eliminate undesirabl e
like Insertion, Update, and Deletion Anomalies.
• Normalization divides the larger table into smaller and links them using relationships.
• The normal form is used to reduce redundancy from the database table.
Why do we need Normalization?
The main reason for normalizing the relations is removing these anomalies. Failure to eliminate anomalies leads to data redun dancy and
cause data integrity and other problems as the database grows. Normalization consists of a series of guidelines that helps to guide you in
creating a good database structure.

Data modification anomalies can be categorized into three types:


• Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple into a relationship due to lack of data.
• Deletion Anomaly: The delete anomaly refers to the situation where the deletion of data results in the unintended loss of some other
important data.
• Updatation Anomaly: The update anomaly is when an update of a single data value requires multiple rows of data to be updated.
Types of Normal Forms:
Normalization works through a series of stages called Normal forms. The normal forms apply to individual relations. The relat ion is said
particular normal form if it satisfies constraints.
Following are the various types of Normal forms:

Normal Form Description


1NF A relation is in 1NF if it contains an atomic value.
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional dependent on the
3NF A relation will be in 3NF if it is in 2NF and no transition dependency exists.
BCNF A stronger definition of 3NF is known as Boyce Codd's normal form.
4NF A relation will be in 4NF if it is in Boyce Codd's normal form and has no multi -valued dependency.
5NF A relation is in 5NF. If it is in 4NF and does not contain any join dependency, joining should be lossless.

Advantages of Normalization
• Normalization helps to minimize data redundancy.
Greater overall database organization.

DBMS - PCATC Page 46


• Greater overall database organization.
• Data consistency within the database.
• Much more flexible database design.
• Enforces the concept of relational integrity.
Disadvantages of Normalization
• You cannot start building the database before knowing what the user needs.
• The performance degrades when normalizing the relations to higher normal forms, i.e., 4NF, 5NF.
• It is very time-consuming and difficult to normalize relations of a higher degree.
• Careless decomposition may lead to a bad database design, leading to serious problems.

Pasted from <https://www.javatpoint.com/dbms-normalization>

DBMS - PCATC Page 47


Set Operations
19 February 2023
13:03

Explain set operators in DBMS


Operators like union, intersect, minus and exist operate on relations. Corresponding to relational algebra U, ∩ and -. Relations
in the operations must have the same set of attributes.
The syntax for the set operators is as follows −
<query1><set operator><query2>
Now, let us understand the set operators in the database management system (DBMS).
UNION − It returns a table which consists of all rows either appearing in the result of <query1> or in the result of <query2>
For example,
select ename from emp where job=’mamager’ UNION select ename from emp where job=’analyst’;
UNION ALL − It returns all rows selected by either query, including all duplicates.
For example,
select salary from emp where job=’manager’ UNION ALL select salary fro, emp where job=’analyst’);
INTERSECT − It returns all rows that appear in both results <query1> and <query2>
For example,
select * from orderList1 INTERSECT select * from orderList2;
INTERSECT ALL − It is same as INTERSECT, returns all distinct rows selected by both queries.
For example,
select * from orderList1 INTERSECT ALL select * from orderList2;
MINUS − It returns those rows which appear in result of <query1> but not in the result of <query2>
For example,
select * from(select salary from emp where job=’manager’ MINUS select salary from emp where job=’CEO’);
Example
Consider the step by step query given below −
Step 1
Create table T1(regno number(10), branch varchar2(10));
The output is given herewith: Table created.
Step 2
insert into T1 values(100,'CSE');
insert into T1 values(101,'CSE');
insert into T1 values(102,'CSE');
insert into T1 values(103,'CSE');
insert into T1 values(104,'CSE');
The output will be as follows: 5 rows inserted.
Step 3
create table T2 (regno number(10), branch varchar2(10));
The output is as follows: Table created.
Step 4
insert into T2 values(101,'CSE');
insert into T2 values(102,'CSE');
insert into T2 values(103,'CSE');
The output is given herewith: 3 rows inserted.
Step 5
select * from T1;
Output
You will get the following output −
100|CSE
101|CSE
102|CSE
103|CSE
104|CSE

DBMS - PCATC Page 48


104|CSE
Step 6
select * from T2;
Output
You will get the following output −
101|CSE
102|CSE
103|CSE
Application of set operators
Now apply the set operators on the two tables which are created above.
The syntax for use of set operators is as follows −
select coulmnname(s) from tablename1 operatorname select columnname(s) from table2;

Union
Given below is the command for usage of Union set operator −
select regno from T1 UNION select regno from T2;
Output
You will get the following output −
100
101
102
103
104
Intersect
Given below is the command for usage of Intersect set operator −
select regno from T1 INTERSECT select regno from T2;
Output
You will get the following output −
101
102
103
Minus
Given below is the command for usage of Minus set operator −
select regno from T1 MINUS select regno from T2;
Output
You will get the following output −
100
104

DBMS - PCATC Page 49


10 Marks
19 February 2023
17:10

1. Explain about Tuple relational calculus giving examples.


2. Explain ER model with the help of a suitable diagram.
3. Show insertion and deletion operations on a B-Tree with relevant example.(Page no 251)
4. How deadlock is handled? Explain.
5. Discuss in detail on database system architecture.
6. Explain about tuple and domain relational calculus.
7. What is Normalization? Explain different normal forms.
8. Explain in detail about file organization.
9. List and explain the types of serializability giving examples.
10. What is DML? List and explain different types of commands under this category with appropriate
syntax and example.
11. Illustrate on the basic aspects of SQL.
12. Elaborate on the concept of normalization.
13. Explain about B-Tree. Give examples.
14. Give a detailed notes on conflict serializability.
15. Explain the directory system.
16. Discuss the various advantages and disadvantages of DBMS.
17. Explain basic notations of ER diagram with examples.
18. Explain in detail about hashing techniques.
19. Describe concurrency control with time stamping method. Give examples.
20. Describe the evolution from centralized DBMSs to Distributed DBMSs.

DBMS - PCATC Page 50


DML
20 February 2023
10:36

DML Commands in SQL


DML is an abbreviation of Data Manipulation Language.
The DML commands in Structured Query Language change the data present in the SQL database. We can easily access, store, modify,
and delete the existing records from the database using DML commands.
Following are the four main DML commands in SQL:
1. SELECT Command
2. INSERT Command
3. UPDATE Command
4. DELETE Command

SELECT DML Command


SELECT is the most important data manipulation command in Structured Query Language. The SELECT command shows the records of the
specified table. It also shows the particular record of a particular column by using the WHERE clause.

Syntax of SELECT DML command


SELECT column_Name_1, column_Name_2, ….., column_Name_N FROM Name_of_table;
Here, column_Name_1, column_Name_2, ….., column_Name_N are the names of those columns whose data we want to retrieve from the
table.

If we want to retrieve the data from all the columns of the table, we have to use the following SELECT command:
SELECT * FROM table_name;

Examples of SELECT Command


Example 1: This example shows all the values of every column from the table.
SELECT * FROM Student;
This SQL statement displays the following values of the student table:
Student_ID Student_Name Student_Marks
BCA1001 Abhay 85
BCA1002 Anuj 75
BCA1003 Bheem 60
BCA1004 Ram 79
BCA1005 Sumit 80

Example 2: This example shows all the values of a specific column from the table.
SELECT Emp_Id, Emp_Salary FROM Employee;
This SELECT statement displays all the values of Emp_Salary and Emp_Id column of Employee table:
Emp_Id Emp_Salary
201 25000
202 45000
203 30000
204 29000
205 40000
Example 3: This example describes how to use the WHERE clause with the SELECT DML command.
Let's take the following Student table:
Student_ID Student_Name Student_Marks
BCA1001 Abhay 80
BCA1002 Ankit 75
BCA1003 Bheem 80
BCA1004 Ram 79
BCA1005 Sumit 80
If you want to access all the records of those students whose marks is 80 from the above table, then you have to write the following DML
command in SQL:
SELECT * FROM Student WHERE Stu_Marks = 80;
The above SQL query shows the following table in result:

DBMS - PCATC Page 51


The above SQL query shows the following table in result:
Student_ID Student_Name Student_Marks
BCA1001 Abhay 80
BCA1003 Bheem 80
BCA1005 Sumit 80

INSERT DML Command


INSERT is another most important data manipulation command in Structured Query Language, which allows users to insert data in
tables.
Syntax of INSERT Command
INSERT INTO TABLE_NAME ( column_Name1 , column_Name2 , column_Name3 , .... column_NameN ) VALUES (value_1, value_2, value_
3, .... value_N ) ;
Examples of INSERT Command
Example 1: This example describes how to insert the record in the database table.
Let's take the following student table, which consists of only 2 records of the student.
Stu_Id Stu_Name Stu_Mark Stu_Ag
101 Ramesh 92 20
201 Jatin 83 19
Suppose, you want to insert a new record into the student table. For this, you have to write the following DML INSERT command:
INSERT INTO Student (Stu_id, Stu_Name, Stu_Marks, Stu_Age) VALUES (104, Anmol, 89, 19);

UPDATE DML Command


UPDATE is another most important data manipulation command in Structured Query Language, which allows users to update or modify
existing data in database tables.
Syntax of UPDATE Command
UPDATE Table_name SET [column_name1= value_1, ….., column_nameN = value_N] WHERE CONDITION;
Here, 'UPDATE', 'SET', and 'WHERE' are the SQL keywords, and 'Table_name' is the name of the table whose values you want to update.
Examples of the UPDATE command
Example 1: This example describes how to update the value of a single field.
Let's take a Product table consisting of the following records:
Product_Id Product_Name Product_Price Product_Quantity
P101 Chips 20 20
P102 Chocolates 60 40
P103 Maggi 75 5
P201 Biscuits 80 20
P203 Namkeen 40 50
Suppose, you want to update the Product_Price of the product whose Product_Id is P102. To do this, you have to write the following DML
UPDATE command:
UPDATE Product SET Product_Price = 80 WHERE Product_Id = 'P102' ;
Example 2: This example describes how to update the value of multiple fields of the database table.
Let's take a Student table consisting of the following records:
Stu_Id Stu_Name Stu_Mark Stu_Ag
101 Ramesh 92 20
201 Jatin 83 19
202 Anuj 85 19
203 Monty 95 21
102 Saket 65 21
103 Sumit 78 19
104 Ashish 98 20
Suppose, you want to update Stu_Marks and Stu_Age of that student whose Stu_Id is 103 and 202. To do this, you have to write the
DML Update command:
UPDATE Student SET Stu_Marks = 80, Stu_Age = 21 WHERE Stu_Id = 103 AND Stu_Id = 202;

DELETE DML Command


DELETE is a DML command which allows SQL users to remove single or multiple existing records from the database tables.
This command of Data Manipulation Language does not delete the stored data permanently from the database. We use the WHERE
with the DELETE command to select specific rows from the table.
Syntax of DELETE Command

DBMS - PCATC Page 52


Syntax of DELETE Command
DELETE FROM Table_Name WHERE condition;
Examples of DELETE Command
Example 1: This example describes how to delete a single record from the table.
Let's take a Product table consisting of the following records:
Product_Id Product_Name Product_Price Product_Quantity
P101 Chips 20 20
P102 Chocolates 60 40
P103 Maggi 75 5
P201 Biscuits 80 20
P203 Namkeen 40 50
Suppose, you want to delete that product from the Product table whose Product_Id is P203. To do this, you have to write the following
DELETE command:
DELETE FROM Product WHERE Product_Id = 'P202' ;
Example 2: This example describes how to delete the multiple records or rows from the database table.
Let's take a Student table consisting of the following records:
Stu_Id Stu_Name Stu_Mark Stu_Ag
101 Ramesh 92 20
201 Jatin 83 19
202 Anuj 85 19
203 Monty 95 21
102 Saket 65 21
103 Sumit 78 19
104 Ashish 98 20
Suppose, you want to delete the record of those students whose Marks is greater than 70. To do this, you have to write the following DML
Update command:
DELETE FROM Student WHERE Stu_Marks > 70 ;

DBMS - PCATC Page 53


B+Tree
20 February 2023
11:16

B+ Tree
• The B+ tree is a balanced binary search tree. It follows a multi-level index format.
• In the B+ tree, leaf nodes denote actual data pointers. B+ tree ensures that all leaf nodes remain at the same height.
• In the B+ tree, the leaf nodes are linked using a link list. Therefore, a B+ tree can support random access as well as sequential
Structure of B+ Tree
• In the B+ tree, every leaf node is at equal distance from the root node. The B+ tree is of the order n where n is fixed for every B+ tree.
• It contains an internal node and leaf node.

Internal node
• An internal node of the B+ tree can contain at least n/2 record pointers except the root node.
• At most, an internal node of the tree contains n pointers.
Leaf node
• The leaf node of the B+ tree can contain at least n/2 record pointers and n/2 key values.
• At most, a leaf node contains n record pointer and n key values.
• Every leaf node of the B+ tree contains one block pointer P to point to next leaf node.
Searching a record in B+ Tree
Suppose we have to search 55 in the below B+ tree structure. First, we will fetch for the intermediary node which will direct to the leaf
that can contain a record for 55.
So, in the intermediary node, we will find a branch between 50 and 75 nodes. Then at the end, we will be redirected to the th ird leaf node.
Here DBMS will perform a sequential search to find 55.

B+ Tree Insertion
Suppose we want to insert a record 60 in the below structure. It will go to the 3rd leaf node after 55. It is a balanced tree , and a leaf node
this tree is already full, so we cannot insert 60 there.
In this case, we have to split the leaf node, so that it can be inserted into tree without affecting the fill factor, balance and order.

The 3rd leaf node has the values (50, 55, 60, 65, 70) and its current root node is 50. We will split the leaf node of the tree in the middle so
that its balance is not altered. So we can group (50, 55) and (60, 65, 70) into 2 leaf nodes.
If these two has to be leaf nodes, the intermediate node cannot branch from 50. It should have 60 added to it, and then we ca n have
to a new leaf node.

DBMS - PCATC Page 54


to a new leaf node.

This is how we can insert an entry when there is overflow. In a normal scenario, it is very easy to find the node where it fi ts and then
that leaf node.
B+ Tree Deletion
Suppose we want to delete 60 from the above example. In this case, we have to remove 60 from the intermediate node as well as from the
leaf node too. If we remove it from the intermediate node, then the tree will not satisfy the rule of the B+ tree. So we need to modify it to
have a balanced tree.
After deleting node 60 from above B+ tree and re-arranging the nodes, it will show as follows:

DBMS - PCATC Page 55


B Tree
19 February 2023
21:41

B Tree
B Tree is a specialized m-way tree that can be widely used for disk access. A B-Tree of order m can have at most m-1 keys and m
One of the main reason of using B tree is its capability to store large number of keys in a single node and large key values by keeping
height of the tree relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains the following properties.
1. Every node in a B-Tree contains at most m children.
2. Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
3. The root nodes must have at least 2 nodes.
4. All leaf nodes must be at the same level.
It is not necessary that, all the nodes contain the same number of children but, each node must have m/2 number of nodes.
A B tree of order 4 is shown in the following image.

While performing some operations on B Tree, any property of B Tree may violate such as number of minimum children a node can
maintain the properties of B Tree, the tree may split or join.
Operations
Searching :
Searching in B Trees is similar to that in Binary search tree. For example, if we search for an item 49 in the following B Tree. The
something like following :
1. Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub -tree.
2. Since, 40<49<56, traverse right sub-tree of 40.
3. 49>45, move to right. Compare 49.
4. match found, return.
Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n) time to search any element in a B tree.

Inserting
Insertions are done at the leaf node level. The following algorithm needs to be followed in order to insert an item into B Tree.
1. Traverse the B Tree in order to find the appropriate leaf node at which the node can be inserted.
2. If the leaf node contain less than m-1 keys then insert the element in the increasing order.
3. Else, if the leaf node contains m-1 keys, then follow the following steps.
○ Insert the new element in the increasing order of elements.
○ Split the node into the two nodes at the median.
○ Push the median element upto its parent node.
○ If the parent node also contain m-1 number of keys, then split it too by following the same steps.
Example:
Insert the node 8 into the B Tree of order 5 shown in the following image.

8 will be inserted to the right of 5, therefore insert 8.

DBMS - PCATC Page 56


8 will be inserted to the right of 5, therefore insert 8.

The node, now contain 5 keys which is greater than (5 -1 = 4 ) keys. Therefore split the node from the median i.e. 8 and push it up to its
parent node shown as follows.

Deletion
Deletion is also performed at the leaf nodes. The node which is to be deleted can either be a leaf node or an internal node. Following
algorithm needs to be followed in order to delete a node from a B tree.
1. Locate the leaf node.
2. If there are more than m/2 keys in the leaf node then delete the desired key from the node.
3. If the leaf node doesn't contain m/2 keys then complete the keys by taking the element from eight or left sibling.
○ If the left sibling contains more than m/2 elements then push its largest element up to its parent and move the intervening
element down to the node where the key is deleted.
○ If the right sibling contains more than m/2 elements then push its smallest element up to the parent and move intervening
element down to the node where the key is deleted.
4. If neither of the sibling contain more than m/2 elements then create a new leaf node by joining two leaf nodes and the
element of the parent node.
5. If parent is left with less than m/2 nodes then, apply the above process on the parent too.
If the the node which is to be deleted is an internal node, then replace the node with its in-order successor or predecessor. Since,
or predecessor will always be on the leaf node hence, the process will be similar as the node is being deleted from the leaf node.
Example 1
Delete the node 53 from the B Tree of order 5 shown in the following figure.

53 is present in the right child of element 49. Delete it.

Now, 57 is the only element which is left in the node, the minimum number of elements that must be present in a B tree of order 5, is 2.
is less than that, the elements in its left and right sub-tree are also not sufficient therefore, merge it with the left sibling and
element of parent i.e. 49.
The final B tree is shown as follows.

Application of B tree
B tree is used to index the data and provides fast access to the actual data stored on the disks since, the access to value stored in a
database that is stored on a disk is a very time consuming process.
Searching an un-indexed and unsorted database containing n key values needs O(n) running time in worst case. However, if we use B
to index this database, it will be searched in O(log n) time in worst case.

DBMS - PCATC Page 57


Normalization
19 February 2023
13:50

Normalization
A large database defined as a single relation may result in data duplication. This repetition of data may result in:
• Making relations very large.
• It isn't easy to maintain and update data as it would involve searching many records in relation.
• Wastage and poor utilization of disk space and resources.
• The likelihood of errors and inconsistencies increases.
So to handle these problems, we should analyze and decompose the relations with redundant data into smaller, simpler, and wel l-
relations that are satisfy desirable properties. Normalization is a process of decomposing the relations into relations with fewer
What is Normalization?
• Normalization is the process of organizing the data in the database.
• Normalization is used to minimize the redundancy from a relation or set of relations. It is also used to eliminate undesirabl e
like Insertion, Update, and Deletion Anomalies.
• Normalization divides the larger table into smaller and links them using relationships.
• The normal form is used to reduce redundancy from the database table.
Why do we need Normalization?
The main reason for normalizing the relations is removing these anomalies. Failure to eliminate anomalies leads to data redun dancy and
cause data integrity and other problems as the database grows. Normalization consists of a series of guidelines that helps to guide you in
creating a good database structure.

Data modification anomalies can be categorized into three types:


• Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple into a relationship due to lack of data.
• Deletion Anomaly: The delete anomaly refers to the situation where the deletion of data results in the unintended loss of some other
important data.
• Updatation Anomaly: The update anomaly is when an update of a single data value requires multiple rows of data to be updated.
Types of Normal Forms:
Normalization works through a series of stages called Normal forms. The normal forms apply to individual relations. The relat ion is said
particular normal form if it satisfies constraints.
Following are the various types of Normal forms:

Normal Form Description


1NF A relation is in 1NF if it contains an atomic value.
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional
on the primary key.
3NF A relation will be in 3NF if it is in 2NF and no transition dependency exists.
BCNFBCNF in DBMS: Boyce-Codd Normal Form - A stronger definition of 3NF is known as Boyce Codd's normal form.
javatpoint
4NF A relation will be in 4NF if it is in Boyce Codd's normal form and has no multi-valued
dependency.
5NF A relation is in 5NF. If it is in 4NF and does not contain any join dependency, joining should
lossless.

Advantages of Normalization
DBMS - PCATC Page 58
Advantages of Normalization
• Normalization helps to minimize data redundancy.
• Greater overall database organization.
• Data consistency within the database.
• Much more flexible database design.
• Enforces the concept of relational integrity.
Disadvantages of Normalization
• You cannot start building the database before knowing what the user needs.
• The performance degrades when normalizing the relations to higher normal forms, i.e., 4NF, 5NF.
• It is very time-consuming and difficult to normalize relations of a higher degree.
• Careless decomposition may lead to a bad database design, leading to serious problems.

Pasted from <https://www.javatpoint.com/dbms-normalization>

DBMS - PCATC Page 59


Deadlock
19 February 2023
16:55

Deadlock in DBMS
A deadlock is a condition where two or more transactions are waiting indefinitely for one another to give up locks. Deadlock is said
one of the most feared complications in DBMS as no task ever gets finished and is in waiting state forever.

For example: In the student table, transaction T1 holds a lock on some rows and needs to update some rows in the grade table.
Simultaneously, transaction T2 holds locks on some rows in the grade table and needs to update the rows in the Student table held
by Transaction T1.
Now, the main problem arises. Now Transaction T1 is waiting for T2 to release its lock and similarly, transaction T2 is waiti ng for T1
release its lock. All activities come to a halt state and remain at a standstill. It will remain in a standstill until the DB MS detects the
deadlock and aborts one of the transactions.

Deadlock Avoidance
• When a database is stuck in a deadlock state, then it is better to avoid the database rather than aborting or restating the
This is a waste of time and resource.
• Deadlock avoidance mechanism is used to detect any deadlock situation in advance. A method like "wait for graph" is used for
detecting the deadlock situation but this method is suitable only for the smaller database. For the larger database, deadlock
prevention method can be used.

Deadlock Detection
In a database, when a transaction waits indefinitely to obtain a lock, then the DBMS should detect whether the transaction is
a deadlock or not. The lock manager maintains a Wait for the graph to detect the deadlock cycle in the database.

Wait for Graph


• This is the suitable method for deadlock detection. In this method, a graph is created based on the transaction and their loc k. If
created graph has a cycle or closed loop, then there is a deadlock.
• The wait for the graph is maintained by the system for every transaction which is waiting for some data held by the others. T he
system keeps checking the graph if there is any cycle in the graph.
The wait for a graph for the above scenario is shown below:

DBMS - PCATC Page 60


Deadlock Prevention
• Deadlock prevention method is suitable for a large database. If the resources are allocated in such a way that deadlock never
occurs, then the deadlock can be prevented.
• The Database management system analyzes the operations of the transaction whether they can create a deadlock situation or
If they do, then the DBMS never allowed that transaction to be executed.
Wait-Die scheme
In this scheme, if a transaction requests for a resource which is already held with a conflicting lock by another transaction then the
simply checks the timestamp of both transactions. It allows the older transaction to wait until the resource is available for
Let's assume there are two transactions Ti and Tj and let TS(T) is a timestamp of any transaction T. If T2 holds a lock by so me other
transaction and T1 is requesting for resources held by T2 then the following actions are performed by DBMS:
1. Check if TS(Ti) < TS(Tj) - If Ti is the older transaction and Tj has held some resource, then Ti is allowed to wait until the data -
available for execution. That means if the older transaction is waiting for a resource which is locked by the younger
then the older transaction is allowed to wait for resource until it is available.
2. Check if TS(Ti) < TS(Tj) - If Ti is older transaction and has held some resource and if Tj is waiting for it, then Tj is killed and
restarted later with the random delay but with the same timestamp.
Wound wait scheme
• In wound wait scheme, if the older transaction requests for a resource which is held by the younger transaction, then older
transaction forces younger one to kill the transaction and release the resource. After the minute delay, the younger transact ion
restarted but with the same timestamp.
• If the older transaction has held a resource which is requested by the Younger transaction, then the younger transaction is
wait until older releases it.

differences between Wait – Die and Wound -Wait scheme prevention schemes :

Wait – Die Wound -Wait


It is based on a non-preemptive technique. It is based on a preemptive technique.
In this, older transactions must wait for the younger one to release its In this, older transactions never wait for younger
data items. transactions.
The number of aborts and rollback is higher in these techniques. In this, the number of aborts and rollback is lesser.

Recovery from DeadLoack(Page no 351)

DBMS - PCATC Page 61


ER Model
19 February 2023
17:13

ER Model is used to model the logical view of the system from a data perspective
consists of these components:

Components of the E-R Model

Entity, Entity Type, Entity Set –


An Entity may be an object with a physical existence – a particular person, car,
house, or employee – or it may be an object with a conceptual existence – a
company, a job, or a university course.
An Entity is an object of Entity Type and a set of all entities is called as an entity set.
E1 is an entity having Entity Type Student and set of all students is called Entity Set. In
diagram, Entity Type is represented as:

Attribute(s):

Attributes are the properties that define the entity type. For example, Roll_No, Name,
DOB, Age, Address, Mobile_No are the attributes that define entity type Student. In ER
diagram, the attribute is represented by an oval.

1. Key Attribute –
The attribute which uniquely identifies each entity in the entity set is called key

DBMS - PCATC Page 62


The attribute which uniquely identifies each entity in the entity set is called key
attribute.For example, Roll_No will be unique for each student. In ER diagram, key
attribute is represented by an oval with underlying lines.

2. Composite Attribute –
An attribute composed of many other attribute is called as composite attribute. For
example, Address attribute of student Entity type consists of Street, City, State, and
Country. In ER diagram, composite attribute is represented by an oval comprising of
ovals.

3. Multivalued Attribute –
An attribute consisting more than one value for a given entity. For example,
(can be more than one for a given student). In ER diagram, a multivalued attribute is
represented by a double oval.

4. Derived Attribute –
An attribute that can be derived from other attributes of the entity type is known as a
derived attribute. e.g.; Age (can be derived from DOB). In ER diagram, the derived
attribute is represented by a dashed oval.

The complete entity type Student with its attributes can be represented as:

DBMS - PCATC Page 63


Relationship Type and Relationship Set:
A relationship type represents the association between entity types. For
example,‘Enrolled in’ is a relationship type that exists between entity type Student
and Course. In ER diagram, the relationship type is represented by a diamond and
connecting the entities with lines.

A set of relationships of the same type is known as a relationship set. The following
relationship set depicts S1 as enrolled in C2, S2 is enrolled in C1, and S3 is enrolled
C3.

Degree of a relationship set:

The number of different entity sets participating in a relationship set is called as the
degree of a relationship set.
1. Unary Relationship –
When there is only ONE entity set participating in a relation, the relationship is called
a unary relationship. For example, one person is married to only one person.

2. Binary Relationship –

DBMS - PCATC Page 64


2. Binary Relationship –
When there are TWO entities set participating in a relationship, the relationship is
called a binary relationship. For example, a Student is enrolled in a Course.

3. n-ary Relationship –
When there are n entities set participating in a relation, the relationship is called an
n-ary relationship.
Cardinality:
The number of times an entity of an entity set participates in a relationship set is
known as cardinality. Cardinality can be of different types:
1. One-to-one – When each entity in each entity set can take part only once in the
relationship, the cardinality is one-to-one. Let us assume that a male can marry one
female and a female can marry one male. So the relationship will be one-to-one.
the total number of tables that can be used in this is 2.

Using Sets, it can be represented as:

2. Many to one – When entities in one entity set can take part only once in the
relationship set and entities in other entity sets can take part more than once in the
relationship set, cardinality is many to one. Let us assume that a student can take
only one course but one course can be taken by many students. So the cardinality will
be n to 1. It means that for one course there can be n students but for one student,
there will be only one course.
The total number of tables that can be used in this is 3.

Using Sets, it can be represented as:

DBMS - PCATC Page 65


In this case, each student is taking only 1 course but 1 course has been taken by
students.
3. Many to many – When entities in all entity sets can take part more than once in
the relationship cardinality is many to many. Let us assume that a student can take
more than one course and one course can be taken by many students. So the
relationship will be many to many.
the total number of tables that can be used in this is 3.

Using sets, it can be represented as:

In this example, student S1 is enrolled in C1 and C3 and Course C3 is enrolled by S1,


and S4. So it is many-to-many relationships.
In this, there is one-to-many mapping as well where each entity can be related to
more than one relationship and the total number of tables that can be used in this
is 2.

Participation Constraint:

Participation Constraint is applied to the entity participating in the relationship set.


1. Total Participation – Each entity in the entity set must participate in the
relationship. If each student must enroll in a course, the participation of students
will be total. Total participation is shown by a double line in the ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT participate in the
relationship. If some courses are not enrolled by any of the students, the
participation of the course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set having
participation and Course Entity set having partial participation.

DBMS - PCATC Page 66


Using set, it can be represented as,

Every student in the Student Entity set is participating in a relationship but there
course C4 that is not taking part in the relationship.
Weak Entity Type and Identifying Relationship:
As discussed before, an entity type has a key attribute that uniquely identifies each
entity in the entity set. But there exists some entity type for which key attributes can’t
be defined. These are called Weak Entity types.
For example, A company may store the information of dependents (Parents, Children,
Spouse) of an Employee. But the dependents don’t have existed without the employee.
So Dependent will be a weak entity type and Employee will be Identifying Entity type
Dependent.
A weak entity type is represented by a double rectangle. The participation of weak
types is always total. The relationship between the weak entity type and its
strong entity type is called identifying relationship and it is represented by a double
diamond.

DBMS - PCATC Page 67


Tuple Relational Calculus (TRC) in DBMS
20 February 2023
10:21

Tuple Relational Calculus is a non-procedural query language unlike relational algebra. Tuple Calculus provides only the description of
query but it does not provide the methods to solve it. Thus, it explains what to do but not how to do.
In Tuple Calculus, a query is expressed as
{t| P(t)}
where t = resulting tuples,
P(t) = known as Predicate and these are the conditions that are used to fetch t
Thus, it generates set of all tuples t, such that Predicate P(t) is true for t.
P(t) may have various conditions logically combined with OR (∨), AND (∧), NOT(¬).
It also uses quantifiers:
∃ t ∈ r (Q(t)) = ”there exists” a tuple in t in relation r such that predicate Q(t) is true.
∀ t ∈ r (Q(t)) = Q(t) is true “for all” tuples in relation r.
Example:
Table-1: Customer

Customer name Street City


Saurabh A7 Patiala
Mehak B6 Jalandha
Sumiti D9 Ludhiana
Ria A5 Patiala
Table-2: Branch

Branch name Branch city


ABC Patiala
DEF Ludhiana
GHI Jalandhar
Table-3: Account

Account number Branch name Balance


1111 ABC 50000
1112 DEF 10000
1113 GHI 9000
1114 ABC 7000
Table-4: Loan

Loan number Branch name Amount


L33 ABC 10000
L35 DEF 15000
L49 GHI 9000
L98 DEF 65000
Table-5: Borrower

Customer name Loan number


Saurabh L33
Mehak L49
Ria L98
Table-6: Depositor

Customer name Account number


Saurabh 1111
Mehak 1113
Sumiti 1114

DBMS - PCATC Page 68


Sumiti 1114
Queries-1: Find the loan number, branch, amount of loans of greater than or equal to 10000 amount.
{t| t ∈ loan ∧ t[amount]>=10000}
Resulting relation:

Loan number Branch name Amount


L33 ABC 10000
L35 DEF 15000
L98 DEF 65000
In the above query, t[amount] is known as tuple variable.
Queries-2: Find the loan number for each loan of an amount greater or equal to 10000.
{t| ∃ s ∈ loan(t[loan number] = s[loan number]
∧ s[amount]>=10000)}
Resulting relation:

Loan number
L33
L35
L98
Queries-3: Find the names of all customers who have a loan and an account at the bank.
{t | ∃ s ∈ borrower( t[customer-name] = s[customer-name])
∧ ∃ u ∈ depositor( t[customer-name] = u[customer-name])}
Resulting relation:

Customer name
Saurabh
Mehak
Queries-4: Find the names of all customers having a loan at the “ABC” branch.
{t | ∃ s ∈ borrower(t[customer-name] = s[customer-name]
∧ ∃ u ∈ loan(u[branch-name] = “ABC” ∧ u[loan-number] = s[loan-number]))}
Resulting relation:

Customer name
Saurabh

DBMS - PCATC Page 69


Domain Relational Calculus in DBMS
20 February 2023
10:24

Domain Relational Calculus is a non-procedural query language equivalent in power to Tuple Relational Calculus. Domain
Relational Calculus provides only the description of the query but it does not provide the methods to solve it. In Domain
Relational Calculus, a query is expressed as,
{ < x1, x2, x3, ..., xn > | P (x1, x2, x3, ..., xn ) }
where, < x1, x2, x3, …, xn > represents resulting domains variables and P (x 1, x2, x3, …, xn ) represents the condition or formula
equivalent to the Predicate calculus.

Predicate Calculus Formula:


1. Set of all comparison operators
2. Set of connectives like and, or, not
3. Set of quantifiers
Example:
Table-1: Customer

Customer name Street City


Debomit Kadamtala Alipurdua
Sayantan Udaypur Balurghat
Soumya Nutanchati Bankura
Ritu Juhu Mumbai

Table-2: Loan

Loan number Branch name Amount


L01 Main 200
L03 Main 150
L10 Sub 90
L08 Main 60

Table-3: Borrower

Customer name Loan number


Ritu L01
Debomit L08
Soumya L03
Query-1: Find the loan number, branch, amount of loans of greater than or equal to 100 amount.
{≺l, b, a≻ | ≺l, b, a≻ ∈ loan ∧ (a ≥ 100)}
Resulting relation:

Loan number Branch name Amount


L01 Main 200
L03 Main 150
Query-2: Find the loan number for each loan of an amount greater or equal to 150.
{≺l≻ | ∃ b, a (≺l, b, a≻ ∈ loan ∧ (a ≥ 150)}
Resulting relation:

Loan number
L01
L03

DBMS - PCATC Page 70


L03
Query-3: Find the names of all customers having a loan at the “Main” branch and find the loan amount .
{≺c, a≻ | ∃ l (≺c, l≻ ∈ borrower ∧ ∃ b (≺l, b, a≻ ∈ loan ∧ (b = “Main”)))}
Resulting relation:

Customer Name Amount


Ritu 200
Debomit 60
Soumya 150
Note:
The domain variables those will be in resulting relation must appear before | within ≺ and ≻ and all the domain variables
appear in which order they are in original relation or table.

Pasted from <https://www.geeksforgeeks.org/domain-relational-calculus-in-dbms/>

DBMS - PCATC Page 71


File Organization in DBMS
20 February 2023
11:10

A database consist of a huge amount of data. The data is grouped within a table in RDBMS, and each table have related records . A
can see that the data is stored in form of tables, but in actual this huge amount of data is stored in physical memory in for m of
File – A file is named collection of related information that is recorded on secondary storage such as magnetic disks, magnetic tape s
and optical disks.
What is File Organization?
File Organization refers to the logical relationships among various records that constitute the file, particularly with respe ct to the
of identification and access to any specific record. In simple terms, Storing the files in certain order is called file Organ ization. File
Structure refers to the format of the label and data blocks and of any logical control record.

Types of File Organizations –

Various methods have been introduced to Organize files. These particular methods have advantages and disadvantages on the bas is
access or selection . Thus it is all upon the programmer to decide the best suited file Organization method according to his
Some types of File Organizations are :

• Sequential File Organization


• Heap File Organization
• Hash File Organization
• B+ Tree File Organization
• Clustered File Organization

We will be discussing each of the file Organizations in further sets of this article along with differences and advantages/
each file Organization methods.

Sequential File Organization –

The easiest method for file Organization is Sequential method. In this method the file are stored one after another in a sequ ential
manner. There are two ways to implement this method:
• Pile File Method – This method is quite simple, in which we store the records in a sequence i.e one after other in the order in which
they are inserted into the tables.

1. Insertion of new record –


Let the R1, R3 and so on upto R5 and R4 be four records in the sequence. Here, records are nothing but a row in any table. Su ppose a
new record R2 has to be inserted in the sequence, then it is simply placed at the end of the file.

DBMS - PCATC Page 72


• Sorted File Method –In this method, As the name itself suggest whenever a new record has to be inserted, it is always
in a sorted (ascending or descending) manner. Sorting of records may be based on any primary key or any other key.

1. Insertion of new record –


Let us assume that there is a preexisting sorted sequence of four records R1, R3, and so on upto R7 and R8. Suppose a new rec ord R2
has to be inserted in the sequence, then it will be inserted at the end of the file and then it will sort the sequence .

Pros and Cons of Sequential File Organization –


Pros –
• Fast and efficient method for huge amount of data.
• Simple design.
• Files can be easily stored in magnetic tapes i.e cheaper storage mechanism.
Cons –
• Time wastage as we cannot jump on a particular record that is required, but we have to move in a sequential manner which take s
time.
• Sorted file method is inefficient as it takes time and space for sorting records.

Heap File Organization –

Heap File Organization works with data blocks. In this method records are inserted at the end of the file, into the data bloc ks. No

DBMS - PCATC Page 73


Heap File Organization works with data blocks. In this method records are inserted at the end of the file, into the data bloc ks. No
or Ordering is required in this method. If a data block is full, the new record is stored in some other block, Here the other data block
not be the very next data block, but it can be any block in the memory. It is the responsibility of DBMS to store and manage the new
records.

Insertion of new record –


Suppose we have four records in the heap R1, R5, R6, R4 and R3 and suppose a new record R2 has to be inserted in the heap the n,
the last data block i.e data block 3 is full it will be inserted in any of the data blocks selected by the DBMS, lets say dat a block 1.

If we want to search, delete or update data in heap file Organization the we will traverse the data from the beginning of the file till
get the requested record. Thus if the database is very huge, searching, deleting or updating the record will take a lot of ti me.
Pros and Cons of Heap File Organization –
Pros –
• Fetching and retrieving records is faster than sequential record but only in case of small databases.
• When there is a huge number of data needs to be loaded into the database at a time, then this method of file Organization is best
Cons –
• Problem of unused memory blocks.
• Inefficient for larger databases.

In a database management system, When we want to retrieve a particular data, It becomes very inefficient to search all the in dex
and reach the desired data. In this situation, Hashing technique comes into picture.
Hashing is an efficient technique to directly search the location of desired data on the disk without using index structure. Data is
stored at the data blocks whose address is generated by using hash function. The memory location where these records are stor ed is
called as data block or data bucket.
Prerequisite - Hashing Data Structure

Hash File Organization:

• Data bucket – Data buckets are the memory locations where the records are stored. These buckets are also considered as Unit Of
Storage.
Hash Function – Hash function is a mapping function that maps all the set of search keys to actual record address. Generally, hash

DBMS - PCATC Page 74


• Hash Function – Hash function is a mapping function that maps all the set of search keys to actual record address. Generally, hash
function uses the primary key to generate the hash index – address of the data block. Hash function can be simple mathematical
function to any complex mathematical function.
• Hash Index-The prefix of an entire hash value is taken as a hash index. Every hash index has a depth value to signify how many bits
are used for computing a hash function. These bits can address 2n buckets. When all these bits are consumed ? then the depth value
is increased linearly and twice the buckets are allocated.

Static Hashing:

In static hashing, when a search-key value is provided, the hash function always computes the same address. For example, if we
generate an address for STUDENT_ID = 104 using mod (5) hash function, it always results in the same bucket address 4. There will
any changes to the bucket address here. Hence a number of data buckets in the memory for this static hashing remain constant
throughout.
Operations:
• Insertion – When a new record is inserted into the table, The hash function h generates a bucket address for the new record based
its hash key K. Bucket address = h(K)
• Searching – When a record needs to be searched, The same hash function is used to retrieve the bucket address for the record. For
Example, if we want to retrieve the whole record for ID 104, and if the hash function is mod (5) on that ID, the bucket addre ss
generated would be 4. Then we will directly got to address 4 and retrieve the whole record for ID 104. Here ID acts as a hash key.
• Deletion – If we want to delete a record, Using the hash function we will first fetch the record which is supposed to be deleted. Then
we will remove the records for that address in memory.
• Updation – The data record that needs to be updated is first searched using hash function, and then the data record is updated.
Now, If we want to insert some new records into the file But the data bucket address generated by the hash function is not em pty or
data already exists in that address. This becomes a critical situation to handle. This situation in the static hashing is called bucket
overflow. How will we insert data in this case? There are several methods provided to overcome this situation. Some commonly used
methods are discussed below:
2. Open Hashing – In Open hashing method, next available data block is used to enter the new record, instead of overwriting the older
one. This method is also called linear probing. For example, D3 is a new record that needs to be inserted, the hash function
generates the address as 105. But it is already full. So the system searches next available data bucket, 123 and assigns D3 t o it.
3. Closed hashing – In Closed hashing method, a new data bucket is allocated with same address and is linked it after the full data
bucket. This method is also known as overflow chaining. For example, we have to insert a new record D3 into the tables. The static
hash function generates the data bucket address as 105. But this bucket is full to store the new data. In this case is a new data
bucket is added at the end of 105 data bucket and is linked to it. Then new record D3 is inserted into the new bucket.
• Quadratic probing : Quadratic probing is very much similar to open hashing or linear probing. Here, The only difference between old
and new bucket is linear. Quadratic function is used to determine the new bucket address.
• Double Hashing : Double Hashing is another method similar to linear probing. Here the difference is fixed as in linear probing, but
this fixed difference is calculated by using another hash function. That’s why the name is double hashing.

Dynamic Hashing –

The drawback of static hashing is that it does not expand or shrink dynamically as the size of the database grows or shrinks. In
hashing, data buckets grows or shrinks (added or removed dynamically) as the records increases or decreases. Dynamic hashing is
known as extended hashing. In dynamic hashing, the hash function is made to produce a large number of values. For Example, there
three data records D1, D2 and D3 . The hash function generates three addresses 1001, 0101 and 1010 respectively. This method of
storing considers only part of this address – especially only first one bit to store the data. So it tries to load three of them at
address 0 and 1.

But the problem is that No bucket address is remaining for D3. The bucket has to grow dynamically to accommodate D3. So it
the address have 2 bits rather than 1 bit, and then it updates the existing data to have 2 bit address. Then it tries to acco mmodate

DBMS - PCATC Page 75


DBMS - PCATC Page 76
Advantages and Disadvantages of DBMS
20 February 2023
10:16

Advantages of DBMS
The use of a database management system, or DBMS, to store and manage data has several advantages. These are DBMS's
Improves the effectiveness of data exchange
With DBMS, data can be exchanged between users more effectively, and access to the data can be restricted so that only authorized
users are permitted to view it, as opposed to earlier systems when everyone with access to the system could access the data. We
more easily manage the data in a DBMS.
Heightens Data Protection
Data is now one of the most precious resources available in the modern world. Additionally, the need for data protection becomes
more critical. A large amount of people having access to the database raises the likelihood that the data may be compromised. A
security layout can be provided by the database management system. Only users with such permissions will be able to view or
data, according to limits placed on the information's access by the database administrator. Although it does not guarantee total
security, it does offer a solid security design.
Safeguarding Data Integrity
It is essential to offer specific capabilities, such as executing numerous transactions and allowing continuous access to the data,
giving many users database access. Maintaining the accuracy of the information is essential to prevent data loss when numerous
attempt to alter the same piece of data at the same time. Data redundancy is reduced in the database by the normalized format in
which the data is kept. Additionally, it lessens any discrepancies in the data. Inside a database, the entire set of data is kept in a
file, as opposed to a file system in which it is spread across numerous directories, files, and folders.
Enhance the Process of Decision-Making
It is considerably simpler to study the data because it is presented in a more organized format with rows and columns by the
can reach certain conclusions by doing straightforward database queries. Constraints that must be followed when storing data in
improve data quality, which in turn improves decision-making. The productivity and utility of the data improve dramatically as a
Recovery and Back-up
Data is the most precious resource for the entity, as was described before; therefore, data preservation is just as critical as data
protection. By performing regular backups using a DBMS, a user can store the most recent data on the drive or the cloud. The user
utilize the restore to retrieve the information from the drive or even the cloud if it is deleted from the system.

Disadvantages of DBMS
Although DMBS provides a lot of benefits, it also has a lot of drawbacks. DBMS has the following drawbacks:
Specifications for Hardware and Software
A system with a high configuration is needed to operate the DBMS effectively. We will unavoidably need hardware that performs
to get this height. As all of this technology and the license for this program are relatively pricey, it raises the cost of development.
your local system, they also take up comparatively more room. Also necessary is the upkeep of these systems.
Management scope and complexity
Due to the large range of functions, it offers, the database project's scalability is increased. To create a user interface, it supports
GUIs. It may also be used in conjunction with other potent software. But the complexities of the system as a whole are increased by
entire situation. The process is highly complicated as a result of all these implementations. We need to know other SQL languages
maintain the data and operate the database.
Huge Dimensions
For database management software to work correctly, a lot of disc space is needed. It needs extra software, and that software
storage space. Gigabytes of space may be needed for the whole DBMS configuration.
Regular updates
There are frequent requests for updates while using DBMS because they are regularly updated with new functionality and bug fixes.
When a new update is released, it may occasionally include more features that the user does not require and even alter the way
previous feature functions. The database administrator must be informed of these new features in configuration and should be
of modifications to implementation. Some upgraded versions might need a machine with higher specifications to function
These upgrades could also be very expensive. DBMS use involves regular replacement phases.
Productivity
The productivity of complex procedures may increase thanks to the DBMS, but simple processes are also made more difficult.
Failure has an enormous effect
As was previously said, the DBMS stores together all data in one place. Therefore, if there is a problem with that file, it could affect
of the other processes as well, which would halt everything and bring the process to a total halt.

Pasted from <https://www.javatpoint.com/advantages-and-disadvantages-of-dbms>

DBMS - PCATC Page 77


Distributed DBMS
20 February 2023
10:44

Distributed Database System


A distributed database is basically a database that is not limited to one system, it is spread over different sites, i.e, on
computers or over a network of computers. A distributed database system is located on various sites that don’t share physical
components. This may be required when a particular database needs to be accessed by various users globally. It needs to be
managed such that for the users it looks like one single database.
Types:
1. Homogeneous Database:
In a homogeneous database, all different sites store database identically. The operating system, database management
the data structures used – all are the same at all sites. Hence, they’re easy to manage.
2. Heterogeneous Database:
In a heterogeneous distributed database, different sites can use different schema and software that can lead to problems in
processing and transactions. Also, a particular site might be completely unaware of the other sites. Different computers may
different operating system, different database application. They may even use different data models for the database. Hence,
translations are required for different sites to communicate.

Distributed Data Storage :


There are 2 ways in which data can be stored on different sites. These are:
1. Replication –
In this approach, the entire relationship is stored redundantly at 2 or more sites. If the entire database is available at al l sites,
fully redundant database. Hence, in replication, systems maintain copies of data.
This is advantageous as it increases the availability of data at different sites. Also, now query requests can be processed i n
However, it has certain disadvantages as well. Data needs to be constantly updated. Any change made at one site needs to be
recorded at every site that relation is stored or else it may lead to inconsistency. This is a lot of overhead. Also, concurr ency
becomes way more complex as concurrent access now needs to be checked over a number of sites.
2. Fragmentation –
In this approach, the relations are fragmented (i.e., they’re divided into smaller parts) and each of the fragments is stored in
different sites where they’re required. It must be made sure that the fragments are such that they can be used to reconstruct the
original relation (i.e, there isn’t any loss of data).
Fragmentation is advantageous as it doesn’t create copies of data, consistency is not a problem.

Fragmentation of relations can be done in two ways:

• Horizontal fragmentation – Splitting by rows –


The relation is fragmented into groups of tuples so that each tuple is assigned to at least one fragment.
• Vertical fragmentation – Splitting by columns –
The schema of the relation is divided into smaller schemas. Each fragment must contain a common candidate key so as to
ensure a lossless join.
In certain cases, an approach that is hybrid of fragmentation and replication is used.
Applications of Distributed Database:
• It is used in Corporate Management Information System.
• It is used in multimedia applications.
• Used in Military’s control system, Hotel chains etc.
• It is also used in manufacturing control system.

DBMS - PCATC Page 78


Timestamp based Concurrency Control
20 February 2023
10:46

Timestamp based Concurrency Control


Concurrency Control can be implemented in different ways. One way to implement it is by using Locks. Now, let us discuss Time
Ordering Protocol.
As earlier introduced, Timestamp is a unique identifier created by the DBMS to identify a transaction. They are usually assigned in
the order in which they are submitted to the system. Refer to the timestamp of a transaction T as TS(T). For the basics of Timestamp,
you may refer here.
Timestamp Ordering Protocol –
The main idea for this protocol is to order the transactions based on their Timestamps. A schedule in which the transactions
then serializable and the only equivalent serial schedule permitted has the transactions in the order of their Timestamp Values.
simply, the schedule is equivalent to the particular Serial Order corresponding to the order of the Transaction timestamps. An
algorithm must ensure that, for each item accessed by Conflicting Operations in the schedule, the order in which the item is accessed
does not violate the ordering. To ensure this, use two Timestamp Values relating to each database item X.
• W_TS(X) is the largest timestamp of any transaction that executed write(X) successfully.
• R_TS(X) is the largest timestamp of any transaction that executed read(X) successfully.
Basic Timestamp Ordering –
Every transaction is issued a timestamp based on when it enters the system. Suppose, if an old transaction Ti has timestamp TS(Ti), a
new transaction Tj is assigned timestamp TS(Tj) such that TS(Ti) < TS(Tj). The protocol manages concurrent execution such that the
timestamps determine the serializability order. The timestamp ordering protocol ensures that any conflicting read and write
operations are executed in timestamp order. Whenever some Transaction T tries to issue a R_item(X) or a W_item(X), the Basic TO
algorithm compares the timestamp of T with R_TS(X) & W_TS(X) to ensure that the Timestamp order is not violated. This describes
the Basic TO protocol in the following two cases.
1. Whenever a Transaction T issues a W_item(X) operation, check the following conditions:
• If R_TS(X) > TS(T) or if W_TS(X) > TS(T), then abort and rollback T and reject the operation. else,
• Execute W_item(X) operation of T and set W_TS(X) to TS(T).
2. Whenever a Transaction T issues a R_item(X) operation, check the following conditions:
• If W_TS(X) > TS(T), then abort and reject T and reject the operation, else
• If W_TS(X) <= TS(T), then execute the R_item(X) operation of T and set R_TS(X) to the larger of TS(T) and current R_TS(X).

Whenever the Basic TO algorithm detects two conflicting operations that occur in an incorrect order, it rejects the latter of the two
operations by aborting the Transaction that issued it. Schedules produced by Basic TO are guaranteed to be conflict serializable.
discussed that using Timestamp can ensure that our schedule will be deadlock free.
One drawback of the Basic TO protocol is that Cascading Rollback is still possible. Suppose we have a Transaction T1 and T2 has used
value written by T1. If T1 is aborted and resubmitted to the system then, T2 must also be aborted and rolled back. So the problem of
Cascading aborts still prevails.
Let’s gist the Advantages and Disadvantages of Basic TO protocol:

• Timestamp Ordering protocol ensures serializability since the precedence graph will be of the form:

Image – Precedence Graph for TS ordering


• Timestamp protocol ensures freedom from deadlock as no transaction ever waits.
• But the schedule may not be cascade free, and may not even be recoverable.
Strict Timestamp Ordering –

DBMS - PCATC Page 79


Strict Timestamp Ordering –
A variation of Basic TO is called Strict TO ensures that the schedules are both Strict and Conflict Serializable. In this variation, a
Transaction T that issues a R_item(X) or W_item(X) such that TS(T) > W_TS(X) has its read or write operation delayed until the
Transaction T‘ that wrote the values of X has committed or aborted.

DBMS - PCATC Page 80


LDAP
20 February 2023
10:51

Lightweight Directory Access Protocol (LDAP)


Lightweight Directory Access Protocol (LDAP) is an internet protocol works on TCP/IP, used to access information from
directories. LDAP protocol is basically used to access an active directory.
Features of LDAP:
1. Functional model of LDAP is simpler due to this it omits duplicate, rarely used and esoteric feature.
2. It is easier to understand and implement.
3. It uses strings to represent data
Directories:
Directories are set of object with similar attributes, organised in a logical and hierarchical manner. For example, Telephonic
Directories. It is a distributed database application used to manage attributes in a directory.

LDAP defines operations for accessing and modifying directory entries such as:
• Searching for user specified criteria
• Adding an entry
• Deleting an entry
• Modifying an entry
• Modifying the distinguished name or relative distinguished name of an entry
• Comparing an entry
LDAP Models:
LDAP can be explained by using four models upon which it based:
1. Information Model:
This model describes structure of information stored in an LDAP Directory.In this basic information is stored in directory is
entity. Entries here represents object of interest in real world such as people, server, organization, etc. Entries contain collection
attributes that contain information about object.Every attribute has a type and one or more values. Here types of attribute is
associated with syntax and syntax specifies what kind of values can be stored
2. Naming Model:
This model describes how information in an LDAP Directory is organized and identified. In this entries are organized in a Tree-
structure called Directory Information Tree (DIT). Entries are arranged within DIT based on their distinguished name DN. DN is a
unique name that unambiguously identifies a single entry.
3. Functional Model:
LDAP defines operations for accessing and modifying directory entries . In this we discuss about LDAP operations in a
language independent manner LDAP operations can be divided into following categories:
• Query
• Update
• Authentication
4. Security Model:
This model describes how information in LDAP directory can be protected from unauthorized access. It is based on BIND
There are several bind operation can be performed.
LDAP Client and Server Interaction:
It is quite similar to any other client-server interaction. In this client performs protocol functions against server.The interaction
place as follows:-

DBMS - PCATC Page 81


place as follows:-
1. A protocol request is send to server by client.
2. Server perform operations on directory such as search, update, delete, etc.
3. The response is sent back to the client.
Microsoft, Open LDAP, Sun, etc can easily be made an LDAP server. if the user don’t want to install directory service but want to
LDAP instruction for available LDAP server then user can use four11, bigfoot etc. Making an LDAP client is quite simple as there
SDK’s in many programming languages such as C, C++, Perl, Java, etc.
User has to perform certain task to be LDAP client:
(i) Go get SDK for your language
(ii) Use function of SDK to connect to LDAP
(iii) Operate on LDAP
LDAP functions / operations:
• (a) For Authentication:
It includes bind, unbind and abandon operations used to connect and disconnect to and from an LDAP server, establish access
and protect information. In authentication, client session is established and ended using the functions
-> BIND/UNBIND
-> Abandon
• (b) For Query:
It includes search and compare operations used to retrieve information from a directory. In query, server performs action using
function
-> Search
-> Compare Entry
• (c) For Update:
It includes add, delete, modify and modify RDN operations used to update stored information in a directory. In update, we can
changes in directories by using function
-> Add an entry
-> Delete an entry
-> Modify an entry
• Client establishes session with server (BIND) using Hostname/IP/and Port Number. For security purposes, user set USER-ID and
Password based authentication.
• Server perform operations such as read, update, search, etc.
• Client end session using UNBIND or Abandon function.
Advantages of LDAP:
• Data present in LDAP is available to many clients and libraries.
• LDAP support many types of application.
• LDAP is very general and has basic security.
Disadvantages in LDAP:
It does not handle well relational database.

What is Data Manipulation?


Data manipulation is the method of organizing data to make it easier to read or more designed or structured. For instance, a
any kind of data could be organized in alphabetical order so that it can be understood easily. On the other hand, it can be difficult to
information about any particular employee in an organization if all the employees' information is not organized. Therefore, all the
employee's information could be organized in alphabetical order that makes it easier to find information easily of any individual
Data manipulation helps website owners to monitor their sources of traffic and their most popular pages. Hence, it is frequently used

DBMS - PCATC Page 82


Data manipulation helps website owners to monitor their sources of traffic and their most popular pages. Hence, it is frequently used
web server logs.
Data manipulation is also used by accounting users or similar fields to organized data in order to figure out product costs, future tax
obligations, pricing patterns, etc. It also helps the stock market predictors to forecast developments and predicts how stocks might
in the adjacent future. Furthermore, data manipulation may also use by computers to display information to users in a more realistic
on the basis of web pages, the code in a software program, or data formatting.
The DML is used to manipulate data, which is a programming language. It short for Data Manipulation Language that helps to modify
like adding, removing, and altering databases. It means that changing the information in a way that can be read easily.
Objective of Data Manipulation
Data manipulation is a key feature for business operations and optimization. You need to deal with data in a proper manner and
manipulate it into meaningful information like doing trend analysis, financial data, and consumer behavior. Data manipulation offers
organization multiple advantages; some are discussed below:
Play Video
• Consistent data: Data manipulation provides a way to organize your data inconsistent format that makes it structured, which
be read easily and better understood. When you are collecting data from different-different sources, you may not have a unified
view; but data manipulation provides you surety that the data is well -organized, structured, and stored consistently.
• Project data: Especially when it comes to finances, data manipulation is more useful as it helps to provide more in-depth
analysis by using historical data to project the future.
• Delete or neglect redundant data: Data manipulation helps to maintain your data and delete unusable data that is always
present.
• Overall, with the data, you can do many operations such as edit, delete, update, convert, and incorporate data into a databas e. It
helps to create more value from the data. If you do not know how to use data in an effective manner, it becomes pointless.
Therefore, it will be beneficial to make better business decisions when you are able to organize your data accordingly.
Steps involved in Data Manipulation
Below there are some important steps given that may help you out to get started with data manipulation.
1. First of all, data manipulation is possible only if you have data. Therefore, you are required to create a database that is
from data sources.
2. This knowledge needs restructuring and reorganization, which could be done with data manipulation that helps you to cleanse
information.
3. Then, you need to import a database and create it to get start work with data.
4. With the help of data manipulation, you can edit, delete, merge, or combine your information.
5. Finally, data analysis becomes easier at the time of manipulating data.
Why do use data manipulation?
It is more important to manipulate data for improving the growth of any business and organization. As manipulation of data helps to
the information properly by organizing the raw data in a structural way, which is crucial for boosting productivity, trend analysis,
costs, analyzing customer behavior, etc. Below there are some examples of the benefits that describe the need for data manipulation.
Format consistency
Data manipulation offers a way to organize data in a unified format, which helps c-suit members to a better understanding of
intelligence. The collection of data from various sources can be unstructured, whereas DML (Data manipulation language) allows data
consistently organized and more transparent.
Historical overview
The manipulation of data can help you with making the right decisions by providing easy access to data related to your previous
Also, it can help with required team size, budget allocation, and deadline projections.
Efficiency
The manipulation of data provides efficiency in terms of collecting organized data or meaningful information. You may not be aware
findings interfere or are redundant, information is relevant or not, metrics have a low or significant impact. DML offers you the benefit
isolating and identify these facts quickly.
In daily life, we also see data manipulation; if you are receiving calls from telemarketers, getting targeted ads on the websites you
receiving emails, it is all done through data manipulation. It also helps in your online behavior in terms of extracting relevant
For example, when you are visiting any website and share your email address at this site and agree to terms and conditions, it will
your behavior and likely generate relevant data for you.
Data manipulation tips
One of the widely used tools for data manipulation is Microsoft Excel. Below there are some tips to work on this tool.
• Formulas and functions: In Excel, you can use essential math functions easily to modify your data through desired values, such
as Addition, subtraction, multiplication, and division. However, you should know that how to use these basic math functions i n
Excel.
• Autofill: This feature is useful when you want to the same equation across multiple fields or cells without re-entering the
information from scratch. If you do not use this function, you have to retype the formula or need to drag the cursor from the cell's

DBMS - PCATC Page 83


information from scratch. If you do not use this function, you have to retype the formula or need to drag the cursor from the cell's
lower right corner until the cell you want to fill up. Therefore, users can rely on this feature for the sake of efficiency through the
data manipulation process.
• Sort and Filter: When you are analyzing data and need to find specific data, you can save a lot of time at that time by using the
filtering options. It helps to isolate the information you wish to see.
• Removing duplicates: When you are collecting and assimilating data, there are often chances of the same sets of information. In
Excel, you can delete duplicate spreadsheet entries easily by using the Delete Duplicate feature.
• Combining column: To paint a clear picture, you can merge columns or rows in Excel or use other means to organize your data.
Column splitting, merging, and merging-Columns or rows offer users surety that the most relevant cells are immediately visible.
Difference between Data manipulation and Data modification
Both terms, data manipulation and data modification sound similar; however, they are not interchangeable. Generally, data
the act of organizing data to make it cooler to read or additional refined. On the other hand, data modification is the process of
the existing data values or data itself.
Anyone can get confused by their sound; therefore, here is an instance to explain both terms. Let's take value X=7. It can be represented
data manipulation as X=3+4, or X=2+5, X=8-1, etc. By using data manipulation, it can be represented as X=5.

DBMS - PCATC Page 84


videos
20 February 2023
14:06

B+ tree in ...

B tree inse...

B tree in d...

B tree in d...

DBMS - Ins...

DBMS - PCATC Page 85

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy