0% found this document useful (0 votes)
45 views136 pages

DBMS Question Bank With Answers

database management system

Uploaded by

mulaviju2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views136 pages

DBMS Question Bank With Answers

database management system

Uploaded by

mulaviju2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 136

DATABASE MANAGEMENT SYSTEMS

QUESTION BANK

DEPARTMENT OF COMPUTER SCIENCE


ENGINEERING (CSE,AIML, DS)
(2023-2024)
SHORT QUESTION AND ANSWERS

UNIT-1

 Question and answers:


1. What are the goals of DBMS?

A: A database system’s primary goal is to facilitate data retrieval


and provide a dependable storage platform for essential data.
Efficient storage and retrieval are allowed by structured organization
of data through database systems utilizing predefined schemas and
data models.
2. Explain about DML language and query processor.

A: Query Processor: It interprets the requests (queries) received from


end user via an application program into instructions. It also executes
the user request which is received from the DML compiler.
Data Manipulation Language is used to Manipulate the data in the
database by using different commands. In this category we can able
to perform Insert new data into Table, Update existing data in the
Table, Delete Data from the Table and other functions we can
perform on data by using these DML commands.
3. What is data model? Explain Relational Model and E-R model.

A: A Data Model in Database Management System (DBMS) is the


concept of tools that are developed to summarize the description of
the database. Data Models provide us with a transparent picture of
data which helps us in creating an actual database. It shows us from
the design of the data to its proper implementation of data.
4. Distinguish between Relational Model and E-R

model. A:

ER model Relational model

Developed by Peter Chen in 1976. Developed by E.F. Codd in


1970.
ER model is the high level or It is the representational or
conceptual model. implementation model.

It is used by people who don’t


know how database is It is used by programmers.
implemented.
It represents collection of It represent data in the form of
entities and describes tables and describes
relationship between them. relationship between them.

ER model Relational model

It consists of components like It consists of components like


Entity, Entity Type, Entity domain, attributes, tuples.
Set.
It is less easy to derive the
It is easy to understand the
relationship between
relationship between entities.
different tables.

It describes cardinality. It does not describe cardinality.

E-R model does not define data Relational model defines


dependencies. dependencies in tables.

E-R model represents Relational model represents


relationships as relationships as join tables.
associations.
E-R model is more granular in
Relational model is less
terms of data representation.
granular.
E-R model is more flexible than Relational model is less
the relational model. flexible than E-R
model.
E-R model does not involve Relational model involves
normalization. normalization.

Relational model use case is


E-R model use case is useful
useful for implementation and
for initial planning and
maintenance
design.
Some of the popular Language and
Notations used- Some of the popular Language
 Chen and Notations used-
 UML  SQL
 Crow’s foot  MySQL
 Bachman and others.

5. What is abstraction? What are the levels of Abstraction in a DBMS?

A: Data abstraction is the process of hiddening irrelevant or


unwanted data from the end user.
Levels of Data Abstractions in DBMS

In DBMS, there are three levels of data abstraction:


o Physical or Internal Level.

o Logical or Conceptual Level.


6. How to represent the strong entity set, weak entity set in ER-

Model? A:

7. Define Multi valued attribute.

A: Multivalued Attribute:An attribute which can have multiple


values is known as a multivalued attribute. Multivalued attributes
have multiple values for the single instance of an entity.
It is represented by double
oval. Ex:

8. what is cardinality and write about types of cardinality.

A: The number of times an entity of an entity set participate in a


relationship set is known as cardinality.
Types of cardinality
a) ONE to ONE relationship:A single record in table A is
related to a single record in table B and vice versa is true.
Ex:

b) ONE to MANY relationship: Each record of table A can be


related to one or more than one record of table B. But the
records of table B can only relate to the only record in the
table A.
Ex:

c) MANY to ONE relationship: Each record of table B can be


related to one or more than one record of table A. But the
records of table A can only relate to the only record in the
table B.
Ex:
d) MANY to MANY relationship:Each record of table A can be
related to one or more than one record of table B and vice-
versa.
Ex:

9. Explain about Specialization.

A: Specialization things are broken down into smaller things to


simplify it further. We can also say that in Specialization a particular
entity gets divided into sub entities and it’s done on the basis of it’s
characteristics. Also in Specialization Inheritance takes place.

Example of Specialization :

10. Explain about Generalization and Specialization.

A: Generalization :
It works on the principle of bottom up approach. In Generalization
lower level functions are combined to form higher level function
which is called as entities. This process is repeated further to make
advanced level entities. In the Generalization process properties are
drawn from particular entities and thus we can create generalized
entity. We can summarize Generalization process as it combines
subclasses to form superclass.
Example of Generalization :
UNIT-2

 Question and answers:


1. Define unique key.
A: A column or set of columns in a database system that uniquely
identifies each tuple in the table is called a unique key. Unique keys
can have one NULL value.
2. Distinguish between super key and Candidate
key. A:

Super Key Candidate Key

Super Key is an attribute (or set


of attributes) that is used to Candidate Key is a subset of
uniquely identifies all attributes a super key.
in a relation.
But all candidate keys are
All super keys can’t be candidate super keys.
keys.
Various candidate keys together
Various super keys together makes makes the criteria to select the
the criteria to select the candidate primary keys.
keys.
While in a relation, number of
In a relation, number of super keys candidate keys are less than
is more than number of candidate
number of super keys.
keys.
Super key attributes can contain Candidate key attributes can
NULL values. also contain NULL values.

3. Define integrity constraints.

A: Integrity constraints are a set of rules. It is used to maintain the


quality of information.Integrity constraints ensure that the data
insertion, updating, and other processes have to be performed in such
a way that data integrity is not affected.Thus, integrity constraint is
used to guard against accidental damage to the database.
4. Define views and tables.
A: Views in SQL are a kind of virtual table. A view also has rows
and columns like tables, but a view doesn’t store data on the disk like
a table.
A table is a database object which is used to store data in relational
databases in the form of rows and columns. It actually stores the
data in DBMS. It is also known as a base table.

5. Define foreign key.

A: Foreign keys are a set of constraints in DBMS that establish


relationships between tables and also ensure consistency and
integrity of data. A foreign key is applied to a column of one table
which references the primary key of a column in another table.

6. what is procedural query language.

A: Procedural Language: In procedural languages, the program code


is written as a sequence of instructions. User has to specify “what
to do” and also “how to do” (step by step procedure). These
instructions are executed in the sequential order. These instructions
are written to solve specific problems.

Ex: FORTRAN,COBOL,ALGOL,BASIC, C and Pascal.

7. Explain about selection operations in relational algebra.

A: Select operation chooses the subset of tuples from the relation that
satisfies the given condition mentioned in the syntax of selection.
The selection operation is also known as horizontal partitioning since
it partitions the table or relation horizontally.
Syntax: σ c(R).
8. Explain key integrity constraint.
A: Key constraints
o Keys are the entity set that is used to identify an entity within its
entity set uniquely.
o An entity set can have multiple keys, but out of which one key will
be the primary key. A primary key can contain a unique and null
value in the relational table.

Example:

9. Explain candidate key.


A: Candidate key: A Candidate key is a super key whose proper subset
is not a super key.
 Candidate key is minimal super key.
 Every Candidate key can be a super key.
 Every super key can’t be a candidate key.

10. Define cartesian product.


A: Cartesian product is used to combine each row in one table with
each row in the other table.
 It is also known as “cross product”.
 The symbol ‘✕’ is used to denote the CROSS PRODUCT
operator. Ex: Consider two relations STUDENT(SNO,
FNAME, LNAME) and DETAIL(ROLLNO, AGE) below:
SNO FNAME LNAME
1 Albert Singh
2 Nora Fatehi
ROLLNO AGE
5 18
9 21
On applying CROSS PRODUCT on STUDENT and DETAIL
STUDENT ✕ DETAILS

SNO FNAME LNAME ROLLNO AGE


1 Albert Singh 5 18
1 Albert Singh 9 21
2 Nora Fatehi 5 18
2 Nora Fatehi 9 21
UNIT 3:

1. What are the parts of SQL language?


The SQL language has several parts: data - definition language, Data
manipulation language.

2. What are the categories of SQL command?


SQL commands are divided in to the following categories:
 data - definition language
 data manipulation language
 Data Query language
 data control language
 transaction control statements

3. What are the three classes of SQL


expression? SQL expression consists of
three clauses:
1. Select
2. From
3. Where

4. Give the general form of SQL query?


Select A1, A2 , An From R1, R2 , Rm Where P

5. What is the use of rename operation?


Rename operation is used to rename both relations and a
attributes. It uses the as clause, taking the form:
Old-name as new-name

6. Define tuple variable?


Tuple variables are used for comparing two tuples in the same relation.
The tuple variables are defined in the from clause by way of the as
clause.

7. List the string operations supported by SQL?


 Pattern matching Operation
 Concatenation
 Extracting character strings
 Converting between uppercase and lower case letters.

8. List the set operations of SQL?


 Union
 Intersect operation
 The except operation
9. What is the use of Union and intersection operation?
Union: The result of this operation includes all tuples that are either in r1
or in r2 or in both r1 and r2.Duplicate tuples are automatically eliminated.
Intersection: The result of this relation includes all tuples that are in both r1
and r2.
10. What are aggregate functions? And list the aggregate functions supported by
SQL? Aggregate functions are functions that take a collection of values as
input and return a single value.
Aggregate functions supported by SQL are Average: avg, Minimum: min,
Maximum: max Total: sum, Count: count

11. What is the use of group by clause?


Group by clause is used to apply aggregate functions to a set of tuples. The
attributes given in the group by clause are used to form groups. Tuples with
the same value on all attributes in the group by clause are placed in one
group.

12. What is the use of sub queries?


A sub query is a select-from-where expression that is nested with in another
query. A common use of sub queries is to perform tests for set
membership,make setcomparisions, and determine set cardinality.

13. What is view in SQL? How is it defined?


Any relation that is not part of the logical model, but is made visible to a
user as a virtual relation is called a view.
We define view in SQL by using the create view command. The form of
create view
command is Create view v as <query expression>

14. What is the use of with clause in SQL?


The with clause provides a way of defining a temporary view whose
definition is available only to the query in which the with clause occurs.

15. List the table modification commands in SQL?


 Deletion
 Insertion
 Updates
 Update of a view

16. List out the statements associated with a database transaction?


 Commit work
 Rollback work

17. What is transaction?


Transaction is a unit of program execution that accesses and possibly
updated various data items.

18. List the SQL domain Types?


SQL supports the following domain types.
1) Char(n) 2) varchar(n) 3) int 4) numeric(p,d) 5)
float(n) 6) date.

19. What is the use of integrity constraints?


Integrity constraints ensure that changes made to the database by authorized
users do not result in a loss of data consistency. Thus integrity constraints
guard against accidental damage to the database.
20. Name the various privileges in SQL?
 Delete
 Select
 Insert
 update

21. Mention the various user privileges.


All privileges directly granted to the user or role.
All privileges granted to roles that have been granted to the user or role.

22. Give the limitations of SQL authorization.


 The code for checking authorization becomes intermixed with the rest
of the application code.
 Implementing authorization through application code rather than
specifying it declaratively in SQL makes it hard to ensure the absence of
loopholes.
23. What is meant by normalization of data?
o It is a process of analyzing the given relation schemas based on their
Functional Dependencies (FDs) and primary key to achieve the
properties
o Minimizing redundancy
o Minimizing insertion, deletion and updating anomalies.

24. What is meant by functional dependencies?


Consider a relation schema R and α C R and β C R. The functional
dependency α β holds on relational schema R if in any legal relation r(R), for
all pairs of tuples t1 and t2 in r such that t1 [α] =t1 [α], and also t1 [β] =t2
[β].

25. What are the uses of functional dependencies?


 To test relations to see whether they are legal under a given set of
functional dependencies.
 To specify constraints on the set of legal relations.

26. Define Boyce codd normal form


A relation schema R is in BCNF with respect to a set F of functional
dependencies if, for all functional dependencies in F of the form. α->β,
where α
+

27. List the disadvantages of relational database system


 Repetition of data
 Inability to represent certain information.

28. What is first normal form?


The domain of attribute must include only atomic (simple, indivisible) values.

29. Explain trivial dependency?


Functional dependency of the form α β is trivial if β C α. Trivial functional
dependencies are satisfied by all the relations.

30. What are axioms?


Axioms or rules of inference provide a simpler technique for reasoning
about functional dependencies.

31. Define canonical cover?


A canonical cover Fc for F is a set of dependencies such that F logically
implies all dependencies in FC and Fc logically implies all dependencies in F.
Fc must have the following properties.

32. List the properties of canonical cover.


 Fc must have the following properties.
 No functional dependency in Fc contains an extraneous attribute. Each
left side of a functional dependency in Fc is unique.

33. What is meant by normalization of data?

It is a process of analysing the given relation schemas based


on their Functional Dependencies
(FDs) and primary key to achieve the properties
 Minimizing redundancy
 Minimizing insertion, deletion and updating anomalies

34. List out the desirable properties of decomposition.


Lossless-join decomposition
Dependency preservation
Repetition of information
35. What is 2NF?
A relation schema R is in 2NF if it is in 1NF and every non-prime attribute
A in R is fully functionally dependent on primary key.
36. Define instance.
An instance of a relation that satisfies all such real-world constraints is
called a legal instance of the relation; a legal instance of a database is one
where all the relation instances are legal instances.

37. What does Boyce-Codd Normal Form (BCNF) mean?

Boyce-Codd Normal Form (BCNF) is one of the forms of database


normalization.
A database table is in BCNF if and only if there are no non-trivial
functional dependencies of attributes on anything other than a
superset of a candidate key.
38. Define third normal form.

A relation schema R is in third normal form with respect to a set F of


functional dependencies if, for all functional dependencies in F + of the
form α→ β, where α ⊆ R and β ⊆ R, at least one of the following holds:
 α→ β is a trivial functional dependency.
 α is a superkey for R.
 Each attribute A in β − α is contained in a candidate key for R.
It is a process of analysing the given relation schemas based on
their Functional Dependencies
(FDs) and primary key to achieve the properties
 Minimizing redundancy
 Minimizing insertion, deletion and updating anomalies

39. List out the desirable properties of decomposition.


Lossless-join decomposition
Dependency preservation
Repetition of information

40. What is 2NF?


A relation schema R is in 2NF if it is in 1NF and every non-prime
attribute A in R is fully functionally dependent on primary key.
41. Define instance.
An instance of a relation that satisfies all such real-world
constraints is called a legal instance of the relation; a legal instance of a
database is one where all the relation instances are legal instances.

42. What does Boyce-Codd Normal Form (BCNF) mean?

Boyce-Codd Normal Form (BCNF) is one of the forms of database


normalization.
A database table is in BCNF if and only if there are no non-
trivial functional dependencies of attributes on anything other
than a superset of a candidate key.
43. Define third normal form.

A relation schema R is in third normal form with respect to a set F


of functional dependencies if, for all functional dependencies in F + of
the form α→ β, where α ⊆ R and β ⊆ R, at least one of the following
holds:
 α→ β is a trivial functional dependency.
 α is a superkey for R.
 Each attribute A in β − α is contained in a candidate key for R.

44. What is meant by Multivalued Dependencies?

Multivalued dependencies, do not rule out the existence of certain


tuples. Instead, they require that other tuples of a certain form be present
in the relation. For this reason, functional dependencies sometimes are
referred to as equality-generating dependencies, and multivalued
dependencies are referred to as tuple-generating dependencies.
Unit 4

1. What is a transaction?
A transaction can be defined as a group of tasks that form a single
logical unit.
2. What does time to commit mean?
i. The COMMIT command is used to save permanently any
transaction to database.
ii. When we perform, Read or Write operations to the database then
those changes can be undone by rollback operations. To make these
changes permanent, we should make use of commit
3. What are the various properties of transaction that the database system
maintains to ensure integrity of data. (OR) What are ACID properties?
In a database, each transaction should maintain ACID property to
meet the consistency and integrity of the database. These are (1)Atomicity
(2) Consistency (3) Isolation (4) Durability
4. Give the meaning of the expression ACID transaction.
The expression ACID transaction represents the transaction that
follows the ACID Properties.
5. State the atomicity property of a transaction.
This property states that each transaction must be considered as a
single unit and must be completed fully or not completed at all. No
transaction in the database is left half completed.
6. What is meant by concurrency control ?
A mechanism which ensures that simultaneous execution of more than
one transactions does not lead to any database inconsistencies is called
concurrency control mechanism.
7. State the need for concurrency control. (OR) Why is it necessary to have
control of concurrent execution of transactions? How is it made
possible?
Following are the purposes of concurrency control-
a. To ensure isolation
b. To resolve read-write or write-write conflicts
c. To preserve consistency of database
8. List commonly used concurrency control techniques.
The commonly used concurrency control techniques are -
i) Lock
ii) Timestamp
iii) Snapshot Isolation
9. What is meant by serializability? How it is tested?
Serializability is a concept that helps to identify which non serial
schedule and find the transaction equivalent to serial schedule. It is tested
using precedence graph technique.
10. What is serializable schedule?
The schedule in which the transactions execute one after the other is
called serial schedule. It is consistent in nature. For example : Consider
following two transactions T1 and T2

All the operations of transaction T1 on data items A and then B executes


and then in transaction T2 all the operations on data items A and B execute.
The R stands for Read operation and W stands for write operation.
11. When are two schedules conflict equivalent?
Two schedules are conflict equivalent if :
(1) They contain the same set of the transaction.
(2) every pair of conflicting actions is ordered the same way.
For example -

Schedule S2 is a serial schedule because, in this, all operations of T1 are


performed before starting any operation of T2. Schedule S1 can be
transformed into a serial schedule by swapping non-conflicting
operations of S1.
Hence both of the above the schedules are conflict equivalent.
12. Define two phase locking.
The two phase locking is a protocol in which there are two phases:
i) Growing Phase (Locking Phase): It is a phase in which the transaction
may obtain locks but does not release any lock.
ii) Shrinking Phase (Unlocking Phase): It is a phase in which the transaction
may release the locks but does not obtain any new lock.
13. What is the difference between shared lock and exclusive lock?
14. What type of lock is needed for insert and delete operations.
The exclusive lock is needed to insert and delete operations.
15. What benefit does strict two-phase locking provide? What disadvantages
result?
Benefits:
This ensure that any data written by an uncommitted transaction are
locked in exclusive mode until the transaction commits and preventing
other transaction from reading that data. This protocol solves dirty read
problem.
Disadvantage:
Concurrency is reduced.
16. What is rigorous two phase locking protocol ?
This is stricter two phase locking protocol. Here all locks are to be held
until the transaction commits.
17. Differentiate strict two phase locking and rigourous two phase locking
protocol.
(1) In Strict two phase locking protocol all the exclusive mode locks be held
until the transaction commits.
(2) The rigourous two phase locking protocol is stricter than strict two phase
locking protocol. Here all locks are to be held until the transaction
commits
18. Define deadlock.
Deadlock is a situation in which when two or more transactions have got a
lock and waiting for another locks currently held by one of the other
transactions.
19. List four conditions for deadlock.
1. Mutual exclusion condition
Hold and wait condition
No preemption condition
Circular wait condition
20. Why is recovery needed?
(1) A recovery scheme that can restore the database to the
consistent state that existed before the failure.
(2) Due to recovery mechanism, there is high availability of database to
its users.
21. What are states of transaction?
Various states of transaction are - (1) Active, (2) Partially Committed (3)
Failed (4) Aborted (5) Committed.
22. What is meant by log based recovery?
Log is a most commonly used data structure for recording the
modifications that can be made to actual database.
Log based recovery is a technique in which a log of each transaction is
maintained in some stable storage so that if failure occurs then it can be
recovered from there.
23. List the responsibilities of a DBMS has whenever a transaction is
submitted to the system for execution.
The system is responsible for making sure that - (1) Either all the
operations in the transaction are completed successfully and effect is
recorded permanently in the database. (2) The transaction, has no effect
whatsoever on the database or on the database or on any other transaction.
24. Brief any two violations that may occur if a transaction executes a lower
isolation level than serializable.
(1) For non-repeatable Read the phantom read is allowed.
Unit 5:

1. Give the comparison between ordered indices and hashing.


(1) If range of queries are common, ordered indices are to be used.
(2) The buckets containing records can be chained in sorted order in case of
ordered indices.
(3) Hashing is generally better at retrieving records having a specified value of
the key.
(4) Hash function assigns values randomly to buckets. Thus, there is no simple
notion of "next bucket in sorted order."
2.What are the causes of bucket overflow in a hash file organization?
Bucket overflow can occur for following reasons -
(1) Insufficient buckets: For the total number of buckets there are
insufficient number of buckets to occupy.
(2) Skew: Some buckets are assigned more records than are others, so a
bucket might overflow even while other buckets still have space. This situation
is known as bucket skew.
3.What is the need for RAID?
RAID is a technology that is used to increase the performance.
• It is used for increased reliability of data storage.
• An array of multiple disks accessed in parallel will give greater throughput than a
single disk.
• With multiple disks and a suitable redundancy scheme, your system can stay up
and running when a disk fails, and even while the replacement disk is being
installed and its data restored.
4.Define Software and hardware RAID systems.
Hardware RAID: The hardware-based array manages the RAID subsystem
independently from the host. It presents a single disk per RAID array to the host.
Software RAID: Software RAID implements the various RAID levels in the kernel
disk code. It offers the cheapest possible solution, as expensive disk controller
cards.
5. What are ordered indices?
This is type of indexing which is based on sorted ordering values. Various
ordered indices are primary indexing, secondary indexing.
6.What are the two types of ordered indices?
Two types of ordered indices are - Primary indexing and secondary
indexing. The primary indexing can be further classified into dense indexing and
sparse indexing and single level indexing and multilevel indexing.
7. What can be done to reduce the occurrences of bucket overflows in a hash file
organization?
(1) A bucket is a unit of storage containing one or more records (a bucket is
typically a disk block).
(2) The file blocks are divided into M equal-sized buckets, numbered bucket0,
bucket... bucketM-1. Typically, a bucket corresponds to one (or a fixed
number of) disk block.
(3) In a hash file organization we obtain the bucket of a record directly from
its search- key value using a hash function, h (K).
(4) To reduce overflow records, a hash file is typically kept 70-80% full.
(5) The hash function h should distribute the records uniformly among the
buckets; otherwise, search time will be increased because many overflow
records will exist.
8.Distinguish between dense and sparse indices.
1) Dense index:
• An index record appears for every search key value in file.
• This record contains search key value and a pointer to the actual record.
2)Sparseindex:
• Index records are created only for some of the records.
• To locate a record, we find the index record with the largest search key value less
than or equal to the search key value we are looking for.
• We start at that record pointed to by the index record, and proceed along the
pointers in the file (that is, sequentially) until we find the desired record
9. When is it preferable to use a dense index rather than a sparse index?
Explain your answer.
1. It is preferable to use a dense index instead of a sparse index when the file is not
sorted on the indexed field.
2. Or when the index file is small compared to the size of memory.
10. How does B-tree differs from a B+ tree? Why is a B+ tree usually preferred as
an access structure to a data file?

11. What are the disadvantages of B tree over B+ tree?


(1) Searching of a key value becomes difficult in B-tree as data cannot be found in
the leaf node.
(2) The leaf node cannot store linked list and thus wastes the space.
12. Mention different hashing techniques.
Two types of hashing techniques are - i) Static hashing ii) Dynamic hashing.
13. List the mechanisms to avoid collision during hashing.
Collision Resolution techniques are: (1) Separate chaining
(2) Open addressing techniques: (i) Linear probing (ii) Quadratic probing
14. What is the basic difference between static hashing and dynamic hashing?
15. What is the need for query optimization?
Query optimization is required for fast execution of long running complex
16. Which cost component are used most commonly as the basis for cost function.
Disk access or secondary storage access is considered most commonly as a basis
for cost function.
17. What is query execution plan?
To specify fully how to evaluate a query, we need not only to provide the
relational-algebra expression, but also to annotate it with instructions specifying
how to evaluate each operation. This annotated structure is called query execution
plan.
18. Mention all the operations of files.
Various file operations are - (1) Creation of file (2) Insertion of data (3) Deletion of
data (4) Searching desired data from the file. ba
19. Define dense index.
• An index record appears for every search key value in file.
• This record contains search key value and a pointer to the actual record.
• For example:
20. How do you represent leaf node of a B+ tree of order p?
To retrieve all the leaf pages efficiently we have to link them using page pointers.
The sequence of leaf pages is also called as sequence set.

21. What is the Indexed sequential access method (ISAM)?


ISAM (Indexed sequential access method) is an advanced sequential file
organization method. In this case, records are stored in the file with the help of
the primary key. For each primary key, an index value is created and mapped
to the record. This index contains the address of the record in the file.

If a record has to be obtained based on its index value, the data block’s
address is retrieved, and the record is retrieved from memory.
22. Describe the advantage of using ISAM?

 Because each record consists of the address of its data block in this manner,
finding a record in a large database is rapid and simple.
 Range retrieval and partial record retrieval are both supported by this
approach. We may obtain data for a specific range of values because the
index is based on primary key values. Similarly, the partial value can be
simply found, for example, in a student’s name that begins with the letter
‘JA’.

23. Explain the Cons or disadvantages of ISAM?

 This approach necessitates additional disc space to hold the index value.
 When new records are added, these files must be reconstructed in order to
keep the sequence.
 When a record is erased, the space it occupied must be freed up. Otherwise,
the database’s performance will suffer.
LONG QUESTION AND ANSWERS
UNIT-1

1) What is DBMS? What are the advantages of DBMS


The full form of DBMS is Database Management System. DBMS stands for
Database
Definition: - A database management system is software used to perform
different operations, like addition, access, updating, and deletion of the
data, like adding your name in the database for an online retail store as a
customer.ex MYSQL,ORACLE
A database is a collection of related data which represents some aspect of the
real world. A
database system is designed to be built and populated with data for a certain
task.
The database is maintaining information concerning students, courses, and
grades in a university environment.
Management System is software for storing and retrieving users data

2.Classify the data models.


Types of Data model:
A Database model defines the logical design and structure of a database and
defines how data will be stored, accessed and updated in a database
management system.While the Relational Model is the most widely used
database model, there are other models too:

1. Relational model: A collection of tables to represent both data and the


relationships. In this model, data is organised in two-dimensional tables and
the relationship is maintained by storing a common field.
This model was introduced by E.F Codd in 1970, and since then it has been the
most widely used database model, infact, we can say the only database model
used around the world.
The basic structure of data in the relational model is tables. All the information
related to a particular type is stored in rows of that table.
Hence, tables are also known as relations in relational model. In the coming
tutorials we will learn how to design tables, normalize them to reduce
data redundancy and how to use Structured Query language to access data from
tables.
2. Entity Relationship model: E-R model uses collection of basic objects called
entities and relationships
In this database model, relationships are created by dividing object of interest into
entity and its characteristics into attributes.
Different entities are related using relationships.
E-R Models are defined to represent the relationships into pictorial form to make it
easier for different stakeholders to understand.
This model is good to design a database, which can then be turned into tables in
relational model(explained below).
Let take an example, If we have to design a School Database, then Student will be
an entity with attributes name, age, address etc. As Address is generally complex, it
can be another entity with attributes street name, pincode, city etc, and there will be
a relationship between them.
3. Object Based Data model: object oriented programming like in java,
c++,c#

4. Hierarchical model: It represents tree format


One to one, one to many but it doesn’t support many to many
relationship
This database model organises data into a tree-like-structure, with a single root, to
which all the other data is linked. The heirarchy starts from the Root data, and
expands like a tree, adding child nodes to the parent nodes. In this model, a child
node will only have a single parent node.
This model efficiently describes many real-world relationships like index of a book,
recipes etc.
In hierarchical model, data is organised into tree-like structure with one one-to-
many relationship between two different types of data, for example, one department
can have many courses, many professors and of-course many students.

Network model: Data represents in nodes This is an extension of the Hierarchical


model. In this model data is organised more like a graph, and are allowed to have
more than one parent node.
In this database model data is more related as more relationships are established in
this database model. Also, as the data is more related, hence accessing the data is
also easier and fast. This database model was used to map many-to-many data
relationships.

This was the most widely used database model, before Relational Model was
introduced.

3) Explain the Database system structure in detail? or Database system


environment?
Structure of DBMS:
=>It is a software that allows access to data stored in database and
provides an easy and effective method of
Defining the information
Storing the information
Manipulating the in formation
Protecting the information
=>The database system is divided in to 3 components
1. Query processor
2. storage manager
3. Disk storage
1. Query processor:
It interprets the requests(queries) received from end user via an
application program into instructions
Following components
a) DML compiler: It processes the DML statements into low level
instructions
b) DDL interpreter: It processes the DDL statements into a set of table
containing metadata
c) Embedded DML pre compiler: It processes DML statements
embedded in an application program into procedural calls
d) Query optimizer: It executes the instructions generated by DML
compiler

1. Storage manager:
It is a program that provides an interface between the datastored in the
database and the queries received. It is also known as database control
system
Following components
a) Authorization manager: It ensures role based access control i.e,
checks whether the particular person is privileged to perform the
requested operation or not
b) Integrity manager: It checks the integrity constraints when the
database is modified
c) Transaction manager: It controls concurrent access by performing the
operations in a scheduled way that it receives the transaction
d) File manager: It manages the file space and the data structure used to
represent information in the database
e) Buffer manager: It represents the transfer of data between the
secondary storage and main memory
2. Disk storage:
a) Data files: It stores the data
b) Data dictionary: It contains the information about the structure of any
database object
c) Indices: It provides faster retrieval of data item
4) Describe the Entity-Relationship Model thoroughly? Explain the basic
concepts like Entity Sets, Relationship Sets and Attributes in detail with
respective diagrams?

E-R model: It is a top down approach to database design that is based on uniquely
identifiable object.
The following are the symbols used in E-R MODEL
Entities:

Entities: An entity is an object, person, place, concept or event in the users


environment which manage the data
Ex: object- car, bike
Person- employee, student
Place- state, countries, concept- course, business
=>There are 2 types of entities
1. Strong entity: It is one that does not depend on other entities

Ex- A chairman of a company does not depend on any one for final decisions
2. weak entity: It is one that depends on other entities for existence
Ex- If an employee is retired then we do not need to store the details of his
dependent(childrens).
Entity set or Entity type: A collection of similar type of entities is called entity set
Ex- A student entity set which contains a collection of similar types student details
Attributes: Entities are represented by means of their properties called attributes. It
sis represented by means of eclipses
EX-A student entity may have name, class, age as attributes
Types of attributes:
1. Simple attribute: These are atomic values which cannot be divided further

2. Composite attribute: If the attributes are composite they are further divided in a
tree like structure. Every node is then connected to its attribute
3. Single valued attribute: It contains on single value

4. Multi valued attribute: It may contain more than one values

5. Derived attribute: It do not exist physical in the database but there values are
derived from other attributes presented in the database

Relationship: The association among entities is called relation ship

Relationship set: Relationship of similar type is called relationship set. Like,


entities a relationship too can have attributes are called descriptive attributes.
Degree of relationship: The number of participating entities in an relationship
defines the degree of the relationship
1. Binary=degree 2
2. Ternary= degree 3
3. N-ary= degree n

Mapping cardinalities:
Cardinality: It defines the number of entities in one entity set which can be
associated to the number of entities of other set via relationship set
1. One to one: one entity from entity set ‘A’ can be associated with at most one
entity of entity set ‘B’ and vice versa
2. One to many: one entity from entity set ‘A’ can be associated with more than
one entities of entity set ‘B’ one entity can be associated with at most one entity
3. Many to one: more than one entities from entity set ‘A’ can be associated with at
most one entity of entity set ‘B’ but one entity
from entity set ‘B’ can be associated with more than one entity from entity set ‘A’
4. Many to many: One entity from ‘A’ can be associated with more than one entity
from ‘B’ and vice versa
Symbols Used in Mapping
Example
5) Develop an E -R Diagram for Banking enterprise system .

ER diagram of Bank has the following description :

 Bank have Customer.


 Banks are identified by a name, code, address of main office.
 Banks have branches.
 Branches are identified by a branch_no., branch_name, address.
 Customers are identified by name, cust-id, phone number, address.
 Customer can have one or more accounts.
 Accounts are identified by account_no., acc_type, balance.
 Customer can avail loans.
 Loans are identified by loan_id, loan_type and amount.
 Account and loans are related to bank’s branch

ER Diagram of Bank Management System :


This bank ER diagram illustrates key information about bank, including entities
such as branches, customers, accounts, and loans. It allows us to understand the
relationships between entities.
Entities and their Attributes are :

 Bank Entity : Attributes of Bank Entity are Bank Name, Code and Address.
Code is Primary Key for Bank Entity.
 Customer Entity : Attributes of Customer Entity are Customer_id, Name,
Phone Number and Address.
Customer_id is Primary Key for Customer Entity.
 Branch Entity : Attributes of Branch Entity are Branch_id, Name and
Address.
Branch_id is Primary Key for Branch Entity.

 Account Entity : Attributes of Account Entity are Account_number,


Account_Type and Balance.
Account_number is Primary Key for Account Entity.
 Loan Entity : Attributes of Loan Entity are Loan_id, Loan_Type and
Amount.
Loan_id is Primary Key for Loan Entity.

Relationships are :

 Bank has Branches => 1 : N


One Bank can have many Branches but one Branch can not belong to many
Banks, so the relationship between Bank and Branch is one to many
relationship.

 Branch maintain Accounts => 1 : N


One Branch can have many Accounts but one Account can not belong to many
Branches, so the relationship between Branch and Account is one to many
relationship.

 Branch offer Loans => 1 : N


One Branch can have many Loans but one Loan can not belong to many
Branches, so the relationship between Branch and Loan is one to many
relationship.

 Account held by Customers => M : N


One Customer can have more than one Accounts and also One Account can be
held by one or more Customers, so the relationship between Account and
Customers is many to many relationship.

 Loan availed by Customer => M : N


(Assume loan can be jointly held by many Customers).
One Customer can have more than one Loans and also One Loan can be
availed by one or more Customers, so the relationship between Loan and
Customers is many to many relationship.

6) Difference between File system and DBMS

File System Database Management System (DBMS)

1. It is a software system used for creating and


1. It is a software system that
managing the databases. DBMS provides a
manages and controls the data
systematic way to access, update, and delete
files in a computer system.
data.

2. File system does not support 2. Database Management System supports


multi-user access. multi-user access.

3. Data consistency is less in 3. Data consistency is more due to the use of


the file system. normalization.

4. Database Management System is highly


4. File system is not secured.
secured.

5. File system is used for storing 5. Database management system is used for
the unstructured data. storing the structured data.

6. In the file system, data


6. In DBMS, Data redundancy is low.
redundancy is high.

7. No data backup and recovery


7. There is a backup recovery for data in
process is present in a file
DBMS.
system.
8. Handling of a file system is
8. Handling a DBMS is complex.
easy.

9. Cost of a file system is less 9. Cost of database management system is


than the DBMS. more than the file system.

10. If one application fails, it


10. If the database fails, it affects all
does not affect other application
application which depends on it.
in a system.

11. In the file system, data


11. In DBMS, data can be shared as it is stored
cannot be shared because it is
at one place in a database.
distributed in different files.

12. These system does not


12. This system provides concurrency facility
provide concurrency facility.

13. Example: NTFS (New


13. Example: Oracle, MySQL, MS SQL
technology file system), EXT
Server, DB2, Microsoft Access, etc.
(Extended file system), etc.

7) Discuss the activities of different database users and database


administrators?

A database administrator's (DBA) primary job is to ensure that data is available,


protected from loss and corruption, and easily accessible as needed. Below are
some of the chief responsibilities that make up the day-to-day work of a DB

1. Software installation and Maintenance


A DBA often collaborates on the initial installation and configuration of a new
Oracle, SQL Server etc database. The system administrator sets up hardware and
deploys the operating system for the database server, then the DBA installs the
database software and configures it for use. As updates and patches are required, the
DBA handles this on-going maintenance.
And if a new server is needed, the DBA handles the transfer of data from the existing
system to the new platform.

2. Data Extraction, Transformation, and Loading


Known as ETL, data extraction, transformation, and loading refers to efficiently
importing large volumes of data that have been extracted from multiple systems into
a data warehouse environment.
This external data is cleaned up and transformed to fit the desired format so that it
can be imported into a central repository.

3. Specialised Data Handling


Today’s databases can be massive and may contain unstructured data types such as
images, documents, or sound and video files. Managing a very large database
(VLDB) may require higher-level skills and additional monitoring and tuning to
maintain efficiency
4. Database Backup and Recovery
DBAs create backup and recovery plans and procedures based on industry best
practices, then make sure that the necessary steps are followed. Backups cost time
and money, so the DBA may have to persuade management to take necessary
precautions to preserve data.
System admins or other personnel may actually create the backups, but it is the
DBA’s responsibility to make sure that everything is done on schedule.
In the case of a server failure or other form of data loss, the DBA will use existing
backups to restore lost information to the system. Different types of failures may
require different recovery strategies, and the DBA must be prepared for any
eventuality. With technology change, it is becoming ever more typical for a DBA to
backup databases to the cloud, Oracle Cloud for Oracle Databases and MS Azure
for SQL Server.

5. Security
A DBA needs to know potential weaknesses of the database software and the
company’s overall system and work to minimise risks. No system is one hundred per
cent immune to attacks, but implementing best practices can minimise risks.
In the case of a security breach or irregularity, the DBA can consult audit logs to see
who has done what to the data. Audit trails are also important when working with
regulated data.

6. Authentication
Setting up employee access is an important aspect of database security. DBAs control
who has access and what type of access they are allowed. For instance, a user may
have permission to see only certain pieces of information, or they may be denied the
ability to make changes to the system.

7. Capacity Planning
The DBA needs to know how large the database currently is and how fast it is
growing in order to make predictions about future needs. Storage refers to how much
room the database takes up in server and backup space. Capacity refers to usage level.
If the company is growing quickly and adding many new users, the DBA will have
to create the capacity to handle the extra workload.

8. Performance Monitoring
Monitoring databases for performance issues is part of the on-going system
maintenance a DBA performs. If some part of the system is slowing down processing,
the DBA may need to make configuration changes to the software or add additional
hardware capacity. Many types of monitoring tools are available, and part of the
DBA’s job is to understand what they need to track to improve the system. 3rd party
organisations can be ideal for outsourcing this aspect, but make sure they
offer modern DBA support .
9. Database Tuning
Performance monitoring shows where the database should be tweaked to operate as
efficiently as possible. The physical configuration, the way the database is indexed,
and how queries are handled can all have a dramatic effect on database performance.
With effective monitoring, it is possible to proactively tune a system based on
application and usage instead of waiting until a problem develops.

10. Troubleshooting
DBAs are on call for troubleshooting in case of any problems. Whether they need to
quickly restore lost data or correct an issue to minimise damage, a DBA needs to
quickly understand and respond to problems when they occur.
8Explain in detail levels of abstraction in DBMS?

Levels of Abstraction:

Database systems comprise of complex data structures. Thus, to make the system
efficient for retrieval of data and reduce the complexity of the users, developers use
the method of Data Abstraction.

There are mainly three levels of data abstraction:

1. Internal Level: Actual PHYSICAL storage structure and access paths.


2. Conceptual or Logical Level: Structure and constraints for the entire database
3. External or View level: Describes various user views
Internal Level/Schema

The internal schema defines the physical storage structure of the database. The
internal schema is a very low-level representation of the entire database. It contains
multiple occurrences of multiple types of internal record. In the ANSI term, it is
also called "stored record'.

Facts about Internal schema:

 The internal schema is the lowest level of data abstraction


 It helps you to keeps information about the actual representation of the entire
database. Like the actual storage of the data on the disk in the form of records
 The internal view tells us what data is stored in the database and how
 It never deals with the physical devices. Instead, internal schema views a
physical device as a collection of physical pages

Conceptual Schema/Level

The conceptual schema describes the Database structure of the whole database for
the community of users. This schema hides information about the physical storage
structures and focuses on describing data types, entities, relationships, etc.

This logical level comes between the user level and physical storage view.
However, there is only single conceptual view of a single database.

Facts about Conceptual schema:

 Defines all database entities, their attributes, and their relationships


 Security and integrity information
 In the conceptual level, the data available to a user must be contained in or
derivable from the physical level

External Schema/Level

An external schema describes the part of the database which specific user is
interested in. It hides the unrelated details of the database from the user. There may
be "n" number of external views for each database.

Each external view is defined using an external schema, which consists of


definitions of various types of external record of that specific view

An external view is just the content of the database as it is seen by some specific
particular user. For example, a user from the sales department will see only sales
related data.

Facts about external schema:

 An external level is only related to the data which is viewed by specific end
users.
 This level includes some external schemas.
 External schema level is nearest to the user
 The external schema describes the segment of the database which is needed
for a certain user group and hides the remaining details from the database
from the specific user group

Goal of 3 level/schema of Database

Here, are some Objectives of using Three schema Architecture:

 Every user should be able to access the same data but able to see a
customized view of the data.
 The user need not to deal directly with physical database storage detail.
 The DBA should be able to change the database storage structure without
disturbing the user's views
 The internal structure of the database should remain unaffected when changes
made to the physical aspects of storage

9)What is DBMS ?Different applications of database.

A database is a collection of related data which represents some aspect of the real
world. A database system is designed to be built and populated with data for a
certain task.
The database is maintaining information concerning students, courses, and grades in
a university environment.
The full form of DBMS is Database Management System. DBMS stands for
Database Management System is software for storing and retrieving users' data

Applications of DBMS:

Sector Use of DBMS


Banking For customer information, account activities, payments,
deposits, loans, etc.

Airlines For reservations and schedule information.

Universities For student information, course registrations, colleges and


grades.

Telecommunication It helps to keep call records, monthly bills, maintaining


balances, etc.

Finance For storing information about stock, sales, and purchases of


financial instruments like stocks and bonds.

Sales Use for storing customer, product & sales information.

Manufacturing It is used for the management of supply chain and for


tracking production of items. Inventories status in
warehouses.

HR Management For information about employees, salaries, payroll,


deduction, generation of paychecks, etc.
UNIT-2

1) What are integrity constraints? Define the terms primary key constrains and
foreign key constraints. How are these expressed in SQL?
Or
Compare between super key, Candidate key, Primary Key for a relation
with examples.
OR
What is the importance of integrity constraints in database? Explain with
illustrations.

Types of integrity constraints:


1. Key constraints 2. Foreign key constraints 3. General constraints
1. Key constraints:
=>It is a statement that a certain minimal subset of the fields of a relation is a
unique identifier for tuple
=>keys play an important role in the relational database
=>It is used to uniquely identify any record or row of data from the table. It is also
used to establish and identify relationships between tables
Ex- =>In student table ID is used as a key because it is unique for each student
=>In person table passport num, license num, ssn are keys since they are unique for
each person
Types of key constraints:
=>Not NULL: It ensures that the specified column does not contain a null value
=>UNIQUE: It provides a unique/distinct values to specified columns
=>DEFAULT: It provides a default value to a column if none is specified
=>CHECK: It checks for the predefined conditions before inserting data inside the
table
There are several types of keys

a)primary key b)candidate key c)super key d)alternate key e)compound key
f)composite key g)surrogate key h)foreign key

a) Primary key:
=>It is a column or group of columns in a table that uniquely identify every row in
that table
=>A table cannot have more than one primary key
Rules:
=>Two rows cannot have the same primary key value
=>It must for every row to have a primary key value
=>The primary key field cannot be null
=>The value in a primary key column can never be modified or updated if any
foreign key refers to that primary key

In SQL we can declare that a subset of the columns of a table constitute a key by
using the UNIQUE constraint. At most one of these ‘candidate’ keys can be
declared to be a primary key, using the PRIMARY KEY constraint. (SQL does not
require that such constraints be declared for a table.) Let us revisit our example
table definition and specify key.

CREATE TABLE Students ( sid CHAR(20),


name CHAR(30),
login CHAR(20),
age INTEGER,
gpa REAL,
UNIQUE (name, age),
CONSTRAINT StudentsKey PRIMARY KEY (sid) )
This definition says that sid is the primary key and that the combination of name
and age is also a key. The definition of the primary key also illustrates how we can
name a constraint by preceding it with CONSTRAINT constraint-name. If the
constraint is violated, the constraint name is returned and can be used to identify the
error.
b) Candidate key:
=>It is a set of attributes that uniquely identify tuples in a table.

=>Candidate Key is a super key with no repeated attributes.

=>Every table must have at least a single candidate k


=>A table can have multiple candidate keys but only a single primary key.
Rules:
=>It must contain unique values
=>Candidate key may have multiple attributes
=>Must not contain null values
=>It should contain minimum fields to ensure uniqueness
=>Uniquely identify each record in a table c)Super key:
=>A superkey is a group of single or multiple keys which identifies rows in a table.
=>A Super key may have additional attributes that are not needed for unique
identification.
Fname Lastna Start End
me Time Time
Anne Smith 09:00 18:00
Jack Francis 08:00 17:00
Anna McLean 11:00 20:00
Shown Willam 14:00 23:00
d) Alternate key:
=>It is a column or group of columns in a table that uniquely identify every row in
that table.
=>A table can have multiple choices for a primary key but only one can be set as
the primary key.
=>All the keys which are not primary key are called an Alternate Key.
e) Compound key:
=>It has two or more attributes that allow you to uniquely recognize a specific
record. It is possible that each column may not be unique by itself within the
database.
=>However, when combined with the other column or columns the combination of
composite keys become unique. The purpose of the compound key in database is to
uniquely identify each record in the table.

EX-In this example, OrderNo and ProductID can't be a primary key as it does not
uniquely identify a record. However, a compound key of Order ID and Product ID
could be used as it uniquely identified each record.
OrderNo PorductID Product Quantity
Name
B005 JAP102459 Mouse 5
B005 DKT32157 USB 10
3
B005 OMG44678 LCD 20
9 Monitor
B004 DKT32157 USB 15
3
d) Alternate key:
=>It is a column or group of columns in a table that uniquely identify every row in
that table.
=>A table can have multiple choices for a primary key but only one can be set as
the primary key.
=>All the keys which are not primary key are called an Alternate Key.
e) Compound key:
=>It has two or more attributes that allow you to uniquely recognize a specific
record. It is possible that each column may not be unique by itself within the
database.
=>However, when combined with the other column or columns the combination of
composite keys become unique. The purpose of the compound key in database is to
uniquely identify each record in the d) Alternate key:
=>It is a column or group of columns in a table that uniquely identify every row in
that table.
=>A table can have multiple choices for a primary key but only one can be set as
the primary key.
=>All the keys which are not primary key are called an Alternate Key.
e) Compound key:
=>It has two or more attributes that allow you to uniquely recognize a specific
record. It is possible that each column may not be unique by itself within the
database.
=>However, when combined with the other column or columns the combination of
composite keys become unique. The purpose of the compound key in database is to
uniquely identify each record in the table.
EX-In this example, OrderNo and ProductID can't be a primary key as it does not
uniquely identify a record. However, a compound key of Order ID and Product ID
could be used as it uniquely identified each record.

OrderNo PorductID Product Quantity


Name
B005 JAP102459 Mouse 5
B005 DKT32157 USB 10
3
B005 OMG44678 LCD 20
9 Monitor
B004 DKT32157 USB 15
3
f) Composite key:
=>It is a combination of two or more columns that uniquely identify rows in a table.
The combination of columns guarantees uniqueness, though individually
uniqueness is not guaranteed. Hence, they are combined to uniquely identify
records in a table.
=>The difference between compound and the composite key is that any part of the
compound key can be a foreign key, but the composite key may or maybe not a part
of the foreign key.
g) Surrogate key:
=> It is An artificial key which aims to uniquely identify each record is called a
surrogate key. This kind of partial key in dbms is unique because it is created when
you don't have any natural primary key.
=>Surrogate key is usually an integer. A surrogate key is a value generated right
before the record is inserted into a table.
EX- It shown shift timings of the different employee. In this example, a surrogate
key is needed to uniquely identify each employee.
Fname Lastname Start End Time
Time
Anne Smith 09:00 18:00
Jack Francis 08:00 17:00
Anna McLean 11:00 20:00

h) Foreign key constraints:


=> It is a column that creates a relationship between two tables. The purpose of
Foreign keys is to maintain data integrity and allow navigation between two
different instances of an entity.
=>It acts as a cross-reference between two tables as it references the primary key of
another table.
EX- DeptCode DeptName
001 Science
002 English
005 Computer

Teacher ID Fname Lname


B002 David Warner
B017 Sara Joseph
B009 Mike Brunton

=>In this key in dbms example, we have two table, teach and department in a
school. However, there is no way to see which search work in which department.
=>In this table, adding the foreign key in Deptcode to the Teacher name, we can
create a relationship between the two tables.
Teacher ID DeptCode Fname Lname
B002 002 David Warner
B017 002 Sara Joseph
B009 001 Mike Brunton

This concept is also known as Referential Integrity.


Specifying Foreign Key Constraints in SQL.
Let us define Enrolled(sid: string, cid: string, grade: string):
CREATE TABLE Enrolled ( sid CHAR(20),
cid CHAR(20),
grade CHAR(10),
PRIMARY KEY (sid, cid),
FOREIGN KEY (sid) REFERENCES Students )
The foreign key constraint states that every sid value in Enrolled must also appear in
Students, that is, sid in Enrolled is a foreign key referencing Students. Incidentally,
the primary key constraint states that a student has exactly one grade for each course
that he or she is enrolled in. If we want to record more than one grade per student per
course, we should change the primary key constraint.

2) What is view explain different types of views


Introduction to views:
=>It is a kind of virtual table
=>A view also has rows and columns as they are in a real table in the database
=>A view can either have all the rows of a table or specific rows based on certain
condition
Creating views: A view can be created using the CREATE
VIEW statement. We can create a view from a single table or multiple tables.
Creating table
create table student1(Name varchar(50),Roll_No varchar(20),Dept
varchar(20),Phone_number int(10));

Inserting values
insert into student1 values('ramu',101,'cse',345678912);
insert into student1 values('raju',102,'ds',345678912);
insert into student1 values('rohit',103,'ds',325465476);
insert into student1 values('rupa',104,'ds',2243555677);
Creating view for single table.
syntax
CREATE VIEW view_name AS SELECT column1, column2,…
FROM table_name WHERE condition;
CREATE VIEW DetailsView AS SELECT Name,Dept FROM student1 WHERE
Roll_No <104;

Creating view from multiple tables:


EX:
create table subject (Name varchar(20),subcode varchar(20),title char(10),sem
int(10),credits int(10));

insert into subject values('ramu',121,'Maths',1,2);


insert into subject values('raju',122,'java',2,3);
insert into subject values('rohit',123,'os',3,2);
insert into subject values('rupa',124,'dbms',4,3);
SYNTAX
CREATE VIEW view name AS SELECT Table1 _name.column1, column2,…
Table2_name column1…FROM table_name1,table_name2
WHERE condition;
Example:
CREATE VIEW SubjectViews AS SELECT student1.Name, student1.Dept,
subject.sem FROM student1, subject WHERE student1.NAME = subject.Name;

Updated view: This view can be update the table by replace, adding or removing
the fields from a view
Syntax:
CREATE OR REPLACE VIEW view_name AS SELECT column1, column2,…
FROM table_name WHERE condition;
CREATE OR REPLACE VIEW SubjectViews AS SELECT student1.Name,
student1.Dept, subject.title FROM student1, subject WHERE student1.NAME =
subject.Name;

Destroying/Altering tables and views:


Destroying: Destroying or deleting a table can be done with the help of ‘DROP’
command
Syntax:
DROP VIEW view_name;
EX-
DROP VIEW SubjectViews;
3) Explain the fundamental operations in relational algebra with examples .

Relational Algebra: =>It is a widely used procedural query language.


This language is used to manipulate the data model.
=>This language uses a set of operators that are applied on two
relations (input) and produces a new relation (output)
=>Queries are built and applied on the relations. It has form in to
various groups
Unary relational operators:
1. Select(σ)
2. Project(∏)
3. Rename rho (ρ)
Relational Algebra operations from set theory:
4. Union (∪)
5. Intersection (∩)
6. Difference (-)
7. Cartesian Product (X)
Binary relational operations:
8. Join (⋈)
9. Division

4) Consider the following relations


Sailors(sid sname,rating,age)
Boats(bid,bname,color)
Reserves(sid,bid,day)
Write the statement in relation algebra,relation calculus
a) Find the names of sailors who have reserved a Red boat.
b) Find the names of sailors who have reserved at least one boat.
c) Find the names of sailors who have reserved a Red and a Green boat.
d) Find the names of sailors who have reserved a Red or a White boat.
e) Find the names of sailors who have reserved all boats.

a) Find the name of sailor who have reserved a red boat


Relational Algebra:
σ (sname)=who received red boats(sailors)
Relational Calculus:
{T sName|sailors ^ Boats color=red}
b) Find the names of sailors who have resrved atleast one boat
Relational Algebra:
σ (sname)>one boat(sailors)
Relational calculus:
{T name sailors>one Boat}
c) Find the names of sailors who have reserved a red and green boat
Relational Algebra:
σ (sname)=Received 1 red ^ 1 green,color,boats(sailors)
Relational calculus
{Tname|sailors=1 red, color,boats ^ 1 green,color boats}
d) Find the name of sailors who have reserved all boats of white or red coloured
Relational Algebra:
σ (sname)=Reserved 1 red v,white,or boats(sailors)
Relational calculus:
{T sname|sailors=1 red,color.boats v white,color,boats}
e) Find the names of sailors who have reserved all boats

5) What is EER model.Explain generalization,Specialization,Aggregation.


or
Explain generalization, specialization and aggregation in E-R Model
The Enhanced ER Model
As the complexity of data increased in the late 1980s, it became more and more
difficult to use the traditional ER Model for database modelling. Hence some
improvements or enhancements were made to the existing ER Model to make it
able to handle the complex applications better.
Hence, as part of the Enhanced ER Model, along with other improvements, three
new concepts were added to the existing ER Model, they were:
1. Generalization
2. Specialization
3. Aggregation
Generalization
Generalization is a bottom-up approach in which two lower level entities combine
to form a higher level entity. In generalization, the higher level entity can also
combine with other lower level entities to make further higher level entity.
It's more like Super class and Subclass system, but the only difference is the
approach, which is bottom-up. Hence, entities are combined to form a more
generalised entity, in other words, sub-classes are combined to form a super-class
For example, Saving and Current account types entities can be generalised and an
entity with name Account can be created, which covers both.

Specialization

Specialization is opposite to Generalization. It is a top-down approach in which one


higher level entity can be broken down into two lower level entity. In specialization,
a higher level entity may not have any lower-level entity sets, it's possible.

Aggregation
Aggregation is a process when relation between two entities is treated as a single
entity.
In the diagram above, the relationship between Center and Course together, is
acting as an Entity, which is in relationship with another entity Visitor. Now in real
world, if a Visitor or a Student visits a Coaching Center, he/she will never enquire
about the center only or just about the course, rather he/she will ask enquire about
both.

6) Explain about joins operation in relational algebra.


Table1:

Table2
Join (⋈):
=>A Join operation combines related tuples from different
relations, if and only if a given join condition is satisfied.
=>It is denoted by ⋈.
Equi Join OR INNER JOIN
=>It is also known as an inner join. It is the most common join. It is
based on matched data as per the equality condition.
=>The equi join uses the comparison operator(=).

SYNTAX
1. SELECT COL1,COL2 FROM table_name1
2. INNER JOIN table_name2 USING(join_condtion);

Left outer Join:


=>Left outer join contains the set of tuples of all combinations
in R and S that are equal on their common attribute names.
=>In the left outer join, tuples in R have no matching tuples in
S.
=>It is denoted by 𝔴.
=>Returns all records from left table and the matched records
from right table.

SYNTAX
3. SELECT COL1,COL2 FROM table_name1
4. LEFT JOIN table_name2 USING(join_condtion);
Right outer Join:
=>Right outer join contains the set of tuples of all combinations
in R and S that are equal on their common attribute names.
=>In right outer join, tuples in S have no matching tuples in R.
=>It is denoted by ⟖.
Returns all records from right table and the matched records
from left table.
SELECT COL1,COL2 FROM table_name1
RIGHT JOIN table_name2 USING(join_condtion);

Full outer Join:


=>Full outer join is like a left or right join except that it contains
all rows from both tables.
=>In full outer join, tuples in R that have no matching tuples in
S and tuples in S that have no matching tuples in R in their
common attribute name.
=>It is denoted by 𝔴.
7) Explain about domain relational calculus with example
=>Relational calculus is a non-procedural query language. In the non-
procedural query language, the user is concerned with the details of

how to obtain the end results.


=>The relational calculus tells what to do but never explains how to
do.

1. Tuple relational calculus:


=>The tuple relational calculus is specified to select the tuples in a
relation. In TRC, filtering variable uses the tuples of a relation.
=>The result of the relation can have one or more tuples.
Notation:
{T | P (T)} or {T | Condition (T)}
Where
T is the resulting tuples
P(T) is the condition used to fetch T.

For example:
{ T.name | Author(T) AND T.article = 'database' }
OUTPUT: This query selects the tuples from the AUTHOR relation. It
returns a tuple with 'name' from Author who has written an article
on 'database'.
Example:Input

Q.Display the last name of those students where age is greater than
30
sol: {t:Last Name|student(t) And t.Age>30}
output:

2. Domain relational calculus:


=>The second form of relation is known as Domain relational
calculus. In domain relational calculus, filtering variable uses the
domain of attributes.
=>Domain relational calculus uses the same operators as tuple
calculus. It uses logical connectives ∧ (and), ∨ (or) and ┓ (not).

=>It uses Existential (∃) and Universal Quantifiers (∀) to bind the
variable.

Notation:
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where
a1, a2 are attributes
P stands for formula built by inner attributes
For example:
{< article, page, subject > | ∈ dbms ∧ subject = 'database'}
Output: This query will yield the article, page, and subject from the
relational dbms, where the subject is a database.
Example:Input

Q.Find the first name and age of those students where age is greater
than 27
sol: {<First Name,Age>|∈student(t) ∧ t.Age>27}
UNIT-3

1) What is schema? Give example. [2]


Schema:
=>It can be defined as a complete description of database
Or
=>The overall plan of the database to store the information is called as
database schema
Schema Refinement:
=>It refers to refine the schema by using some techniques. The best technique
of schema refinement is decomposition
=>It is just a fancy term for saying polishing tables. It as checking tables for
redundancies and anomalies
=>Normalization or schema refinement is a technique of organizing the data in
the database. It is a systematic approach of decomposing tables to eliminate
data redundancy and undesirable characteristics like
insert, update, delete anomalies.

2) Distinguish between NULL and NOT NULL. [3]

Null values, we must define the logical operators AND, OR, and NOT using a
three-valued logic in which expressions evaluate to true, false,
or unknown.
We can disallow null values by specifying NOT NULL as part of the
fielddefinition, for example, sname CHAR(20) NOT NULL. In addition, the
_FIelds in a primary key are not allowed to take on null values. Thus,
there is an implicit NOT NULL constraint for every Field listed in a
PRIMARY KEY constraint.

3) Explain various Domain constraints in SQL with examples


Complex integrity constraints in SQL:
Integrity constraints are a set of rules. It is used to maintain the quality
of information.
1) Domain constraints:
=>Domain constraints can be defined as the definition of a valid set of
values for an attribute.
=>The data type of domain includes string, character, integer, time,
date, currency, etc. The value of the attribute must be available in the
corresponding domain.
2) Entity integrity constraints:
=>The entity integrity constraint states that primary key value can't be
null.
=>This is because the primary key value is used to identify individual
rows in relation and if the primary key has a null value, then we can't
identify those rows.

=>A table can contain a null value other than the primary key field.

3) Referential Integrity constraints:


=>A referential integrity constraint is specified between two tables.
=>In the Referential integrity constraints, if a foreign key in Table 1
refers to the Primary Key of Table 2, then every value of the Foreign
Key in Table 1 must be null or be available in Table 2.
4) Key constraints:
=>Keys are the entity set that is used to identify an entity within its
entity set uniquely.
=>An entity set can have multiple keys, but out of which one key will be
the primary key. A primary key can contain a unique and null value in
the relational table.

3)Explain various DML functions in SQL with examples .


Data Manipulation Command
Data Manipulation commands are most widely used SQL commands and they are
❖ Insert

❖ Update
❖ Delete

❖ Select
• DML(DATA MANIPULATION LANGUAGE):
a) Insert: Insert data into the rows of a table.
Syntax:
Insert into tablename(col1,col2,…coln)values(val1,val2,…valn);
Or
Insert into tablename values(val1,val2,…..valn);

insert into student1(Name,Roll_No,Dept,Phone_number)


values('Rita','104','cse','12345678');
insert into student3(Name,Roll_No,Marks,status)
values('Rita','104','40','pass'),('Roja','105','40',NULL);
• b) Update: It is used to update or modify the value of a column
in the table
• Syntax:
• Update tablename
SET(colname1=val1,colname2=val2,….,colnamen=valn)
• WHERE condition;

c) Delete: It is used to remove one or more rows from a table


Syntax:
Delete from tablename WHERE condition;
d) Select: It is used to select data from a database
Syntax:
Select col1,col2,…..
FROM tablename;

2) Explain various DDL functions in SQL with examples

Practicing DDL Commands

1. Data Definition Language


The data definition language is used to create an object, alter the structure

of an object and also drop already created object. The Data Definition

Languages used for table definition can be classified into following:


❖ Create table command
❖ Alter table command
❖ Truncate table command
❖ Drop table command
Creating of Tables on ROAD WAY TRAVELS:
Table is a primary object of database, used to store data in form of rows
and columns. It is created using following command:

Create Table <table_name> (column1 datatype(size), column2 datatype(size),

• _,column(n) datatype(size));
Where, table_name is a name of the table and coulumn1, column2 _
column n is a name of the column available in table_name table.

R Each column is separated by comma.


Pointes to be remember while creating a table.
❖ Table Name must be start with an alphabet.
❖ Table name and column name should be of maximum 30 character
long.
❖ Column name should not be repeated in same table.
❖ Reserve words of Oracle cannot be used as a table and column name.
❖ Two different tables should not have the same name.
❖ Underscores, numerals and letters are allowed but not blank space or
single quotes.
TO CREATE A DATABASE
SYNTAX
Create database database name;
1)TO CREATE A TABLE
Syntax
Create table table name ( column1 datatype(),column 2 datatype()….);

CREATE table student1(Name varchar(50), Roll_No varchar(20), Dept


varchar(20), Phone_number int(10));

To describe the table


desc student1;
Alter
Addition of column in table is done using:
TO ALTER TABLE AND ADD A NEW COLUMN AT FIRST
Alter table <table_name> add column1 datatype FIRST;

TO ALTER THE TABLE ADD A COLUMN UING “AFTER”


COMMAND
Alter table <table_name> add column1 datatype AFTER column2;

Modification in Column

Modify option is used with Alter table_ when you want to modify any
existing column datatype.
Alter table <table name> modify (column1 datatype, _);
.

TO ALTER TABLE BY DROPPING A CLOUMN

Alter table <table name> drop column;

TO CHANGE THE NAME OF A COLUMN

Alter table <table name> CHANGE column name1 column name2 datatype();
TO CHANGE THE NAME OF A TABLE

Alter table <table name> RNAME table name;

3) DROP TABLE
Drop table tablename;

4) Truncate Table
Truncate: It is used to delete all rows from the table and free the space
containing the table

Truncate table <table name> [Reuse Storage];


Example
SQL>Truncate Table Emp;

4) What is aggregate functions.Explain different aggregate operators.

The following are aggregate operators:


1) AVG:
=>The AVG function is used to calculate the average value of the
numeric type. AVG function returns the average of all non-Null values.
Syntax
AVG()
or
SELECT AVG( [ALL|DISTINCT] expression ) FROM TABLE NAME;
EX:
mysql> select avg(Marks) from student1;
check output:
2) MIN:
=>MIN function is used to find the minimum value of a certain column.
This function determines the smallest value of all selected values of a
column.
Syntax
MIN()
or
SELECT MIN( [ALL|DISTINCT] expression ) FROM TABLE NAME;
Sample Table:
3) MAX:
=>MAX function is used to find the maximum value of a certain column.
This function determines the largest value of all selected values of a
column.

Syntax
MAX()
or
SELECT MAX( [ALL|DISTINCT] expression ) FROM TABLE NAME;
OUTPUT

mysql> select sum (Marks) from student1;


4)SUM:
=>Sum function is used to calculate the sum of all selected
columns. It works on numeric fields only.
Syntax:
4) SUM()
or
SELECT SUM( [ALL|DISTINCT] expression ) FROM TABLE NAME;
5) COUNT:
=>COUNT function is used to Count the number of rows in a
database table. It can work on both numeric and non-numeric
data types.
=>COUNT function uses the COUNT(*) that returns the count of all
the rows in a specified table. COUNT(*) considers duplicate and
Null.
Syntax:
COUNT(*)
or
SELECT COUNT( [ALL|DISTINCT] expression ) FROM TABLE NAME;

5) Explain about the following different commands with example


1) GROUP By 2)ORDER BY 3)HAVING
Or Explain about aggregate functions.
The following are the various SQL clauses:
1) GROUP BY:
=>SQL GROUP BY statement is used to arrange identical data into
groups. The GROUP BY statement is used with the SQL SELECT
statement.
=>The GROUP BY statement follows the WHERE clause in a SELECT
statement and precedes the ORDER BY clause.
=>The GROUP BY statement is used with aggregation function.
Syntax:
SELECT column
FROM table_name
WHERE conditions
GROUP BY column
ORDER BY:
=>The ORDER BY clause sorts the result-set in ascending or descending
order.
=>It sorts the records in ascending order by default. DESC keyword is
used to sort the records in descending order.
Syntax:
SELECT column1, column2
FROM table_name
WHERE condition
ORDER BY column1, column2... ASC|DESC;
Where
ASC: It is used to sort the result set in ascending order by expression.
DESC: It sorts the result set in descending order by expression.
Example: Sorting Results in Ascending Order.
HAVING:
=>HAVING clause is used to specify a search condition for a group or an
aggregate.
=>Having is used in a GROUP BY clause. If you are not using GROUP BY
clause then you can use HAVING function like a WHERE clause.

Syntax:
SELECT column1, column2
FROM table_name
WHERE conditions
GROUP BY column1, column2
HAVING conditions
ORDER BY column1, column2;

5) Explain the following Operators in SQL with examples: i) ANY ii) IN iii)
NOT EXISTS iv) EXISTS

Nested Queries:
=>A Query inside a query is called as nested query. Inner query is
called as sub query
=>sub query is usually present in WHERE or HAVING clause
SAMPLE TABLES:

1) ANY OPERATOR
Compares values to each value returned by sub query.
SELECT column_name(s)
FROM table_name
WHERE column_name operator ANY
(SELECT column_name
FROM table_name
WHERE condition);
Ex1

2) ALL OPERATOR
Compare values to every value returned by sub query
SELECT column_name(s)
FROM table_name
WHERE column_name operator ALL
(SELECT column_name
FROM table_name
WHERE condition);

3) IN OPERATOR
Equal to any member in the list.
SELECT column_name(s)
FROM table_name
WHERE column_name IN (select col1 from table name where condition);
4) BETWEEN OPERATOR

The BETWEEN operator selects values within a given range. The values can be
numbers, text, or dates.

The BETWEEN operator is inclusive: begin and end values are included.

SELECT column_name(s)
FROM table_name
WHERE column_name BETWEEN value1 AND value2;

EXISTS
The EXISTS condition in SQL is used to check whether the result of a correlated
nested query is empty (contains no tuples) or not The result of EXISTS is a
boolean value True or False.
Syntax:
SELECT column_name(s)
FROM table_name
WHERE EXISTS
(SELECT column_name(s)
FROM table_name
WHERE condition);
NOT EXISTS

The SQL NOT EXISTS Operator will act quite opposite to EXISTS Operator. It is
used to restrict the number of rows returned by the SELECT Statement.

The NOT EXISTS in SQL Server will check the Subquery for rows existence, and
if there are no rows then it will return TRUE, otherwise FALSE. Or we can simply
say, SQL Server Not Exists operator will return the results exactly opposite to the
result returned by the Subquery.

SELECT [Column Names]

FROM Table Name

WHERE NOT EXISTS (select col1 from table 2 where condition);

6) Explain about different types of triggers in SQL stored precedures


=>It can be defined as a program that is executed by DBMS whenever updations are
specified on the database tables
=>more precisely it is like an event which occurs whenever change is done to the
tables or columns of the tables
=>only DBA can specify triggers
=>The general format of the trigger includes
a)Event:
=>It describes the modifications done to the database which leadto the activation of
trigger. The following fall under the category of events
=>Inserting, updating, deleting columns of the tables or rows of tables may activate
the trigger
=>creating, altering, droping any database object may also lead to
activation of triggers
=>An error message or user log on or user log off may also activate the trigger
b)Condition:
=>conditions are used to specify whether the particular action must be performed or
not
=>If the condition is evaluated to true then the respective action is taken otherwise
the action is rejected

c) Action:
=>It specifies the action to be taken when the corresponding event occurs and
condition evaluates to true
=>An action is a collection of SQL statements that are executed as part of trigger
activation
=>It is possible to activate the trigger before the event or after the
Event. EX- Before event

We can define the maximum six types of actions or events in the form of triggers:
1. Before Insert: It is activated before the insertion of data into the table.
2. After Insert: It is activated after the insertion of data into the table.
3. Before Update: It is activated before the update of data in the table.
4. After Update: It is activated after the update of the data in the table.
5. Before Delete: It is activated before the data is removed from the table.
6. After Delete: It is activated after the deletion of data from the table.

When we use a statement that does not use INSERT, UPDATE or DELETE query to
change the data in a table, the triggers associated with the trigger will not be invoked.

Before Insert: It is activated before the insertion of data into the table.

SYNTAX

1. DELIMITER $$
2. CREATE TRIGGER trigger_name BEFORE INSERT
3. ON table_name FOR EACH ROW
4. BEGIN
5. variable declarations
6. trigger code
7. END$$
EX BEFORE INSERT
7) What is Functional Dependency?

=>The functional dependency is a relationship that exists between two attributes. It


typically exists between the primary key and non-key attribute within a table.
X→Y
=>The left side of FD is known as a determinant, the right side of the production is
known as a dependent.

For example:
Assume we have an employee table with attributes: Emp_Id, Emp_Name,
Emp_Address.
=>Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee
table because if we know the Emp_Id, we can tell that employee name associated
with it. Emp_Id → Emp_Name We can say that Emp_Name is functionally
dependent on Emp_Id.
Types of functional dependency:

1. Trivial functional dependency:


=>A → B has trivial functional dependency if B is a subset of A.
=>The following dependencies are also trivial like: A → A, B → B
Example:
Consider a table with two columns Employee_Id and Employee_Name.
{Employee_id, Employee_Name} → Employee_Id is a trivial functional dependency
as
Employee_Id is a subset of {Employee_Id, Employee_Name}.
Also, Employee_Id → Employee_Id and Employee_Name → Employee_Name are
trivial dependencies too.

2. Non Trivial dependency:


=>A → B has a non-trivial functional dependency if B is not a subset of A.

=>When A intersection B is NULL, then A → B is called as complete non-


trivial.

Example:
ID → Name,
Name → DOB
Types of functional dependency:
1. Trivial functional dependency:
=>A → B has trivial functional dependency if B is a subset of A.
=>The following dependencies are also trivial like: A → A, B → B
Example:
Consider a table with two columns Employee_Id and Employee_Name.
{Employee_id, Employee_Name} → Employee_Id is a trivial function
al dependency as
Employee_Id is a subset of {Employee_Id, Employee_Name}.
Also, Employee_Id → Employee_Id and Employee_Name → Employe
e_Name are trivial dependencies too.
2. Non Trivial dependency:
=>A → B has a non-trivial functional dependency if B is not a subset of
A.

=>When A intersection B is NULL, then A → B is called as complete non-


trivial.

Example:
ID → Name,
Name → DOB
8) What are the Problems caused by redundancy?
OR
Explain the different anamolies caused by redundancy or decomposition of
relation.

Problems Without Normalization


If a table is not properly normalized and have data redundancy then it will not only
eat up extra memory space but will also make it difficult to handle and update the
database, without facing data loss. Insertion, Updation and Deletion Anomalies are
very frequent if database is not normalized. To understand these anomalies let us
take an example of a Student table.

rollno name branch hod office_tel

401 Akon CSE Mr. 53337


X
402 Bkon CSE Mr. 53337
X
403 Ckon CSE Mr. 53337
X
404 Dkon CSE Mr. 53337
X
In the table above, we have data of 4 Computer Sci. students. As we can see, data
for the fields branch, hod(Headof Department) and office_tel is repeated

Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a branch, data of
the student cannot be inserted, orelse we will have to set the branch information as
NULL.
Also, if we have to insert data of 100 students of same branch, then the branch
information will be repeated for allthose 100 students.
These scenarios are nothing but Insertion anomalies.
Updation Anomaly
What if Mr. X leaves the college? or is no longer the HOD of computer science
department? In that case all the student records will have to be updated, and if by
mistake we miss any record, it will lead to data inconsistency. This is Updation
anomaly.
Deletion Anomaly
In our Student table, two different informations are kept together, Student
information and Branch information. Hence, at the end of the academic year, if
student records are deleted, we will also lose the branch information. This is
Deletion anomaly

8) What is normalization ? What are the conditions required for a relation


to be in 1NF, 2NF ?
First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the following 4
rules:
1. It should only have single(atomic) valued attributes/columns.
2. Values stored in a column should be of the same domain
3. All the columns in a table should have unique names.
4. And the order in which data is stored, does not matter.
EXAMPLE
Create a table to store student data which will have student's roll no., their
name and the name of subjects theyhave opted for.
Here is the table, with some sample data added to it.
roll_no name subject
101 Akon OS, CN
103 Ckon Java
102 Bkon C, C++

The table already satisfies 3 rules out of the 4 rules, as all our column names
are unique, we have stored datain the order we wanted to and we have not
inter-mixed different type of data in columns
But out of the 3 different students in our table, 2 have opted for more than 1
subject. And we have stored thesubject names in a single column. But as per the
1st Normal form each column must contain atomic value.
It's very simple, because all we have to do is break
the values into atomic values.Here is our updated
table and it now satisfies the First Normal Form.
roll_no name subject
101 Akon OS
101 Akon CN
103 Ckon Java
102 Bkon C
102 Bkon C++

By doing so, although a few values are getting repeated but values for the
subject column are now atomic for each record/row. Using the First Normal
Form, data redundancy increases, as there will be many columns withsame data
in multiple rows but each row as a whole will be unique.
Second Normal Form (2NF)
For a table to be in the Second Normal Form,
1. It should be in the First Normal form.
2. And, it should not have Partial Dependency.
Dependency
Let's take an example of a Student table with columns student_id,
name, reg_no(registrationnumber), branch and address(student's
home address).

student reg_ bran addr


_id name no ch ess

In this table, student_id is the primary key and will be unique for every row,
hence we can use student_id to fetch any row of data from this table
Even for a case, where student names are same, if we know the student_id we
can easily fetch the correct record.
student_id name reg_no branch address
10 Akon 07-WY CSE Kerala
11 Akon 08-WY I Gujarat
T

Hence we can say a Primary Key for a table is the column or a group of
columns(composite key) which canuniquely identify each record in the table.
I can ask from branch name of student with student_id 10, and I can get it.
Similarly, if I ask for name of student with student_id 10 or 11, I will get it.
So all I need is student_id and every other column depends on it, or can be
fetched using it.This is Dependency and we also call it Functional
Dependency.
Partial Dependency
Now that we know what dependency is, we are in a better state to understand what
partial dependency is.
For a simple table like Student, a single column like student_id can uniquely identfy
all the records in a table. But this is not true all the time. So now let's extend our
example to see if more than 1 column together can act as a primary key.
Let's create another table for Subject, which will have subject_id and
subject_name fields and subject_id willbe the primary key.
subject_i subject_nam
d e
1 Java
2 C++
3 Php

Now we have a Student table with student information and another table
Subject for storing subjectinformation.
Let's create another table Score, to store the marks obtained by students in the
respective subjects. We will alsobe saving name of the teacher who teaches that
subject along with marks

subject_
score_i student_id id mark teacher
d s
1 10 1 70 Java Teacher
2 10 2 75 C++ Teacher
3 11 1 80 Java Teacher

In the score table we are saving the student_id to know which student's marks
are these and subject_id toknow for which subject the marks are for.
Together, student_id + subject_id forms a Candidate Key which can be the
Primary key.
To get me marks of student with student_id 10, can you get it from this table? No,
because you don't know forwhich subject. And if I give you subject_id, you would
not know for which student. Hence we need student_id+ subject_id to uniquely
identify any row.
But where is Partial Dependency?
Now if you look at the Score table, we have a column names teacher which is
only dependent on the subject,for Java it's Java Teacher and for C++ it's C++
Teacher & so on.
Now as we just discussed that the primary key for this table is a composition
of two columns which is student_id & subject_id but the teacher's name only
depends on subject, hence the subject_id, and has nothing to do with student_id.
This is Partial Dependency, where an attribute in a table depends on only a part of
the primary key and not onthe whole key.
How to remove Partial Dependency?
There can be many different solutions for this, but out objective is to remove
teacher's name from Score table. The simplest solution is to remove columns
teacher from Score table and add it to the Subject table. Hence, the Subject table
will become:
subject_i subject_nam teacher
d e
1 Java Java
Teacher
2 C++ C++
Teacher
3 Php Php Teacher

And our Score table is now in the second normal form, with no partial
dependency.

score_ student subject ma


id _id _ id r
ks
1 10 1 70
2 10 2 75
3 11

9) Explain , 3NF and BCNF Normal forms with example. What is the
difference between 3NF and BCNF ?

Third Normal Form (3NF)


A table is said to be in the Third Normal Form when,
1. It is in the Second Normal form.
2. And, it doesn't have Transitive Dependency.
So let's use the same example, where we have 3 tables, Student, Subject and
Score.
Student Table
student_
i name reg_n branch address
d o
10 Akon 07- CSE Kerala
WY
11 Akon 08- IT Gujarat
WY
12 Bkon 09- IT Rajastha
WY n
Subject Table
subject_i subject_nam teacher
d e
1 Java Java
Teacher
2 C++ C++
Teacher
3 Php Php Teacher
Score Table
In the Score table, we need to store some more information, which is the
exam name and total marks, so let'sadd 2 more columns to the Score table.
student_ subject_
score_i i i mark
d d d s
1 10 1 70
2 10 2 75
3 11 1 80

Transitive Dependency
With exam_name and total_marks added to our Score table, it saves more data
now. Primary key for the Score table is a composite key, which means it's made
up of two attributes or columns → student_id + subject_id.
The new column exam_name depends on both student and subject. For
example, a mechanical engineering student will have Workshop exam but a
computer science student won't. And for some subjects you have Practical
exams and for some you don't. So we can say that exam_name is
dependent onboth student_id and subject_id.
And what about our second new column total_marks? Does it depend on our
Score table's primary key?
Well, the column total_marks depends on exam_name as with exam type the
total score changes. For example,practicals are of less marks while theory exams
are of more marks.
But, exam_name is just another column in the score table. It is not a primary
key or even a part of the primarykey, and total_marks depends on it.
This is Transitive Dependency. When a non-prime attribute depends on other
non-prime attributes rather thandepending upon the prime attributes or primary
key.
How to remove Transitive Dependency
Again the solution is very simple. Take out the columns exam_name and
total_marks from Score table and putthem in an Exam table and use the
exam_id wherever required.
Score Table: In 3rd Normal Form
student_ subject
score_id i _ marks exam_id
d id

The new Exam table


total_mar
exam_i exam_nam ks
d e
1 Workshop 200
2 Mains 70
3 Practicals 30
Advantage of removing Transitive Dependency
The advantage of removing transitive dependency is,
 Amount of data duplication is reduced.
 Data integrity achieved.

Boyce and Codd Normal Form (BCNF)


Boyce and Codd Normal Form is a higher version of the Third Normal form.
This form deals with certain typeof anomaly that is not handled by 3NF. A 3NF
table which does not have multiple overlapping candidate keys is said to be in
BCNF. For a table to be in BCNF, following conditions must be satisfied:
 R must be in 3rd Normal Form
 and, for each functional dependency ( X → Y ), X should be a super
Key.In simple words, it means, thatfor a dependency A → B, A cannot be
a non-prime attribute, if B is a prime attribute.
Example
College enrolment table with columns student_id, subject and professor.
student_i subjec professo
d t r
101 Java P.Java
101 C++ P.Cpp
102 Java P.Java2
103 C# P.Chash
104 Java P.Java
In the table above:
One student can enroll for multiple subjects. For example, student with
student_id 101, has opted for subjects -Java & C++
 For each subject, a professor is assigned to the student.
 And, there can be multiple professors
teaching one subject like Java.What do you
think should be the Primary Key?

Well, in the table above student_id, subject together form


theprimary key, becauseusing student_id and subject, we
can find all the columns of the table.
One more important point to note here is, one professor teaches only one
subject, but one subject may havetwo different professors.
Hence, there is a dependency between subject and professor here, where
subject depends on the professorname.
This table satisfies the 1st Normal form because all the values are atomic, column
names are unique and all thevalues stored in a particular column are of same
domain. This table also satisfies the 2nd Normal Form as there is no Partial
Dependency.And, there is no Transitive Dependency, hence the table also
satisfies the 3rd Normal Form.But this table is not in Boyce-Codd Normal Form.
Why this table is not in BCNF?
In the table above, student_id, subject form primary key, which means
subject column is a prime attribute. But, there is one more dependency,
professor → subject.
And while subject is a prime attribute, professor is a non-prime attribute, which
is not allowed by BCNF.
How to satisfy BCNF?
To make this relation(table) satisfy BCNF, we will decompose this
table into two tables, student tableand professor table.
Below we have the structure for both the tables.
Student Table
student_i p_i
d d
101 1
101 2
Professor Table
p_id professo subjec
r t
1 P.Java Java
2 P.Cpp C++
And now, this relation satisfy Boyce-Codd
Normal Form

10) What is lossless decomposition ?Explain


with example

Lossless join decomposition:

=>If the information is not lost from the relation that is decomposed, then the
decomposition will be lossless.

=>The lossless decomposition guarantees that the join of relations will result in the
same relation as it was decomposed.

=>The relation is said to be lossless


decomposition if natural joins of all the
decomposition give the original relation

Example:
EMPLOYEE_DEPARTMENT table:

The above relation is decomposed into two relations EMPLOYEE and


DEPARTMENT

EMPLOYEE table:
DEPARTMENT table

Now, when these two relations are joined on the common column "EMP_ID", then
the resultant relation will look like:

Employee ⋈ Department
Hence, the decomposition is Lossless
joindecomposition.

Explain about Fourth normal form(4NF) and Fifth Normal Form(5NF)

=>A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-
valued dependency.

=>For a dependency A → B, if for a single value of A, multiple values of B exists,


then the relation will be a multi-valued dependency.

Example:
STUDENT

=>The given STUDENT table is in 3NF, but the COURSE and HOBBY are two
independent entity. Hence, there is no relationship between COURSE and HOBBY.

=>In the STUDENT relation, a student with STU_ID, 21 contains two


courses, Computer and Math and two hobbies, Dancing and Singing. So there is a
Multi-valued dependency on STU_ID, which leads to unnecessary repetition of
data.

=>So to make the above table into 4NF, we can decompose it into two tables:

Fifth normal form(5NF):


=>A relation is in 5NF if it is in 4NF and not contains any join dependency and
joining should be lossless.
=>5NF is satisfied when all the tables are broken into as many tables as possible
in order to avoid redundancy.
=>5NF is also known as Project-join normal form (PJ/NF).

=>In the above table, John takes both Computer and Math class for Semester 1
but he doesn't take Math class for Semester 2
=> In this case, combination of all these fields required to identify avalid data.
=>Suppose we add a new Semester as Semester 3 but do not know about the
subject and who will be taking that subject so we leave Lecturer and Subject as
NULL. But all three columns together acts as a primary key, so we can't leave
other two columns blank.
=>So to make the above table into 5NF, we can decompose it into three
relations P1, P2 & P3:
UNIT-4

1) What is transaction? Explain the ACID Properties of transactions.


Or Explain the ACID Properties of transaction with examples

=>A Transaction is a unit of program execution that accesses and


possibly updates various data items.
Or
=>A transaction is an execution of a user program and is seen by the
DBMS as a series or list of actions. The actions that can be executed
by a transaction includes the reading and writing of database

Example: transaction to transfer $50 from account A to account B: Fund transfer


1. read(A)
2. A:=A–50
3. write(A)
4. read(B)
5. B:=B+50
6. write(B)
Two main issues to deal with:
=>Failures of various kinds, such as hardware
failures and system crashes
=>Concurrent execution of multiple transactions

ACID Properties:

1. Atomicity: Either all operations of the transaction are properly


reflected in the database or none are.

2. Consistency: Execution of a transaction in isolation preserves the consistency


of the database.

3. Isolation: Although multiple transactions may execute concurrently,


each transaction must be unaware of other concurrently executing
transactions. Intermediate transaction results must be hidden from
other concurrently executed transactions. That is, for every pair of
transactions Ti and Tj, it appears to Ti that either Tj, finished execution
before Ti started, or Tj started execution after Ti
finished.

4. Durability: After a transaction completes successfully, the changes


it has made to the database persist, even if there are system failures.
Example of Fund Transfer Transaction to transfer $50 from account A to
account B:
1. read(A)
2. A:=A–50
3. write(A)
4. read(B)
5. B:=B+50
6. write(B)

Atomicity requirement:
=>If the transaction fails after step 3 and before step 6, money will be
“lost” leading to an inconsistent database state

=>Failure could be due to software or hardware the system should


ensure that updates of a partially executed transaction are not reflected
in the database

Durability requirement — once the user has been notified that the
transaction has completed (i.e., the transfer of the $50 has taken place),
the updates to the database by the transaction must persist even if there
are software or hardware failures.

Consistency requirement:
=> In above example: the sum of A and B is unchanged by the execution
of the transaction In general, consistency requirements include Explicitly
specified integrity constraints such as primary keys and foreign keys
Implicit integrity constraints
=>Example sum of balances of all accounts, minus sum of loan amounts
must equal value of cash-in-hand
=>A transaction must see a consistent database. During transaction
execution the database may be temporarily inconsistent. When the
transaction completes successfully the database must be consistent
Erroneous transaction logic can lead to inconsistency
Isolation requirement:
=>If between steps 3 and 6, another transaction T2 is allowed to
access the partially updated database, it will see an inconsistent
database (the sum A + B will be less than it should be).
T1 T2
1. read(A)
2. A:=A–50
3. write(A)
read(A), read(B), print(A+B)
4. read(B)
5. B:=B+50
6. write(B )
=>Isolation can be ensured trivially by running transactions serially
that is, one after the other. However, executing multiple transactions
concurrently has significant benefits.

2) What is SERIALIZABILITY:
=>Each transaction preserves database consistency. Thus
serial execution of a set of transactions preserves database
consistency.
=>A (possibly concurrent) schedule is serializable if it is
equivalent to a serial schedule. Different forms of schedule
equivalence give rise to the notions of:

1. Conflict serializability:
2. View serializability:

=>The above two schedules produce the same results view


serializable
is a serial schedule and conflict serializable is a schedule when they
are
interleaved but both schedules yield same results such schedules are
serializable

All the transactions are not always serializable. This may happen
because of two reasons
a) concurrency control methods provide a better solution
b) SQL embedded in host language provide a facility for non
serializable
schedule
Conflict Serializable Schedule:

=>A schedule is called conflict serializability if after swapping of


non-conflicting operations, it can transform into a serial schedule.

=>The schedule will be a conflict serializable if it is conflict


equivalent to a serial schedule.
Conflicting Operations:
The two operations become conflicting if all conditions satisfy:
1. Both belong to separate transactions.
2. They have the same data item.
3. They contain at least one write operation.

Example:
Swapping is possible only if S1 and S2 are logically equal.
Here, S1 = S2. That means it is non-conflict.

Here, S1 ≠ S2. That means it is conflict.


Conflicts (WR,RW,WW)Serializability of Transactions
1:Reading Uncommited data(WR Conflict)
The FIrst source of anomalies is that a transaction T2 could read a
database object A that has been modified by another transaction
T1,which has not yet committed Such a read is called a dirty read

2. Unrepeatable Read(RW CONFLICTS)


The second way in which anomalous behavior could result is that a
transaction T2 could change the value of an object A that has been
read by a transaction T1, while T1 is still in progress
3. Overwriting Uncommitted Data(ww conflicts)
• The third source of anomalous behavior is that a transaction T2
could overwrite the value of an object A, which has already been
modfied by a transaction T1, while T1 is still in progress.

View Serializability:
=>A schedule will view serializable if it is view equivalent to a
serial schedule.
=>If a schedule is conflict serializable, then it will be view
serializable.
=>The view serializable which does not conflict serializable
contains blind writes.
View Equivalent:
Two schedules S1 and S2 are said to be view equivalent if they
satisfy
the following conditions:
1. Initial Read:
An initial read of both schedules must be the same. Suppose two
schedule S1 and S2. In schedule S1, if a transaction T1 is reading the
data item A, then in S2, transaction T1 should also read A.

Above two schedules are view equivalent because Initial read


operation
in S1 is done by T1 and in S2 it is also done by T1.

2. Updated Read:
In schedule S1, if Ti is reading A which is updated by Tj then in S2
also,Ti should read A which is updated by Tj.

Above two schedules are not view equal because, in S1, T3 is


reading A updated by T2 and in S2, T3 is reading A updated by T1.
3. Final Write:
A final write must be the same between both the schedules. In
schedule S1, if a transaction T1 updates A at last then in S2, final
writes operations should also be done by T1.
Above two schedules is view equal because Final write operation in
S1 is done by T3 and in S2, the final write operation is also done by
T3.

3) What is 2PL Protocol?


Two-phase locking (2PL):
o The two-phase locking protocol divides the execution phase of the
transaction into three parts.
o In the first part, when the execution of the transaction starts, it seeks
permission for the lock it requires.
o In the second part, the transaction acquires all the locks. The third phase is
started as soon as the transaction releases its first lock.
o In the third phase, the transaction cannot demand any new locks. It only
releases the acquired locks.
There are two phases of 2PL:

Growing phase: In the growing phase, a new lock on the data item may be
acquired by the transaction, but none can be released.

Shrinking phase: In the shrinking phase, existing lock held by the transaction may
be released, but no new locks can be acquired.

4) Explain the Time Stamp - Based Concurrency Control protocol .


Timestamp Ordering Protocol:
o The Timestamp Ordering Protocol is used to order the transactions based on
their Timestamps. The order of transaction is nothing but the ascending order
of the transaction creation.
o The priority of the older transaction is higher that's why it executes first. To
determine the timestamp of the transaction, this protocol uses system time or
logical counter.
o The lock-based protocol is used to manage the order between conflicting
pairs among transactions at the execution time. But Timestamp based
protocols start working as soon as a transaction is created.
o Let's assume there are two transactions T1 and T2. Suppose the transaction
T1 has entered the system at 007 times and transaction T2 has entered the
system at 009 times. T1 has the higher priority, so it executes first as it is
entered the system first.

Basic Timestamp ordering protocol works as follows:


1. Check the following condition whenever a transaction Ti issues a Read
(X) operation:

o If W_TS(X) >TS(Ti) then the operation is rejected.


o If W_TS(X) <= TS(Ti) then the operation is executed.
o Timestamps of all the data items are updated.

2. Check the following condition whenever a transaction Ti issues


a Write(X) operation:

o If TS(Ti) < R_TS(X) then the operation is rejected.


o If TS(Ti) < W_TS(X) then the operation is rejected and Ti is rolled back
otherwise the operation is executed.

Where,

TS(TI) denotes the timestamp of the transaction Ti.

R_TS(X) denotes the Read time-stamp of data-item X.

W_TS(X) denotes the Write time-stamp of data-item X.

Ex: TIME STAMP OF READ AND WRITE

T2 T3 T4
T1
R(A)
W(A)
R(A)
R(A) W(A)

Ex: READ TIME STAMP OF WRITE TIME STAMP

Write Time Stamp (A)


READ TIME STAMP(A)
READ TIME STAMP(A) WRITE TIME STAMP(A)
UpdateT1=10 Update T1=10
0<10(TRUE) 0<10(TRUE)
UpdateT3=30 Update T4=40
10<30(TRUE) 10<40(TRUE)
Update T2=20 ThereforeWTS(A)=40
30<20(FALSE)
Therefore RTS(A)=30

5) Explain about Validation -Based Protocol .


Validation Based Protocol:

Validation phase is also known as optimistic concurrency control technique. In the


validation based protocol, the transaction is executed in the following three phases:

1. Read phase: In this phase, the transaction T is read and executed. It is used
to read the value of various data items and stores them in temporary local
variables. It can perform all the write operations on temporary variables
without an update to the actual database.
2. Validation phase: In this phase, the temporary variable value will be
validated against the actual data to see if it violates the serializability.
3. Write phase: If the validation of the transaction is validated, then the
temporary results are written to the database or system otherwise the
transaction is rolled back.

Here each phase has the following different timestamps:

Start(Ti): It contains the time when Ti started its execution.

Validation(Ti): It contains the time when Ti finishes its read phase and starts its
validation phase.

Finish(Ti): It contains the time when Ti finishes its write phase.

6) Explain multiple granularity of locking protocol with example

Multiple Granularity:
o It can be defined as hierarchically breaking up the database into blocks which
can be locked.
o The Multiple Granularity protocol enhances concurrency and reduces lock
overhead.
o It maintains the track of what to lock and how to lock.
o It makes easy to decide either to lock a data item or to unlock a data item.
This type of hierarchy can be graphically represented as a tree.

For example: Consider a tree which has four levels of nodes.

o The first level or higher level shows the entire database.


o The second level represents a node of type area. The higher level database
consists of exactly these areas.
o The area consists of children nodes which are known as files. No file can be
present in more than one area.
o Finally, each file contains child nodes known as records. The file has exactly
those records that are its child nodes. No records represent in more than one
file.
o Hence, the levels of the tree starting from the top level are as follows:
1. Database
2. Area
3. File
4. Record

Intention Mode Lock

Intention-shared (IS): It contains explicit locking at a lower level of the tree but
only with shared locks.

Intention-Exclusive (IX): It contains explicit locking at a lower level with exclusive


or shared locks.

Shared & Intention-Exclusive (SIX): In this lock, the node is locked in shared
mode, and some node is locked in exclusive mode by the sametransaction.
Compatibility Matrix with Intention Lock Modes: The below table describes the
compatibility matrix for these lock modes:

It uses the intention lock modes to ensure serializability. It requires that if a


transaction attempts to lock a node, then that node must follow these protocols:

o Transaction T1 should follow the lock-compatibility matrix.

o Transaction T1 firstly locks the root of the tree. It can lock it in any mode.

o If T1 currently has the parent of the node locked in either IX or IS mode, then the
transaction T1 will lock a node in S or IS mode only.

o If T1 currently has the parent of the node locked in either IX or SIX modes, then
the transaction T1 will lock a node in X, SIX, or IX mode only.

o If T1 has not previously unlocked any node only, then the Transaction T1 can
lock a node.

o If T1 currently has none of the children of the node-locked only,then Transaction


T1 will unlock a node.Observe that in multiple-granularity, the locks are acquired in
top-down order, and locks must be released in bottom-up order.

o If transaction T1 reads record Ra9 in file Fa, then transaction T1 needs to lock the
database, area A1 and file Fa in IX mode. Finally, it needs to lock Ra2 in S mode.
o If transaction T2 modifies record Ra9 in file Fa, then it can do so after locking the
database, area A1 and file Fa in IX mode. Finally, it needs to lock the Ra9 in X
mode.

o If transaction T3 reads all the records in file Fa, then transaction T3 needs to lock
the database, and area A in IS mode. At last, it needs to lock Fa in S mode.

o If transaction T4 reads the entire database, then T4 needs to lockthe database in S


mode.

7) Explain the Check point log based recovery scheme for recovering the
data base.

Log-Based Recovery:
o The log is a sequence of records. Log of each transaction is maintained in
some stable storage so that if any failure occurs, then it can be recovered
from there.
o If any operation is performed on the database, then it will be recorded in the
log.
o But the process of storing the logs should be done before the actual
transaction is applied in the database.

Let's assume there is a transaction to modify the City of a student. The following
logs are written for this transaction.

o When the transaction is initiated, then it writes 'start' log.

<Tn, Start>
o When the transaction modifies the City from 'Noida' to 'Bangalore', then
another log is written to the file.

<Tn, City, 'Noida', 'Bangalore' >


o When the transaction is finished, then it writes another log to indicate the end
of the transaction.

<Tn, Commit>

There are two approaches to modify the database:

1. Deferred database modification:


o The deferred modification technique occurs if the transaction does not
modify the database until it has committed.
o In this method, all the logs are created and stored in the stable storage, and
the database is updated when a transaction commits.

2. Immediate database modification:


o The Immediate modification technique occurs if database modification
occurs while the transaction is still active.
o In this technique, the database is modified immediately after every operation.
It follows an actual database modification.

Recovery using Log records:

When the system is crashed, then the system consults the log to find which
transactions need to be undone and which need to be redone.

1. If the log contains the record <Ti, Start> and <Ti, Commit> or <Ti,
Commit>, then the Transaction Ti needs to be redone.
2. If log contains record<Tn, Start> but does not contain the record either <Ti,
commit> or <Ti, abort>, then the Transaction Ti needs to be undone.

Checkpoint:
o The checkpoint is a type of mechanism where all the previous logs are
removed from the system and permanently stored in the storage disk.
o The checkpoint is like a bookmark. While the execution of the transaction,
such checkpoints are marked, and the transaction is executed then using the
steps of the transaction, the log files will be created.
o When it reaches to the checkpoint, then the transaction will be updated into
the database, and till that point, the entire log file will be removed from the
file. Then the log file is updated with the new step of transaction till next
checkpoint and so on.
o The checkpoint is used to declare a point before which the DBMS was in the
consistent state, and all transactions were committed.

Recovery with Concurrent Transaction:


o Whenever more than one transaction is being executed, then the interleaved
of logs occur. During recovery, it would become difficult for the recovery
system to backtrack all logs and then start recovering.

To ease this situation, 'checkpoint' concept is used by most DBMS


. Explain shadow paging recovery scheme for recovering the data base
UNIT -5

1. State and explain various file organisation methods. Give suitable


examples to each
File Organization:
File Organization defines how file records are mapped onto disk blocks. We have
four types of File Organization to organize file records

Types of file organization:


1. Sequential File Organization:
=>Every file record contains a data field (attribute) to uniquely
identify that record.
=>In sequential file organization, records are placed in the file in some sequential
order based on the unique key field or search key.

2. Heap File Organization:


=>When a file is created using Heap File Organization, the Operating System
allocates memory area to that file without any further accounting details.
=>File records can be placed anywhere in that memory area.
=>It is the responsibility of the software to manage the records.
=> Heap File does not support any ordering, sequencing, or indexing on its own.

3. Hash File Organization:


=>Hash File Organization uses Hash function computation on some
fields of the records.
=>The output of the hash function determines the location of disk
block where the records are to be placed.

4. Clustered File Organization:


=>Clustered file organization is not considered good for large databases.
=>In this mechanism, related records from one or more relations are kept in the
same disk block, that is, the ordering of records is not based on primary key or
search key.

Indexing in DBMS:
=>Indexing is a data structure technique to efficiently retrieve records from the
database files based on some attributes on which the indexing has been done.
=>Indexing in database systems is similar to what we see in books.
=>Indexing is defined based on its indexing attributes.
Index structure:
Indexes can be created using some database columns.

o The first column of the database is the search key that contains a copy of the
primary key or candidate key of the table. The values of the primary key are stored
in sorted order so that the corresponding data can be accessed easily.
o The second column of the database is the data reference. It contains a set of
pointers holding the address of the disk block where the value of the particular key
can be found.
1) Difference between ISAM and B+tree
ISAM ISAM: (Indexed sequential access method):
=>ISAM method is an advanced sequential file organization.It is static searchimg.
=>In this method, records are stored in the file using the primary key.
=>An index value is generated for each primary key and mapped with the record.
=>This index contains the address of the record in the file.

B+ tree:
=>It is Dynamic searching.A B+tree is a balanced binary search tree that follows
amulti-level index format. The leaf nodes of a B+tree denote actual data pointers.

=>B+ tree ensures that all leaf nodes remain at the same height, thus balanced.
=>Additionally, the leaf nodes are linked using a link list; therefore, a B+ tree can
support random access as well as sequential access.
2) Explain Deletion and insertion operations in ISAM with examples.
Search over ISAM
• In a binary search tree, the elements of the nodes can be
compared with a total order semantics.

• The following two rules are followed for every node n: Every
element in n's left subtree is less than or equal to the element
in node n.
• Every element in n's right subtree is greater than the element
in node n.
the tree shown . All searches begin at the root. For example, to
locate a record with the key value 27, we start at the root and follow
the left pointer, since 27 < 40. We then follow the middle pointer,
since 20 <= 27 < 33

Insert 23,48,41,42 can contain two entries.

If we now insert a record with key value 23, the entry 23* belongs in the
second data page, which already contains20* and 27* and has no more space.
We deal with this situation by adding an overflowpage and putting 23* in the
overflow page. Chains of overflow pages can easily develop. For instance,
inserting 48*, 41*, and 42* leads to an overflow chain of two pages. The tree
of Figure 9.5 with all these insertions is shown below.
Delete over ISAM TREE

The deletion of an entry k_ is handled by simply removing the entry. If this


entry is on an overflow page and the overflow page becomes empty, the page
can be removed.If the entry is on a primary page and deletion makes the
primary page empty, the simplest approach is to simply leave the empty
primary page placeholder for future insertions (and possibly non-empty
overflow pages, because we do not move records from the overflow pages to
the primary page when deletions on the primary page create space). Thus, the
number of primary leaf pages is _Fixed at file creation time. After deletion of
the entries 42*, 51*, and 97*. Note that after deleting 51*, the key value 51
continues to appear in the index level.A subsequent search for 51* would go to
the correct leaf page and determine that the entry is not in the tree.

3) Explain the Insertion and deletion Operations in B+ trees with example.


B + Tree Search
This B+ tree is of order d=2. That is, each node contains between 2 and 4
entries. Each non-leaf entry is a hkey value, nodepointeri pair; at the leaf level,
the entries are data records that we denote by k_. To search for entry 5*, we
follow the left-most child pointer, since 5 < 13. To search for the entries 14*
or 15*, we follow the second pointer, since 13 _ 14 < 17,and 13 <_ 15 < 17.
(We don't _find 15* on the appropriate leaf, andwe can conclude that it is not
present in the tree.) To _find 24*, we follow the fourth child pointer, since
24<24 < 30.

B+ Tree Insertion:
B+ trees are filled from bottom and each entry is done at the leaf node.
If a leaf node overflows −
o Split node into two parts.
o Partition at i = ⌊(m+1)/2⌋.
o First i entries are stored in one node.
o Rest of the entries (i+1 onwards) are moved to a new node.
o ith key is duplicated at the parent of the leaf.

If a non-leaf node overflows −


o Split node into two parts.
o Partition the node at i = ⌈(m+1)/2⌉.
o Entries up to i are kept in one node.
o Rest of the entries are moved to a new node.

INSERT 8 IN BELOW TREE


If we insert entry 8*, it belongs in the left-most leaf, which is already full. This
insertion causes a split of the leaf page; the split pages are shown in Figure
9.12. The tree must now be adjusted to take the new leaf page into account, so
weinsert an entry consisting of the pair (5, pointer to new page into the parent
node. Notice how the key 5, which discriminates between the split leaf page
and its newly created sibling, is `copied up.'We cannot just `push up' 5, because
every data entry must appear in a leaf page.

Now, since the split node was the old root, we need to create a new root node
to hold the entry that distinguishes the two split index pages. The tree after
completing the insertion of the entry 8* is shown in Figure 9.14.
B+ Tree Deletion:
B+tree entries are deleted at the leaf nodes.
The target entry is searched and deleted.
o If it is an internal node, delete and replace with the entry
from the left position.
After deletion, underflow is tested,
o If underflow occurs, distribute the entries from the nodes
left to it.

If distribution is not possible from left, then


o Distribute from the nodes right to it.
If distribution is not possible from left or from right, then
o Merge the node with left and right to it.

To illustrate deletion, let us consider the sample tree shown in Below


Figure.

To delete entry 19*, we simply remove it from the leaf page on which it
appears, and we aredone because the leaf still contains two entries. If we
subsequently delete 20*, however,the leaf contains only one entry after the
deletion. The (only)sibling of the leaf node that contained 20* has three entries,
and we can therefore deal with the situation byredistribution; we move entry
24* to the leaf pagethat contained 20* and `copy up'the new splitting key (27,
which is the new low key value of the leaf from which we borrowed 24*) into
the parent. This process is illustrated in Figure 9.17.

4) Explain how insert and delete operations are handled in a static hash
index.
Or When does a collision occur in hashing? Illustrate various collision
resolution techniques
Hash Based Indexing:
=>Hashing is an effective technique to calculate the direct location of a data
record on the disk without using index structure.
=> Hashing uses hash functions with search keys as parameters to generate the
address of a data record.
Hash Organization:
Bucket –
=>A hash file stores data in bucket format.
=> Bucket is considered a unit of storage.
=>A bucket typically stores one complete disk block, which in turn can store
one or more records.
Hash Function –
=>A hash function, h, is a mapping function that maps all the set of search-
keys K to the address where actual records are placed.
=>It is a function from search keys to bucket addresses.
Static Hashing:
=>In static hashing, when a search-key value is provided, the hash function
always computes the same address.
=>For example, if mod-4 hash function is used, then it shall generate only 5
values. The output address shall always be same for that function. The number
of buckets provided remains unchanged at all times.

Operation:
Insertion − When a record is required to be entered using static hash, the
hash function h computes the bucket address for search key K, where the record
will be stored. Bucket address = h(K)
Search − When a record needs to be retrieved, the same hash function can
be used to retrieve the address of the bucket where the data is stored.
Delete − This is simply a search followed by a deletion
operation.
EX:Inserting data in hash tables
(24,52,91,67,48,83
Insert all the keys in hash table
h(k)=Kmodn or kmod10
here k=24 n =10
h(k)=24mod10=4
h(k)=52mod10=2
h(k)=91mod10=1
h(k)=67mod10=7
h(k)=48mod10=8
h(k)=83mod10=3
Bucket Overflow:
The condition of bucket-overflow is known as collision. This is a fatal state for
any static hash function. In this case, overflow chaining can be used.
Overflow Chaining − When buckets are full, a new bucket is allocated for
the same hash result and is linked after the previous one. This mechanism is
called Closed Hashing.

Insert (24,25,32,44,58,40)
H(k)=kmodn
H(k)=24mod6=0
H(k)=25mod6=1
H(k)=32mod6=2
H(k)=44mod6=2
H(k)=58mod6=4
h(k)=40mod6=4
Linear Probing − When a hash function generates an address at which data is
already stored, the next free bucket is allocated to it. This mechanism is called
Open Hashing.

5) What is indexing?Explain about sparse and dense index


Indexing in DBMS:
=>Indexing is a data structure technique to efficiently retrieve records from the
database files based on some attributes on which the indexing has been done.
=>Indexing in database systems is similar to what we see in books.
=>Indexing is defined based on its indexing attributes.
Index structure:

Indexes can be created using some database columns.


The first column of the database is the search key that contains a copy of the
primary key or candidate key of the table. The values of the primary key are
stored in sorted order so that the corresponding data can be accessed easily.
o The second column of the database is the data reference. It contains a set of
pointers holding the address of the disk block where the value of the particular
key can be found.

Ordered Indexing is of two types −


Dense Index
Sparse Index
1. Dense Index:
=>In dense index, there is an index record for every search key value in the
database.
=>This makes searching faster but requires more space to store index records
itself.
=>Index records contain search key value and a pointer to the actual record on
the disk.

2. Sparse Index:
=>In the data file, index record appears only for a few items. Each item points
to a block.
>In this, instead of pointing to each record in the main table, the index points
to the records in the main table in a gap.
=>To search a record, we first proceed by index record and reach at the actual
location of the data.
=>If the data we are looking for is not where we directly reach by following
the index, then the system starts sequential search until the desired data is
found.

5)Compare dynamic hashing with static hashing.

Static Hashing:
=>In static hashing, when a search-key value is provided, the hash function
always computes the same address.
=>For example, if mod-4 hash function is used, then it shall generate only 5
values. The output address shall always be same for that function. The number
of buckets provided remains unchanged at all times.

Dynamic Hashing:
=>The problem with static hashing is that it does not expand or
shrink dynamically as the size of the database grows or shrinks.
=>Dynamic hashing provides a mechanism in which data buckets are
added and removed dynamically and on-demand.
=>Dynamic hashing is also known as extended hashing.
=>Hash function, in dynamic hashing, is made to produce a large
number of values and only a few are used initially.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy