0% found this document useful (0 votes)
2 views36 pages

Unit 4.Docx Dbms

The document provides an overview of transaction processing and concurrency control in database management systems (DBMS). It explains single-user and multi-user systems, transaction properties such as atomicity and consistency, and concurrency control problems like dirty reads and lost updates. Additionally, it discusses concurrency control protocols, including lock-based and timestamp-based methods, along with their advantages and disadvantages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views36 pages

Unit 4.Docx Dbms

The document provides an overview of transaction processing and concurrency control in database management systems (DBMS). It explains single-user and multi-user systems, transaction properties such as atomicity and consistency, and concurrency control problems like dirty reads and lost updates. Additionally, it discusses concurrency control protocols, including lock-based and timestamp-based methods, along with their advantages and disadvantages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Introduction to Transaction Processing




Single user system :
In this at-most, only one user at a time can use the system.
Multi-user system :
In the same, many users can access the system concurrently.
Concurrency can be provided through :
1. Interleaved Processing –
In this, the concurrent execution of processes is interleaved
in a single CPU. The transactions are interleaved, meaning
the second transaction is started before the primary one
could finish. And execution can switch between the
transactions. It can also switch between multiple
transactions. This causes inconsistency in the system.
2. Parallel Processing –
It is defined as the processing in which a large task into
various smaller tasks and smaller task also executes
concurrently on several nodes. In this, the processes are
concurrently executed in multiple CPUs.
Transaction :
It is a logical unit of database processing that includes one or more
access operations. (read-retrieval, write-insert or update). It is a unit
of program execution that accesses and if required updates various
data items.
A transaction is a set of operations that can either be embedded
within an application program or can be specified interactively via a
high-quality language such as SQL.
Example –
Consider a transaction that involves transferring $1700 from a
customer’s savings account to a customer’s checking account. This
transaction involves two separate operations: debiting the savings
account by $1700 and crediting the checking account by $1700. If
one operation succeeds but the other doesn’t, the books of the bank
will not balance.
Transaction boundaries :
Begin and end boundaries. In this, you can say an application
program may have several transactions and transactions separated
by the beginning and end of the transaction in an application
program.
Granularity of data :
 The size of data item is called its granularity.
 A data item can be an individual field (attribute), value of
some record, a record, or a whole disk block.
 Concepts are independent of granularity
Advantages :
 Batch processing or real-time processing available.
 Reduction in processing time, lead time and order cycle
time.
 Reduction in inventory, personnel and ordering costs.
 Increase in productivity andcustomer satisfaction
Disadvantages :
 High setup costs.
 Lack of standard formats.
 Hardware and software incompatibility.

Concurrency Control in DBMS





Concurrently control is a very important concept of DBMS which
ensures the simultaneous execution or manipulation of data by
several processes or user without resulting in data inconsistency.
Concurrency Control deals with interleaved execution of more
than one transaction.
What is Transaction?
A set of logically related operations is known as a transaction. The
main operations of a transaction are:
 Read(A): Read operations Read(A) or R(A) reads the value
of A from the database and stores it in a buffer in the main
memory.
 Write (A): Write operation Write(A) or W(A) writes the value
back to the database from the buffer.
(Note: It doesn’t always need to write it to a database back it just
writes the changes to buffer this is the reason where dirty read
comes into the picture)
Let us take a debit transaction from an account that consists of the
following operations:
1. R(A);
2. A=A-1000;
3. W(A);
Assume A’s value before starting the transaction is 5000.
 The first operation reads the value of A from the database
and stores it in a buffer.
 the Second operation will decrease its value by 1000. So
buffer will contain 4000.
 the Third operation will write the value from the buffer to
the database. So A’s final value will be 4000.
But it may also be possible that the transaction may fail after
executing some of its operations. The failure can be because
of hardware, software or power, etc. For example, if the debit
transaction discussed above fails after executing operation 2, the
value of A will remain 5000 in the database which is not acceptable
by the bank. To avoid this, Database has two important operations:
 Commit: After all instructions of a transaction are
successfully executed, the changes made by a transaction
are made permanent in the database.
 Rollback: If a transaction is not able to execute all
operations successfully, all the changes made by a
transaction are undone.
For more details please refer Transaction Control in DBMS article.
Properties of a Transaction
Atomicity: As a transaction is a set of logically related
operations, either all of them should be executed or none. A
debit transaction discussed above should either execute all three
operations or none. If the debit transaction fails after executing
operations 1 and 2 then its new value of 4000 will not be updated in
the database which leads to inconsistency.
Consistency: If operations of debit and credit transactions on the
same account are executed concurrently, it may leave the database
in an inconsistent state.
 For Example, with T1 (debit of Rs. 1000 from A) and T2
(credit of 500 to A) executing concurrently, the database
reaches an inconsistent state.
 Let us assume the Account balance of A is Rs. 5000. T1
reads A(5000) and stores the value in its local buffer space.
Then T2 reads A(5000) and also stores the value in its local
buffer space.
 T1 performs A=A-1000 (5000-1000=4000) and 4000 is
stored in T1 buffer space. Then T2 performs A=A+500
(5000+500=5500) and 5500 is stored in the T2 buffer
space. T1 writes the value from its buffer back to the
database.
 A’s value is updated to 4000 in the database and then T2
writes the value from its buffer back to the database. A’s
value is updated to 5500 which shows that the effect of the
debit transaction is lost and the database has become
inconsistent.
 To maintain consistency of the database, we
need concurrency control protocols which will be
discussed in the next article. The operations of T1 and T2
with their buffers and database have been shown in Table 1.
T1 T1’s buffer space T2 T2’s Buffer Space Database

A=5000

R(A); A=5000 A=5000

A=5000 R(A); A=5000 A=5000

A=A-1000; A=4000 A=5000 A=5000

A=4000 A=A+500; A=5500

W(A); A=5500 A=4000

W(A); A=5500

Isolation: The result of a transaction should not be visible to others


before the transaction is committed. For example, let us assume
that A’s balance is Rs. 5000 and T1 debits Rs. 1000 from A. A’s new
balance will be 4000. If T2 credits Rs. 500 to A’s new balance, A will
become 4500, and after this T1 fails. Then we have to roll back T2
as well because it is using the value produced by T1. So transaction
results are not made visible to other transactions before it commits.
Durable: Once the database has committed a transaction, the
changes made by the transaction should be permanent. e.g.; If a
person has credited $500000 to his account, the bank can’t say that
the update has been lost. To avoid this problem, multiple copies of
the database are stored at different locations.
What is a Schedule?
A schedule is a series of operations from one or more transactions. A
schedule can be of two types:
Serial Schedule: When one transaction completely executes
before starting another transaction, the schedule is called a serial
schedule. A serial schedule is always consistent. e.g.; If a schedule S
has debit transaction T1 and credit transaction T2, possible serial
schedules are T1 followed by T2 (T1->T2) or T2 followed by T1 ((T2-
>T1). A serial schedule has low throughput and less resource
utilization.
Concurrent Schedule: When operations of a transaction are
interleaved with operations of other transactions of a schedule, the
schedule is called a Concurrent schedule. e.g.; the Schedule of debit
and credit transactions shown in Table 1 is concurrent. But
concurrency can lead to inconsistency in the database. The above
example of a concurrent schedule is also inconsistent.
Difference between Serial Schedule and Serializable Schedule
Serial Schedule Serializable Schedule

In Serial schedule, transactions will In Serializable schedule transaction are


be executed one after other. executed concurrently.

Serial schedule are less efficient. Serializable schedule are more efficient.

In serial schedule only one In Serializable schedule multiple transactions


transaction executed at a time. can be executed at a time.

Serial schedule takes more time for


In Serializable schedule execution is fast.
execution.

Concurrency Control in DBMS


 Executing a single transaction at a time will increase the
waiting time of the other transactions which may result in
delay in the overall execution. Hence for increasing the
overall throughput and efficiency of the system, several
transactions are executed.
 Concurrently control is a very important concept of DBMS
which ensures the simultaneous execution or manipulation
of data by several processes or user without resulting in
data inconsistency.
 Concurrency control provides a procedure that is able to
control concurrent execution of the operations in the
database.
Concurrency Control Problems
There are several problems that arise when numerous transactions
are executed simultaneously in a random manner. The database
transaction consist of two major operations “Read” and “Write”. It is
very important to manage these operations in the concurrent
execution of the transactions in order to maintain the consistency of
the data.

Dirty Read Problem(Write-Read conflict)

Dirty read problem occurs when one transaction updates an item


but due to some unconditional events that transaction fails but
before the transaction performs rollback, some other transaction
reads the updated value. Thus creates an inconsistency in the
database. Dirty read problem comes under the scenario of Write-
Read conflict between the transactions in the database
1. The lost update problem can be illustrated with the below
scenario between two transactions T1 and T2.
2. Transaction T1 modifies a database record without
committing the changes.
3. T2 reads the uncommitted data changed by T1
4. T1 performs rollback
5. T2 has already read the uncommitted data of T1 which is no
longer valid, thus creating inconsistency in the database.

Lost Update Problem

Lost update problem occurs when two or more transactions modify


the same data, resulting in the update being overwritten or lost by
another transaction. The lost update problem can be illustrated with
the below scenario between two transactions T1 and T2.
1. T1 reads the value of an item from the database.
2. T2 starts and reads the same database item.
3. T1 updates the value of that data and performs a commit.
4. T2 updates the same data item based on its initial read and
performs commit.
5. This results in the modification of T1 gets lost by the T2’s
write which causes a lost update problem in the database.
Concurrency Control Protocols
Concurrency control protocols are the set of rules which are
maintained in order to solve the concurrency control problems in the
database. It ensures that the concurrent transactions can execute
properly while maintaining the database consistency. The
concurrent execution of a transaction is provided with atomicity,
consistency, isolation, durability, and serializability via the
concurrency control protocols.
 Locked based concurrency control protocol
 Timestamp based concurrency control protocol

Locked based Protocol

In locked based protocol, each transaction needs to acquire locks


before they start accessing or modifying the data items. There are
two types of locks used in databases.
 Shared Lock : Shared lock is also known as read lock which
allows multiple transactions to read the data simultaneously.
The transaction which is holding a shared lock can only read
the data item but it can not modify the data item.
 Exclusive Lock : Exclusive lock is also known as the write
lock. Exclusive lock allows a transaction to update a data
item. Only one transaction can hold the exclusive lock on a
data item at a time. While a transaction is holding an
exclusive lock on a data item, no other transaction is
allowed to acquire a shared/exclusive lock on the same data
item.
There are two kind of lock based protocol mostly used in database:
 Two Phase Locking Protocol : Two phase locking is a
widely used technique which ensures strict ordering of lock
acquisition and release. Two phase locking protocol works in
two phases.
 Growing Phase : In this phase, the transaction
starts acquiring locks before performing any
modification on the data items. Once a transaction
acquires a lock, that lock can not be released until
the transaction reaches the end of the execution.
 Shrinking Phase : In this phase, the transaction
releases all the acquired locks once it performs all
the modifications on the data item. Once the
transaction starts releasing the locks, it can not
acquire any locks further.
 Strict Two Phase Locking Protocol : It is almost similar
to the two phase locking protocol the only difference is that
in two phase locking the transaction can release its locks
before it commits, but in case of strict two phase locking the
transactions are only allowed to release the locks only when
they performs commits.
Timestamp based Protocol
 In this protocol each transaction has a timestamp attached
to it. Timestamp is nothing but the time in which a
transaction enters into the system.
 The conflicting pairs of operations can be resolved by the
timestamp ordering protocol through the utilization of the
timestamp values of the transactions. Therefore,
guaranteeing that the transactions take place in the correct
order.
Advantages of Concurrency
In general, concurrency means, that more than one transaction can
work on a system. The advantages of a concurrent system are:
 Waiting Time: It means if a process is in a ready state but
still the process does not get the system to get execute is
called waiting time. So, concurrency leads to less waiting
time.
 Response Time: The time wasted in getting the response
from the cpu for the first time, is called response time. So,
concurrency leads to less Response Time.
 Resource Utilization: The amount of Resource utilization
in a particular system is called Resource Utilization. Multiple
transactions can run parallel in a system. So, concurrency
leads to more Resource Utilization.
 Efficiency: The amount of output produced in comparison
to given input is called efficiency. So, Concurrency leads to
more Efficiency.
Disadvantages of Concurrency
 Overhead: Implementing concurrency control requires
additional overhead, such as acquiring and releasing locks
on database objects. This overhead can lead to slower
performance and increased resource consumption,
particularly in systems with high levels of concurrency.
 Deadlocks: Deadlocks can occur when two or more
transactions are waiting for each other to release resources,
causing a circular dependency that can prevent any of the
transactions from completing. Deadlocks can be difficult to
detect and resolve, and can result in reduced throughput
and increased latency.
 Reduced concurrency: Concurrency control can limit the
number of users or applications that can access the
database simultaneously. This can lead to reduced
concurrency and slower performance in systems with high
levels of concurrency.
 Complexity: Implementing concurrency control can be
complex, particularly in distributed systems or in systems
with complex transactional logic. This complexity can lead
to increased development and maintenance costs.
 Inconsistency: In some cases, concurrency control can
lead to inconsistencies in the database. For example, a
transaction that is rolled back may leave the database in an
inconsistent state, or a long-running transaction may cause
other transactions to wait for extended periods, leading to
data staleness and reduced accuracy.

Conclusion

Concurrency control ensures transaction atomicity, isolation,


consistency, and serializability. Concurrency control issues occur
when many transactions execute randomly. A dirty read happens
when a transaction reads data changed by an uncommitted
transaction. When two transactions update data simultaneously, the
Lost Update issue occurs. Lock-based protocol prevents incorrect
read/write activities. Timestamp-based protocols organise
transactions by timestamp.

ACID Properties in DBMS




A transaction is a single logical unit of work that accesses and


possibly modifies the contents of a database. Transactions access
data using read and write operations.
In order to maintain consistency in a database, before and after the
transaction, certain properties are followed. These are
called ACID properties.
Atomicity:
By this, we mean that either the entire transaction takes place at
once or doesn’t happen at all. There is no midway i.e. transactions
do not occur partially. Each transaction is considered as one unit
and either runs to completion or is not executed at all. It involves
the following two operations.
—Abort: If a transaction aborts, changes made to the database are
not visible.
—Commit: If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.
Consider the following transaction T consisting of T1 and T2:
Transfer of 100 from account X to account Y.

If the transaction fails after completion of T1 but before completion


of T2.( say, after write(X) but before write(Y)), then the amount
has been deducted from X but not added to Y. This results in an
inconsistent database state. Therefore, the transaction must be
executed in its entirety in order to ensure the correctness of the
database state.
Consistency:

This means that integrity constraints must be maintained so that the


database is consistent before and after the transaction. It refers to
the correctness of a database. Referring to the example above,
The total amount before and after the transaction must be
maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, the database is consistent. Inconsistency occurs in
case T1 completes but T2 fails. As a result, T is incomplete.

Isolation:

This property ensures that multiple transactions can occur


concurrently without leading to the inconsistency of the database
state. Transactions occur independently without interference.
Changes occurring in a particular transaction will not be visible to
any other transaction until that particular change in that transaction
is written to memory or has been committed. This property ensures
that the execution of transactions concurrently will result in a state
that is equivalent to a state achieved these were executed serially in
some order.
Let X= 500, Y = 500.
Consider two transactions T and T”.

Suppose T has been executed till Read (Y) and then T’’ starts. As a
result, interleaving of operations takes place due to which T’’ reads
the correct value of X but the incorrect value of Y and sum
computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of the transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
This results in database inconsistency, due to a loss of 50 units.
Hence, transactions must take place in isolation and changes should
be visible only after they have been made to the main memory.
Durability:

This property ensures that once the transaction has completed


execution, the updates and modifications to the database are stored
in and written to disk and they persist even if a system failure
occurs. These updates now become permanent and are stored in
non-volatile memory. The effects of the transaction, thus, are never
lost.
Some important points:
Property Responsibility for maintaining properties

Atomicity Transaction Manager

Consistenc
Application programmer
y

Isolation Concurrency Control Manager

Durability Recovery Manager

The ACID properties, in totality, provide a mechanism to ensure the


correctness and consistency of a database in a way such that each
transaction is a group of operations that acts as a single unit,
produces consistent results, acts in isolation from other operations,
and updates that it makes are durably stored.
If you like GeeksforGeeks and would like to contribute, you can also
write an article using write.geeksforgeeks.org or mail your article to
review-team@geeksforgeeks.org. See your article appearing on the
GeeksforGeeks main page and help other Geeks.
ACID properties are the four key characteristics that define the
reliability and consistency of a transaction in a Database
Management System (DBMS). The acronym ACID stands for
Atomicity, Consistency, Isolation, and Durability. Here is a brief
description of each of these properties:
1. Atomicity: Atomicity ensures that a transaction is treated as
a single, indivisible unit of work. Either all the operations
within the transaction are completed successfully, or none
of them are. If any part of the transaction fails, the entire
transaction is rolled back to its original state, ensuring data
consistency and integrity.
2. Consistency: Consistency ensures that a transaction takes
the database from one consistent state to another
consistent state. The database is in a consistent state both
before and after the transaction is executed. Constraints,
such as unique keys and foreign keys, must be maintained
to ensure data consistency.
3. Isolation: Isolation ensures that multiple transactions can
execute concurrently without interfering with each other.
Each transaction must be isolated from other transactions
until it is completed. This isolation prevents dirty reads, non-
repeatable reads, and phantom reads.
4. Durability: Durability ensures that once a transaction is
committed, its changes are permanent and will survive any
subsequent system failures. The transaction’s changes are
saved to the database permanently, and even if the system
crashes, the changes remain intact and can be recovered.
Overall, ACID properties provide a framework for ensuring data
consistency, integrity, and reliability in DBMS. They ensure that
transactions are executed in a reliable and consistent manner, even
in the presence of system failures, network issues, or other
problems. These properties make DBMS a reliable and efficient tool
for managing data in modern organizations.

Advantages of ACID Properties in DBMS:

1. Data Consistency: ACID properties ensure that the data


remains consistent and accurate after any transaction
execution.
2. Data Integrity: ACID properties maintain the integrity of the
data by ensuring that any changes to the database are
permanent and cannot be lost.
3. Concurrency Control: ACID properties help to manage
multiple transactions occurring concurrently by preventing
interference between them.
4. Recovery: ACID properties ensure that in case of any failure
or crash, the system can recover the data up to the point of
failure or crash.
Disadvantages of ACID Properties in DBMS:

1. Performance: The ACID properties can cause a performance


overhead in the system, as they require additional
processing to ensure data consistency and integrity.
2. Scalability: The ACID properties may cause scalability issues
in large distributed systems where multiple transactions
occur concurrently.
3. Complexity: Implementing the ACID properties can increase
the complexity of the system and require significant
expertise and resources.
Overall, the advantages of ACID properties in DBMS
outweigh the disadvantages. They provide a reliable and
consistent approach to data
4. management, ensuring data integrity, accuracy, and
reliability. However, in some cases, the overhead of
implementing ACID properties can cause performance and
scalability issues. Therefore, it’s important to balance the
benefits of ACID properties against the specific needs and
requirements of the system.

Types of Schedules based Recoverability in


DBMS



In this article, we are going to deal with the types of Schedules
based on the Recoverability in Database Management Systems
(DBMS). Generally, there are three types of schedules given as
follows:
Schedules Based on Recoverability
 Recoverable Schedule: A schedule is recoverable if it
allows for the recovery of the database to a consistent state
after a transaction failure. In a recoverable schedule, a
transaction that has updated the database must commit
before any other transaction reads or writes the same data.
If a transaction fails before committing, its updates must be
rolled back, and any transactions that have read its
uncommitted data must also be rolled back.
 Cascadeless Schedule: A schedule is cascaded less if it
does not result in a cascading rollback of transactions after
a failure. In a cascade-less schedule, a transaction that has
read uncommitted data from another transaction cannot
commit before that transaction commits. If a transaction
fails before committing, its updates must be rolled back, but
any transactions that have read its uncommitted data need
not be rolled back.
 Strict Schedule: A schedule is strict if it is both
recoverable and cascades. In a strict schedule, a transaction
that has read uncommitted data from another transaction
cannot commit before that transaction commits, and a
transaction that has updated the database must commit
before any other transaction reads or writes the same data.
If a transaction fails before committing, its updates must be
rolled back, and any transactions that have read its
uncommitted data must also be rolled back.
These types of schedules are important because they affect the
consistency and reliability of the database system. It is essential to
ensure that schedules are recoverable, cascaded, or strict to avoid
inconsistencies and data loss in the database.

Recoverability

Recoverable Schedule
A schedule is said to be recoverable if it is recoverable as the name
suggests. Only reads are allowed before write operations on the
same data. Only reads (Ti->Tj) are permissible.
Example:

S1: R1(x), W1(x), R2(x), R1(y), R2(y),


W2(x), W1(y), C1, C2;

The given schedule follows the order of Ti->Tj => C1->C2.


Transaction T1 is executed before T2 hence there is no chance of
conflict occurring. R1(x) appears before W1(x) and transaction T1 is
committed before T2 i.e. completion of the first transaction
performed the first update on data item x, hence given schedule is
recoverable.
Let us see an example of an unrecoverable schedule to clear the
concept more.

S2: R1(x), R2(x), R1(z), R3(x), R3(y), W1(x),


W3(y), R2(y), W2(z), W2(y), C1, C2, C3;

Ti->Tj => C2->C3 but W3(y) executed before W2(y) which leads to
conflicts thus it must be committed before the T2 transaction. So
given schedule is unrecoverable. if Ti->Tj => C3->C2 is given in
the schedule then it will become a recoverable schedule.
Note: A committed transaction should never be rollback. It means
that reading value from uncommitted transaction and commit it will
enter the current transaction into inconsistent or unrecoverable
state this is called Dirty Read problem.
Example:

Dirty Read Problem

Cascadeless Schedule

When no read or write-write occurs before the execution of the


transaction then the corresponding schedule is called a cascadeless
schedule.
Example:

S3: R1(x), R2(z), R3(x), R1(z), R2(y), R3(y), W1(x), C1,


W2(z), W3(y), W2(y), C3, C2;

In this schedule W3(y) and W2(y) overwrite conflicts and there is


no read, therefore given schedule is cascade less schedule.

Special Case: A committed transaction desired to abort. As given


below all the transactions are reading committed data hence it’s
cascadeless schedule.

Cascadeless Schedule

Strict Schedule
If the schedule contains no read or write before commit then it is
known as a strict schedule. A strict schedule is strict in nature.

Example:

S4: R1(x), R2(x), R1(z), R3(x), R3(y),


W1(x), C1, W3(y), C3, R2(y), W2(z), W2(y), C2;
In this schedule, no read-write or write-write conflict arises before
committing hence its strict schedule:

Strict Schedule

Cascading Abort: Cascading Abort can also be rollback. If


transaction T1 aborts as T2 read data that is written by T1 it is not
committed. Hence its cascading

rollback.
Example:
Cascading Abort

Co-Relation between Strict, Cascadeless, and Recoverable schedules


Below is the picture showing the correlation between Strict
Schedules, Cascadeless Schedules, and Recoverable Schedules.

Schedules
Serializability in DBMS



In this article, we are going to explain the serializability concept and
how this concept affects the DBMS deeply, we also understand the
concept of serializability with some examples, and we will finally
conclude this topic with an example of the importance of
serializability. The DBMS form is the foundation of the most modern
applications, and when we design the form properly, it provides
high-performance and relative storage solutions to our application.
What is a serializable schedule, and what is it used for?
If a non-serial schedule can be transformed into its corresponding
serial schedule, it is said to be serializable. Simply said, a non-serial
schedule is referred to as a serializable schedule if it yields the same
results as a serial timetable.
Non-serial Schedule
A schedule where the transactions are overlapping or switching
places. As they are used to carry out actual database operations,
multiple transactions are running at once. It’s possible that these
transactions are focusing on the same data set. Therefore, it is
crucial that non-serial schedules can be serialized in order for our
database to be consistent both before and after the transactions are
executed.
Example:

Transaction-1 Transaction-2

R(a)

W(a)

R(b)

W(b)

R(b)

R(a)

W(b)

W(a)

We can observe that Transaction-2 begins its execution before


Transaction-1 is finished, and they are both working on the same
data, i.e., “a” and “b”, interchangeably. Where “R”-Read, “W”-Write
Serializability testing
We can utilize the Serialization Graph or Precedence Graph to
examine a schedule’s serializability. A schedule’s full transactions
are organized into a Directed Graph, what a serialization graph is.

Precedence Graph
It can be described as a Graph G(V, E) with vertices V = “V1, V2, V3,
…, Vn” and directed edges E = “E1, E2, E3,…, En”. One of the two
operations—READ or WRITE—performed by a certain transaction is
contained in the collection of edges. Where Ti -> Tj, means
Transaction-Ti is either performing read or write before the
transaction-Tj.
Types of Serializability
There are two ways to check whether any non-serial schedule is
serializable.

Types of Serializability – Conflict & View

1. Conflict serializability
Conflict serializability refers to a subset of serializability that focuses
on maintaining the consistency of a database while ensuring that
identical data items are executed in an order. In a DBMS each
transaction has a value and all the transactions, in the database rely
on this uniqueness. This uniqueness ensures that no two operations
with the conflict value can occur simultaneously.
For example lets consider an order table and a customer table as
two instances. Each order is associated with one customer even
though a single client may place orders. However there are
restrictions for achieving conflict serializability in the database. Here
are a few of them.
1. Different transactions should be used for the two
procedures.
2. The identical data item should be present in both
transactions.
3. Between the two operations, there should be at least one
write operation.
Example
Three transactions—t1, t2, and t3—are active on a schedule “S” at
once. Let’s create a graph of precedence.
Transaction – 1
(t1) Transaction – 2 (t2) Transaction – 3 (t3)

R(a)

R(b)

R(b)

W(b)

W(a)

W(a)

R(a)

W(a)

It is a conflict serializable schedule as well as a serial schedule


because the graph (a DAG) has no loops. We can also determine the
order of transactions because it is a serial schedule.
DAG of transactions

As there is no incoming edge on Transaction 1, Transaction 1 will be


executed first. T3 will run second because it only depends on T1.
Due to its dependence on both T1 and T3, t2 will finally be executed.
Therefore, the serial schedule’s equivalent order is: t1 –> t3 –> t2
Note: A schedule is unquestionably consistent if it is conflicting
serializable. A non-conflicting serializable schedule, on the other
hand, might or might not be serial. We employ the idea of View
Serializability to further examine its serial behavior.
2. View Serializability
View serializability is a kind of operation in a serializable in which
each transaction should provide some results, and these outcomes
are the output of properly sequentially executing the data item. The
view serializability, in contrast to conflict serialized, is concerned
with avoiding database inconsistency. The view serializability
feature of DBMS enables users to see databases in contradictory
ways.
To further understand view serializability in DBMS, we need to
understand the schedules S1 and S2. The two transactions T1 and
T2 should be used to establish these two schedules. Each schedule
must follow the three transactions in order to retain the equivalent
of the transaction. These three circumstances are listed below.
1. The first prerequisite is that the same kind of transaction
appears on every schedule. This requirement means that
the same kind of group of transactions cannot appear on
both schedules S1 and S2. The schedules are not equal to
one another if one schedule commits a transaction but it
does not match the transaction of the other schedule.
2. The second requirement is that different read or write
operations should not be used in either schedule. On the
other hand, we say that two schedules are not similar if
schedule S1 has two write operations whereas schedule S2
only has one. The number of the write operation must be
the same in both schedules, however there is no issue if the
number of the read operation is different.
3. The second to last requirement is that there should not be a
conflict between either timetable. execution order for a
single data item. Assume, for instance, that schedule S1’s
transaction is T1, and schedule S2’s transaction is T2. The
data item A is written by both the transaction T1 and the
transaction T2. The schedules are not equal in this instance.
However, we referred to the schedule as equivalent to one
another if it had the same number of all write operations in
the data item.
What is view equivalency?
Schedules (S1 and S2) must satisfy these two requirements in order
to be viewed as equivalent:
1. The same piece of data must be read for the first time. For
instance, if transaction t1 is reading “A” from the database
in schedule S1, then t1 must also read A in schedule S2.
2. The same piece of data must be used for the final write. As
an illustration, if transaction t1 updated A last in S1, it
should also conduct final write in S2.
3. The middle sequence need to follow suit. As an illustration,
if in S1 t1 is reading A, and t2 updates A, then in S2 t1
should read A, and t2 should update A.
View Serializability refers to the process of determining whether a
schedule’s views are equivalent.
Example
We have a schedule “S” with two concurrently running transactions,
“t1” and “t2.”
Schedule – S:
Transaction-1 (t1) Transaction-2 (t2)

R(a)

W(a)
Transaction-1 (t1) Transaction-2 (t2)

R(a)

W(a)

R(b)

W(b)

R(b)

W(b)

By switching between both transactions’ mid-read-write operations,


let’s create its view equivalent schedule (S’).
Schedule – S’:
Transaction-1 (t1) Transaction-2 (t2)

R(a)

W(a)

R(b)

W(b)

R(a)

W(a)

R(b)
Transaction-1 (t1) Transaction-2 (t2)

W(b)

It is a view serializable schedule since a view similar schedule is


conceivable.
Note: A conflict serializable schedule is always viewed as
serializable, but vice versa is not always true.

Advantages of Serializability
1. Execution is predictable: In serializable, the DBMS’s
threads are all performed simultaneously. The DBMS doesn’t
include any such surprises. In DBMS, no data loss or
corruption occurs and all variables are updated as intended.
2. DBMS executes each thread independently, making it much
simpler to understand and troubleshoot each database
thread. This can greatly simplify the debugging process. The
concurrent process is therefore not a concern for us.
3. Lower Costs: The cost of the hardware required for the
efficient operation of the database can be decreased with
the aid of the serializable property. It may also lower the
price of developing the software.
4. Increased Performance: Since serializable executions
provide developers the opportunity to optimize their code
for performance, they occasionally outperform non-
serializable equivalents.
For a DBMS transaction to be regarded as serializable, it must
adhere to the ACID properties. In DBMS, serializability comes in a
variety of forms, each having advantages and disadvantages of its
own. Most of the time, choosing the best sort of serializability
involves making a choice between performance and correctness.
Making the incorrect choice for serializability might result in
database issues that are challenging to track down and resolve. You
should now have a better knowledge of how serializability in DBMS
functions and the different types that are available thanks to this
guide.

Concurrency Control Techniques


 Read
 Courses
 Jobs



Concurrency control is provided in a database to:
 (i) enforce isolation among transactions.
 (ii) preserve database consistency through consistency
preserving execution of transactions.
 (iii) resolve read-write and write-read conflicts.
Various concurrency control techniques are:
1. Two-phase locking Protocol
2. Time stamp ordering Protocol
3. Multi version concurrency control
4. Validation concurrency control
These are briefly explained below. 1. Two-Phase Locking
Protocol: Locking is an operation which secures: permission to
read, OR permission to write a data item. Two phase locking is a
process used to gain ownership of shared resources without creating
the possibility of deadlock. The 3 activities taking place in the two
phase update algorithm are:
(i). Lock Acquisition
(ii). Modification of Data
(iii). Release Lock
Two phase locking prevents deadlock from occurring in distributed
systems by releasing all the resources it has acquired, if it is not
possible to acquire all the resources required without waiting for
another process to finish using a lock. This means that no process is
ever in a state where it is holding some shared resources, and
waiting for another process to release a shared resource which it
requires. This means that deadlock cannot occur due to resource
contention. A transaction in the Two Phase Locking Protocol can
assume one of the 2 phases:
(i) Growing Phase: In this phase a transaction can only

acquire locks but cannot release any lock. The point when a
transaction acquires all the locks it needs is called the Lock
Point.
 (ii) Shrinking Phase: In this phase a transaction can only
release locks but cannot acquire any.
2. Time Stamp Ordering Protocol: A timestamp is a tag that can
be attached to any transaction or any data item, which denotes a
specific time on which the transaction or the data item had been
used in any way. A timestamp can be implemented in 2 ways. One is
to directly assign the current value of the clock to the transaction or
data item. The other is to attach the value of a logical counter that
keeps increment as new timestamps are required. The timestamp of
a data item can be of 2 types:
(i) W-timestamp(X): This means the latest time when the

data item X has been written into.
 (ii) R-timestamp(X): This means the latest time when the
data item X has been read from. These 2 timestamps are
updated each time a successful read/write operation is
performed on the data item X.
3. Multiversion Concurrency Control: Multiversion schemes keep
old versions of data item to increase concurrency. Multiversion 2
phase locking: Each successful write results in the creation of a
new version of the data item written. Timestamps are used to label
the versions. When a read(X) operation is issued, select an
appropriate version of X based on the timestamp of the
transaction. 4. Validation Concurrency Control: The optimistic
approach is based on the assumption that the majority of the
database operations do not conflict. The optimistic approach
requires neither locking nor time stamping techniques. Instead, a
transaction is executed without restrictions until it is committed.
Using an optimistic approach, each transaction moves through 2 or
3 phases, referred to as read, validation and write.
 (i) During read phase, the transaction reads the database,
executes the needed computations and makes the updates
to a private copy of the database values. All update
operations of the transactions are recorded in a temporary
update file, which is not accessed by the remaining
transactions.
 (ii) During the validation phase, the transaction is validated
to ensure that the changes made will not affect the integrity
and consistency of the database. If the validation test is
positive, the transaction goes to a write phase. If the
validation test is negative, he transaction is restarted and
the changes are discarded.
 (iii) During the write phase, the changes are permanently
applied to the database.

Difference between Deferred update and


Immediate update
 Read
 Courses
 Jobs



1. Deferred Update: It is a technique for the maintenance of the
transaction log files of the DBMS. It is also called NO-UNDO/REDO
technique. It is used for the recovery of transaction failures that
occur due to power, memory, or OS failures. Whenever any
transaction is executed, the updates are not made immediately to
the database. They are first recorded on the log file and then those
changes are applied once the commit is done. This is called the “Re-
doing” process. Once the rollback is done none of the changes are
applied to the database and the changes in the log file are also
discarded. If the commit is done before crashing the system, then
after restarting the system the changes that have been recorded in
the log file are thus applied to the database.
2. Immediate Update: It is a technique for the maintenance of the
transaction log files of the DBMS. It is also called UNDO/REDO
technique. It is used for the recovery of transaction failures that
occur due to power, memory, or OS failures. Whenever any
transaction is executed, the updates are made directly to the
database and the log file is also maintained which contains both old
and new values. Once the commit is done, all the changes get
stored permanently in the database, and records in the log file are
thus discarded. Once rollback is done the old values get restored in
the database and all the changes made to the database are also
discarded. This is called the “Un-doing” process. If the commit is
done before crashing the system, then after restarting the system
the changes are stored permanently in the database.
Difference between Deferred update and Immediate
update:
Deferred Update Immediate Update

In a deferred update, the changes are not In an immediate update, the changes
applied immediately to the database. are applied directly to the database.

The log file contains all the changes that The log file contains both old as well as
are to be applied to the database. new values.

In this method once rollback is done all the In this method once rollback is done the
records of log file are discarded and no old values are restored to the database
changes are applied to the database. using the records of the log file.

Concepts of buffering and caching are used Concept of shadow paging is used in
in deferred update method. immediate update method.
Deferred Update Immediate Update

The major disadvantage of this method is The major disadvantage of this method
that it requires a lot of time for recovery in is that there are frequent I/O operations
case of system failure. while the transaction is active.

In this method of recovery, firstly the


In this method of recovery, the database
changes carried out by a transaction on the
gets directly updated after the changes
data are done in the log file and then
made by the transaction and the log file
applied to the database on commit. Here,
keeps the old and new values. In the
the maintained record gets discarded on
case of rollback, these records are used
rollback and thus, not applied to the
to restore old values.
database.

Immediate Update
Feature Deferred Update

Updates occur during instruction


Update Updates occur after instruction
execution
timing execution

May be faster: update occurs May be slower: update occurs


Processor after instruction execution, during instruction execution,
speed allowing for multiple updates to potentially causing the processor
be performed at once to stall

More complex: requires Less complex: updates occur


additional instructions or immediately during instruction
Complexity
mechanisms to handle updates execution
after instruction execution

May result in temporary Data in registers and memory are


Consistency inconsistency between data in always consistent
registers and memory

Less flexible: immediate updates


May be more flexible: allows for can limit the range of data
Flexibility more complex data manipulations and algorithms
manipulations and algorithms that can be performed
Shadow Paging in DBMS?
Example, Advantages,
Disadvantages
Shadow Paging in DBMS? Example, Advantages,
Disadvantages: The process of recovering data in a
database management system is known as Shadow Paging. It
is a recovery technique where the transactions are performed
in the main memory. After completion of transactions, they
get updated in the database. However, it didn’t show up in
the database if failures happen during transaction execution.
1. Navigating the hidden job market th...
Pause
Unmute
Loaded: 16.74%

Remaining Time -6:50


Fullscreen

2.

Play Video
Navigating the hidden job market through networking

Also See: What is RAID in DBMS: 7 Levels with Advantages


and Methods

There are tons of queries on the internet regarding shadow


paging that our team will try to uncover using this post. We
suggest checking the entire post to ensure you have all the
required information.
Shadow Paging in DBMS? Example,
Advantages, Disadvantages
Shadow paging is a recovery method for retrieving data in
DBMS. The main use of this technique is maintaining the
consistency in data if failure happens in any case. The
technique is also known with the name of Cut of Place
Updating. There is no need to log in to a single user
environment while performing this technique. However, the
system needs a log for concurrency control during the
multiuser environment.

It offers durability and atomicity to the system in the entire


process. This concept utilizes quite a few disks for completing
the operation. It is known to give power to manipulating
pages in a database that is a very important step to perform.
Also See: File Organization in DBMS: Types with Advantages
and Importance
What Kinds of Functions does Shadow Paging Perform
in Different Environments?
It needs to be understood that the shadow paging causes a
different effect on different environments. Although most
programmers don’t take this thing seriously, it is strongly
suggested to understand this prospect. Allow us to explain
this prospect in detail below:

 Single User Environment

If we talk about the single-user environment, the recovery


scheme never asks for the log. The entire operation can be
easily completed without the use of a log in any manner.
 Multi-User Environment
Things are entirely different when we talk about performing
shadow paging in a multiuser environment. There will be
strongly a requirement for a log as it involves a concurrency
control procedure.

Example of Shadow Paging


As we all know, it is considered a perfect alternative for log-
based recovery. Let’s understand the shadow paging using
the figure. Here, the motive is to maintain two-page tables in
the entire lifecycle named Shadow Page Table and Current
Page Table.

In the case of shadow paging, the database is created using


a certain number of particular size disk pages to perform the
recovery process. The directory has N number of entries
where the sixth entry points to the sixth database on the
disk. The current directory goes towards the current
database pages once a transaction starts and gets copied to
the directory named shadow page.
Also See: Aggregation in DBMS: Types, Example, Techniques
and Importance
The current directory is used during the operation, whereas
a non-volatile disk is used to save the shadow directory.
Therefore, there is no modification performed on the shadow
directory when the execution of the transaction is performed.
Instead, a copy of modified data is developed during the
execution of the write operation without performing over-
writing.

A new page written on a disk block isn’t completely unused.


The modification work is performed on the current directory,
and it gets pointed to the new disk block. On the other hand,
no modification work is performed on the shadow directory,
and it continuously points towards the unmodified disk block.
There are two versions maintained for updated pages. The
shadow directory references the older version, whereas the
current directory is used for pointing to new versions. If
recovery needs to be performed from failure, the modified
data is released, followed by throwing away the current
directory.

The status of data can be accessed through the shadow


directory prior to the execution of transaction. After that, the
shadow directory is reinstated to recover the state. In this
way, the entire process of shadow paging is performed.
Also See: Metadata in DBMS, Types and Importance
Advantages of Shadow Paging in DBMS?
Shadow paging is an over log-based method that offers a
great set of advantages we have mentioned in detail below:
 There is overhead removed in log record output with the help of

shadow paging. It makes sure that the recovery process is

performed in a much better way from the crash.

 The shadow paging asks for a very less number of disks for

performing the entire operation.

 The operations like Undo and Redo don’t need to be performed

during shadow paging.

 It is an inexpensive and faster method to perform recovery after a

crash.
Disadvantages of shadow paging in DBMS?
The post won’t get completed without mentioning the
disadvantages of shadow paging that are mentioned below:
 It becomes tough to maintain pages that are closer to the disk due

to location changes.

 There is a chance of data fragmentation while performing shadow

paging.

 The system needs multiple blocks for performing a single

transaction which reduces the execution speed.

 Extending the algorithm becomes difficult for letting transactions

operate concurrently.

Also See: Examples of Popular Database Management


Systems (DBMS)
Shadow paging different from log-based recovery?
Log-based recovery is a very important used structure to
record database modifications. Many people inquire how they
are different from the shadow paging. Have a look at the
major differences between the two below:
 The log record is made using different fields like data item identifier,

transaction identifier, new value, and old value. On the other hand,

shadow paging is made from a certain number of fixed-size disk

pages.

 The search process in the log-based recovery is quite long, while

things become faster in shadow paging thanks to the elimination of

overhead of log record output.

 There is a need for only one block to complete a single transaction

in log-based recovery, but the commit of one transaction asks for

multiple blocks in shadow paging.

 The locality property of pages can be lost in shadow paging, while it

never happens in log-based recovery.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy