r18 Dbms Unit-IV
r18 Dbms Unit-IV
Transaction in Database:
It is a set of operations use to perform a logical unit of work. For example work is withdraw
money. To withdraw money it needs to transaction some amount from one account to other
account through ATM or online etc.
A transaction generally represents change in database.
As we know, All banks contain their database Server which holds all information’s of their account
holders. When it need to transaction then the following basic operations are performed on data.
READ: Information’s of sender account are fetched from Hard DISK of DATABASE SERVER
to MAIN Memory of Sending Device.
WRITE: Now fetched information’s can change according to transaction need though CPU.
Still all the changes are in Main memory.
In last step, commit command made the permanent changes in database of hard disk server.
Example:
Let’s take an example of a simple transaction. Suppose a bank employee transfers Rs. 1000 from A’s
account to B’s account. This simple transaction contains several low level mini tasks.
A’s Account:
Open_Account(A)
Open_Account(B)
Old_Balance = B.balance
B.balance = New_Balance
Close_Account(B)
Transaction Properties:
The transaction has the four properties which ensure that the database remains consistent,
accuracy, completeness, and data integrity, before and after the transaction execution.
These properties are called as ACID Properties of a transaction.
ACID Properties
Atomicity
Consistency
Isolation
Durability
Above Mention properties are commonly known as ACID properties. Let explain all terms one by
one
Atomicity:
This property states that either all of its operations are executed or none.
Execution of a transaction should be start from first-step (fetch) and end with last-step
(Commit). There should be no abortion or failure in-between the execution of any atomicity
transaction.
If there is any failure or abortion even at point 99.9 out of 100 then there must be Rollback.
Rollback, eliminate all previous execution and transfer the control from start of execution where
it restart the transaction
Let explain with example:
If we want to transfer the money from one account (Account “A”) to other account (Account “B”)
then a transaction will be
Transaction = (Debit account “A”) + (Credit Account “B”)
In the following diagram, we can be seen that
If there is no atomicity then debiting the account “A” with Rs.1000 does not credit the Account
“B”.
If there is an atomicity then debiting the account “A” with Rs.1000 will credit the Account “B”
with Rs.1000.
2. Consistency:
The database will be in consistent state if
“Sum of balance, before and after the transaction execution, is same”.
Suppose SUM1 is balance of two accounts before to transaction and SUM2 is balance after
the transaction.
The condition for consistency is SUM1 Must be equal to SUM2
Now, suppose Account A has 1500 and transfer 1000 to Account B which already contain 3000, let
execute
Note: if account debited and money not received then it should be in consistently. To resolve this,
again Rollback is used
3. Isolation:
The term ‘isolation’ means separation. In Isolation, the data of one transaction should not
affect by other transaction.
Conversion of parallel schedule to serial schedule is called isolation. Serial schedule is always
consistent.
IMPORTANT NOTE:
If two operations are running parallel on two different accounts or databases, then the value
of both accounts should not get affected by each other.
If two operations are running parallel on same accounts, then the value of both accounts can
get affected by each other. To resolve this problem make transaction serial.
4. Durability:
Durability ensures the permanency of something.
The database should be durable. It means, when the database is updated after transaction
then it should holds the permanent changes in database. Permanent changes means, these
modified information’s does not change automatically after sometime until unless other
transaction has to perform some actions.
Note: Commit command is use to store or update the data permanently.
Transaction States:
States through which a transaction goes during its lifetime are known as transaction states. These
states tells about the current state of the Transaction.
There are six major types of Transaction states which are as given below
1. Active state
2. Partially committed state
3. Committed state
4. Failed state
5. Aborted state
6. Terminated state
Detail diagram of transaction states as below
1. Active State
When the instructions of the transaction are executing then the transaction is in active state.
After the execution of all instruction of a transaction, the transaction enters into a partially
committed state from active state.
At this stage, Still Changes are possible in transaction because all the changes made by the
transaction are still store in the buffer of main memory.
3. Committed State
Committed state permanently store all changes made by the transaction into the database
4. Failed State
When a transaction is in the “active state” or “partially committed state” and some failure
occurs then it becomes impossible to resume its execution, so it enters into a failed state.
Note: At this stage, Transaction cannot go to “partially committed state” or “active state” .
5. Aborted State
As we know failed state can never be resuming but it is possible to restart the failed
transaction. To restart the failed transaction Rollback method comes into picture.
When we rollback (restart) the failed transaction the all the changes made by that
transaction earlier have to be undone.
6. Terminated State
If any transaction comes from “Aborted state” or “committed state” then that transaction is
terminated and get ready to execute the new transactions.
Schedules and Its Types:
When there are multiple transactions are running and the order of execution is needed then
schedules comes into pictures. So that operations do not overlap to each other.
In simple words, schedules define the order of the operation of the each transaction.
1. Serial Schedules:
Schedules in which no transaction can starts until a running transaction is ended are called serial
schedules.
Example: Consider the following schedules (A and B) each hold two transactions T1 and T2. In
Serial schedule two transactions execute separately. They cannot execute at the same time as given
below
This is a serial schedule since the transactions perform serially in the order T1 —> T2 OR T2 —>
T1
Advantage of Serial Schedule:
Main benefit of serial schedule is that there is no concurrency problem.
Concurrency Problem:
This problem occurs when two transactions are accessing the same data in such a way that
execution of one transaction will affect the other transaction.
Our Need is parallel transactions because of time saving but parallel cause the Concurrency
problem and Read write Conflict problem
Explanation with Real Life Example:
If there exist one ATM machine then only one person can withdraw the money at a time
other persons have to wait. It is a serial schedule
By using online application of any bank, thousands of people can send and receive the
money at a time. It is a parallel schedule
Two operations (Read and Write) are considered as conflicting operations, if they hold the
following conditions.
Both the operations are on the same data.
Both the operations (Read and Write) belong to different transactions.
At least one of the two operations is a write operation.
Two operations (Read and Write) are considered as Non-Conflicting Operations, if they hold the
following conditions.
Both the operations are on different data item.
Both the operations belong to different transactions.
Note: Both READ operation of same or different data is also a non-conflict pair
Serializability:
As we know some non-serial schedule are also consistent and behave like serial schedules
To check whether the given non-serial schedule is serial or not, the concept of Serializability is
used.
According to Serializability -
“If a given non-serial schedule of ‘n’ transactions is equivalent to some serial schedule of ‘n’
transactions, then it is called as a serializable schedule”.
Example:
Suppose there are 3 transactions (T1,T2,T3) where T1 execute first and then T2 and T3 then
serial schedule will contain the following series of transaction execution
T1→T2→T3
Non-serial must contains a series where any transaction cannot repeat itself just like the above
serial schedule but position of transaction can be inter-change. So Non-serial will be one of the
following if it is serializable.
T1→T2 →T3
T1→T3 →T2
T2→T1 →T3
T2→T3 →T1
T3→T1 →T2
T3→T2 →T1
If a schedule contains the series of transaction where transactions are repeating like T1→T2 →T1
→T3 where T1 repeating then that series will not be serial.
Types of Serializability:
1. Conflict Serializability
Conflict serializability is used to check whether the given Non-serial schedule is Serial or not.
Testing of conflict serializability:
In conflict serializability, by using conflicting pairs we draw a precedence graph from given
schedule.
If derived precedence graph from schedule holds no loop then that schedule will be a serial
schedule.
Keep in mind a serial schedule will always be
Consistant
Serializable or serial
Let explain with examples
Example 01: Check given schedule is serializable or not?
Solution:
So, x,y is data and R,W are read and write operations
First find the precedence graph from above schedule. To draw a precedence graph,
At time1, R(x) of T1 exist, as conflict of R(x) is W(x). so Check W(x) in T2 and T3. As No W(x)
exist in T2 and T3, after time1. So, No edge is draw from T1 to T2 or T3.
At time2, R(y) of T3 exist, as conflict of R(y) is W(y). so Check W(y) in T2 and T1. As No W(y)
exist in T2 and T1, after time2. So, No edge is draw from T3 to T1 or T2.
At time3, R(x) of T3 exist, as conflict of R(x) is W(x). so Check W(x) in T2 and T1. As W(x) exist
in T1 , after time3. So, an edge is draw from T3 to T1.
At time4, R(y) of T2 exist, as conflict of R(y) is W(y). so Check W(y) in T1 and T3. As W(y)
exist in T3 , after time4. So, an edge is draw from T2 to T3.
At time5, R(z) of T2 exist, as conflict of R(z) is W(z). So Check W(z) in T3 and T1, after time5.
As an W(z) exist in T1, after time5. So, an edge is draw from T2 to T1.
At time6, W(y) of T3 exist, as conflict of W(y) is R(y) and W(y). so Check R(y) and W(y) in T2
and T1, after time6. As No R(y) and W(y) exist in T2 and T1 , after time6. So, No edge is draw
from T3 to T1 or T2.
At time7, W(z) of T2 exist, as conflict of W(z) is R(z) and W(z). so Check W(z) and R(z) in T1
and T3, after time7. As an W(z) exist in T1 , after time7. So, an edge is draw from T2 to T1.
This edge is already exist.
At time8, time9, time10, operation of T1 R(z), W(x), W(z) are executed respectively. Because
after time8, no operation of T2 and T3 are exists.
After all time execution, the following precedence graph will draw.
If precedence graph is has no loop or acyclic graph the given schedule will be
Conflict serilizable
Serilizable/serial
Consistant
Order or sequence of transaction execution will be derived through Topological Sort method
According to Topological Sort Method, select the vertex first which has lowest in-degree So,
In above example sequence will be T2 → T3 → T1.
Example 02:
List all the conflicting operations and determine the dependency between the transactions-
R2(A) , W1(A) (T2 → T1)
Step-02:
Example 03:
Check whether the given schedule S is conflict serializable and recoverable or not-
Solution-
Checking Whether S is Conflict Serializable or Not-
Step-01:
List all the conflicting operations and determine the dependency between the transactions-
Step-02:
Draw the precedence graph-
View Equivalent:
Two schedules S1 and S2 are said to be view equivalent if they satisfy the following conditions:
1. Initial Read:
An initial read of both schedules must be the same. Suppose two schedule S1 and S2. In schedule S1,
if a transaction T1 is reading the data item A, then in S2, transaction T1 should also read A.
Above two schedules are view equivalent because Initial read operation in S1 is done by T1 and in
S2 it is also done by T1.
2. Updated Read:
In schedule S1, if Ti is reading A which is updated by Tj then in S2 also, Ti should read A which is
updated by Tj.
Above two schedules are not view equal because, in S1, T3 is reading A updated by T2 and in S2, T3
is reading A updated by T1.
3. Final Write:
A final write must be the same between both the schedules. In schedule S1, if a transaction T1
updates A at last then in S2, final writes operations should also be done by T1.
Above two schedules is view equal because Final write operation in S1 is done by T3 and in S2, the
final write operation is also done by T3.
Example:
Schedule S
With 3 transactions, the total number of possible schedule
= 3! = 6
S1 = <T1 T2 T3>
S2 = <T1 T3 T2>
S3 = <T2 T3 T1>
S4 = <T2 T1 T3>
S5 = <T3 T1 T2>
S6 = <T3 T2 T1>
Schedule S1
In both schedules S and S1, there is no read except the initial read that's why we don't need to
check that condition.
The initial read operation in S is done by T1 and in S1, it is also done by T1.
Step 3: Final Write
The final write operation in S is done by T3 and in S1, it is also done by T3. So, S and S1 are view
Equivalent.
The first schedule S1 satisfies all three conditions, so we don't need to check another schedule.
T1 → T2 → T3
PRACTICE PROBLEMS BASED ON VIEW SERIALIZABILITY-
Problem-01:
Check whether the given schedule S is view serializable or not-
Solution-
We know, if a schedule is conflict serializable, then it is surely view serializable.
So, let us check whether the given schedule is conflict serializable or not.
List all the conflicting operations and determine the dependency between the transactions-
Problem-02:
Solution-
So, let us check whether the given schedule is conflict serializable or not.
List all the conflicting operations and determine the dependency between the transactions-
Now,
Since, the given schedule S is not conflict serializable, so, it may or may not be view
serializable.
Now,
To check whether S is view serializable or not, let us use another method.
Sometimes a transaction may not execute completely due to some problems i.e. hardware
failure, software issue, system crash etc.
In that case, the failed transaction has to be rollback to its starting point of time. If rollback is
done successfully then it will be recoverable schedule otherwise it will be a irrecoverable
schedule.
1. Irrecoverable Schedule:
When some values of failed transaction (say T1) is read by some other transaction (say T2) and T2
committed before T1 and T1 fail before to commit as given below. Then this case will be
irrecoverable
Explanation:
The above irrecoverable schedule shows two transactions (T1 and T2). T1 reads and writes the
value of data “A”. T2 also read the same value of “A” which is written be T1. T2 commits but
later on, T1 fails.
Due to the failure of T1, we have to rollback T1 and T2.
T1 is rollback due to its failure but T2 has to rollback because it read the data written by T1.
But T2 can’t be rollback because it is already committed. So this type of schedule is called
irrecoverable schedule.
Note: After rollback of T1 database should rollback to its original value 1500 but it is not possible
because database updated with value 3000 by committing transaction T2. So it is a big problem in
irrecoverable schedule.
2. Recoverable Schedule
The schedule will be recoverable if Tj (Say T2) reads the updated value of Ti (Say T1) and Ti
committed before Tj commit.
Explanation:
The above table of recoverable schedule shows two transactions. T1 reads and writes the value
of data “A”. T2 also read the same value of “A” which is written be T1. But later on, T1 fails. Due
to this, we have to rollback T1. Transaction (T2) should also be rollback because T2 has read the
same value of data “A” written by T1.
As, the transaction T2 is not committed before T1. So, we can rollback transaction T2 as well. So
it is recoverable with cascade rollback.
Note: As nothing is committed by T1 and T2 so database value can be rollback to its original
value 1500 successfully. It is called recoverable schedule.
Here,
T2 reads from transaction T1.
T3 reads from transaction T2.
T4 reads from transaction T3.
T5 reads from transaction T
And so on..
In this schedule
If transaction T1 is fail then the other dependent transactions (T2,T3,T4) has to
Note: If any transactions from T2, T3, T4 or T5 are committed before the failure of
transaction T1, then the schedule will not be
2. Cascadeless Schedule:
In a cascadeless schedule, a transaction is never be allowed to perform read operation until
the last transaction is committed or aborted.
It is also a problematic in some cases which is overcome through third type (strict schedule)
Problem with Cascadeless Schedule:
It allows the dependent transaction (T2) to perform Write Operation after Write operation of
Transaction (T1). As given below
Remember:
All serial schedules are strict, cascading and cascadeless and recoverable schedules but not vice
versa.
As, All cascadeless schedules are may or may not be strict schedules but all cascadeless
schedules are always cascading or recoverable.
Concurrency Control:
Concurrency Control is the management procedure that is required for controlling concurrent
execution of the operations that take place on a database.
But before knowing about concurrency control, we should know about concurrent execution.
Note: Uncontrolled manner means the Execution of conflict pairs without any restriction.
Example: In a multi user environment system, multiple users can access the same database at a
time. If the there is an BANK database and multiple users needs to perform the transactions
without any restriction, then there may be the following concurrency problems can happens.
Types of Concurrency Problems:
There are four major concurrency problems
Here,
T1 reads the value of A (= 15 say).
T2 reads the value of A (= 15).
T1 updates the value of A (from 15 to 25 say) in the buffer.
T2 again reads the value of A (but = 25).
Conclusion
T2 wonders how the value of data variable “A” got changed from 15 to 25 even when it has not
updated its value. T2 understand that, it is running in isolation.
Conclusion: T2 wonders who deleted the data variable “A”. T2 understand that, it is running in
isolation.
Here,
T1 reads the value of A (= 20 say).
T1 updates the value to A (= 30 say) in the buffer.
T2 does blind write A = 50 (write without read) in the buffer.
T2 commits.
When T1 commits, it writes A = 50 in the database.
Conclusion
T1 perform the over-written of A value in the database.
So, W(A) from T1 gets lost.
if a transaction (say T1) holds an shared lock on Data( say A) and some other transaction (say
T2) want to perform read operation on same data (A), then T2 can also acquire the shared lock
without waiting the unlocking of T1. It is because Read-Read is not conflict.
2. Exclusive Lock
In an exclusive lock, if any transaction want to perform Read and Write both operation on same
data then it must acquire the exclusive lock first.
In exclusive lock, multiple transactions do not perform the Write operations on same data
simultaneously.
if a transaction (say T1) holds an exclusive lock on Data( say A) and some other transaction
(say T2) want to access the same data (A), then it T2 has to wait until T1 unlock the exclusive
lock.
Note: Shared and exclusive lock on different data items can be applied any times without any
problem.
Compatibility Lock Table:
Following compatibly table is used when multiple transactions wants to perform read or write
operation on same data items.
Suppose T1 and T2 are parallel transaction and both wants to perform read and write
operation on same data say “A”. Shared lock is denoted by “S” and Exclusive lock is denoted by
“X”.
Case 01: (Shared to shared): If T1 has a Shared lock on data (A) then we allow to T2 to shared-
lock on same data (A) without unlocking data of T1.
Explanation: When a transaction T1 holds a shared lock for read data and T2 wants the shared
lock for read operation on same data “A” then shared lock is granted because Read-Read is not a
conflict.
Case 02: Case (Shared to Exclusive ) If T1 has an Shared- lock on data (A) then we cannot allow
to T2 to exclusive-lock on same data (A) until unless T1 unlock the data (A)
Explanation: When a transaction T1 holds a shared lock for read data (“A”) and T2 wants the
Exclusive lock for read and write operation on same data “A” then shared lock is not granted
because Read-Write is a conflict.
Case 03: Case (Exclusive to shared) If T1 has an exclusive lock on data (A) then we cannot
allow to T2 to lock either shared or Exclusive on same data(A) until unless T1 unlock the
data(A)
Explanation: When a transaction T1 holds a Exclusive lock for read and write data (“A”) and T2
wants the Exclusive lock for read and write operation on same data “A” then Exclusive lock is
not granted because Read-Write and Write-Write is a conflict.
Case 04: (Exclusive to Exclusive ) If T1 has an exclusive lock on data (A) then we cannot allow
to T2 to lock Exclusive on same data(A) until unless T1 unlock the data(A)
Explanation: When a transaction T1 holds a Exclusive lock for read and write data (“A”) and T2
wants the Exclusive lock for read and write operation on same data “A” then Exclusive lock is
not granted because Read-Write and Write-Write is a conflict there.
Keep in Mind: all above conditions are required when multiple transactions are using the same
data. If T1 has exclusive-lock on data A then T2 can exclusive lock on B without waiting the
unlocking of T1 on data A concurrently.
Problems in Shared-Exclusive Locking:
When the Shared-Exclusive locking is given properly then there may still exist the following
problem
Explanation: See the above example where read/write locking is given properly but still a loop
present in the schedule of T1 and T2. We know if a loop is there then it may or may not be a serial
Problem-2: Produced schedule through Shared-Exclusive locking may be irrecoverable
Explanation:
T2 request for shared-lock on data “A” which is granted directly , Now let suppose T1 request
for exclusive lock on data “A” which will not be granted until T2 unlock the data A.
Suppose as T2 was unlocking data A, T3 also acquire Shared-lock on same data A. It is possible
shared lock and shared lock at a time on same data. So, T1 has to wait until T3 unlock the A.
Suppose as T3 was unlocking data A, T4 also get the Shared-lock on same data A. Now, T1 has to
wait until T4 unlock the A.
So, Transaction T1 wait from time 2 to time 8 for getting exclusive lock on data “A”. Therefore, It
is a starvation case.
To remove all said problems we use Two Phase Locking (2PL) Protocol.
Two Phase Locking (2-PL) Protocol:
2-PL is an extension of Shared/Exclusive locking.
It is used to reduce the problems of Shared/Exclusive locking
Any schedule which is following 2PL will always serializable which was not in Shared/Exclusive
locking
Phases of 2PL:
There are two basic phases of two phase locking which are explained below
1. Growing Phase: In Growing phase, only Locks are acquired by a transaction and no locks are
released by a transaction at that time.
2. Shrinking Phase: In shrinking phase only Locks are released by transaction and no locks are
acquired by a transaction at that time.
Important in 2PL is Lock Point:
Lock Point: The Point at which the growing phase ends (i.e., when transaction takes the final
lock) is called Lock Point.
Example: The following diagrams shows growing and shrinking phases in 2-PL.
Transaction T1
Growing Phase: From Time 1 to 3.
Shrinking Phase: From Time 5 to 7
Lock Point: At time 3
Transaction T2
Growing Phase: From Time 2 to 6
Shrinking Phase: From Time 8 to 9
Lock Point: At Time 6
2PL Transaction Execution Sequence:
Q) As we say, 2PL will generate the serial schedule then what will be the sequence of transaction
execution?
Answer: If locking point of Transaction T1 comes earlier than transaction T2 then the T1 will
execute first than T2. As given below in diagram
Important: Keep in mind when multiple transactions needs to acquire the shared or exclusive lock
then it must follow the lock compatibility table.
In the following schedule where T1 holds the shared lock on “A” and T2 also required
exclusive lock on “A”. This case is not a possible according to lock compatibility table. So, T2
will be blocked (waiting state) until T1 release the the exclusive lock on “A”.
Advantages of 2-PL”
2PL always ensure the Serilizability. It means the schedule in 2PL must be serial. No loop
existing in transaction execution sequence graph.
Disadvantages of 2-PL:
As 2PL remove the Irrecoverability problem but still cascading, deadlock and starvation problem is
there. To remove such problems we use different types of 2PL.
Categories of Two Phase Locking:
As we know Basic 2-PL achieve the serializability, but if we want to achieve cascadless,
recoverability and deadlock removal from schedule then we have to use categories of two
phase locking.
Categories of 2-PL:
There are three basic categories of 2-PL
1. Strict 2-PL:
A schedule will be in Strict 2PL if
It must satisfy the basic 2-PL.
And each transaction should holds all Exclusive(X) Locks until the Transaction is
Commits or abort.
2. Rigorous 2-PL:
It is similar to Strict 2-PL in its advantage and disadvantage but little bit more strict than Strict
2-PL.
A schedule will be in Rigorous 2PL if
It must satisfy the basic 2-PL.
And each transaction should hold all Exclusive(X) Locks as well as Shared(S)
Locks until the Transaction is Commits or abort.
In Strict 2PL, If T1 acquire Shared lock on some data (Say A) and T2 also request for shared
lock on same data (“A”) then it is granted.
But in Rigorous 2PL, If T1 acquire Shared lock on data “A” and T2 also request for shared lock
on same data “A” then it will not be granted. Because T1 has to commit before to grant shared
lock to T2.
Following figure explain all
Advantages of Strict/ Rigorous 2PL
There are some advantages and disadvantage of Strict/ Rigorous 2PL. Explained under
Recoverable: As T2 can’t acquire any lock until T1 committed or abort. So if T1 fail then no
effect on the T2 so schedule will become recoverable.
Cascade less: As T2 can’t acquire any lock until T1 committed or abort. So if T1 fail then no
effect on the T2 so schedule will become cascade less. Cascade less means rollback of one
transaction doesn’t affect the other transactions.
Note: Still deadlock and starvation problems are exist in strict 2-PL
3. Conservative 2-PL:
Keep in mind: it is difficult to use in practice because of need to pre-declare the read-set and the
write-set which is not possible in many situations.
2. Timestamp Ordering Protocol (TOP):
Timestamp is a some numeric value, when a transaction arrives in a schedule at any time then a
timestamp value is assigned to it.
The Timestamp Ordering Protocol is used to order the transactions based on their Timestamps.
This timestamp value is mostly assign in ascending order.
Example: Suppose there are three transactions T1, T2, T3 and all are executing parallel on same
data say (“A”) as
T1 arrive at time 8:00oC So, assign as timestamp value as 100 called Older
T2 arrive at time 8:10oC So, assign as timestamp value as 200 called Younger
T3 arrive at time 8:15oC So, assign as timestamp value as 300 called Youngest
Keep in Mind, the transaction which came first will always execute first. We can also say the
transaction which holds minimum timestamp value will execute first
Timestamp Types:
Timestamp of any data is of two types
1. Read timestamp (RTS)
2. Write timestamp (WTS)
Example:
As in above diagram T3 is the latest transaction which perform the read operation. So Read
timestamp of data “A” will be the Read timestamp of least transaction.
Timestamp of last (latest) transaction number which performs the write operation on same
data successfully is called Write timestamp.
It is denoted by WTS (Ti) = timestamp-Value.
Example:
As in above diagram T2 is the latest transaction which perform the Write operation on same
data “A”. So write timestamp of some data (Say “A)” will be.
WTS (A) = 200
If there are multiple data items in the schedule (i.e. A, B, C…) then each data item holds its own
Read and Write timestamp as given below
RTS(A) = 300
WTS(A) =200
RTS(B) =100
WTS(B) =100
Thomas rules:
Timestamp follow some rules to perform read or write operation. The rules are also known as
Thomas rules.
Note: Although Read-write, Write-Write and Write- read are conflicts but never problematic in
timestamp because older transaction has higher priority and always execute first than younger.
Case 02:
If younger transaction (T2) wants to perform read or write operation before the read or write
operation of T1 on same data. Then there will be the following conflicts
Read-write conflicts
Write-Write conflicts
Write- read conflicts.
Let explain all above conflicts one by one
I. Read-Write Conflict: Dirty Read Problem:
Suppose younger-T2 read the value of “A” first and older-T1 abort or fail in later on. Then it is dirty
read problem.
II. Write-Read Conflict: Kind of dirty read Problem:
When the younger-T2 Write the value of data “A” first, and older-T1 read the same data and
committed. But in later on the younger-T2 abort or fail. Then it is also a kind of dirty read problem.
Conclusion: if younger transaction reads or write the data (i.e. “A”) first and older transaction
wants to write or read operation on same data in later on then there must be a conflict. But keep in
mind Read-read conflict does not exist in any case.
Example:
Let Explain with Example, Look at following table
Solution:
Draw the following table
In above table A,B,C are data values. And Read and Write timestamp values are given “0”. As in the
example table, time0 to time7 are given, let discuss one by one all.
At time 1, the transaction 1 wants to perform read operation on data “A” then according to Rule
No 01,
At time 2, the transaction 2 wants to perform read operation on data “B” then according to Rule
No 01,
At time 4, the transaction 3 wants to perform read operation on data “B” then according to Rule
No 01,
WTS(B) > TS(T3) = 0>300 // condition false
Go to else part and SET RTS (B) = MAX {RTS(B), TS(T3)} So,
RTS (B) = MAX{200,300} = 300.
So, finally RTS(B) replace 200 and updated with 300.
Updated table will be appear as following,
At time 5, the transaction T1 wants to perform read operation on data “C” then according to Rule
No 01,
At time 7, the transaction 3 wants to perform write operation on data “A” then according to Rule
No 02,
The Validation based protocol works based upon the following three phases
Read and Execution Phase: Read phases involve read and execute an operation for
Transaction T1. The values of the multiple data items are being read in this phase and the
protocol writes the data in a temporary variable. The temporary variable is a local variable
that holds the data item instead of writing it to the database.
Validation Phase: The validation phase is an important phase of the concurrency protocol. It
involves validating the temporary value with the actual values in the database and to check the
view serializability condition.
Write Phase: The write phase ensures valid data to be written to the database that is
validated in the validation phase. The protocol performs the rollback operation in case of an
invalid scenario of the validation phase.
Next, we will discuss various timestamps associated with each phase of the validation protocol.
There are three timestamps that control the serializability of the validation based protocol in the
database management system, such as.
Start(Timestamp): The start timestamp is the initial timestamp when the data item being
read and executed in the read phase of the validation protocol.
Validation(Timestamp): The validation timestamp is the timestamp when T1 completed
the read phase and started with the validation phase.
Finish(Timestamp): The finish timestamp is the timestamp when T1 completes the writing
process in the writing phase.
To manage the concurrency between transactions T1 and T2, the validation test process for T1
should validate all the T1 operations should follow TS(T1) < TS(T2) where TS is the timestamp
and one of the following condition should be satisfying
Finish T1 < Start T2:
In this condition, T1 completes all the execution processes before the T2 starts the
operations.
It regulates maintaining the serializability.
Finish(T1) <Validate(T2):
The validation phase of T2 should occur after the finish phase of T1. This scenario is useful
for concurrent transaction serializability.
The Transactions are able to access the mutually exclusive database resource at a particular
timestamp while validating the protocol conditions.
The validation based protocol relies on the timestamp to achieve serializability. The validation
phase is the deciding phase for the transaction to be committed or rollback in the database. It
works on the equation TS (T1) =Validation (T1) where TS is the time stamp and T1 is the
transaction.
Example:
Transaction T1 Transaction T2
Read(A) Read(A)
A=A-40
Read(B)
B=B+80
Read(B)
<Validate>
Display(B+A) <Validate>
Write(A)
Write(B)
The transaction table is shown in the example associated with transaction T1 and transaction
T2. It represents the schedule produced using validation protocol.
The concurrent transaction process starts with T1 with a reading operation as Read (A) where
A is the numeric data element in the database.
In the next step, the transaction T2 also reads the same data variable A after some time.
Transaction T2 performs an arithmetic operation by subtracting constant value 40 from the
variable A. It is represented as A=A-40 in the transaction table.
The next step is a read operation on transaction T2 where it’s reading another numerical value
of variable B as the Read(B). After the read operation completed, the transaction T2
immediately performs an arithmetic operation on the variable B. It uses the addition operator
‘+’ for adding a constant value as 80 to variable B. The addition operation is represented as
B=B+80 in the transaction table.
In the next step of the concurrent transaction, T1 reads the variable B with operation Read (B).
Now the validation based protocol comes into the action in the transaction T1 that validates the
time stamp of the start phase of T2 lesser than the finishing phase time stamp of Transaction T1
and that is a lesser timestamp as the validate phase of Transaction T2.
Similarly, in the Transaction T2, the validation based protocol validates the timestamps. In the
example shown in the table indicates both the validation based protocol is provided with a valid
result based on the timestamp condition. And, as the conclusive operations write operations are
performed by the transaction T2 using Write (A) and Write (B) statements.
Advantages:
Avoid Cascading-rollbacks: This validation based scheme avoid cascading rollbacks since
the final write operations to the database are performed only after the transaction passes the
validation phase. If the transaction fails then no updation operation is performed in the
database. So no dirty read will happen hence possibilities cascading-rollback would be null.
Avoid deadlock: Since a strict time-stamping based technique is used to maintain the
specific order of transactions. Hence deadlock isn’t possible in this scheme.
Disadvantages:
Starvation: There might be a possibility of starvation for long-term transactions, due to a
sequence of conflicting short-term transactions that cause the repeated sequence of restarts of
the long-term transactions so on and so forth. To avoid starvation, conflicting transactions
must be temporarily blocked for some time, to let the long-term transactions to finish.
Multiple Granularities:
Multiple Granularities:
o It can be defined as hierarchically breaking up the database into blocks which can be locked.
o The Multiple Granularity protocol enhances concurrency and reduces lock overhead.
o It maintains the track of what to lock and how to lock.
o It makes easy to decide either to lock a data item or to unlock a data item. This type of
hierarchy can be graphically represented as a tree.
For example: Consider a tree which has four levels of nodes.
o The first level or higher level shows the entire database.
o The second level represents a node of type area. The higher level database consists of
exactly these areas.
o The area consists of children nodes which are known as files. No file can be present in more
than one area.
o Finally, each file contains child nodes known as records. The file has exactly those records
that are its child nodes. No records represent in more than one file.
o Hence, the levels of the tree starting from the top level are as follows:
1. Database
2. Area
3. File
4. Record
Recovery and Atomicity:
When a system crashes, it may have several transactions being executed and various files opened
for them to modify the data items. Transactions are made of various operations, which are atomic
in nature. But according to ACID properties of DBMS, atomicity of transactions as a whole must be
maintained, that is, either all the operations are executed or none.
It should check the states of all the transactions, which were being executed.
A transaction may be in the middle of some operation; the DBMS must ensure the atomicity
of the transaction in this case.
It should check whether the transaction can be completed now or it needs to be rolled back.
No transactions would be allowed to leave the DBMS in an inconsistent state.
There are two types of techniques, which can help a DBMS in recovering as well as maintaining the
atomicity of a transaction −
Maintaining the logs of each transaction, and writing them onto some stable storage before
actually modifying the database.
Maintaining shadow paging, where the changes are done on a volatile memory, and later, the
actual database is updated.
Log-Based Recovery:
The log is a sequence of records. Log of each transaction is maintained in some stable
storage so that if any failure occurs, then it can be recovered from there.
If any operation is performed on the database, then it will be recorded in the log.
But the process of storing the logs should be done before the actual transaction is applied in
the database.
Let's assume there is a transaction to modify the City of a student. The following logs are written
for this transaction.
When the transaction modifies the City from 'Noida' to 'Bangalore', then another log is
written to the file.
<Tn, City, 'Noida', 'Bangalore' >
When the transaction is finished, then it writes another log to indicate the end of the
transaction.
<Tn, Commit>
There are two approaches to modify the database:
The deferred modification technique occurs if the transaction does not modify the database
until it has committed.
In this method, all the logs are created and stored in the stable storage, and the database is
updated when a transaction commits.
The Immediate modification technique occurs if database modification occurs while the
transaction is still active.
In this technique, the database is modified immediately after every operation. It follows an
actual database modification.
When the system is crashed, then the system consults the log to find which transactions need to be
undone and which need to be redone.
1. If the log contains the record <Ti, Start> and <Ti, Commit> or <Ti, Commit>, then the
Transaction Ti needs to be redone.
2. If log contains record<Tn, Start> but does not contain the record either <Ti, commit> or <Ti,
abort>, then the Transaction Ti needs to be undone.
When more than one transaction is being executed in parallel, the logs are interleaved. At the time
of recovery, it would become hard for the recovery system to backtrack all logs, and then start
recovering.
To ease this situation, most modern DBMS use the concept of 'checkpoints'.
Checkpoints:
The checkpoint is a type of mechanism where all the previous logs are removed from the
system and permanently stored in the storage disk.
The checkpoint is like a bookmark. While the execution of the transaction, such checkpoints
are marked, and the transaction is executed then using the steps of the transaction, the log
files will be created.
When it reaches to the checkpoint, then the transaction will be updated into the database,
and till that point, the entire log file will be removed from the file. Then the log file is
updated with the new step of transaction till next checkpoint and so on.
The checkpoint is used to declare a point before which the DBMS was in the consistent state,
and all transactions were committed.
In the following manner, a recovery system recovers the database from this failure:
The recovery system reads log files from the end to start. It reads log files from T4 to T1.
The transaction is put into redo state if the recovery system sees a log with <Tn, Start> and
<Tn, Commit> or just <Tn, Commit>. In the redo-list and their previous list, all the transactions
are removed and then redone before saving their logs.
For example: In the log file, transaction T2 and T3 will have <Tn, Start> and <Tn, Commit>. The
T1 transaction will have only <Tn, commit> in the log file. That's why the transaction is committed
after the checkpoint is crossed. Hence it puts T1, T2 and T3 transaction into redo list.
The transaction is put into undo state if the recovery system sees a log with <Tn, Start> but no
commit or abort log found. In the undo-list, all the transactions are undone, and their logs are
removed.
For example: Transaction T4 will have <Tn, Start>. So T4 will be put into undo list since this
transaction is not yet complete and failed amid.