Unit Iv
Unit Iv
Syllabus
Transaction processing
Concurrency control,
ACID property,
Serializability of scheduling,
Locking and timestamp-based schedulers,
Multi-version and
Optimistic Concurrency Control schemes,
Database Recovery.
1. Transactions
Transaction is a collection of operations that form a single logical unit of work. A transaction is a
unit of program execution that accesses and possibly updates various data items. A transaction is
initiated by a user program written in a high level data manipulation language, where it delimited by
statements of the form ‘begin transaction’ and ‘end transaction’. The transaction consists of all
operations executed between the begin transaction and end transaction.
1.1 Example
Consider the following fund transfer transaction
• Begin transaction
• Transfer 100 from Account X to Account Y
• Commit
2. Transaction Processing
Transaction processing systems are systems with large databases and hundreds of concurrent users
that are executing database transaction. Ex: Systems for reservations, banking, and credit card
processing etc.
Step1: Initially the transaction is in “modify” state when the first statement of the transaction is
executed. At the end of the modify state, there is a transition into one of the following states:
Step2: If the transaction completes the modification state, it enters the start-to-commit state where it
instructs the DBMS to reflect the changes made by it into the database.
Step3: Once all the changes made by the transaction are propagated to the database, the transaction is
said to be in the “commit” state and from there the transaction is terminated, the database once again
being in a “consistent” state.
Step5: The abort state could also be entered from the modify state if there are system errors, for ex,
division by zero. In case the transaction detects an error while in the modify state, it decides to
terminate itself (suicide) and enters the error state and then the “rollback” state. If the system aborts a
transaction, it may have to initiate a rollback to undo partial changes made by the transaction.
The transaction outcome can be either successful (if the transaction goes through the commit state),
suicidal (if the transaction goes through the rollback state) or murdered (if the transaction goes
through the abort state).
Consistency: The consistency property of a transaction implies that if the database was in a consistent
state before the start of a transaction, then on termination of a transaction the database will also be in a
consistent state.
Isolation: The isolation property of a transaction indicates that actions performed by a transaction will
be isolated or hidden from outside the transaction until the transaction terminates. This property gives
the transaction a measure of relative independence.
Durability: The durability property of a transaction ensures that the commit action of a transaction,
on its termination, will be reflected in the database. The permanence of the commit action of a
transaction requires that any failures after to commit operation will not cause loss of the updates made
by the transaction.
Example
Let Ti be a transaction that transfers $50 from account A to account B. This transaction can be defined
as
Ti: read(A);
A := A − 50;
write(A);
read(B);
B := B + 50;
write(B).
Consistency: The consistency requirement here is that the sum of A and B be unchanged by the
execution of the transaction. It can be verified easily that, if the database is consistent before an
execution of the transaction, the database remains consistent after the execution of the transaction.
If the atomicity property is present, all actions of the transaction are reflected in the database, or none
are. Ensuring atomicity is the responsibility of the database system itself; specifically, it is handled by
a component called the transaction-management component
Durability: Once the execution of the transaction completes successfully, and the user who initiated
the transaction has been notified that the transfer of funds has taken place, it must be the case that no
system failure will result in a loss of data corresponding to this transfer of funds. The durability
property guarantees that, once a transaction completes successfully, all the updates that it carried out
on the database persist, even if there is a system failure after the transaction completes execution.
1. The updates carried out by the transaction have been written to disk before the transaction
completes.
2. Information about the updates carried out by the transaction and written to disk is sufficient to
enable the database to reconstruct the updates when the database system is restarted after the failure.
Ensuring durability is the responsibility of a component of the database system called the recovery-
management component.
Isolation: the database is temporarily inconsistent while the transaction to transfer funds from A to B
is executing, with the deducted total written to A and the increased total yet to be written to B. If a
second concurrently running transaction reads A and B at this intermediate point and computes A+B, it
will observe an inconsistent value.
Advantages are:
o increased processor and disk utilization: one transaction can be using the CPU while another
is reading from or writing to the disk
o reduced average response time for transactions: short transactions need not wait behind long
ones.
If control of concurrent execution is left entirely to the operating system, many possible schedules,
including ones that leave the database in an inconsistent state, are possible. It is the job of the database
system to ensure that any schedule that gets executed will leave the database in a consistent state. The
concurrency-control component of the database system carries out this task.
read(X): which transfers the data item X from the database to a local buffer belonging to the
transaction that executed the read operation.
write(X): which transfers the data item X from the local buffer of the transaction that executed the
write back to the database.
SCHEDULES
Schedules represent the chronological order in which instructions of concurrent transactions are
executed in the system. A schedule for a set of transactions must consist of all instructions of those
transactions. Must preserve the order in which the instructions appear in each individual transaction.
A transaction that successfully completes its execution will have commit instructions as the last
statement. A transaction that fails to successfully complete its execution will have an abort instruction
as the last statement.
Schedule 1: Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B. A serial
schedule in which T1 is followed by T2: A=1000, B=2000
Schedule 3: Let T1 and T2 be the transactions defined previously. The following schedule is not a
serial schedule, but it is equivalent to Schedule 1. In Schedules 1, 2 and 3, the sum A + B is preserved.
Schedule 4: The following concurrent schedule does not preserve the value of (A + B).
I. Serializability
II. Recoverability
Conflict operations
Two operations conflict if:
Conflict equivalent
If a schedule S can be transformed into a schedule S´ by a series of swaps of non-conflicting
instructions, we say that S and S´ are conflict equivalent.
Conflict Serializability
An execution is conflict-serializable if it is conflict equivalent to a serial schedule.
Example
Schedule 3 can be transformed into Schedule 6, a serial schedule where T2 follows T1, by series of
swaps of non-conflicting instructions. Therefore Schedule 3 is conflict serializable.
We are unable to swap instructions in the above schedule to obtain either the serial schedule < T3, T4
>, or the serial schedule < T4, T3 >.
View Serializability
Let S and S´ be two schedules with the same set of transactions. S and S´ are view equivalent if the
following three conditions are met:
i. For each data item Q, if TI reads initial value of Q in S, then TI also read initial value of Q in S´.
ii. For each data item Q if TI read value of Q written by TJ in S, then TI also reads the value of Q written
by TJ in S´.
iii. For each data item Q, if TI writes final value of Q in S, then TI also writes final value of Q in S´.
Conditions 1 and 2 ensure that each transaction reads the same values in both schedules and,
therefore, performs the same computation. Condition 3, coupled with conditions 1 and 2, and ensures
that both schedules result in the same final system state.
A schedule S is view serializable it is view equivalent to a serial schedule. Every conflict serializable
schedule is also view serializable, but there are view-serializable schedules that are not conflict
serializable. In below schedule, transactions T4 and T6 perform Write(Q) operations without having
performed a read(Q) operation. Writes of this sort are called blind writes.
Every conflict-serializable schedule is also view serializable, but there are view serializable schedules
that are not conflict serializable. Indeed, schedule 9 is not conflict serializable, since every pair of
consecutive instructions conflicts, and, thus, no swapping of instructions is possible.
The set of edges consists of all edges Ti →Tj for which one of three conditions holds:
Consider the following precedence graph. Is corresponding schedule conflict serializability. Explain
your answer.
Solution 1: There is a conflict serializable schedule corresponding to the precedence graph below,
since the graph is acyclic.
Question 2
Define and explain conflict serializability. Show that following schedules S1 is conflict serizable
whereas S2 is not.
Solution 2:
Solution 3:
• Precedence graph is
• This is cylic.
• So schedule is not conflict
seriziable
1. Lock-Based Protocols
2. Timestamp-Based Protocols
3. Validation-Based Protocols/Optimistic concurrency control
4. Multi-version Schemes
1. Lock-Based Protocols
A lock is a mechanism to control concurrent access to a data item.
Data items can be locked in two modes:
I. Exclusive (X) mode: Data item can be both read as well as written. X-lock is requested using
lock-X instruction.
II. Shared (S) mode: Data item can only be read. S-lock is requested using lock-S instruction.
Lock requests are made to concurrency-control manager. Transaction can proceed only after request is
granted.
Lock-compatibility matrix
A transaction may be granted a lock on an item if the requested lock is compatible with locks already
held on the item by other transactions. Any number of transactions can hold shared locks on an item.
If a lock cannot be granted, the requesting transaction is made to wait till all incompatible locks held
by other transactions have been released. The lock is then granted.
A locking protocol is a set of rules followed by all transactions while requesting and releasing locks.
The potential for deadlock exists in most locking protocols. Deadlocks are a necessary evil.
Starvation is also possible if concurrency control manager is badly designed. For example:
This is a protocol which ensures serializable schedules. This protocol requires that each transaction
issue lock and unlock requests in two phases:
Phase 1: Growing Phase: A transaction may obtain locks, but may not release any lock
Phase 2: Shrinking Phase: A transaction may release locks, but may not obtain any new locks
Initially, a transaction is in the growing phase. The transaction acquires locks as needed. Once the
transaction releases a lock, it enters the shrinking phase, and it can issue no more lock requests. The
point in the schedule where the transaction has obtained its final lock is called the lock point of the
transaction. Transactions can be ordered according to their lock points—this ordering is, in fact, a
serializability ordering for the transactions. Two-phase locking does not ensure freedom from
deadlocks. Cascading roll-back is possible under two-phase locking.
To avoid this, follow a modified protocol called strict two-phase locking. Here a transaction must
hold all its exclusive locks till it commits/aborts. Rigorous two-phase locking is even stricter: here
all locks are held till commit/abort. Most database systems implement either strict or rigorous two-
phase locking.
Lock Conversions
A mechanism for upgrading a shared lock to an exclusive lock and downgrading an exclusive lock to
a shared lock. We denote conversion from shared to exclusive modes by upgrade, and from
exclusive to shared by downgrade.
Lock Table
Black rectangles indicate granted locks, white ones indicate waiting requests. Lock table also records
the type of lock granted or requested. New request is added to the end of the linked list of requests for
the data item, and granted if it is compatible with all earlier locks. Unlock requests result in the
request being deleted, and later requests are checked to see if they can now be granted. If transaction
aborts, all waiting or granted requests of the transaction are deleted. Lock manager may keep a list of
locks held by each transaction, to implement this efficiently.
2. Timestamp-Based Protocols
Each transaction is issued a timestamp when it enters the system. Timestamp is the unique value
assigned to every transaction. It tells the order when they enters into system. If an old transaction Ti
has time-stamp TS(Ti), a new transaction Tj is assigned time-stamp TS(Tj) such that TS(Ti) <TS(Tj).
The protocol manages concurrent execution such that the time-stamps determine the serializability
order.
There are two simple methods for implementing this scheme:
i. Use the value of the system clock as the timestamp; that is, a transaction’s timestamp is
equal to the value of the clock when the transaction enters the system.
ii. Use a logical counter that is incremented after a new timestamp has been assigned; that
is, a transaction’s timestamp is equal to the value of the counter when the transaction enters
the system.
The timestamp ordering protocol ensures that any conflicting read and write operations are
executed in timestamp order. This protocol operates as follows:
Disadvantages
• Each value stored in the database requires two additional time stamp fields.
• One for the last time the field was read and one for the last update.
• Time stamping thus increases the memory needs and the database processing overhead.
Example:
To perform the validation test, we need to know when the various phases of transactions Ti took place.
1. Start(Ti): the time when Ti started its execution.
2. Validation(Ti): the time when Ti finished its read phase and started its validation phase.
3. Finish(Ti): the time when Ti finished its write phase.
4. Multiversion Schemes
In Multiversion concurrency control schemes, each write(Q) operation creates a new version of Q.
When a transaction issues a read (Q) operation, the concurrency control manager selects one of the
versions of Q to be read. The concurrency-control scheme must ensure that the version to be read is
selected in a manner that ensures serializability.
Each data item Q has a sequence of versions <Q1, Q2,...., Qm>. Each version Qk contains three data
fields:
• Content -- the value of version Qk.
• W-timestamp(Qk) -- timestamp of the transaction that created (wrote) version Qk
• R-timestamp(Qk) -- largest timestamp of a transaction that successfully read version Qk
When a transaction Ti creates a new version Qk of Q by issuing a write (Q) operation and W-
timestamp and R-timestamp are initialized to TS(Ti).
Suppose that transaction Ti issues a read(Q) or write(Q) operation. Let Qk denote the version of Q
whose write timestamp is the largest write timestamp less than or equal to TS(Ti).
• If transaction Ti issues a read(Q), then the value returned is the content of version Qk.
• If transaction Ti issues a write(Q)
• if TS(Ti) < R-timestamp(Qk), then transaction Ti is rolled back.
• if TS(Ti) = W-timestamp(Qk), the contents of Qk are overwritten
• else a new version of Q is created.
RECOVERY TECHNIQUES
1. Log Based recovery technique
o Deferred update
o Immediate update
2. Shadow Paging
Log-Based Recovery
The most widely used structure for recording database modifications is the log. The log is a sequence
of log records, recording all the update activities in the database. There are several types of log
records. An update log record describes a single database write. It has these fields:
• Transaction identifier is the unique identifier of the transaction that performed the write operation.
• Data-item identifier is the unique identifier of the data item written. Typically, it is the location on
disk of the data item.
• Old value is the value of the data item prior to the write.
• New value is the value that the data item will have after the write.
Other special log records exist to record significant events during transaction processing, such as the
start of a transaction and the commit or abort of a transaction.
Whenever a transaction performs a write, it is essential that the log record for that write be created
before the database is modified. Once a log record exists, we can output the modification to the
database if that is desirable. Also, we have the ability to undo a modification that has already been
output to the database. We undo it by using the old-value field in log records. For log records to be
useful for recovery from system and disk failures, the log must reside in stable storage.
When a transaction partially commits, the information on the log associated with the transaction is
used in executing the deferred writes. If the system crashes before the transaction completes its
execution, or if the transaction aborts, then the information on the log is simply ignored.
The deferred database modification scheme records all modifications to the log, but defers
all the writes to after partial commit.
Assume that transactions execute serially
Transaction starts by writing <Ti start> record to log.
A write(X) operation results in a log record <Ti , X, V> being written, where V is the new
value for X
Note: old value is not needed for this scheme
The write is not performed on X at this time, but is deferred.
When Ti partially commits, <Ti commit> is written to the log.
Finally, the log records are read and used to actually execute the previously deferred writes.
Suppose that these transactions are executed serially, in the order T0 followed by T1, and that the
values of accounts A, B, and C before the execution took place were $1000, $2000, and $700,
respectively. The portion of the log containing the relevant information on these two transactions
appears in Figure 5.1 There are various orders in which the actual outputs can take place to both the
database system and the log as a result of the execution of T0 and T1. One such order appears in
Figure 5.2.
< T0start>
< T0 , A, 950>
< T0, B, 2050>
< T0 commit>
< T1 start>
< T1 , C, 600>
< T1 commit>
Figure 5.1 Portion of the database log corresponding to T0 and T1.
Figure 5.2 State of the log and database corresponding to T0 and T1.
Figure 5.3 The same log as that in Figure 17.3, shown at three different times.
Case a:
In Figure 5.3 a. When the system comes back up, no redo actions need to be taken, since no commit
record appears in the log. The values of accounts A and B remain $1000 and $2000, respectively. The
log records of the incomplete transaction T0 can be deleted from the log.
Case b:
In Figure 5.3 b. When the system comes back up, the operation redo(T0) is performed, since the
record < T0 commit> appears in the log on the disk. After this operation is executed, the values of
accounts A and B are $950 and $2050, respectively. The value of account C remains $700. As before,
the log records of the incomplete transaction T1 can be deleted from the log.
Case c:
In Figure 5.3 c. When the system comes back up, two commit records are in the log: one for T0 and
one for T1. Therefore, the system must perform operations redo(T0) and redo(T1), in the order in
which their commit records appear in the log. After the system executes these operations, the values
of accounts A, B, and C are $950, $2050, and $600, respectively.
< T0 start>
< T0 , A, 1000, 950>
< T0 , B, 2000, 2050>
<T0 commit>
< T1 start>
< T1 , C, 700, 600>
< T1 commit>
Figure 5.4 Portion of the system log corresponding to T0 and T1.
Log Database
< T0 start>
< T0 , A, 1000, 950>
< T0 , B, 2000, 2050>
A = 950
B = 2050
< T0 commit>
< T1 start>
< T1 , C, 700, 600>
C = 600
< T1 commit>
Figure 5.5 State of system log and database corresponding to T0 and T1.
Using the log, the system can handle any failure that does not result in the loss of information in
nonvolatile storage. The recovery scheme uses two recovery procedures:
• undo(Ti) restores the value of all data items updated by transaction Ti to the old values.
• redo(Ti) sets the value of all data items updated by transaction Ti to the new values.
After a failure has occurred, the recovery scheme consults the log to determine which transactions
need to be redone, and which need to be undone:
i. Transaction Ti needs to be undone if the log contains the record < Ti start>, but does not
contain the record < Ti commit>.
ii. Transaction Ti needs to be redone if the log contains both the record < Ti start> and the
record < Ti commit>.
As an illustration, return to our banking example, with transaction T0 and T1 executed one after the
other in the order T0 followed by T1. Suppose that the system crashes before the completion of the
transactions. We shall consider three cases. The state of the logs for each of these cases appears in
Figure 5.6.
Figure 5.6 a, when the system comes back up, it finds the record < T0 start> in the log, but no
corresponding < T0 commit> record. Thus, transaction T0 must be undone, so an undo(T0) is
performed. As a result, the values in accounts A and B (on the disk) are restored to $1000 and $2000,
respectively.
Figure 5.6 b, when the system comes back up, two recovery actions need to be taken. The operation
undo(T1) must be performed, since the record < T1 start> appears in the log, but there is no record <
T1 commit>. The operation redo(T0)must be performed, since the log contains both the record < T0
start> and the record < T0 commit>. At the end of the entire recovery procedure, the values of
accounts A, B, and C are $950, $2050, and $700, respectively. Note that the undo(T1) operation is
performed before the redo(T0). In this example, the same outcome would result if the order were
reversed.
Figure 5.6 c, when the system comes back up, both T0 and T1 need to be redone, since the records <
T0 start> and < T0 commit> appear in the log, as do the records < T1 start> and < T1 commit>. After
the system performs the recovery procedures redo(T0) and redo(T1), the values in accounts A, B, and
C are $950, $2050, and $600, respectively.
Checkpoints
When a system failure occurs, we must consult the log to determine those transactions that need to be
redone and those that need to be undone. In principle, we need to search the entire log to determine
this information. There are two major difficulties with this approach:
1. The search process is time consuming.
2. Most of the transactions that, according to our algorithm, need to be redone have already
written their updates into the database. Although redoing them will cause no harm, it will
nevertheless cause recovery to take longer.
To reduce these types of overhead, we introduce checkpoints. During execution, the system maintains
the log, using one of the two techniques, i.e. deferred or immediate database modification. In addition,
the system periodically performs checkpoints, which require the following sequence of actions to
take place:
1. Output onto stable storage all log records currently residing in main memory.
During recovery we need to consider only the most recent transaction T i that started before the
checkpoint, and transactions that started after Ti.
1. Scan backwards from end of log to find the most recent <checkpoint> record.
2. Continue scanning backwards till a record <Ti start> is found.
3. Once the system has identified transaction Ti the redo and undo operations need to be
applied to only transaction Ti and all transactions Tj that started executing after
transaction Ti.
4. For all transactions (starting from Ti or later) with no <Ti commit>, execute
undo(Ti).
5. Scanning forward in the log, for all transactions starting from Ti or later with a <Ti
commit>, execute redo(Ti).
As an illustration, consider the set of transactions { T0, T1, . . ., T100 } executed in the order of the
subscripts. Suppose that the most recent checkpoint took place during the execution of transaction T67.
Thus, only transactions T67, T68, . . ., T100 need to be considered during the recovery scheme. Each of
them needs to be redone if it has committed; otherwise, it needs to be undone.
Shadow Paging
The key idea behind the shadow-paging technique is to maintain two page tables during the life of a
transaction: the current page table and the shadow page table. When the transaction starts, both
page tables are identical. The shadow page table is never changed over the duration of the transaction.
The current page table may be changed when a transaction performs a write operation. All input and
output operations use the current page table to locate database pages on disk.
1. Ensure that all buffer pages in main memory are output to disk.
2. Output the current page table to disk.
3. Output the disk address of the current page table to the fixed location in stable storage
containing the address of the shadow page table. This action overwrites the address of the old
shadow page table. Therefore, the current page table has become the shadow page table, and
the transaction is committed.
• No recovery is needed after a crash — new transactions can start right away, using the
shadow page table.
• Pages not pointed to from current/shadow page table should be freed (garbage collected).
• Commit overhead. The commit of a single transaction using shadow paging requires multiple
blocks to be output—the actual data blocks, the current page table, and the disk address of the current
page table. Log-based schemes need to output only the log records, which, for typical small
transactions, fit within one block.
The overhead of writing an entire page table can be reduced by implementing the page table as a tree
structure, with page table entries at the leaves.
• Garbage collection. Each time that a transaction commits the database pages containing the old
version of data changed by the transaction become inaccessible. In Figure 17.9, the page pointed to by
the fourth entry of the shadow page table will become inaccessible once the transaction of that
example commits. Such pages are considered garbage, since they are not part of free space and do not
contain usable information. Garbage may be created also as a side effect of crashes. Periodically, it is
necessary to find all the garbage pages, and to add them to the list of free pages. This process, called
garbage collection, imposes additional overhead and complexity on the system. There are several
standard algorithms for garbage collection.