ADBS ch-3
ADBS ch-3
(CoSc2042)
2
Introduction
Introduction to Transaction Processing
• Transaction
• Logical unit of database processing that includes one or more access operations (read -
retrieval, write - insert or update, delete).
• A transaction (set of operations) may be stand-alone specified in a high level language
like SQL submitted interactively, or may be embedded within an application program.
• Transaction boundaries:
• Any single transaction in an application program is bounded with Begin and End
statements.
• Application program may contain several transactions separated by the Begin and End
transaction boundaries.
4
Simple Model of Database
• A database is a collection of named data items
• Granularity of data - a field, a record , or a whole disk block that measure
the size of the data item
• Basic operations that a transaction can perform are read and write
• read_item(X): Reads a database item named X into a program variable.
To simplify our notation, we assume that the program variable is also
named X.
• write_item(X): Writes the value of program variable X into the database
item named X.
5
Simple Model of Database…
• In general, a data item (what is read or written) will be the field of some
record in the database, although it may be a larger unit such as a record or
even a whole block.
8
Properties and States of Transactions
Desirable Properties of Transactions
ACID Properties:
• Atomicity: A transaction is an atomic unit of processing; it is either performed entirely or
not performed at all.
• All or nothing.
• Monetarized by transaction management of DBMS
• It does by maintaining log files till the commit statement is encountered.
• After the commit statement the log file is deleted
• Consistency preservation: A correct execution of the transaction must take the database
from one consistent state to another.
• No violation of integrity constraint rule
• The rule states that “if the database was consistent before a particular
transaction, then it should also be consistent after the execution of the
transaction”.
• User or programmer takes care of this properties.
10
• It does by setting various integrity constraints
Desirable Properties of Transactions…
• Isolation: concurrent changes invisible
• Rule/principle: “concurrent execution of two or more transaction should not
case any inconsistency”.
• It should be as if the transaction executed independent of the other transaction.
• The concurrency control subsystem of the DBMS enforced the isolation
property.
• Durability or permanency: Once a transaction changes the database and the changes
are committed, these changes must never be lost because of subsequent failure.
• The recovery subsystem of the DBMS has the responsibility of Durability property.
• How it ensure?
• RAID (Redundant Array of Independent Disk) to keep multiple copies of
information at various place in the database: to insure we have an alternative
location to retrieve information in case of a crash
Transaction States
• A transaction is an atomic unit of work that is either completed in its entirety or not
done at all.
• For recovery purposes, the system needs to keep track of when the transaction starts,
terminates, and commits or aborts.
• Transaction states:
• Active state: indicates the beginning of a transaction execution
• Partially committed state: shows the end of read/write operation but this will not
ensure permanent modification on the database
• Committed state: ensures that all the changes done on a record by a transition were
done persistently
• Failed state: happens when a transaction is aborted during its active state or if one
of the rechecking is fails
• Terminated State: corresponds to the transaction leaving the system
12
Transaction States…
13
What are the causes for a Transaction failure ?
1. Computer failure (system crash):
• A hardware or software error occurs in the computer system during transaction
execution.
• If the hardware crashes, the contents of the computer’s internal memory may be lost.
2. Transaction or system error:
• Some operation in the transaction may cause it to fail, such as integer overflow or
division by zero.
• Transaction failure may also occur because of erroneous parameter values or because of
a logical programming error.
• In addition, the user may interrupt the transaction during its execution.
3. Exception conditions detected by the transaction:
• Certain conditions forces cancellation of the transaction.
• Data for the transaction may not be found, insufficient account balance in a banking
database, may cause a transaction, such as a fund withdrawal from that account, to be
canceled.
What are the causes for a Transaction failure ?...
4. Concurrency control enforcement:
• The concurrency control method may decide to abort the
transaction, to be restarted later, because it violates serializability or
because several transactions are in a state of deadlock.
5. Disk failure:
• Some disk blocks may lose their data because of a read or write
malfunction or because of a disk read/write head crash. This may
happen during a read or a write operation of the transaction.
6. Physical problems and catastrophes:
• This refers to an endless list of problems that includes power or air-
conditioning failure, fire, theft, overwriting disks or tapes by mistake.
15
Recovery manager keeps track of the following operations
• Log file: a file which contain a sequence of record or history that is used by recovery manager
for a recovery techniques.
• Types of log record:
• [start_transaction,T]: This marks the beginning of transaction execution.
• [write_item,T,X,old_value,new_value]: Record that Transaction T has changed the value
of database item X from old value to a new value.
• [read_item,T,X]: record that transaction T has read the value of database item X.
• [end_transaction,T] This specifies that read and write transaction operations have ended
and marks the end limit of transaction execution.
• [Commit_transaction,T]: This signals a successful end of the transaction so that any
changes (updates) executed by the transaction can be safely committed to the database and
will not be undone.
• [abort,T]: This signals that the transaction has ended unsuccessfully, so that any changes or
effects that the transaction may have applied to the database must be undone.
18
Transaction Processing - Concurrency
•Concurrency: allow many concurrent user to access the database.
• Interleaved processing:
• Concurrent execution of processes is interleaved in a single CPU using for example,
round robin algorithm
Advantages:
• Keeps the CPU busy when the process requires I/O by switching to execute another process
rather than remaining idle during I/O time and hence this will increase system throughput
(average number of transactions completed within a given time)
• Prevents long process from delaying other processes (minimize unpredictable delay in the
response time).
• Summary:
• Waiting time decrease
• Response time decrease
• Resource utilization increase
• Efficiency increase
• Parallel processing: If Processes are concurrently executed in multiple CPUs. 19
20
Problems of Concurrent Transaction
• The main objective in developing database system is to allow many concurrent user to
access the database.
• Concurrent access of a database case a problem during writing rather than read operation.
• Simultaneous execution of transaction over a shared database can create several data
integrity and consistency problem.
• It is necessarily to control the concurrence access to have a consistent database. (concurrency
control)
• Three main concurrency transaction problems are:
• Lost Update Problem
• Temporary Update (Dirty Read) Problem
• Incorrect Summary Problem 21
Lost Update Problem (write-write conflict)
• This occurs when two transactions that access the same database items have
their operations interleaved in a way that makes the value of some database item
incorrect.
• When data which is being updated by one transaction is being overwritten by
the update operation of another transaction.
22
Temporary Update (Dirty Read) Problem
• Reading uncommitted data (write-read conflict)
• This occurs when one transaction updates a database item and then the
transaction fails for some reason.
• The updated item is accessed by another transaction before it is changed
back to its original value.
23
Incorrect Summary Problem
• If one transaction is calculating an aggregate summary function on a
number of records
• while other transactions are updating some of these records before the first
transaction is committed.
• the aggregate function may calculate some values before they are updated
and others after they are updated.
T1 T2
sum=0
Read_item(A)
Sum=sum+A
Read_Item(B)
Sum=sum+B
Read_item(B)
B=50
Read_item(C)
Su=sum+C
24
Concept of schedules and serializability
Schedules-Definition
• Transaction schedule or history:
• When transactions are executing concurrently in an interleaved fashion, the
order of execution of operations from the various transactions forms what is
known as a transaction schedule (history).
• A schedule S of n transactions T1, T2, …, Tn:
• It is an ordering of the operations of the transactions subject to the
constraint.
• for each transaction Ti that participates in S, the operations of T1 in S must
appear in the same order in which they occur in Ti.
• Note, that operations from other transactions Tj can be interleaved with the
operations of Ti in S.
• The scheduler: decides the execution of concurrent database access.
• The order in which 2 actions of a transaction T appear in a schedule must be
the same order as they appear in T. 26
Example:
T1: R(V)W(V)
T2:R(Y)W(Y)
27
Conflict schedules
• Two operations in a schedule are side to be conflict if they satisfy all the three of the following
conditions.
• They belongs to different transaction
• They access the same data item X
• At least one of the operation is a write Item(X)
• Three types of conflict:
• Read-write conflict(RW)
• Write-read conflict(WR)
• Write-write conflict(WW)
Eg. Sa: r1(X); r2(x); w1(X); r1(Y); W2(X); W1(Y);
r1(X) and w2(X)
r2(X) and w1(X); Conflict operations, why?
W1(X) and w2(X)
r1(X) and r2(X)
W2(X) and w1(Y) No Conflict, why?
28
R1(X) and w(X)
Types of Schedules
• Schedules classified based on Recoverability : Classified into four types:
1. Recoverable schedule:
• One where no committed transaction needs to be rolled back.
• A schedule S is recoverable if no transaction Tj in S commits until all transactions Ti that
have written an item that Tj reads have committed.
• Examples:
• Sc: r1(X); w1(X); r2(X); r1(Y);w2(x);c2;a1; not recoverable
32
Schedules based on Serializability
• Result equivalent:
• Two schedules are called result equivalent if they produce the same final state of
the database
• Two types of equivalent schedule:
• Conflict and view
i. Conflict equivalent:
• Two schedules are said to be conflict equivalent if the order of any two
conflicting operations is the same in both schedules. Eg
• S1: r1(x); w2(x) & S2: w2(x); r1(x)
Not conflict equivalent
• S1: w1(x); w2(x); & S2: w2(x); w1(x);
• Conflict serializable:
• A schedule S is said to be conflict serializable if it is conflict equivalent to
some serial schedule S’.
• Every conflict serializable schedule is serializable . 33
Conflict Equivalence
• If you can transform an interleaved schedule by swapping consecutive non-
conflicting operations of different transactions into a serial schedule, then
the original schedule is conflict serializable.
• Example:
R(A) W(A)
R(B) W(B)
35
View Equivalence
• A less restrictive definition of equivalence of schedules View serializability:
• A schedule is view serializable if it is view equivalent to a serial schedule.
• Two schedules are said to be view equivalent if the following three conditions hold:
• The same set of transactions participates in S and S’, and S and S’ include the same
operations of those transactions.
• If Ti reads a value A written by Tj in S1 , it must also read the value of A written
by Tj in S2
• for each data object A, the transaction that perform the final write on x in S1
must also perform the final write on A in S2
S
S’
• The two are same under constrained write assumption which assumes that
if T writes X, it is constrained by the value of X it read; i.e., new X = f(old X)
• Any conflict serializable schedule is also view serializable, but not vice versa.
37
Relationship between view and conflict equivalence…
• Consider the following schedule of three transactions
• T1: r1(X), w1(X); T2: w2(X); and T3: w3(X):
• Schedule Sa: r1(X); w2(X); w1(X); w3(X); c1; c2; c3;
• In Sa, the operations w2(X) and w3(X) are blind writes, since T1 and T3 do not read the value
of X.
• Sa is view serializable, since it is view equivalent to the serial schedule T1, T2, T3.
• However, Sa is not conflict serializable, since it is not conflict equivalent to any serial
schedule.
Testing for conflict serializability: Algorithm
• Looks at only read_Item (X) & write_Item (X) operations
• Constructs a precedence graph (serialization graph) - a graph with directed edges
• An edge is created from Ti to Tj if one of the operations in Ti appears before a
conflicting operation in Tj
• The schedule is serializable if and only if the precedence graph has no cycles.
38
Constructing the Precedence Graphs
• FIGURE 5: Constructing the precedence graphs for schedules A and D from Figure
4 (from slide No 30) to test for conflict serializability.
• (a) Precedence graph for serial schedule A.
• (b) Precedence graph for serial schedule B.
• (c) Precedence graph for schedule C (not serializable).
• (d) Precedence graph for schedule D (serializable, equivalent to schedule A).
39
40
Summery of Schedule types
41
Transaction Support in SQL
• A single SQL statement is always considered to be atomic.
• Either the statement completes execution without error or it fails and leaves the database
unchanged.
• Every transaction has three characteristics: Access mode, Diagnostic size and isolation
1. Access mode: READ ONLY or READ WRITE
• If the access mode is Read ONLY , INSERT, DELET , UPDATE & CREATE
commands cannot be executed on the data base
• The default is READ WRITE unless the isolation level of READ UNCOMITTED is
specified, in which case READ ONLY is assumed.
2. Diagnostic size n, specifies an integer value n, indicating the number of error conditions
that can be held simultaneously in the diagnostic area.
3. Isolation level can be
• READ UNCOMMITTED,
• READ COMMITTED,
• REPEATABLE READ or
• SERIALIZABLE. The default is SERIALIZABLE.
42
Transaction Support in SQL …
• Sample SQL transaction:
EXEC SQL whenever sqlerror go to UNDO;
EXEC SQL SET TRANSACTION
READ WRITE
DIAGNOSTICS SIZE 5
ISOLATION LEVEL SERIALIZABLE;
EXEC SQL INSERT
INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY)
VALUES ('Robert','Smith','991004321',2,35000);
EXEC SQL UPDATE EMPLOYEE
SET SALARY = SALARY * 1.1
WHERE DNO = 2;
EXEC SQL COMMIT;
GOTO THE_END;
UNDO: EXEC SQL ROLLBACK;
THE_END: ...
44
iii. Overwriting Uncommitted Data: WW Conflicts
• A transaction T2 could overwrite the value of an object A, which has already been
modified by a transaction T1, while T1 is still in progress.
iv. Phantoms:
• New rows being read using the same read with a condition.
• A transaction T1 may read a set of rows from a table, perhaps based on some
condition specified in the SQL WHERE clause.
• Now suppose that a transaction T2 inserts a new row that also satisfies the
WHERE clause condition of T1, into the table used by T1.
• If T1 is repeated, then T1 will see a row that previously did not exist, called a
phantom.
45
Possible Violation of Serializabilty
Type of Violation_______________
Isolation Dirty non repeatable
level read read phantom
_______________________________________________________
READ UNCOMMITTED yes yes yes
READ COMMITTED no yes yes
REPEATABLE READ no no yes
SERIALIZABLE no no no
46
End of Chapter Three
Any question??