0% found this document useful (0 votes)
5 views

Chapter 3 Transaction Processing Concepts

transaction processing

Uploaded by

mtaddis19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter 3 Transaction Processing Concepts

transaction processing

Uploaded by

mtaddis19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 87

Chapter 3

Transaction Processing Concepts


Difference between single and multi- User
system?
• Single-User System
– At most one user at a time can use the DBMS.
• Eg. Personal computer system.
• Multi-user System
– Many users can access the DBMS
concurrently.
• Eg. Airline reservation, Bank and the like system
are operated by many users who submit transaction
concurrently to the system.
– This is achieved by multi programming, which
allows the computer to execute multiple
programs /processes at the same time.
Concurrency of Transaction
• Interleaved processing
– Concurrent execution of processes is interleaved in a
single CPU that using round robin algorithm as an
examples.
– The advantages interleaving process are
• keeps the CPU busy when the process requires I/O by switching
to execute another process rather than remaining idle during
I/O time and
• Increase system throughput (average no. of transactions
completed within a given time) and
• prevents long process from delaying other processes (minimize
unpredictable delay in the response time).
• Parallel processing: -The process is concurrently
executed in multiple CPUs.
What is Transaction?
• It is a unit of program execution that
accesses and possibly updates various
data items.
• Transaction is initiated by user program
written in
• A high-level data-manipulation language (typically SQL), or

• Programming language (for example, C++, or Java), with


embedded database accesses in JDBC or ODBC.
Cont’d
• It is a logical unit of database processing that
includes one or more access operations (read -
retrieval, write - insert or update, delete).
– Examples include ATM transactions, credit card
approvals, flight reservations, hotel check-in,
phone calls, supermarket scanning, academic
registration and billing.

• It is a collection of operations that form a single


logical unit of work.
Transaction boundaries
• Any single transaction in an application
program is bounded with Begin and End
statements.
– An application program may contain several
transactions separated by the Begin and End
transaction boundaries.

• This collection of steps must appear to the


user as a single, indivisible unit.
Cont’d
• Since transaction is indivisible, it either
executes in its entirety or not at all.
– Thus, if a transaction begins to execute but
fails for whatever reason, any changes to the
database that the transaction may have made
must be undone.
• This requirement holds regardless of
whether the transaction itself failed. For
example,
– if it divided by zero,
– the operating system crashed, or
– The computer itself stopped operating.
ACID properties of the transactions

• Atomicity.
– Either all operations of the transaction are reflected properly in the

database, or none are.

– This “all-or-none” property is referred to as atomicity.

• Consistency.
– Execution of a transaction in isolation (with no other transaction

executing concurrently) preserves the consistency of the database.

– A transaction must preserve database consistency—if a transaction

is run atomically in isolation starting from a consistent database, the

database must again be consistent at the end of the transaction.


Cont’d
• Isolation
– Even though multiple transactions may execute
concurrently,
• The system guarantees that, for every pair of
transactions Ti and Tj, it appears to Ti that either Tj
finished execution before Ti started or Tj started
execution after Ti finished.

– Thus, each transaction is unaware of other


transactions executing concurrently in the system.
Cont’d
• The database system must take special actions
– to ensure that transactions operate properly without
interference from concurrently executing database
statements. This property is referred to as isolation.

• Durability
– After a transaction completes successfully, the changes it
has made to the database persist, even if there are
system failures.
Simple Model of a Database
• A database is a collection of data items.
– Because SQL is a powerful and complex language, we
focus on when data are moved from disk to main
memory and main memory to disk.

• Granularity of data a field, a record ,or a whole


disk block that measure the size of the data item

• Basic operations that a transaction can perform are


read and write.
Cont’d
• Transactions access data using two
operations.
– read_item(X):
• Reads a database item named X into a program variable.

• To simplify our notation, we assume that the program


variable is also named X.

– write_item(X):
• Writes the value of program variable X into the database
item named X.
Cont’d
• Basic unit of data transfer from the disk to the

computer main memory is one block.

• read_item(X) command includes the following steps:

1. Find the address of the disk block that contains item X.

2. Copy that disk block into a buffer in main memory (if

that disk block is not already in some main memory buffer).

3. Copy item X from the buffer to the program variable

named X.
Cont’d
• write_item(X) command includes the following steps:
1. Find the address of the disk block that contains item X.

2. Copy that disk block into a buffer in main memory (if


that disk block is not already in some main memory buffer).

3. Copy item X from the program variable named X into its


correct location in the buffer.

4. Store the updated block from the buffer back to disk (either
immediately or later).
Cont’d
• The DBMS maintains a number of buffers in the main

memory that holds database disk blocks which

contains the database items being processed.


– When this buffers are occupied and if there is a need for

additional database block to be copied to the main memory.

• Some buffer management policy is used to choose for

replacement but if the chosen buffer has been modified,

it must be written back to disk before it is used.


Transaction Atomicity and Durability

• A transaction may not always complete its

execution successfully.
– Such a transaction is aborted.

• If we are to ensure the atomicity property, an

aborted transaction must have no effect on the

state of the database.


– Thus, any changes that the aborted transaction made to

the database must be undone.


Cont’d
– Once the changes caused by an aborted transaction have

been undone, we say that the transaction has been rolled

back.

• It is part of the responsibility of the recovery scheme

to manage transaction aborts.


– This is done typically by maintaining a log.

• Each database modification made by a transaction is

first recorded in the log.


Cont’d

• We record

– the identifier of the transaction performing the

modification,

– the identifier of the data item being modified, and

– both the old value (prior to modification) and the

new value (after modification) of the data item.

• Only then is the database itself modified.


Cont’d

• Maintaining a log provides


– the possibility of redoing a modification to ensure
atomicity and durability.
– the possibility of undoing a modification to ensure
atomicity in case of a failure during transaction
execution.

• A transaction that completes its execution


successfully is said to be committed.
Cont’d
• A committed transaction that has performed updates

transforms the database into a new consistent state,


which must persist even if there is a system failure.
– Once a transaction has committed, we cannot undo its effects by

aborting it.

• The only way to undo the effects of a committed transaction

is to execute a compensating transaction (pay costs).


– For instance, if a transaction added $20 to an account, the

compensating transaction would subtract $20 from the account.


Transaction States
– Active: - the transaction stays in this state while it is in
executing.

– Partially committed, after the final statement has been


executed.

– Failed, after the discovery that normal execution can no longer


proceed.

– Aborted, after the transaction has been rolled back and the
database has been restored to its state prior to the start of
the transaction.

– Committed, after successful completion.


Cont’d
• A transaction has committed only if it has entered the
committed state.
• A transaction has aborted only if it has entered the
aborted state.
• A transaction is said to have terminated if it has either
committed or aborted.
Cont’d

Figure 1: - State diagram of a


transaction
Cont’d
• A transaction starts in the active state.

• When it finishes its final statement, it enters the


partially committed state.
– At this point, the transaction has completed its
execution, but it is still possible that it may have to be
aborted, since
• The actual output may still be temporarily residing in main memory,
and

• Thus a hardware failure may prevent its successful completion.


Cont’d

• The database system then writes out enough

information to disk that,


– even in the event of a failure, the updates

performed by the transaction can be re-created

when the system restarts after the failure.

• When the last of this information is written out,

the transaction enters the committed state.


Cont’d

• A transaction enters the failed state


after the system determines that
– The transaction can no longer proceed
with its normal execution (example,
because of hardware or logical errors).

• Such a transaction must be rolled back.


– Then, it enters the aborted state.
Cont’d
• At this point(Aborted state), the system has two options:

1. It can restart the transaction, but only if the transaction was


aborted as a result of some hardware or software error that
was not created through the internal logic of the transaction.
– A restarted transaction is considered to be a new transaction.

2. It can kill the transaction.


– It usually does so because of some internal logical error that can be
corrected only by rewriting the application program, or because the input
was bad, or because the desired data were not found in the database.
Transaction Isolation
• Transaction-processing systems usually allow
multiple transactions to run concurrently.
• Allowing multiple transactions to update data
concurrently causes several problems with
consistency of the data.
Cont’d
• There are two good reasons for allowing concurrency:

1. Improved throughput and resource utilization.


– To increases the throughput of the system—that is, the

number of transactions executed in a given amount of time.

– Correspondingly, the processor (CPU) and disk

utilization also increase;

– In other words, the processor and disk spend less time idle,

or not performing any useful work.


Cont’d
2. Reduced waiting time
– If transactions run serially, a short transaction may have to wait for a

preceding long transaction to complete, which can lead to unpredictable

delays in running a transaction.

– If the transactions are operating on different parts of the database, it is

better to let them run concurrently, sharing the CPU cycles and disk

accesses among them.

– Concurrent execution reduces the unpredictable delays in running

transactions. Moreover, it also reduces the average response time:

the average time for a transaction to be completed after it has been

submitted.
Cont’d

• The database system must control


– The interaction among the concurrent
transactions to prevent them from destroying
the consistency of the database.

• It does so through a variety of


mechanisms called concurrency-control
schemes.
Cont’d
• Example: - Let T1 and T2 be two transactions
that transfer money from one account to
another.
Cont’d
• Suppose the current values of accounts A and B are 1000

birr and 2000 birr, respectively.

• Suppose two transactions are executed one at a time in

the order T1 followed by T2.

• In the figure, the sequence of instruction steps is in

chronological order from top to bottom.


Cont’d

• The final values of accounts A and B, after the

execution in Figure 2 takes place, are 855 birr

and 2145 birr, respectively.

• Thus, the total amount of money in accounts A

and B—that is, the sum A + B—is preserved after

the execution of both transactions.


Cont’d
Cont’d
• Schedule 1 is serial schedule in which T1 is

followed by T2.

• Similarly, if the transactions are executed one at a

time in the order T2 followed by T1, then the

corresponding execution sequence is that of Figure 3.


– Again, as expected, the sum A + B is preserved, and the

final values of accounts A and B are 850 birr and 2150

birr, respectively.
Cont’d
Cont’d
• The execution sequences just described are called
schedules.
– They represent the chronological order in which instructions
are executed in the system.

– These schedules are serial.

• Each serial schedule consists of a sequence of


instructions from various transactions, where the
instructions belonging to one single transaction appear
together in that schedule.
Cont’d
• When the database system executes several transactions

concurrently, the corresponding schedule no longer needs to

be serial.
– If two transactions are running concurrently,
• the operating system may execute one transaction for a little while, then

perform a context switch, execute the second transaction for some time,

and then switch back to the first transaction for some time, and so on.

• With multiple transactions, the CPU time is shared among all

the transactions.
Cont’d
• Not all concurrent executions result in a correct state.

• Consider the schedule of Figure 5.


– After the execution of this schedule, we arrive at a state where

the final values of accounts A and B are 950 birr and 2100

birr, respectively.

– This final state is an inconsistent state, since we have gained

50 birr in the process of the concurrent execution.

– Indeed, the sum A + B is not preserved by the execution of the

two transactions.
Cont’d
• It is the job of the database system to ensure that
any schedule that is executed will leave the
database in a consistent state.
• The concurrency-control component of the
database system carries out this task.
Cont’d
• We can ensure consistency of the database under
concurrent execution
– by making sure that any schedule that is executed has
the same effect as a schedule that could have occurred
without any concurrent execution.

• That is, the schedule should, in some sense, be


equivalent to a serial schedule.
• Such schedules are called serializable schedules.
Serializable
• Serial schedules are serializable, but if steps of multiple

transactions are interleaved, it is harder to determine whether a

schedule is serializable or not.

• Since transactions are programs, it is difficult to determine

exactly
– What operations a transaction performs and

– How operations of various transactions interact.

• For this reason, we shall consider only on two types of operations

that a transaction can perform on a data item: read and write.


Cont’d
• We assume that, between a read (Q) instruction

and a write (Q) instruction on a data item Q,


– A transaction may perform an arbitrary sequence of

operations on the copy of Q that is residing in the local

buffer of the transaction.

• Let us consider a schedule S in which there are two

consecutive instructions, I and J, of transactions

Ti and Tj, respectively (i = j).


Cont’d
• If I and J refer to different data items, then we can swap (exchange)

I and J without affecting the results of any instruction in the schedule.

• If I and J refer to the same data item Q, then the order of the two

steps may matter.


1. I = read (Q), J = read (Q). The order of I and J does not matter.

2. I = read (Q), J = write (Q).

– If I comes before J, then Ti does not read the value of Q that is written by Tj

in instruction J .

– If J comes before I, then Ti reads the value of Q that is written by Tj.

– Thus, the order of I and J matters.


Cont’d
3. I = write (Q), J = read (Q).
– The order of I and J matters for reasons similar to those of the previous case.

4. I = write (Q), J = write (Q).


– Since both instructions are write operations, the order of these instructions does

not affect either Ti or Tj.

– However, the value obtained by the next read (Q) instruction of S is affected,

since the result of only the latter of the two write instructions is preserved in the

database.

• If there is no other write (Q) instruction after I and J in S, then the order

of I and J directly affects the final value of Q in the database state that

results from schedule S.


Cont’d
• We say that I and J conflict
– if they are operations by different transactions on the same data item,

and at least one of these instructions is a write operation.

• To illustrate the concept of conflicting instructions, we consider

schedule 3.
– The write (A) instruction of T1 conflicts with the read (A)

instruction of T2.

– However, the write (A) instruction of T2 does not conflict with the

read (B) instruction of T1, because the two instructions access

different data items.


Cont’d
Cont’d
Cont’d
• Let I and J be consecutive instructions of a schedule S.
– If I and J are instructions of different transactions and I and J do not

conflict, then we can swap the order of I and J to produce a new

schedule S.

• S is equivalent to S, since all instructions appear in the same

order in both schedules except for I and J, whose order does not

matter.
– Since the write (A) instruction of T2 in schedule 3 does not conflict

with the read (B) instruction of T1, we can swap these instructions

to generate an equivalent schedule, schedule 5, in Figure 7.


Cont’d
• Regardless of the initial system state, schedules 3 and 5 both

produce the same final system state.


– We continue to swap non conflicting instructions:

– Swap the read (B) instruction of T1 with the read (A) instruction of T2.

– Swap the write (B) instruction of T1 with the write (A) instruction of T2.

• Swap the write (B) instruction of T1 with the read (A) instruction

of T2.

• The final result of these swaps, schedule 6 of Figure 8, is a serial

schedule.
Transaction Isolation and Atomicity
• Effect of transaction failures during concurrent
execution.
Cont’d
• If a transaction Ti fails, for whatever reason, we need to
undo the effect of this transaction to ensure the
atomicity property of the transaction.
– In a system that allows concurrent execution, the atomicity

property requires that any transaction Tj that is dependent on Ti


(that is, Tj has read data written by Ti) is also aborted.

– To achieve this, we need to place restrictions on the type of

schedules permitted in the system. In the following two


subsections, we address the issue of what schedules are acceptable
from the viewpoint of recovery from transaction failure.
Recoverable Schedules

• A recoverable schedule is one


where,
– for each pair of transactions Ti and Tj
such that Tj reads a data item
previously written by Ti, the commit
operation of Ti appears before the
commit operation of Tj.
Cascadeless Schedules

• Even if a schedule is recoverable, to recover

correctly from the failure of a transaction Ti, we

may have to roll back several transactions.

• Such situations occur if transactions have read

data written by Ti.


Cont’d
• Example

– transaction T8 writes a value of A that is read by transaction T9.

– Transaction T9 writes a value of A that is read by transaction T10.


• Suppose that, at this point, T8 fails. T8 must be rolled back. Since T9 is

dependent on T8, T9 must be rolled back. Since T10 is dependent on T9,

T10 must be rolled back.

• This phenomenon, in which a single transaction failure leads to

a series of transaction rollbacks, is called cascading rollback.


Cont’d
Cont’d
• Cascading rollback is undesirable, since it leads to the undoing of

a significant amount of work.

• It is desirable to restrict the schedules to those where cascading

rollbacks cannot occur.

• Such schedules are called Cascadeless schedules.


– Formally, a Cascadeless schedule is one where, for each pair of

transactions Ti and Tj such that Tj reads a data item previously written

by Ti, the commit operation of Ti appears before the read operation of Tj.

– It is easy to verify that every Cascadeless schedule is also

recoverable.
Transaction Isolation Levels
• Serializability is allows programmers to ignore issues

related to concurrency when they code transactions.

• The SQL standard also allows a transaction to specify that it

may be executed in such a way that it becomes non-

serializable with respect to other transactions.


– For instance, a transaction may operate at the isolation level of

read uncommitted, which permits the transaction to read a data

item even if it was written by a transaction that has not been

committed.
The isolation levels specified by the SQL
standard are as follows:
• Serializable usually ensures serializable execution.

• Repeatable read allows only committed data to be read and

further requires that, between two reads of a data item by a

transaction, no other transaction is allowed to update it

• Read committed allows only committed data to be read, but

does not require repeatable reads.


– For instance, between two reads of a data item by the transaction,

another transaction may have updated the data item and

committed.
Cont’d
• Read uncommitted allows uncommitted data to be

read.
– It is the lowest isolation level allowed by SQL.

• All the isolation levels above additionally disallow

dirty writes, that is,


– they disallow writes to a data item that has already been

written by another transaction that has not yet

committed or aborted.
Locking
• Instead of locking the entire database, a transaction

could, lock only those data items that it accesses.

• Under such a policy, the transaction must hold locks long

enough to ensure Serializability, but for a period short

enough not to harm performance excessively.

• There are two kinds of locks.

1. Shared locks are used for data that the transaction reads and

2. Exclusive locks are used for those it writes.


Cont’d
• Many transactions can hold shared locks on the
same data item at the same time,
– but a transaction is allowed an exclusive lock on a data
item only if no other transaction holds any lock (regardless
of whether shared or exclusive) on the data item.

• This use of two modes of locks along with two-


phase locking allows concurrent reading of
data while still ensuring Serializability.
Transactions as SQL Statements
• In SQL, insert statements create new data and delete

statements delete data.


– These two statements are, write operations, since they change

the database, but their interactions with the actions of other

transactions are different.

• Example, consider the following SQL query on our

university database that finds all instructors who earn more

than 90,000 birr.


– Select ID, name from instructor where salary > 90000;
Cont’d
• Using our sample instructor relation,

– we find that only Ein-stein and Brandt satisfy the

condition.

• Now assume that around the same time we are

running our query, another user inserts a new

instructor named James‖ whose salary is $100,000.


– Insert into instructor values (’11111’, ’James’,

’Marketing’, 100000);
Cont’d
• The result of our query will be different depending on

whether this insert comes before or after our

query is run.
– In a concurrent execution of these transactions, it is

intuitively clear that they conflict, but this is a conflict not

captured by our simple model.

– This situation is referred to as the phantom phenomenon,

because a conflict may exist on phantom data.


Cont’d

• But in an SQL statement, the specific data

items (tuples) referenced may be determined

by a where clause predicate.

• So the same transaction, if run more than

once, might reference different data items

each time it is run if the values in the

database change between runs.


Cont’d

• One way of dealing with the above problem is

to recognize that it is not sufficient for

concurrency control to consider only the tuples

that are accessed by a transaction;


– the information used to find the tuples that are

accessed by the transaction must also be

considered for the purpose of concurrency control.


Cont’d
• The information used to find tuples could be updated by an

insertion or deletion, or in the case of an index, even by an

update to a search-key attribute.


– For example, if locking is used for concurrency control, the data

structures that track the tuples in a relation, as well as index structures,

must be appropriately locked.

• However, such locking can lead to poor concurrency in some

situations; index-locking protocols which maximize concurrency,

while ensuring Serializability in spite of inserts, deletes, and

predicates in queries.
Cont’d
• Let us consider again the query:
– Select ID, name from instructor where salary>
90000;

• And the following SQL update:


– Update instructor set salary = salary *0.9 where
name = ’Wu’;
Cont’d
• If our query reads the entire instructor relation,

then it reads the tuple with Wu’s data and conflicts

with the update.


– However, if an index were available that allowed our

query direct access to those tuples with salary > 90000,


• then our query would not have accessed Wu’s data at all

because Wu’s salary is initially 90,000 birr in our example

instructor relation, and reduces to 81,000 birr after the update.


Cont’d
– However, using the above approach, it would appear that

the existence of a conflict depends on a low-level query

processing decision by the system that is unrelated to a

user-level view of the meaning of the two SQL statements!

– An alternative approach to concurrency control treats an

insert, delete or update as conflicting with a predicate

on a relation, if it could affect the set of tuples selected by

a predicate.
Cont’d
• In our example query above,
– the predicate is salary > 90000‖, and an update of Wu’s
salary from 90,000 birr to a value greater than 90,000
birr, or an update of Einstein’s salary from a value
greater than 90,000 birr to a value less than or equal to
90,000 birr, would conflict with this predicate.

• Locking based on this idea is called predicate


locking; however predicate locking is expensive, and
not used in practice.
Summary
• A transaction is a unit of program execution that accesses and

updates various data items.

• Transactions are required to have the ACID properties:

atomicity, consistency, isolation, and durability.


– Atomicity ensures that either all the effects of a transaction are

reflected in the database, or none are; a failure cannot leave the

database in a state where a transaction is partially executed.

– Consistency ensures that, if the database is initially consistent, the

execution of the transaction (by itself) leaves the database in a

consistent state.
Cont’d

– Isolation ensures that concurrently executing

transactions are isolated from one another, so

that each has the impression that no other

transaction is executing concurrently with it.

– Durability ensures that, once a transaction has

been committed, that transaction’s updates do

not get lost, even if there is a system failure.


Cont’d
• Concurrent execution of transactions improves throughput of transactions

and system utilization, and also reduces waiting time of transactions.

• The various types of storage in a computer are volatile storage, nonvolatile

storage, and stable storage.

• Data in volatile storage, such as in RAM, are lost when the computer crashes.

• Data in nonvolatile storage, such as disk, are not lost when the computer

crashes, but may occasionally be lost because of failures such as disk crashes.

• Data in stable storage are never lost.


Cont’d
• Stable storage that must be accessible online is approximated with mirrored

disks, or other forms of RAID, which provide redundant data storage.

• Offline or archival, stable storage may consist of multiple tape copies of data

stored in physically secure locations.

• When several transactions execute concurrently on the database, the

consistency of data may no longer be preserved.

• It is therefore necessary for the system to control the interaction among the

concurrent transactions.
Cont’d
• Since a transaction is a unit that preserves consistency, a serial

execution of transactions guarantees that consistency is

preserved.

• A schedule captures the key actions of transactions that affect

concurrent execution, such as read and write operations, while

abstracting away internal details of the execution of the

transaction.
– We require that any schedule produced by concurrent processing of a set

of transactions will have an effect equivalent to a schedule produced

when these transactions are run serially in some order.


Cont’d
• A system that guarantees this property is said to ensure

serializability.

• There are several different notions of equivalence

leading to the concepts of conflict serializability and

view serializability.

• Serializability of schedules generated by concurrently executing

transactions can be ensured through one of a variety of mechanisms

called concurrency-control policies.


Cont’d

• We can test a given schedule for conflict

serializability by constructing a precedence graph

for the schedule, and by searching for absence of

cycles in the graph.

• However, there are more efficient concurrency-

control policies for ensuring serializability.


Cont’d
• Schedules must be recoverable, to make sure that if transaction a

sees the effects of transaction b, and b then aborts, then a also

gets aborted.
– Schedules should preferably be Cascadeless, so that the abort of a

transaction does not result in cascading aborts of other transactions.

– Cascadelessness is ensured by allowing transactions to only read

committed data.

• The concurrency-control–management component of the

database is responsible for handling the concurrency-control

policies.
Questions

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy