Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

DBMS TRANSACTIONS part 1&2, Lecture notes of Database Management Systems (DBMS)

Transactions in database management system

Typology: Lecture notes

2018/2019

Uploaded on 11/08/2019

rp-tv
rp-tv 🇮🇳

1 document

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
UNIT III TRANSACTION PROCESSING AND CONCURRENCY CONTROL
Introduction-Properties of Transaction- Serializability- Concurrency Control Locking
Mechanisms-Two Phase Commit Protocol-Dead lock.
Transaction
A transaction is a unit of program execution that accesses and updates various data
items.
Usually, a transaction is initiated by a user program written in a high-level data-
manipulation language or programming language (for example, SQL, COBOL, C, C++,
or Java), where it is delimited by statements (or function calls) of the form begin
transaction and end transaction.
The transaction consists of all operations executed between the begin transaction
and end transaction.
ACID Properties
To ensure integrity of the data, we require that the database system maintain the following
properties of the transactions:
1. Atomicity. A transaction is said to be atomic if all the operation of the transaction are
executed at once or none of them is executed.
2. Consistency. A database is said to be consistent if a transaction is executed in a isolated
manner.
3. Isolation. Even though multiple transactions may execute concurrently, the system
guarantees that, for every pair of transactions Ti and Tj, it appears to Ti that either Tj finished
execution before Ti started, or Tj started execution after Ti finished. Thus, each transaction
is unaware of other transactions executing concurrently in the system.
4. Durability. After a transaction completes successfully, the changes it has made to the
database persist, even if there are system failures.
These properties are often called the ACID properties; the acronym is derived from the first
letter of each of the four properties.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download DBMS TRANSACTIONS part 1&2 and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!

UNIT III TRANSACTION PROCESSING AND CONCURRENCY CONTROL

Introduction-Properties of Transaction- Serializability- Concurrency Control – Locking

Mechanisms-Two Phase Commit Protocol-Dead lock.

Transaction

A transaction is a unit of program execution that accesses and updates various data

items.

Usually, a transaction is initiated by a user program written in a high-level data-

manipulation language or programming language (for example, SQL, COBOL, C, C++,

or Java), where it is delimited by statements (or function calls) of the form begin

transaction and end transaction.

The transaction consists of all operations executed between the begin transaction

and end transaction.

ACID Properties

To ensure integrity of the data, we require that the database system maintain the following

properties of the transactions:

1. Atomicity. A transaction is said to be atomic if all the operation of the transaction are

executed at once or none of them is executed.

2. Consistency. A database is said to be consistent if a transaction is executed in a isolated

manner.

3. Isolation. Even though multiple transactions may execute concurrently, the system

guarantees that, for every pair of transactions Ti and Tj, it appears to Ti that either Tj finished

execution before Ti started, or Tj started execution after Ti finished. Thus, each transaction

is unaware of other transactions executing concurrently in the system.

4. Durability. After a transaction completes successfully, the changes it has made to the

database persist, even if there are system failures.

These properties are often called the ACID properties ; the acronym is derived from the first

letter of each of the four properties.

Let Ti be a transaction that transfers 50 from account A to account B. This transaction can be

defined as:

Read (A);

A := A − 50;

Write (A);

read(B);

B := B + 50;

Write (B).

read (X), which transfers the data item X from the database to a local buffer belonging to

the transaction that executed the read operation.

write (X), which transfers the data item X from the the local buffer of the transaction that

executed the write back to the database.

Let us now consider each of the ACID requirements. (For ease of presentation, we consider

them in an order different from the order A-C-I-D).

Consistency : The consistency requirement here is that the sum of A and B be unchanged by

the execution of the transaction. Without the consistency requirement, money could be

created or destroyed by the transaction! It can be verified easily that, if the database is

consistent before an execution of the transaction, the database remains consistent after the

execution of the transaction. Ensuring consistency for an individual transaction is the

responsibility of the application programmer who codes the transaction.

Atomicity : Suppose that, just before the execution of transaction Ti the values of accounts A

and B are 1000 and 2000, respectively. Now suppose that, during the execution of

transaction Ti, a failure occurs that prevents Ti from completing its execution successfully.

Examples of such failures include power failures, hardware failures, and software errors.

Question 1 : During its execution, a transaction passes through several states, until it finally

commits or aborts. List all possible sequences of states through which a transaction may

pass. Explain why each state transition may occur.

Answer: The possible sequences of states are:-

a. active→ partially committed →committed. This is the normal sequence a successful

transaction will follow. After executing all its statements it enters the partially committed

state. After enough recovery information has been written to disk, the transaction finally

enters the committed state.

b. active → partially committed → aborted. After executing the last statement of the

transaction, it enters the partially committed state. But before enough recovery information

is written to disk, a hardware failure may occur destroying the memory contents. In this case

the changes which it made to the database are undone, and the transaction enters the

aborted state.

c. active →failed →aborted. After the transaction starts, if it is discovered at some point that

normal execution cannot continue (either due to internal program errors or external errors),

it enters the failed state. It is then rolled back, after which it enters the aborted state.

Concurrent Executions

Transaction-processing systems usually allow multiple transactions to run

concurrently. Allowing multiple transactions to update data concurrently causes

several complications with consistency of the data. Ensuring consistency in spite of

concurrent execution of transactions requires extra work; it is far easier to insist that

transactions run serially —that is, one at a time, each starting only after the previous

one has completed.The motivation for using concurrent execution in a database is

essentially the same as the motivation for using multiprogramming in an operating

system.

There are two good reasons for allowing concurrency:

  1. Improved throughput and resource utilization
  2. Reduced waiting time.

Conflict Serializability

Let us consider a schedule S in which there are two consecutive instructions Ii and Ij, of

transactions Ti and Tj , respectively (i _= j). If Ii and Ij refer to different data items, then we

can swap Ii and Ij without affecting the results of any instruction in the schedule. However, if

Ii and Ij refer to the same data item Q, then the order of the two steps may matter. Since we

are dealing with only read and write instructions, there are four cases that we need to

consider:

1. Ii = read(Q), Ij = read(Q). The order of Ii and Ij does not matter, since the same value of Q is

read by Ti and Tj , regardless of the order.

2. Ii = read(Q), Ij = write(Q). If Ii comes before Ij, then Ti does not read the value of Q that is

written by Tj in instruction Ij. If Ij comes before Ii, then Ti reads the value of Q that is written

by Tj. Thus, the order of Ii and Ij matters.

3. Ii = write(Q), Ij = read(Q). The order of Ii and Ij matters for reasons similar to those of the

previous case.

4. Ii = write(Q), Ij = write(Q). Since both instructions are write operations, the order of these

instructions does not affect either Ti or Tj. However, the value obtained by the next read(Q)

instruction of S is affected, since the result of only the latter of the two write instructions is

preserved in the database. If there is no other write (Q) instruction after Ii and Ij in S, then

the order of Ii and Ij directly affects the final value of Q in the database state that results

from schedule S.

View Serializability

Consider two schedules S and S’, where the same set of transactions participates in both

schedules. The schedules S and S’ are said to be view equivalent if three conditions are met:

schedule T2 with the single edge T2 - > T1, since all the instructions of T2 are executed before

the first instructions of T1 is executed.

If the precedence graph for S has a cycle then schedule S is not conflict serializable. If

the graph contains no cycle, then the schedule is known as conflict serializable.

A serializability order of the transactions can be obtained through topological sorting,

which determines a linear order consistent with the partial partial order of the precedence

graph.

Illustration of topological sorting

Thus, to test for conflict serializability, we need to construct the precedence graph and to

invoke a cycle detection algorithm. Cycle detection algorithm can be found in standard

textbooks on algorithms. Cycle detection algorithms, such as those based on depth first

search, require on the order of n2 operations, and where n is the number of vertices in the

graph (that is, number of transactions). Thus we have a practical scheme for determining

conflict serializability.

Transaction Definition in SQL

A data-manipulation language must include a construct for specifying the set of actions that

constitute a transaction. The SQL standard specifies that a transaction begins implicitly.

Transactions are ended by one of these SQL statements:

  • Commit work commits the current transaction and begins a new one.
  • Rollback work causes the current transaction to abort.

There a simple and efficient method for determining conflict serializability of a schedule.

Consider a schedule S. Then construct a directed graph, called a precedence graph , from S. This

graph consists of a pair G = (V, E), where V is a set of vertices and E is a set of edges. The set of

vertices consists of all the transactions participating in the schedule. The set of edges consists

of all edges T i

→T

j for which one of three conditions holds:

1. Ti executes write(Q) before T j

executes read(Q).

2. Ti executes read(Q) before T j

executes write(Q).

3. Ti executes write(Q) before T j

executes write(Q).

Concurrency Control

Concurrent control is a technique to ensure that concurrent execution of transaction

must results in a consistent database.

It uses certain protocols to ensure the serializability of a schedule of concurrent

executing transactions.

Some of these protocols are:

Lock based protocol

Time stamp based protocol

Lock-Based Protocols

One way to ensure serializability is to require that data items be accessed in a mutually

exclusive manner; that is, while one transaction is accessing a data item, no other

transaction can modify that data item. The most common method used to implement this

requirement is to allow a transaction to access a data item only if it is currently holding a lock

on that item.

Lock

It is a variable associated with a data item X that reflects the status of data item X

with respect to possible operations that can be applied. A data item X can be locked

in two modes:

Shared lock

If a transaction T has obtained shared mode lock, then T can read but cannot write. It

is denoted by S.

Exclusive lock

If a transaction T has obtained exclusive mode lock, then T can both read and write. It

is denoted by X.

Transaction Ti may unlock a data item that it had locked at some earlier point. Note

that a transaction must hold a lock on a data item as long as it accesses that item.

Moreover, for a transaction to unlock a data item immediately after its final access of

that data item is not always desirable, since serializability may not be ensured.

Concurrency-

T1 T2 Control Manager

lock-X(B)

grant-X(B, T1)

read(B)

B := B 50

write(B)

unlock(B) lock-S(A)

grant-S(A, T2)

read(A)

unlock(A)

lock-S(B)

grant-S(B, T2)

read(B)

unlock(B)

display(A + B)

lock-X(A) grant-X(A, T2)

read(A)

A := A + 50

write(A)

unlock(A)

Granting of Locks

When a transaction requests a lock on a data item in a particular mode, and no other

transaction has a lock on the same data item in a conflicting mode, the lock can be granted.

However, care must be taken to avoid the following scenario. Suppose a transaction T2 has a

shared-mode lock on a data item, and another transaction T1 requests an exclusive-mode

lock on the data item. Clearly, T1 has to wait for T2 to release the shared-mode lock.

Meanwhile, a transaction T3 may request a shared-mode lock on the same data item. The

lock request is compatible with the lock granted to T2, so T3 may be granted the shared-

mode lock. At this point T2 may release the lock, but still T1 has to wait for T3 to finish. But

again, there may be a new transaction T4 that requests a shared-mode lock on the same

data item, and is granted the lock before T3 releases it. In fact, it is possible that there is a

sequence of transactions that each requests a shared-mode lock on the data item, and each

transaction releases the lock a short while after it is granted, but T1 never gets the exclusive-

mode lock on the data item. The transaction T1 may never make progress, and is said to be

starved.

The Two-Phase Locking Protocol

One protocol that ensures serializability is the two-phase locking protocol. This protocol

requires that each transaction issue lock and unlock requests in two phases:

1. Growing phase. A transaction may obtain locks, but may not release any lock. 2. Shrinking phase. A transaction may release locks, but may not obtain any new locks.

Initially, a transaction is in the growing phase. The transaction acquires locks as needed.

Once the transaction releases a lock, it enters the shrinking phase, and it can issue no more

lock requests.

Timestamps base Protocol

Timestamps:

With each transaction Ti in the system, we associate a unique fixed timestamp, denoted by

TS(Ti). This timestamp is assigned by the database system before the transaction Ti starts

execution. If a transaction Ti has been assigned timestamp TS(Ti), and a new transaction Tj

enters the system, then TS(Ti) < TS(Tj ). There are two simple methods for implementing this

scheme:

wait–die

The wait–die scheme is a non-preemptive technique. When transaction Ti requests a data

item currently held by T j

, T

i

is allowed to wait only if it has a timestamp smaller than that of

T

j

(that is, Ti is older than T j

). Otherwise, Ti is rolled back (dies).

For example, suppose that transactions T22, T23, and T24 have timestamps 5, 10, and 15,

respectively. If T22 requests a data item held by T23, then T22 will wait. If T24 requests a data

item held by T23, then T24 will be rolled back.

wound–wait

The wound–wait scheme is a preemptive technique. It is a counterpart to the wait–die

scheme. When transaction Ti requests a data item currently held by T j

, T

i

is allowed to wait

only if it has a timestamp larger than that of T j (that is, Ti is younger than T j ). Otherwise, T j is

rolled back (T j

is wounded by T i

Returning to our example, with transactions T22, T23, and T24, if T22 requests a data item

held by T23, then the data item will be preempted from T23, and T23 will be rolled back. If

T24 requests a data item held by T23, then T24 will wait.

Deadlock Detection and Recovery

If a system does not employ some protocol that ensures deadlock freedom, then a detection

and recovery scheme must be used. An algorithm that examines the state of the system is

invoked periodically to determine whether a deadlock has occurred. If one has, then the

system must attempt to recover from the deadlock. To do so, the system must:

O Maintain information about the current allocation of data items to

transactions, as well as any outstanding data item requests.

O Provide an algorithm that uses this information to determine whether the

system has entered a deadlock state.

O Recover from the deadlock when the detection algorithm determines that a

deadlock exists.

Database Recovery System

There are various types of failure that may occur in a system, each of which needs to be

dealt with in a different manner.

Transaction failure. There are two types of errors that may cause a transaction to fail:

Logical error. The transaction can no longer continue with its normal execution

because of some internal condition, such as bad input, data not found, overflow, or

resource limit exceeded.

System error. The system has entered an undesirable state (for example, deadlock),

as a result of which a transaction cannot continue with its normal execution. The

transaction, however, can be reexecuted at a later time.

System crash. There is a hardware malfunction, or a bug in the database software or the

operating system, that causes the loss of the content of volatile storage, and brings transaction

processing to a halt. The content of nonvolatile storage remains intact, and is not corrupted.

Disk failure. A disk block loses its content as a result of either a head crash or failure during a

data transfer operation. Copies of the data on other disks, or archival backups on tertiary

media, such as tapes, are used to recovers from the failure.

Storage Types

Volatile storage. Information residing in volatile storage does not usually survive

system crashes. Examples of such storage are main memory and cache memory.

Access to volatile storage is extremely fast, both because of the speed of the

memory access itself, and because it is possible to access any data item in volatile

storage directly.

Nonvolatile storage. Information residing in nonvolatile storage survives system

crashes. Examples of such storage are disk and magnetic tapes. Disks are used for

online storage, whereas tapes are used for archival storage.

Stable storage. Information residing in stable storage is never lost.

Deferred Database Modification

The deferred-modification technique ensures transaction atomicity by recording all

database modifications in the log, but deferring the execution of all write operations

of a transaction until the transaction partially commits.

When a transaction partially commits, the information on the log associated with the

transaction is used in executing the deferred writes. If the system crashes before the

transaction completes its execution, or if the transaction aborts, then the

information on the log is simply ignored.

The execution of transaction Ti proceeds as follows. Before T i

starts its execution, a

record <T i

start> is written to the log. A write(X) operation by T i

results in the writing

of a new record to the log. Finally, when T i

partially commits, a record <T i

commit> is

written to the log.

Immediate Database Modification

The immediate-modification technique allows database modifications to be output to

the database while the transaction is still in the active state. Data modifications

written by active transactions are called uncommitted modifications.

Before a transaction Ti starts its execution, the system writes the record <T i

start> to

the log. During its execution, any write(X) operation by T i is preceded by the writing

of the appropriate new update record to the log. When T i

partially commits, the

system writes the record <T i

commit> to the log.