DBMS II
TYBSC(CS)
THE CONCEPT OF A TRANSACTION
 A transaction is an execution of a user program, as a series of reads and writes of database
objects.
 Database ‘objects’ are the units in which programs read or write information.
 The units could be pages, records, and so on, but this is dependent on the DBMS and is not
central to the principles underlying concurrency control or recovery.
ACID PROPERTIES
There are four important properties of transactions that a DBMS must ensure to maintain data in
the face of concurrent access and system failures.
1. ATOMICITY:
Users should be able to regard the execution of each transaction as atomic: either all actions are
carried out or none are. Users should not have to worry about the effect of incomplete
transactions (say, when a system crash occurs).
2. CONSISTENCY:
Each transaction, run by itself with no concurrent execution of other transactions, must preserve
the consistency of the database. This property is called consistency, and the DBMS assumes that it
holds for each transaction. Ensuring this property of a transaction is the responsibility of the user.
3. ISOLATION:
Users should be able to understand a transaction without considering the effect of other
concurrently executing transactions, even if the DBMS interleaves the actions of several
transactions for performance reasons. This property is sometimes referred to as isolation:
Transactions are isolated, or protected, from the effects of concurrently scheduling other
transactions.
4. DURABILITY:
Once the DBMS informs the user that a transaction has been successfully completed, its effects
should persist even if the system crashes before all its changes are reflected on disk. This property
is called durability.
The acronym ACID is sometimes used to refer to the four properties of transactions that we have
presented here: atomicity, consistency, isolation and durability.
Atomicity and Durability
Transactions can be incomplete for three kinds of reasons.
1. A transaction can be aborted, or terminated unsuccessfully, by the DBMS because some
anomaly arises during execution. If a transaction is aborted by the DBMS for some internal
reason, it is automatically restarted and executed a new.
2. The system may crash (e.g. Power supply is interrupted) while one or more transactions are in
progress.
3. A transaction may encounter an unexpected situation (for example, read an unexpected data
value or be unable to access some disk) and decide to abort (i.e., terminate itself).
How DBMS does maintain Atomicity and Durability?
 A transaction that is interrupted in the middle may leave the database in an inconsistent
state.
 Thus a DBMS must find a way to remove the effects of partial transactions from the
database, that is, it must ensure transaction atomicity: either all of a transaction's actions
are carried out, or none are.
 A DBMS ensures transaction atomicity by undoing the actions of incompletetransactions.
 This means that users can ignore incomplete transactions in thinking about how the
database is modified by transactions over time.
 To be able to do this, the DBMS maintains a record, called the log, of all writes to the
database.
 The log is also used to ensure durability: If the system crashes before the changes made by
a completed transaction are written to disk, the log is used to remember and restore these
changes when the system restarts.
 The DBMS component that ensures atomicity and durability is called the recoveryManager.
TRANSACTIONS AND SCHEDULES
A transaction is seen by the DBMS as a series, or list, of actions.
The actions that can be executed by a transaction include
1. reads: RT (O) :- Transaction T is reading database object O.
2. writes: WT (O) :- Transaction T is writing database object O.
3. commit (i.e., complete successfully)
4. abort (i.e., terminate and undo all the actions carried out thus far).
When the transaction T is clear from the context, we can omit the subscript.
Schedule:
A schedule is a list of actions (reading, writing, aborting, or committing) from a set of
transactions, and the order in which two actions of a transaction T appear in a schedule must be
the same as the order in which they appear in T. Intuitively, a schedule represents an actual or
potential execution sequence.
For example, the following schedule shows an execution order for actions of two
transactions T1 and T2. The schedule does not contain an abort or commit action for either
transaction.
A Schedule Involving Two Transactions
Complete Schedule:
A schedule that contains either an abort or a commit for each transaction whose actions are
listed in it is called a complete schedule. A complete schedule must contain all the actions of every
transaction that appears in it.
T1 T2
R(A)
W(A)
R(B)
W(B)
commit
R( C)
W( C)
commit
Serial Schedule:
If the actions of different transactions are not interleaved that is, transactions are executed
from start to finish, one by one we call the schedule a serial schedule.
or
T1 T2 T1 T2
R(A) R(B)
W(A) W(B)
R( C) commit
W( C) R(A)
Commit W(A)
R(B) R( C)
W(B) W(C )
Commit commit
Serial Schedule
CONCURRENT EXECUTION OF TRANSACTIONS
The DBMS interleaves the actions of different transactions to improve performance, in terms of
increased throughput or improved response times for short transactions, but not all
interleavings should be allowed.
Motivation for Concurrent Execution
Ensuring transaction isolation while permitting concurrent execution is difficult, but is necessary
for following performance reasons.
A Complete schedule
1. While one transaction is waiting for a page to be read in from disk, the CPU can process
another transaction. Overlapping I/O and CPU activity reduces the amount of time disks and
processors are idle, and increases system throughput (the average number of transactions
completed in a given time).
2. Interleaved execution of a short transaction with a long transaction usually allows the short
transaction to complete quickly. In serial execution, a short transaction could get stuck
behind a long transaction leading to unpredictable delays in response time, or average
time taken to complete a transaction.
CONCURRENCY CONTROL
Serializability :-
Definition:
A serializable schedule over a set S of transactions is a schedule whose effect on any consistent
database instance is guaranteed to be identical to that of some complete serial schedule over the
set of committed transactions in S.
T1 T2
R(A)
W(A)
R(B)
W(B)
commit
R(C)
W(C)
commit
A Serializable Schedule
Example:
Given schedule is serializable because if we execute the schedule its effect on any consistent
database instance is same as executing T! followed by T2 or T2 followed by T1 that is its effect is
same to some serial order. That is, the database instance that results from executing the given
schedule is identical to the database instance that results from executing the transactions in some
serial order.
Some important points:
1. Executing the transactions serially in different orders may produce different results, but all
are presumed to be acceptable; the DBMS makes no guarantees about which of them will be
the outcome of an interleaved execution.
2. If a transaction computes a value and prints it to the screen, this is an `effect' that is not
directly captured
in the state of the database. We will assume that all such values are also written into the
database, for simplicity.
Some Anomalies Associated with Interleaved Execution
Anomalies occur with inconsistent database when schedule involves conflicting actions.
Two actions on the same data object conflict if
1. Schedule involves at least two transactions.
2. Both transactions use same data object.
3. One of them is modifying data object(i. e have write operation).
The three anomalous situations can be described in terms of when the actions of two transactions
T1 and T2 conflict with each other.
1. Reading Uncommitted Data (WR Conflicts)
2. Unrepeatable Reads (RW Conflicts)
3. Overwriting Uncommitted Data (WW Conflicts)
1. Reading Uncommitted Data (WR Conflicts)
Consider two transactions T1 and T2. A transaction T2 could read a database object A that
has been modified by another transaction T1, which has not yet committed. Such a read is called a
dirty read.
Example:
 Consider two transactions T1 and T2, each of which, run alone, preserves database
consistency: T1 transfers Rs.1000 from A to B, and T2 increments both A and B by 6 percent
(e.g., annual interest is deposited into these two accounts).
 Suppose that their actions are interleaved so that the account transfer program T1 deducts
Rs1000 from account A, then the interest deposit program T2 reads the current values of
accounts A and B and adds 6 percent interest to each, and then the account transfer
program credits Rs1000 to account B.
 The result of this schedule is different from any result that we would get by running one of
the two transactions first and then the other. The problem can be traced to the fact that the
value of A written by T1 is read by T2 before T1 has completed all its changes.
Note:
Although a transaction must leave a database in a consistent state after it completes, it is not
required to keep the database consistent while it is still in progress. Such a requirement would be
too restrictive: To transfer money from one account to another, a transaction must debit one
account, temporarily leaving the database inconsistent, and then credit the second account,
restoring consistency again.
2. Unrepeatable Reads (RWConflicts)
 A transaction T2 could change the value of an object A that has been read by a transaction
T1, while T1 is still in progress.
 This situation causes two problems.
(1) If T1 tries to read the value of A again, it will get a different result, even though it has not
modified A in the meantime. This situation could not arise in a serial execution of two
transactions; it is called an unrepeatable read.
(2) suppose that A is the number of available copies for a book. A transaction that places an
order first reads A, checks that it is greater than 0, and then decrements it. Transaction T1
reads A and sees the value 1. Transaction T2 also reads A and sees value 1, decrements A to
0, and commits. Transaction T1 then tries to decrement A and gets an error (violation of
integrity constraint).
T1 T2
R(A)
R(A)
W(A)
W(B)
commit
W(A)
commit
Unrepeatable read
3. Overwriting Uncommitted Data (WW Conflicts)
 A transaction T2 could overwrite the value of an object A, which has already been modified
by a transaction T1, while T1 is still in progress.
 Example:
o Suppose that Harry and Larry are two employees, and their salaries must be kept
equal.
o Transaction T1 sets their salaries to Rs1,000 and transaction T2 sets their salaries to
Rs2,000.
o If we execute these in the serial order T1 followed by T 2, both receive the salary
$2,000; the serial order T2 followed by T 1 gives each the salary $1,000.
o Either of these is acceptable from a consistency standpoint (although Harry and
Larry may prefer a higher salary!).
o Here, neither transaction reads a salary value before writing it; such a write is called
a blind write.
o Now, consider the following interleaving of the actions of T1 and T2:
 T1 sets Harry's salary to $1,000,
 T2 sets Larry's salary to $2,000,
 T1 sets Larry's salary to $1,000,
 and finally T2 sets Harry's salary to $2,000.
o The result is not identical to the result of either of the two possible serial executions,
and the interleaved schedule is therefore not serializable. It violates the desired
consistency criterion that the two salaries must be equal.
o The problem is that we have lost update. The first transaction to commit, T2,
overwrote Larry’s salary as set by T1 and we can not get it back.
T1 T2
W(H) - 1000
2000 - W(T)
W(T) - 1000
commit
commit
Overwriting uncommitted data
Schedules Involving Aborted Transactions
Unrecoverable
Schedule:
Example:
Suppose that (1) an account transfer program T1 deducts $100 from account A, then (2) an
interest deposit program T2 reads the current values of accounts A and B and adds 6 percent
interest to each, then commits, and then (3) T1 is aborted. Now, T2 has read a value for A that
should never have been there! If T2 had not yet committed, we could deal with the situation by
cascading the abort of T1 and also aborting T2; this process would recursively abort any
transaction that read data written by T2, and so on. But T2 has already committed, and so we
cannot undo its actions! We say that such a schedule is unrecoverable.
Recoverable Schedule:
A recoverable schedule is one in which transactions commit only after (and if!) all
transactions whose changes they read commit. If transactions read only the changes of committed
transactions, not only is the schedule recoverable, but also aborting a transaction can be
accomplished without cascading the abort to other transactions. Such a schedule is said to avoid
cascading aborts.
Problem in undoing the actions of a transaction
 Consider two transactions T1 and T2. Refer the schedule given in above figure(Unrecoverable
schedule).Suppose that a transaction T2 overwrites the value of an object A that has been
modified by a transaction T1, while T1 is still in progress, and T1 subsequentlyaborts.
 All of T1's changes to database objects are undone by restoring the value of any object that it
modified to the value of the object before T1's changes.
 When T1 is aborted, and its changes are undone in this manner, T2's changes are lost as well,
even if T2 decides to commit.
 So, for example, if A originally had the value 5, then was changed by T1 to 6, and by T2 to 7, if T1
now aborts, the value of A becomes 5 again.
 Even if T2 commits, its change to A is inadvertently lost
LOCK-BASED CONCURRENCY CONTROL
 A DBMS must be able to ensure that only serializable, recoverable schedules are allowed,
and that no actions of committed transactions are lost while undoing aborted
transactions.
 A DBMS typically uses a locking protocol to achieve this.
 A locking protocol is a set of rules to be followed by each transaction (and enforced by the
DBMS), in order to ensure that even though actions of several transactions might be
interleaved, the net effect is identical to executing all transactions in some serial order.
Strict Two-Phase Locking (Strict 2PL)
 The most widely used locking protocol, called Strict Two-Phase Locking, or Strict 2PL, has
two rules.
(1) If a transaction T wants to read (respectively, modify) an object, it first requests
a shared (respectively, exclusive) lock on the object.
(2) All locks held by a transaction are released when the transaction is completed.
 A transaction that has an exclusive lock can also read the object; an additional shared lock is
not required.
 A transaction that requests a lock is suspended until the DBMS is able to grant it the
requested lock.
 The DBMS keeps track of the locks it has granted and ensures that if a transaction holds an
exclusive lock on an object, no other transaction holds a shared or exclusive lock on the
same object.
 Requests to acquire and release locks can be automatically inserted into transactions by the
DBMS; users need not worry about these details.
 But these rule reduces of strict 2PL the concurrency.
 To increase the concurrency without sacrificing serializability we can modify second rule of
Strict 2PL as: A transaction can not request additional locks once it releases any lock.
 That is it allows transactions to release locks before the end, that is before the commit or
abort action.
 Thus, every transaction has a `growing' phase in which it acquires locks, followed by a
`shrinking' phase in which it releases locks.
Safe interleaving
The locking protocol allows only `safe' interleavings of transactions.
(1) If two transactions access completely independent parts of the database, they will be able to
concurrently obtain the locks that they need and proceed on their ways.
(2)if two transactions access the same object, and one of them wants to modify it, their actions are
effectively ordered serially, all actions of one of these transactions (the one that gets the lock on the
common object first) are completed before (this lock is released and) the other transaction can
proceed.
ST (O): Transaction T requesting a shared lock on object O.
XT (O): Transaction T requesting a exclusive lock on object O.
Schedule illustrating Strict 2PL
Consider the following schedule.
T1 T2
R(A)
W(A)
R(A)
W(A)
R(B)
W(B)
commit
R(B)
W(A)
commit
 This interleaving could result in a state that cannot result from any serial execution of the three
transactions. For instance,
(1) T1 could change A from 10 to 20,
(2) then T2 (which reads the value 20 for A) could change B from 100 to 200, and
(3) then T1 would read the value 200 for B.
 If run serially, either T1 or T2 would execute FIrst, and read the values 10 for A and 100 forB.
 Clearly, the interleaved execution is not equivalent to either serial execution.
 If the Strict 2PL protocol is used, the above interleaving is disallowed.
 Assuming that the transactions proceed at the same relative speed as before, T1
would obtain an exclusive lock on A first and then read and write A.
Schedule Illustrating Strict 2PL
 Then, T2 would request a lock on A. However, this request cannot be granted until T1 releases
its exclusive lock on A, and the DBMS therefore suspends T2.
 T1 now proceeds to obtain an exclusive lock on B, reads and writes B, then finally commits, at
which time its locks are released.
 T2's lock request is now granted, and it proceeds. In this example the locking protocol results in
a serial execution of the two transactions,
 In general, however, the actions of different transactions could be interleaved.
T1 T2
X(A)
R(A)
W(A)
X(B)
R(B)
W(B)
commit
X(A)
R(A)
W(A)
X(B)
R(B)
W(B)
commit
Illustrating strict 2 PL with serial
exicution
 In the above schedule, first transaction T1 request for the shared lock on data object A. then
transaction T2 request for shared lock data object A.
 Since we can have two shared locks on the same database object at the same time DBMS grant
this request. Then T2 requests for the exclusive lock on data object B.
T1 T2
S(A)
R(A)
S(A)
R(A)
X(B)
R(B)
W(B)
commit
X(C)
R(C)
W(C)
commit
 T2 reads and writes B, then finally commits, at which time its locks arereleased.
 Now T1 requests for exclusive lock on data object C. DBMS grants this request.
 T1 reads and writes C, then finally commits, at which time its locks arereleased.
 In this case, we don’t have conflicting actions thus interleaving is allowed.
LOCK-BASED CONCURRENCY CONTROL REVISITED
Conflict serializability
A schedule is conflict serializable if it is conflict equivalent to some serial schedule.
Conflict Equivalent
 Two schedules are said to be conflict equivalent if they involve the (same set of) actions of
the same transactions and they order every pair of conflicting actions of two committed
transactions in the same way.
 Two actions conflict if they operate on the same data object and at least one of them is a
write.
 The outcome of a schedule depends only on the order of conflicting operations; we can
interchange any pair of non-conflicting operations without altering the effect of the
schedule on the database.
 If two schedules are conflict equivalent, it is easy to see that they have the same effect on a
database.
 Indeed, because they order all pairs of conflicting operations in the same way, we can obtain
one of them from the other by repeatedly swapping pairs of non-conflicting actions, that is,
by swapping pairs of actions whose relative order does not alter the outcome.
 Every conflict serializable schedule is serializable, if we assume that the set of items in the
database does not grow or shrink; that is, values can be modified but items are not added or
deleted.
 However, some serializable schedules are not conflict serializable, This schedule is
equivalent to executing the transactions serially in the order T1, T2, T3, but it is not conflict
equivalent to this serial schedule because the writes of T1 and T2 are ordereddifferently.
Example:
Consider the following schedule:
T2
T1
W(X)
S R(Y)
R(Y)
Commit
R(X)
Commit
Conflict serializability
To check if this schedule is conflict serializable or not we will have to check if it is conflict
equivalent to some serial schedule of it.
Consider serial schedule of it.
T1 T2
W(X)
R(Y)
Commit
R(Y)
R(X)
Commit
Serial schedule of given schedule
In this case conflicting actions are the actions on data object X and conflicting actions are ordered in
same way in both the schedules. So two schedules are conflict equivalent and thus given schedule is
conflict serializable.
Example:
Precedence graph:
a
 A schedule S is conflict serializable if and only if its precedence graph is acyclic.
 Strict 2PL ensures that the precedence graph for any schedule that it allows is acyclic.
 It is useful to capture all potential conflicts between the transactions in a schedule in
precedence graph, also called a serializability graph.
 The precedence graph for a schedule S contains:
o A node for each committed transaction in S.
o An arc from Ti to Tj if an action of Ti precedes and conflicts with one of Tj'sactions.
 If precedence graph contains cycle schedule is not conflict serializable.
 Consider, the following transaction
 The precedence graph for it is asfollows
Strict Schedule:
 A schedule is said to be strict if a value written by a transaction T is not read or overwritten
by other transactions until T either aborts or commits.
 Strict schedules are recoverable, do not require cascading aborts, and actions of aborted
transactions can be undone by restoring the original values of modified objects.
 Strict 2PL improves upon 2PL by guaranteeing that every allowed schedule is strict, in
addition to being conflict serializable.
 The reason is that when a transaction T writes an object under Strict 2PL, it holds the
(exclusive) lock until it commits or aborts. Thus, no other transaction can see or modify this
object until T is complete.
View Serializability
 A schedule is view serializable if it is view equivalent to some serial schedule.
 View Equivalent
o wo schedules S1 and S2 over the same set of transactions i. e. any transaction that
appears in either S1 or S2 must also appear in the other are view equivalent if the
satisfy following conditions:
i. If Ti reads the initial value of object A in S1, it must also read the initial value of A
in S2.
ii. If Ti reads a value of A written by Tj in S1, it must also read the value of A written
by Tj in S2.
iii. For each data object A, the transaction (if any) that performs the final write on A
in S1 must also perform the final write on A in S2.
 Every conflict serializable schedule is view serializable, although the converse is not true.
 It can be shown that any view serializable schedule that is not conflict serializable contains
one or more blind writes.
LOCK MANAGEMENT
 The part of the DBMS that keeps track of the locks issued to transactions is called the lock
manager.
 The lock manager maintains a lock table, which is a hash table with the data object
identifier as the key. The DBMS also maintains a descriptive entry for each transaction in a
transaction table also contains other things such as the entry that contains a pointer to a
list of locks held by the transaction.
 A lock table entry for an object which can be a page, a record, and so on, depending on the
DBMS contains the following information: the number of transactions currently holding a
lock on the object (this can be more than one if the object is locked in shared mode), the
nature of the lock (shared or exclusive), and a pointer to a queue of lockrequests.
Implementing Lock and Unlock Requests
 According to the Strict 2PL protocol, before a transaction T reads or writes a database
object O, it must obtain a shared or exclusive lock on O and must hold on to the lock until it
commits or aborts.
 When a transaction needs a lock on an object, it issues a lock request to the lock manager:
(1)If a shared lock is requested, the queue of requests is empty, and the object is not
currently locked in exclusive mode, the lock manager grants the lock and updates
the lock table entry for the object (indicating that the object is locked in shared
mode, and incrementing the number of transactions holding a lock by one).
(2)If an exclusive lock is requested, and no transaction currently holds a lock on the
object which also implies the queue of requests is empty), the lock manager grants
the lock and updates the lock table entry.
(3)Otherwise, the requested lock cannot be immediately granted, and the lock request
is added to the queue of lock requests for this object. The transaction requesting the
lock is suspended.
 When a transaction aborts or commits, it releases all its locks. When a lock on an object is
released, the lock manager updates the lock table entry for the object and examines the lock
request at the head of the queue for this object.
 If this request can now be granted, the transaction that made the request is woken up and
given the lock. Indeed, if there are several requests for a shared lock on the object at the
front of the queue, all of these requests can now be granted together.
Note:
 If T1 has a shared lock on O, and T2 requests an exclusive lock, T2's request is queued. Now,
if T3 requests a shared lock, its request enters the queue behind that of T2, even though the
requested lock is compatible with the lock held by T1.
 This rule ensures that T2 does not starve, that is, wait indefinitely while a stream of other
transactions acquire shared locks and thereby prevent T2 from getting the exclusive lock
that it is waiting for.
Atomicity of Locking and Unlocking
 The implementation of lock and unlock commands must ensure that these are atomic
operations.
 To ensure atomicity of these operations when several instances of the lock manager code
can execute concurrently, access to the lock table has to be guarded by an operating system
synchronization mechanism such as a semaphore.
 Suppose that a transaction requests an exclusive lock. The lock manager checks and finds
that no other transaction holds a lock on the object and therefore decides to grant the
request.
 But in the meantime, another transaction might have requested and received a conflicting
lock!
 To prevent this, the entire sequence of actions in a lock request call (checking to see if the
request can be granted, updating the lock table, etc.) must be implemented as an atomic
operation.
LOCK Conversions
 The DBMS maintains a transaction table, which contains (among other things) a list of the
locks currently held by a transaction.
 This list can be checked before requesting a lock, to ensure that the same transaction does
not request the same lock twice.
 However, a transaction may need to acquire an exclusive lock on an object for which it
already holds a shared lock. Such a lock upgrade request is handled specially by granting
the write lock immediately if no other transaction holds a shared lock on the object and
inserting the request at the front of the queue otherwise.
 The rationale for favoring the transaction thus is that it already holds a shared lock on the
object and queuing it behind another transaction that wants an exclusive lock on the same
object causes both transactions to wait for each other and therefore be blockedforever.
 Lock upgrades leads to a deadlocks caused by conflicting upgrade requests.
 For example, , if two transactions that hold a shared lock on an object both request an
upgrade to an exclusive lock, this leads to a deadlock because first transaction is waiting for
second to release its lock and other transaction is waiting for first transaction to release its
lock.]
 A better approach is to avoid the need for lock upgrades altogether by obtaining exclusive
locks initially and downgrading to a shared lock once it is clear that this issufficient.
 For example of an SQL update statement, rows in a table are locked in exclusive mode first.
If a row does not satisfy the condition for being updated, the lock on the row is downgraded
to a shared lock.
 The downgrade approach reduces concurrency by obtaining white lock some cases where
they are not require.
 On the whole require, however, it improve through put by reducing dead locks.
 This approach is therefore widely used current commercial system.
 Concurrency can be increase by introducing new kind of lock, called and update lock that is
compatible with shared lock but not other update and exclusive lock.
 By setting and update lock initially, rather than exclusive lock, we prevent conflict with read
operation.
 Once we are sure we need not update object, we can downgrade to the shared lock.
 If we need to update the object, we must first upgrade to an exclusive lock. This upgrade
does not lead to a deadlock because no other transaction can have an upgrade or exclusive
on the object.
Additional Issues: Lock Upgrades, Convoys, Latches
 We have concentrated thus far on how the DBMS schedules transactions, based on their
requests for locks. This interleaving interacts with the operating system's scheduling of
processes access to the CPU and can lead to a situation called a convoy, where most of the
CPU cycles are spent on process switching. The problem is that a transaction T holding a
heavily used lock may be suspended by the operating system. Until T is resumed, every
other transaction that needs this lock is queued. Such queues, called convoys, can quickly
become very long; a convoy, once formed, tends to be stable.
 Convoys are one of the drawbacks of building a DBMS on top of a general-purpose operating
system with preemptive scheduling.
 In addition to locks, which are held over a long duration, a DBMS also supports short
duration latches. Setting a latch before reading or writing a page ensures that the physical
read or write operation is atomic; otherwise, two read/write operations might conflict if the
objects being locked do not correspond to disk pages (the units of I/O).
 Latches are unset immediately after the physical read or write operation is completed.
Deadlocks
 Consider the following example:
 Transaction T1 gets an exclusive lock on object A, T2 gets an exclusive lock on B.
 T1 requests an exclusive lock on B and is queued, and T2 requests an exclusive lock on A
and is queued. Now, T1 is waiting for T2 to release its lock and T2 is waiting for T1 to
release its lock!
 Such a cycle of transactions waiting for locks to be released is called a deadlock.
 Clearly, these two transactions will make no further progress. Worse, they hold locks that
may be required by other transactions.
 The DBMS must either prevent or detect (and resolve) such deadlock situations.
Deadlock Detection
 In the detection approach, the DBMS must periodically check for deadlocks.
 When a transaction Ti is suspended because a lock that it requests cannot be granted, it
must wait until all transactions Tj that currently hold conflicting locks releasethem.
 The lock manager maintains a structure called a waits-for graph to detect deadlockcycles.
 The nodes correspond to active transactions, and there is an arc from Ti to Tj if (and only
if) Ti is waiting for Tj to release a lock.
 The lock manager adds edges to this graph when it queues lock requests and removes edges
when it grants lock requests.
 The waits-for graph describes all active transactions, some of which will eventually abort.
 The waits-for graph is periodically checked for cycles, which indicate deadlock.
 A deadlock is resolved by aborting a transaction that is on a cycle and releasing its locks.
 This action allows some of the waiting transactions to proceed.
 The choice of which transaction to abort can be made using several criteria’s like :
o The one with the fewest locks
o The one that has done the least work
o The one that is farthest from completion
 The transaction might have been repeatedly restarted; if so, it should eventually be favored
during deadlock detection and allowed to complete.
Consider the schedule The last step, shown below the line, creates a cycle in the waits-for graph.
Figure 19.4 shows the waits-for graph before and after this step.
Deadlock Prevention
 We can prevent deadlocks by giving each transaction a priority and ensuring that lower
priority transactions are not allowed to wait for higher priority transactions (or vice versa).
 One way to assign priorities is to give each transaction a timestamp when it starts up. The
lower the timestamp, the higher the transaction's priority, that is, the oldest transaction has
the highest priority.
 If a transaction Ti requests a lock and transaction Tj holds a conflicting lock,
 The lock manager can use one of the following two policies:
Wait-die: If Ti has higher priority, it is allowed to wait; otherwise it is aborted.
Wound-wait: If Ti has higher priority, abort Tj; otherwise Ti waits.
Wait-die policy:
 In the wait-die scheme, lower priority transactions can never wait for higher priority
transactions.
 The wait-die scheme is non preemptive; only a transaction requesting a lock can beaborted.
Wound-wait policy
 In the wound-wait scheme, higher priority transactions never wait for lower priority
transactions.
 In either case no deadlock cycle can develop.
 The wound-wait scheme is preemptive.
 Advantage:
A transaction that has all the locks it needs will never be aborted for deadlock reasons.
 Disadvantage:
A younger transaction that conflicts with an older transaction may be repeatedly aborted
 We must also ensure that no transaction is perennially aborted because it never has a
sufficiently high priority. (Note that in both schemes, the higher priority transaction is
never aborted.)
 When a transaction is aborted and restarted, it should be given the same timestamp that it
had originally.
 Reissuing timestamps in this way ensures that each transaction will eventually become the
oldest transaction, and thus the one with the highest priority, and will get all the locks that
it requires.
 As a transaction grows older (and its priority increases), it tends to wait for more and more
younger transactions
Conservative 2PL:-
 A variant of 2PL, called Conservative 2PL, can also prevent deadlocks. Under Conservative
2PL, a transaction obtains all the locks it will ever need when it begins, or blocks waiting for
these lock to become available.
 This scheme ensure that there will be no deadlocks, and, perhaps more important, that a
transaction that already holds some locks will not block waiting for other locks.
 If lock contention is heavy, Conservative 2PL can reduce the time that locks are held on
average, because transactions that hold locks are blocked.
Timestamp-Based Concurrency Control
 In optimistic concurrency control, a timestamp ordering is imposed on transactions, and
validation
Check that all conflicting actions occurred in the same order.
 Timestamps can also be used in another way: each transaction can be assigned a timestamp
at startup, and we can ensure, at execution time, that if action ai of transaction Ti conflicts
with action aj of transaction Tj, ai occurs before aj if TS(Ti) < TS(Tj).
 If an action violates this ordering, the transaction is aborted and restarted.
 To implement this concurrency control scheme, every database object O is given a read
timestamp RTS(O) and a write timestamp WTS(O).
 If transaction T wants to read object O, and TS(T) < WTS(O), the order of this read with
respect to the most recent write on O would violate the timestamp order between this
transaction and the writer. Therefore, T is aborted and restarted with a new, larger
timestamp.
 If TS(T) > WTS(O), T reads O, and RTS(O) is set to the larger of RTS(O) and TS(T).
 If T is restarted with the same timestamp, it is guaranteed to be aborted again, due to the
same conflict.
 consider when transaction T wants to write object O:
1. If TS (T) < RTS (O), the write action conflicts with the most recent read action of O, and
T is therefore aborted and restarted.
2. If TS (T) < WTS (O), a naïve approach would be to abort T because its write action
conflicts with the most recent of O and is out of timestamp order. However, we can
safely ignore such write and continue. Ignoring outdated write is called the Thomas
Write Rule.
3. Otherwise, T write O and WTS (O) is set to TS (T).
The Thomas Write Rule
If TS(T) <WTS(O),the current write action has, in effect, been made obsolete by the most recent
write of O, which follows the current write according to the timestamp ordering.
We can think of T, s write action as if it had occurred immediately before the most recent write of O
and never read by anyone.
If the Thomas Write Rule is not used, that is T is aborted in case (2), the timestamp protocol, lie 2PL,
allows only conflict serializable schedules.
If the Thomas Write Rule is used, some schedules are permitted that are not conflict serializable,
Consider following schedule, Because T2’s write follows T1’s read and precedes T1’s write of the
same object, this schedule is not conflict serializable.
T1 T2
R (A)
W (A)
Commit
W (A)
Commit
The Thomas Write Rule relies on the observation that T2’s write is never seen by the transaction
and the schedule in following diagram is therefore equivalent to the serializable schedule obtained
by deleting this write action, which is shown in diagram
T1 T2
R (A)
Commit
W (A)
Commit
Recoverability:-
 Unfortunately, the timestamp protocol just presented permits schedules that are not
recoverable.
 If TS(T1) = 1 and TS (T2) = 2, this schedule is permitted by the timestamp protocol (with or
without the Thomas Write Rule).
 The timestamp protocol can be modified to disallow such schedules by buffering all write
actions until the transaction commits.
 In the example, when T1 wants to write A, WTS (A) is updated to reflect this action, but the
change to A is not carried out immediately; instead it is recorded in a private workspace, or
buffer.
 When T2 wants to read A subsequently, its timestamp is compared with WTS(A), and the
read is seen to be permissible.
 However, T2 is blocked until T1 completes. If T1 commits, its change to A is copied from the
buffer; otherwise, the changes in the buffer are discarded. T2 is then allowed to read A.
 This blocking of T2 is similar to the effect of T1 obtaining an exclusive lock on A.
 Nonetheless, even with this modification, the timestamp protocol permits some schedules
not permitted by 2PL; the two protocols are not quite the same.
T1 T2
W(A)
R (A)
W (B)
Commit
 Because recoverability is essential, such a modification must be used for the timestamp
protocol to be practical.
CRASH RECOVERY
 The recovery manager of a DBMS is responsible for ensuring two important propertiesof
transactions: atomicity and durability.
 It ensures atomicity by undoing the actions of transactions that do not commit and
durability by making sure that all actions of committed transactions survive system
crashes.
 The recovery manager must deal with a wide variety of database states because it is called
on during system failures.
INTRODUCTION TO ARIES
 ARIES is a recovery algorithm that is designed to work with a steal, no-forceapproach.
 When the recovery manager is invoked after a crash, restart proceeds in three phases:
1. Analysis: Identifies dirty pages in the buffer pool (i.e., changes that have not been
written to disk) and active transactions at the time of the crash.
2. Redo: Repeats all actions, starting from an appropriate point in the log, and restores
the database state to what it was at the time of the crash.
3. Undo: Undoes the actions of transactions that did not commit, so that the database
reflects only the actions of committed transactions.
Consider the simple execution history.
 When the system is restarted, the Analysis phase identifies T1 and T3 as transactions that
were active at the time of the crash.
 It also identifies the dirty pages p1,p3 and p5.
 All the updates (including those of T1 and T3) are reapplied in the order shown during the
Redo phase.
 Since T2 as a committed transaction all its actions, therefore, to be written to disk.
 Finally, the actions of T1 and T3 are undone in reverse order during the Undo phase; thatis,
T3's write of P3 is undone, T3's write of P1 is undone, and then T1's write of P5 is undone.
Three main principles behind the ARIES recovery algorithm:
Write-ahead logging:
 Any change to a database object is first recorded in the log.
 The record in the log must be written to stable storage before the change to the database
object is written to disk.
Repeating history during Redo:
 Upon restart following a crash, ARIES retraces all actions of the DBMS before the crash and
brings the system back to the exact state that it was in at the time of thecrash.
 Then, it undoes the actions of transactions that were still active at the time of the crash
(effectively aborting them).
Logging changes during Undo:
 Changes made to the database while undoing a transaction are logged to ensure such an
action is not repeated in the event of repeated restart.
The Log
 The log, sometimes called the trail or journal, is a history of actions executed by theDBMS.
 Physically, the log is a file of records stored in stable storage, which is assumed tosurvive
crashes.
 This durability can be achieved by maintaining two or more copies of the log on different
disks (perhaps in different locations), so that the chance of all copies of the log being
simultaneously lost is negligibly small.
 The most recent portion of the log, called the log tail, is kept in main memory and is
periodically forced to stable storage. This way, log records and data records are written to
disk at the same granularity (pages or sets of pages).
 Every log record is given a unique id called the log sequence number (LSN). As with any
record id, we can fetch a log record with one disk access given the LSN.
 LSNs should be assigned in monotonically increasing order; this property is required for the
ARIES recovery algorithm.
 If the log is a sequential file, in principle growing indefinitely, the LSN can simply bethe
address of the first byte of the log record.
 For recovery purposes, every page in the database contains the LSN of the most recent log
record that describes a change to this page. This LSN is called the PageLSN.
 A log record is written for each of the following actions:
Updating a page:
 After modifying the page, an update type record is appended to the log tail.
 The page LSN of the page is then set to the LSN of the update logrecord.
Commit:
 When a transaction decides to commit, it force-writes a commit type logrecord
containing the transaction id.
 That is, the log record is appended to the log, and the log tail is written to stable
storage, up to and including the commit record.
 The transaction is considered to have committed at the instant that its commit log
record is written to stable storage.
Abort:
 When a transaction is aborted, an abort type log record containing thetransaction
id is appended to the log,
 And Undo is initiated for this transaction.
End:
 when a transaction is aborted or committed, some additional actions such as
removing the transaction's entry in the transaction table must be taken beyond
writing the abort or commit log record.
 After all these additional steps are completed, an end type log record containingthe
transaction id is appended to the log.
Undoing an update:
 When a transaction is rolled back (because the transaction is aborted, or during
recovery from a crash), its updates are undone.
 When the action described by an update log record is undone, a compensation log
record, or CLR, is written.
FIELDS FOR LOG RECORD
 Every log record has certain FIelds: prevLSN, transID, and type.
 The set of all log records for a given transaction is maintained as a linked list going
back in time.
 Using the prevLSN field this list must be updated whenever a log record is added.
 The transID field is the id of the transaction generating the log record.
 The type field obviously indicates the type of the log record.
 Additional fields depend on the type of the log record.
ADDITIONAL FIELDS OF UPDATE TYPE OF LOG RECORD
 The fields in an update log record are illustrated in following diagram:
o The pageID field is the page id of the modified page;
o The length in bytes represents the number of bytes modified.
o The offset of the change are also included which represents from whichposition
the has been started.
o The before-image is the value of the changed bytes before the change.
o The after-image is the value after the change.
 An update log record that contains both before- and after-images can be used it redo the
change and undo it.
 A redo – only update log record contains just the after image.
 similarly an undo-only update record contains just the before image.
Compensation log record (CLR)
 A compensation log record (CLR) is written just before the change recorded in an update
log record U is undone.
 A compensation log record C describes the action taken to undo the actions recorded in the
corresponding update log record and is appended to the log tail just like any other log
record.
 The compensation log record C also contains a field called undoNextLSN, which is the LSN
of the next log record that is to be undone for the transaction that wrote update record U;
 This field in C is set to the value of prevLSN in U.
 Unlike an update log record, a CLR describes an action that will never be undone, that is, we
never undo an undo action.
 The reason is simple: an update log record describes a change made by a transaction during
normal execution and the transaction may subsequently be aborted, whereas a CLR
describes an action taken to rollback a transaction for which the decision to abort has
already been made.
 Thus, the transaction must be rolled back, and the undo action described by the CLR is
definitely required.
 The number of CLRs that can be written during Undo is no more than the number of update
log records for active transactions at the time of the crash.
 It may well happen that a CLR is written to stable storage but that the undo action that it
describes is not yet written to disk when the system crashes again.
 In this case the undo action described in the CLR is reapplied during the Redo phase, just
like the action described in update log records.
Other Recovery-Related Data Structures
In addition to the log, the following two tables contain important recovery-related information:
Transaction table:
 This table contains one entry for each active transaction.
 The entry contains (among other things) the transaction id, the status, and a field called
lastLSN, LastLSN is the LSN of the most recent log record for this transaction.
 The status of a transaction can be that it is in progress, is committed, or is aborted. (In the
latter two cases, the transaction will be removed from the table once certain `clean up' steps
are completed.)
Dirty page table:
 This table contains one entry for each dirty page in the buffer pool, that is, each page with
changes that are not yet reflected on disk.
 The entry contains a field recLSN, which is the LSN of the first log record that caused the
page to become dirty.
 Consider the following simple example. Transaction T1000 changes the value of bytes 21 to
23 on page P500 from `ABC' to `DEF',
 Transaction T2000 changes `HIJ' to `KLM' on page P600, transaction T2000 changes bytes
20 through 22 from `GDE' to `QRS' on page P500,
 Then transaction T1000 changes `TUV' to `WXY' on page P505.
 The dirty page table, the transaction table,3 and the log at this instant are shown inFigure
In this case active transactions are T100 and T200. and dirty pages are P500,P505 and P600.
The Write-Ahead Log Protocol
 Before writing a page to disk, every update log record that describes a change to this page
must be forced to stable storage.
 This is accomplished by forcing all log records up to and including the one with LSN equal to
the pageLSN to stable storage before writing the page to disk.
 WAL is the fundamental rule that ensures that a record of every change to the database is
available while attempting to recover from a crash.
 If a transaction made a change and committed, the no-force approach means that some of
these changes may not have been written to disk at the time of a subsequentcrash.
 Without a record of these changes, there would be no way to ensure that the changes of a
committed transaction survive crashes.
 When a transaction is committed, the log tail is forced to stable storage, even if a no-force
approach is being used.
 If a force approach is used, all the pages modified by the transaction, rather than a portion
of the log that includes all its records, must be forced to disk when the transaction commits.
 The set of all changed pages is typically much larger than the log tail because the size of an
update log record is close to (twice) the size of the changed bytes, which is likely to be much
smaller than the page size.
 Cost of forcing the log tail is much smaller than the cost of writing all changed pages todisk.
Checkpointing
 A checkpoint is like a snapshot of the DBMS state, and by taking checkpoints periodically,
the DBMS can reduce the amount of work to be done during restart in the event of a
subsequent crash.
 Checkpointing in ARIES has three steps.
o First, a begin_checkpoint record is written to indicate when the checkpoint starts.
o Second, an end_checkpoint record is constructed, including in it the current
contents of the transaction table and the dirty page table, and appended to the log.
o The third step is carried out after the end_checkpoint record is written to stable
storage: A special master record containing the LSN of the begin_checkpoint log
record is written to a known place on stable storage.
 While the end_checkpoint record is being constructed, the DBMS continues executing
transactions and writing other log records.
 The only guarantee we have is that the transaction table and dirty page table are accurate as
of the time of the begin checkpoint record.
 This kind of checkpoint is called a fuzzy checkpoint and is inexpensive because it does not
require quiescing the system or writing out pages in the buffer pool.
 On the other hand, the effectiveness of this checkpointing technique is limited by the
earliest recLSN of pages in the dirty pages table, because during restart we must redo
changes starting from the log record whose LSN is equal to this recLSN.
 When the system comes back up after a crash, the restart process begins by locating the
most recent checkpoint record.
RECOVERING FROM A SYSTEM CRASH
 When the system is restarted after a crash, the recovery manager proceeds in threephases:
 The Analysis phase begins by examining the most recent begin checkpoint record, whose
LSN is denoted as C and proceeds forward in the log until the last log record.
 The Redo phase follows Analysis and redoes all changes to any page that might have been
dirty at the time of the crash; this set of pages and the starting point for Redo (the smallest
recLSN of any dirty page) are determined during Analysis.
 The Undo phase follows Redo and undoes the changes of all transactions that were active at
the time of the crash; again, this set of transactions is identified during the Analysisphase.
 Redo reapplies changes in the order in which they were originally carried out; Undo
reverses changes in the opposite order, reversing the most recent change first.
Analysis Phase
 The Analysis phase performs three tasks:
1. It determines the point in the log at which to start the Redo pass.
2. It determines (a conservative superset of the) pages in the buffer pool that were
dirty at the time of the crash.
3. It identifies transactions that were active at the time of the crash and must be
undone.
 Analysis begins by examining the most recent begin checkpoint log record and initializing
the dirty page table and transaction table to the copies of those structures in the next end
checkpoint record.
 Thus, these tables are initialized to the set of dirty pages and active transactions at the time
of the checkpoint.
 Analysis then scans the log in the forward direction until it reaches the end of thelog:
 If an end log record for a transaction T is encountered, T is removed from the transaction
table because it is no longer active.
 If a log record other than an end record for a transaction T is encountered, an entry for T is
added to the transaction table if it is not already there. Further, the entry for T ismodified:
1. The lastLSN field is set to the LSN of this log record.
2. If the log record is a commit record, the status is set to C, otherwise it is set to U
(indicating that it is to be undone).
 If a redoable log record affecting page P is encountered, and P is not in the dirty page table,
an entry is inserted into this table with page id P and recLSN equal to the LSN of this
redoable log record. This LSN identifies the oldest change affecting page P that may not have
been written to disk.
 At the end of the Analysis phase, the transaction table contains an accurate list of all
transactions that were active at the time of the crash.
 This is the set of transactions with status U. The dirty page table includes all pages that were
dirty at the time of the crash, but may also contain some pages that were written todisk.
 If an end write log record were written at the completion of each write operation, the dirty
page table constructed during Analysis could be made more accurate, but in ARIES, the
additional cost of writing end write log records is not considered to be worth thegain.
Redo Phase
 During the Redo phase, ARIES reapplies the updates of all transactions, committed or
otherwise. Further, if a transaction was aborted before the crash and its updates were
undone, as indicated by CLRs, the actions described in the CLRs are alsoreapplied.
 This repeating history paradigm distinguishes ARIES from other proposed WAL based
recovery algorithms and causes the database to be brought to the same state that it was in
at the time of the crash.
 The Redo phase begins with the log record that has the smallest recLSN of all pages in the
dirty page table constructed by the Analysis pass because this log record identifies the
oldest update that may not have been written to disk prior to the crash.
 Starting from this log record, Redo scans forward until the end of the log.
 For each redoable log record (update or CLR) encountered, Redo checks whether the logged
action must be redone. The action must be redone unless one of the following conditions
holds:
1. The affected page is not in the dirty page table.
2. The affected page is in the dirty page table, but the recLSN for the entry is greater
than the LSN of the log record being checked.
3. The pageLSN (stored on the page, which must be retrieved to check this
condition) is greater than or equal to the LSN of the log record being checked.
 The first condition obviously means that all changes to this page have been written to disk
because the recLSN is the first update to this page that may not have been written to disk.
 The second condition means that the update being checked was indeed propagated to
disk.
 The third condition, which is checked last because it requires us to retrieve the page, also
ensures that the update being checked was written to disk, because either this update or a
later update to the page was written.
If the logged action must be redone:
1) The logged action is reapplied:
2) The pageLSN on the page is set to the LSN of the redone log record. No additional log record is
written at this time.
 The log records are processed in Redo phase, bringing the system back to the exact state it
was in at the time of the crash.
 At the end of the Redo phase, end type records are written for all transactions with status C,
which are removed from the transaction table
Undo Phase
 The Undo phase, unlike the other two phases, scans backward from the end of thelog.
 The goal of this phase is to undo the actions of all transactions that were active at the time
of the crash, that is, to effectively abort them.
 This set of transactions is identified in the transaction table constructed by the Analysis
phase.
The Undo Algorithm
 Undo begins with the transaction table constructed by the Analysis phase, which identifies
all transactions that were active at the time of the crash, and includes the LSN of the most
recent log record (the lastLSN field) for each such transaction.
 Such transactions are called loser transactions. All actions of losers must be undone, and
further, these actions must be undone in the reverse of the order in which they appear in
the log.
 Consider the set of lastLSN values for all loser transactions. Let us call this set ToUndo.
 Undo repeatedly chooses the largest (i.e., most recent) LSN value in this set and processes it,
until ToUndo is empty.
 To process a log record:
1. If it is a CLR, and the undoNextLSN value is not null, the undoNextLSN value is added to
the set ToUndo; if the undoNextLSN is null, an end record is written for the transaction
because it is completely undone, and the CLR is discarded.
2. If it is an update record, a CLR is written and the corresponding action is undone, and the
prevLSN value in the update log record is added to the set ToUndo.
 When the set ToUndo is empty, the Undo phase is complete. Restart is now complete, and
the system can proceed with normal operations.
Aborting a Transaction
 Aborting a transaction is just a special case of the Undo phase of Restart in which a single
transaction, rather than a set of transactions, is undone.
Crashes during Restart
It is important to understand how the Undo algorithm handles repeated system crashes.
 The log shows the order in which the DBMS executed various actions; the LSNs are in
ascending order, and that each log record for a transaction has a prevLSN field that points to
the previous log record for that transaction.
 In this case, there are no null prevLSNs, that is, some special value used in the prevLSN field
of the first log record for a transaction to indicate that there is no previous logrecord.
 Log record (with LSN) 30 indicates that T1 aborts. All actions of this transaction should be
undone in reverse order, and the only action of T1, described by the update log record 10, is
indeed undone as indicated by CLR 40.
 After the first crash, Analysis identifies P1 (with recLSN 50), P3 (with recLSN 20), and P5
(with recLSN 10) as dirty pages.
 Log record 45 shows that T1 is a completed transaction; thus, the transaction table
identifies T2 (with lastLSN 60) and T3 (with lastLSN 50) as active at the time of thecrash.
 The Redo phase begins with log record 10, which is the minimum recLSN in the dirty page
table, and reapplies all actions (for the update and CLR records), as per the Redo algorithm.
 The ToUndo set consists of LSNs 60, for T2, and 50, for T3. The Undo phase now begins by
processing the log record with LSN 60 because 60 is the largest LSN in the ToUndoset.
 The update is undone, and a CLR (with LSN 70) is written to the log.
 This CLR has undoNextLSN equal to 20, which is the prevLSN value in log record 60; 20 is
the next action to be undone for T2.
 Now the largest remaining LSN in the ToUndo set is 50. The write corresponding to log
record 50 is now undone, and a CLR describing the change is written.
 This CLR has LSN 80, and its undoNextLSN field is null because 50 is the only log record for
transaction T3.
 Thus T3 is completely undone, and an end record is written. Log records 70, 80, and 85 are
written to stable storage before the system crashes a second time;
 However, the changes described by these records may not have been written to disk.
 When the system is restarted after the second crash, Analysis determines that the only
active transaction at the time of the crash was T2; in addition, the dirty page table is
identical to what it was during the previous restart.
 Log records 10 through 85 are processed again during Redo.
 The Undo phase considers the only LSN in the ToUndo set, 70, and processes it by adding
the undoNextLSN value (20) to the ToUndo set.
 Next, log record 20 is processed by undoing T2's write of page P3, and a CLR is written (LSN
90). Because 20 is the first of T2's log records and therefore, the last of its records to be
undone.
 The undoNextLSN field in this CLR is null, an end record is written for T2, and the ToUndo
set is now empty.
 Recovery is now complete, and normal execution can resume with the writing of a
checkpoint record.
 For completeness, consider what happens if the system crashes while Restart is in the
Analysis or Redo phases. If a crash occurs during the Analysis phase, all the work done in
this phase is lost, and on restart the Analysis phase starts afresh with the same information
as before.
 If a crash occurs during the Redo phase, the only effect that survives the crash is that some
of the changes made during Redo may have been written to disk prior to thecrash.
 Restart starts again with the Analysis phase and then the Redo phase, and some update log
records that were redone the first time around will not be redone a second time because the
pageLSN will now be equal to the update record's LSN.
We can take checkpoints during Restart to minimize repeated work in the event of a crash.
UNIT 2
SEQUENCES
The quickest way to retrieve data from a table is to have a column in the table whose
data uniquely identifies a row.
By using this column and a specific value, in the WHERE condition of a SELECT
sentence the Oracle engine will be able to identify and retrieve the row the fastest.
To achieve this, a constraint is attached to a specific column in the table that ensures that the
column is never left empty and that the data values in the column are unique.
Since human beings do data entry, it is quite likely that a duplicate value could be entered,
which violates this constraint and the entire row is rejected.
If the value entered into this column is computer generated it will always fulfill the
unique constraint and the row will always he accepted for storage.
Oracle provides an object called a Sequence that can generate numeric values. The
value generated can have a maximum of 38 digits. A sequence can be defined to:
 Generate numbers in ascending or descending order
 Provide intervals between numbers
 Caching of sequence numbers in memory to speed up their availability
A sequence is an independent object and can be used with any table that requires its output.
Creating Sequences
The minimum information required for generating numbers using a sequence is:
 The starting number
 The maximum number that can be generated by a sequence
 The increment value for generating the next number.
This information is provided to Oracle at the time of sequence creation.
Syntax:
CREATE SEQUENCE < SequenceName >
[INCREMENT BY < IntegerValue >
START WITH < IntegerValue >
MAXVALUE < IntegerValue > / NOMAXVALUE
MINVALUE < integervalue > / NOMINVALUE
CYCLE / NOCYCLE
CACHE < IntegerValue > / NOCACHE
ORDER / NOORDER]
Note: - Sequence is always given a name so that it can be referenced later when required.
Keywords And Parameters
INCREMENT BY: Specifies the interval between sequence numbers. It can be any positive or
negative value but not zero. If this clause is omitted, the default value is 1.
MINVALUE: Specifies the sequence minimum value.
NOMINVALUE: Specifies a minimum value of 1 for an ascending sequence and -(10)^26 for a
descending sequence.
MAXVALUE: Specifics the maximum value that a sequence can generate.
NOMAXVALUE: Specifies a maximum of 10^27 for an ascending sequence or -1 for a descending
sequence. This is the default clause.
START WITH: Specifies the first sequence number to be generated. The default for an ascending
sequence is the sequence minimum value (1) and for a descending sequence, it is the maximum
value (-1).
CYCLE: Specifies that the sequence continues to generate repeat values after reaching either its
maximum value.
NOCYCLE: Specifies that a sequence cannot generate more values alter reaching the maximum
value.
CACHE: Specifies how many values of a sequence Oracle pre-allocates and keeps in memory for
faster access. The minimum value for this parameter is two.
NOCACHE: Specifies that values of a sequence are not pre-allocated.
Note: - If the CACHE / NOCACHE clause is omitted ORACLE caches 20 sequence numbers by default.
ORDER: This guarantees that sequence numbers are generated in the order of request. This is only
necessary if using Parallel Server in Parallel mode option. In exclusive mode option, a sequence
always generates numbers in order.
NOORDER: This does not guarantee sequence numbers are generated in order of request. This is
only necessary if you are using Parallel Server in Parallel mode option. If the ORDER/NOORDER
clause is omitted, a sequence takes the NOORDER clause by default.
Note
The ORDER, NOORDER Clause has no significance, if Oracle is configured with Single Server option.
Example 20: - Create a sequence by the name ADDR_SEQ, which will
generate numbers from 1 upto 9999 in ascending order with an
interval of 1. The sequence must restart from the number 1 after
generating number 999.
CREATE SEQUENCE ADDR_SEQ INCREMENT BY | START WITH | MINVALUE 1 MAXVALUE 999
CYCLE;
Referencing A Sequence
 Once a sequence is created SQL can be used to view the values held in its cache.
 To simply view sequence value use a SELECT sentence as described below:
SELECT < SequenceNnme >. NextVal FROM DUAL;
 This will display the next value held in the cache on the VDU screen.
 Every time nextval references a sequence its output is automatically incremented
from the old value to the new value ready for use.

The example below explains how to access a sequence and use its generated value in
the INSERT statement.
Example 21: - Insert values for ADDR_TYPE, ADDR1, ADDR2, CITY, STATE and
PINCODE in the ADDR_DTLS table. The ADDR_SEQ sequence must be used to
generate ADDR_NO and CODE_NO must be a value held in the BRANCH_NO column
of the BRANCH_MSTR table.
Table Name: ADDR_DTLS
Column Name Data Type Size Attributes
ADDR_NO Number 6
CODE_NO VarChar2 10 Foreign Key references BRANCH_NO of the BRANCH_MSTR
table.
ADDR_TYPE VarChar2 1 Can hold the values: H for Head Office or B for Branch
Column Name Data Type Size Attributes
ADDR1 VarChar2 50
ADDR2 VarChar2 50
CITY VarChar2 25
STATE VarChar2 25
PINCODE VarChar2 6
INSERT INTO ADDR_DTLS (ADDR_NO, CODE_NO, ADDR_TYPE, ADDR1, ADDR2, CITY, STATE,
PINCODE) VALUES(ADDR SEQ.NextVal, ‘B5’, ‘B’, ‘Vertex Plaza, Shop 4,’, ‘Western Express Highway,
Dahisar (East),’, ‘Mumbai’, ‘Maharashtra’, ‘400078’);
To reference the current value of a sequence:
SELECT < SequenceNnme >. CurrVal FROM DUAL;
Altering a Sequence
 A sequence once created can be altered.
 This is achieved by using the ALTER SEQUENCE statement.
Syntax:
ALTER SEQUENCE <SequenceName>
[INCREMENT BY <IntegerValue> MAXVALUE <IntegerValue> / NOMAXVALUE
MINVALUE <IntegerValue> / NOMINVALUE CYCLE / NOCYCLE
CACHE < IntegerVnlue >/ NOCACHE ORDER / NOORDER]
Note
The START value of the sequence cannot be altered.
Example 23:
Change the Cache value of the sequence ADDR_SEQ to 30 and interval between two
numbers as 2.
ALTER SEQUENCE ADDR_SEQ INCREMENT BY 2 CACHE 30;
Dropping A Sequence
The DROP SEQUENCE command is used to remove the sequence from the database.
Syntax:
DROP SEQUENCE < SequenceName > ;
Example 24:
Destroy the sequence ADDR_SEQ.
DROP SEQUENCE ADDR_SEQ;
FUNDAMENTALS OF PL/SQL
WHAT IS SQL?
 SQL is nothing but a Structured Query Language.
 SQL is the natural language of the DBA.
 But it suffers from various disadvantages, when used as a conventionalprogramming
language.
DISADVANTAGES OF SQL
 SQL does not have any procedural capabilities that is does not provide the programming
techniques of condition checking, looping and branching that is vital for data testing before
its permanent storage.
 SQL statements are passed to the Oracle Engine one at a time. Each time an SQL statement is
executed, a call is made to the engine’s resources. This adds to the traffic on the network,
thereby decreasing in the speed of the data processing, especially in a multi-user
environment.
 While processing an SQL sentence if an error occurs, the Oracle engine displays its own
error messages. SQL has no facility of programmed handling of errors.
INTRODUCTION TO PL/SQL
 PL/SQL is a superset of SQL.
 PL/SQL is a block structured language that enables developers to combine the power of SQL
with procedural statements.
 PL/SQL bridges the gap between database technology and procedural programming
languages.
ADVANTAGES OF PL/SQL
 Support for SQL and Support for object-oriented programming
 PL/SQL is development tool that not only supports SQL data manipulation but also
provides facilities of conditional checking, branching and looping.
 Block Structures
 PL SQL consists of blocks of code, which can be nested within each other. SQL sends an
entire block of SQL statements to the Oracle engine all in one go. Communication between
program block and the Oracle engine reduces considerably, reducing networktraffic.
 Since the Oracle engine got the SQL statements as a single block, it processes this code
much faster than if it got the code one sentence at a time.
 Oracle engine at one time for execution, all changes made to the data in the table are done
or undone, in one go.
 Error Handling
 PL/SQL also permits dealing with errors as required, and facilities displaying user-friendly
messages, when errors are encountered.
 Use of variables
 PL/SQL allows declaration and use of variables in block of code.
 These variables can be used to store intermediate results of a query for later processing, or
calculate values and insert them into an Oracle table later.
 PL/SQL variables can be used anywhere, either in SQL statements or in PL/SQL blocks.
 Performance and Efficiency
 Via PL/SQL, all sorts of calculations can be done quickly and efficiently without the use of
the Oracle engine.
 This considerably improves transaction performance.
 Portability
 Applications written in PL/SQL are portable to any computer hardware andoperating
system, where Oracle is operational.
 Hence, PL/SQL code blocks written for a DOS version of Oracle will run on itsLinux/Unix
version, without any modifications at all.
THE GENERIC PL/SQL BLOCK
 PL/SQL permits the creation of structured logical blocks of code that describe processes,
which have to be applied to data. A single PL/SQL code block consists of a set of SQL
statements, clubbed together and passed to the Oracle engine entirely.
 A PL/SQL block has a definite structure, which can be divided into sections. The sections of
a PL/SQL block are:
 The Declare Section
 The Master Begin and End section that also (optionally) contains anException
section.
Fig. PL/SQL block structure
 Declare section
 Code blocks start with a declaration section, in which, memory variables and other
Oracle objects can be declared and if required initialized. Once declared, they can be
used in SQL statements for data manipulation.
 Begin section
 It consists of a set of SQL and PL/SQL statements, which describe processes that
have to be applied to table data. Actual data manipulation, retrieval, looping and
branching constructs are specified in this section.
 The Exception Section
 This section deals with handling of errors that arise during execution of the data
manipulation statements, which make up the PL/SQL code block. Errors can arise
due to syntax, logic and/or validation rule violation.
 The End Section
 This marks the end of a PL/SQL block.
PL/SQL IN THE ORACLE ENGINE
 The PL/SQL engine resides in Oracle engine.
 Oracle engine can process PL/SQL blocks as well as single SQL statement.
 The PL/SQL block is sent to the PL/SQL engine, where procedural statements areexecuted
and SQL statements are sent to the SQL executor in the Oracle engine.
 The call to the Oracle engine needs to be made only once to execute number ofstatements.
 Oracle engine is called only once for each block, the speed of SQL statement execution is
vastly enhanced, when compared to the Oracle engine being called once for each SQL
statement.
Fundamentals of PL/SQL
 The Character Set
 Uppercase alphabets(A-Z)
 Lowercase alphabets (a-z)
 Numerals (0-9)
 Symbols
( ) + - * / < > = ! ; : . ‘ @ % , “ # $ ^ & _  { } ? [ ]
 Compound symbols used in PL/SQL block are
<> != -= ^= <= >= : = ** || << >>
 Literals
 Literal is a numeric value or a character string used to represent itself.
 Numeric Literal
 These can be either integer or floats. If a floats being represented, then the integer part
must be separated from float part by a period.
 Example
 25, 6.34, -5, 25e-03, .1
 Logical (Boolean) Literal
 These are predetermined constants. The values that can be assigned to this type are:
TRUE, FALSE, NULL
 String Literal
 These are represented by one or more legal characters and must be enclosed within
single quotes. The single quote character can be represented, by writing it twice in a
string literal.
 Example
 ‘Hello world’
 ‘Don’t go without saving your work’
 Character Literal
 These are string literals consisting of single characters.
 Example
 ‘*’
 ‘A’
PL/SQL Datatypes
 Predefined Datatypes
 Number Types
 Character Types
 National Character Types
 Boolean Types
 LOB Types
 Date and Interval Types
 Number Types
 Number types let you store numeric data (integers, real numbers, and floating-point
numbers)
 BINARY_INTEGER
 We use the BINARY_INTEGER datatype to store signed integers.
 BINARY_INTEGER values require less storage than NUMBER values.
 BINARY_INTEGER Subtypes
 NATURAL - Restrict an integer variable to non-negative or positive values
 NATURALN - Prevent the assigning of nulls to an integer variable.
 POSITIVE - Restrict an integer variable to non-negative or positive values
 POSITIVEN - Prevent the assigning of nulls to an integer variable.
 SIGNTYPE - Lets you restrict an integer variable to the values -1, 0, and 1.
 NUMBER
 We use the NUMBER datatype to store fixed-point or floating-point numbers.
 We can specify precision, which is the total number of digits, and scale, which isthe
number of digits to the right of the decimal point. The syntax follows:
 NUMBER[(precision,scale)]
 To declare fixed-point numbers, for which you must specify scale, use the following
form:
 NUMBER(precision,scale)
 NUMBER Subtypes
 DEC
 DECIMAL
 NUMERIC
 INTEGER
 INT
 SMALLINT
 DOUBLE PRECISION
 FLOAT
 REAL
Use the subtypes DEC, DECIMAL, and NUMERIC to declare fixed-point numbers with a
maximum precision of 38 decimal digits.
Use the subtypes DOUBLE PRECISION and FLOAT to declare floating-point numbers with a
maximum precision of 126 binary digits, which is roughly equivalent to 38 decimal
digits.
REAL to declare floating-point numbers with a maximum precision of 63 binary digits,
this is roughly equivalent to 18 decimal digits.
Use the subtypes INTEGER, INT, and SMALLINT to declare integers with a maximum
precision of 38 decimal digits.
 PLS_INTEGER
 You use the PLS_INTEGER datatype to store signed integers
 PLS_INTEGER values require less storage than NUMBER values. Also,PLS_INTEGER
operations use machine arithmetic, so they are faster than NUMBER and
BINARY_INTEGER operations.
 Character Types
 Character types let you store alphanumeric data, represent words and text, and manipulate
character strings.
 CHAR
 We use the CHAR datatype to store fixed-length character data. How the datais
represented internally depends on the database character set.
 Maximum size up to 32767 bytes.
 We can specify the size in terms of bytes or characters, where each character contains
one or more bytes, depending on the character set encoding.
 CHAR[(maximum_size [CHAR | BYTE] )]
 If you do not specify a maximum size, it defaults to 1.
 VARCHAR2
 You use the VARCHAR2 datatype to store variable-length character data.
 VARCHAR2(maximum_size [CHAR | BYTE])
 National Character Types
 The widely used one-byte ASCII and EBCDIC character sets are adequate to represent the
Roman alphabet, but some Asian languages, such as Japanese, contain thousands of
characters.
 These languages require two or three bytes to represent each character.
 NCHAR
 You use the NCHAR datatype to store fixed-length (blank-padded if necessary) national
character data
 The NCHAR datatype takes an optional parameter that lets you specify a maximumsize
in characters. The syntax follows:
 NCHAR[(maximum_size)]
 NVARCHAR2
 You use the NVARCHAR2 datatype to store variable-length Unicode character data.
 The NVARCHAR2 datatype takes a required parameter that specifies a maximum sizein
characters. The syntax follows:
 NVARCHAR2(maximum_size)
 Boolean Type
 BOOLEAN
 You use the BOOLEAN datatype to store the logical values TRUE, FALSE, and NULL.
 Only logic operations are allowed on BOOLEAN variables.
 The BOOLEAN datatype takes no parameters.
 LOB Types
 The LOB (large object) datatypes BFILE, BLOB, CLOB, and NCLOB let you store blocks of
unstructured data (such as text, graphic images, video clips, and sound waveforms) up to
four gigabytes in size.
 PL/SQL operates on LOBs through the locators.
 BFILE
 You use the BFILE datatype to store large binary objects in operating system files
outside the database. Every BFILE variable stores a file locator, which points to a large
binary file on the server.
 BFILEs are read-only, so you cannot modify them. The size of a BFILE is system
dependent but cannot exceed four gigabytes
 BLOB
 You use the BLOB datatype to store large binary objects in the database.
 Every BLOB variable stores a locator, which points to a large binary object.
 The size of a BLOB cannot exceed four gigabytes.
 CLOB
 You use the CLOB datatype to store large blocks of character data in thedatabase
 The size of a CLOB cannot exceed four gigabytes.
 NCLOB
 You use the NCLOB datatype to store large blocks NCHAR data in the database
 The size of a NCLOB cannot exceed four gigabytes.
 Date and Interval Types
 The datatypes in this section let you store and manipulate dates, times, and intervals
(periods of time).
 A variable that has a date/time datatype holds values called datetimes; a variable that has an
interval datatype holds values called intervals.
 A datetime or interval consists of fields, which determine its value.
 DATE
 You use the DATE datatype to store fixed-length datetimes, which include the time of
day in seconds since midnight.
 The date portion defaults to the first day of the current month; the time portion defaults
to midnight.
 The date function SYSDATE returns the current date and time.
 You can add and subtract dates.
 Example, the following statement returns the number of days since an employee was
hired:
SELECT SYSDATE - hiredate INTO days_worked FROM emp
WHERE empno = 7499;
 INTERVAL YEAR TO MONTH
 You use the datatype INTERVAL YEAR TO MONTH to store and manipulate intervals of
years and months.
 Example,
INTERVAL YEAR[(precision)] TO MONTH
where precision specifies the number of digits in the years field. You cannot
use a symbolic constant or variable to specify the precision; you must use an integer
literal in the range 0 .. 4. The default is 2.
 INTERVAL DAY TO SECOND
 You use the datatype INTERVAL DAY TO SECOND to store and manipulate intervals
of days, hours, minutes, and seconds.
 The syntax is:
INTERVAL DAY[(leading_precision)] TO
SECOND[(fractional_seconds_precision)]
where leading_precision and fractional_seconds_precision specify the number of digits
in the days field and seconds field, respectively. In both cases, you cannot use a
symbolic constant or variable to specify the precision; you must use an integer
literal in the range 0 .. 9. The defaults are 2 and 6, respectively.
 VARIABLES
 Variables may be used to store the result of a query or calculations. Variables must be
declared before being used.
 Variable Name
 A variable name must begin with a character.
 Variable length is 30 Characters.
 Reserved words can not be used as variable names unless enclosed within the double
quotes.
 Variables must be separated from each other by at least one space or by a punctuation
mark.
 The case (upper/lower) is insignificant when declaring variable names.
 Space can not be used in variable name.
 Declaring variables
 We can declare a variable of any data type either native to the ORACLE or native to
PL/SQL.
 Variables are declared in the DECLARE section of the PL/SQL block.
 . Declaration involves the name of the variable followed by its data type followed by
semicolon (;).
 To assign a value to the variable the assignment operator (:=) is used.
 Syntax:
 <Variable name> <type> [ :=<value> ];
 Example:
 Ename CHAR(10);
 Assigning value to a variable
 There are two ways to assign a value to a variable.
 Using the assignment operator ( := )
Ex: sal := 1000.00;
Total_sal := sal – tax;
 Selecting or fetching table data values in to variables.
Ex: SELECT sal INTO pay
FROM Employee WHERE emp_id = ‘E001’;
 CONSTANT
 A variable can be modified, a constant cannot.
 Declaring Constant
 Declaring a constant is similar to declaring a variable except that you have toadd
the key word CONSTANT and immediately assign a value toit.
 Syntax:
 <variable_name> CONSTANT <datatype> := <value>;
 Example:
 Pi CONSTANT NUMBER(3,2) := 3.14;
 USE OF %TYPE
 While creating a table user attaches certain attributes like data type and constraints.
 These attributes can be passed on to the variables being created in PL/SQL using %TYPE
attribute.
 This simplifies the declaration of variables and constants.
 The %TYPE attribute is used in the declaration of a variable when the variable’s attributes
must be picked from a table field.
 Advantages of using %TYPE
 You do not need to know the data type of the table column
 If you change the parameters of the table column, the variable’s parameters will
change as well.
 Syntax: <variable_name> Tablename.column_name %TYPE;
 Example:
mSal Employee.Sal %TYPE;
Here, mSal is the variable. It gets the datatype and constraints of the column Sal
belongs to the table Employee.
 USE OF % ROWTYPE
 In case, variables for the entire row of a table need to be declared, then instead of declaring
them individually, %ROWTYPE is used.
 In this case, the variable is a composite variable, consisting of the column names of the
table as its member.
 Syntax:
 <variable_name> Tablename %ROWTYPE;
 Example:
 mEmployee_Row Employee %ROWTYPE;
 Here, variable mEmployee_Row is a composite variable, consisting of the column
names of the table as its member. To refer a specific field such as sal, we canuse
 mEmployee_Row.sal
 IDENTIFIER
 The name of any ORACLE object (variable, memory variable, constant, record, cursor etc) is
known as an Identifier.
 Working with Identifier:
 An identifier cannot be declared twice in the same block
 The same identifier can be declared in two different blocks. In this case, two identifiers
are unique and any change in one does not affect the other.
>= Greater than or equal to Operator
LIKE OPERATOR
LIKE Operator is a Pattern matching operator.
It is used to compare a character string against a pattern.
Wild card characters:
o Percentage sign (%) - It matches any number of characters in a string.
o Underscore ( _ ) - It matches exactly one character.
Example:
 SELECT EName FROM Employee WHERE EName LIKE ‘P%’;
 It displays EName field of Employee table where ENames starts with P.
IN OPERATOR
It checks to see if a value lies within a specified list of values.
IN operator returns a BOOLEAN result, either TRUE or FALSE.
Syntax:
The_value [NOT] IN (value1, value2, value3……)
Example:
3 IN (4, 8, 7, 5, 3, 2) Returns TRUE
BETWEEN
It checks to see if a value lies within a specified range of value.
Low_End and Upper_Ends are inclusive.
Syntax:
the_value [NOT] BETWEEN low_end AND high_end.
Example:
5 BETWEEN –5 AND 10. Returns TRUE
IS NULL
It checks to see if a value is NULL.
Syntax:
Example:
the_value IS [NOT] NULL
IF balance IS NULL THEN
Sequence_of_statements;
END IF
LOGICAL OPERATORS
PL/SQL implements 3 logical operations AND, OR and NOT.
A B A AND B A OR B NOT A
TRUE TRUE TRUE TRUE FALSE
TRUE FALSE FALSE TRUE FALSE
TRUE NULL NULL TRUE FALSE
FALSE TRUE FALSE TRUE TRUE
FALSE FALSE FALSE FALSE TRUE
FALSE NULL FALSE NULL TRUE
NULL TRUE NULL TRUE NULL
NULL FALSE FALSE NULL NULL
NULL NULL NULL NULL NULL
STRING OPERATORS
PL/SQL has two operators specially designed to operate only on character string typedata.
They are
 LIKE
 LIKE Operator is a Pattern matching operator.
 It is used to compare a character string against a pattern.
 Wild card characters:
 Percentage sign (%) - It matches any number of characters in a string.
 Underscore ( _ ) - It matches exactly one character.
 Example:
SELECT EName FROM Employee WHERE EName LIKE ‘P%’;
It displays EName field of Employee table where ENames starts with P.
 Concatenation ( || )
The concatenation operator returns a resultant string consisting of all the charactersin
string_1 followed by all the characters in string_2.
Syntax:
String_1 || string_2;
A=’XX’ B=’YY’ C=VARCHAR2 (50)
C=A || ‘ ‘ || B Returns a value to variable C as ‘XX YY’.
DISPLAYING USER MESSAGES ON THE SCREEN
DBMS_OUTPUT is a package that includes number of procedures and functions that
accumulate information in a buffer so that it can be retrieved later.
PUT_LINE puts a piece of information in the package buffer followed by an end-of-line
marker.
It can also be used to display message.
PUT_LINE expects a single parameter of character data type.
CONDITIONAL CONTROL IN PL/SQL
In PL/SQL, the if statement allows you to control the execution of a block ofcode.
PL/SQL allows the use of an IF statement to control the execution of a block of code.
The different forms are:
 IF condition THEN
Statements;
END IF;
 IF condition THEN
mFINE number(4) := 100;
mMIN_BAL constant number(7,2) := 5000.00;
BEGIN
/* Accept the Account number from the user*/
mACCT_NO := &mACCT_NO;
/* Retrieving the current balance from the ACCT_MSTR table where the ACCT_NO in the
table is equal to the mACCT_NO entered by the user.*/
SELECT CURBAL INTO mCUR_BAL FROM ACCT_MSTR WHERE ACCT_NO= mACCT_NO;
/* Checking if the resultant balance is less than the minimum balance of Rs.5000. If the
condition is satisfied an amount of Rs.100 is deducted as a fine from the current balance of
the corresponding ACCT_NO.*/
IF mCUR_BAL < mMIN_BAL THEN
UPDATE ACCT_MSTR SET CURBAL = CURBAL – mFINE
WHERE ACCT_NO = mACCT_NO;
END IF;
END;
EXIT and EXIT WHEN statement:
EXIT and EXIT WHEN statements enable you to escape out of the control of aloop.
The format of the EXIT statement is as follows :
Syntax: EXIT;
EXIT WHEN statements has following syntax:
Syntax: EXIT WHEN <condition is true >;
 EXIT WHEN statement enables you to specify the condition required to exit the
execution of the loop.
 In this case if statement is not required.
Ex-1: IF count > = 10 THEN
EXIT;
END IF;
Ex-2: EXIT WHEN count > = 10;
Iterative Control
Iterative control indicates the ability to repeat or skip section of codeblock.
A loop marks a sequence of statements that has to be repeated.
The keyword loop has to be placed before the first statement in the sequence of statements
to be repeated, while keyword end loop is placed immediately after the last statement in
the sequence.
.Once a LOOP begins to run, it will go on forever. Hence loops are always accompanied by
conditional statements that keep control on the number of times it is executed.
Simple Loop
In simple loop, the keyword loop should be placed before the first statement in the
sequence and the keyword end loop should be written at the end of the sequence to end the
loop.
The format is as follows.
LOOP
Statements;
END LOOP;
Example: Create a simple loop such that a message is displayed when a loop exceeds a
particular value.
DECLARE
i number := 0;
BEGIN
LOOP
END;
i := i + 2;
EXIT WHEN i > 10;
END LOOP;
dbms_output.put_line('Loop exited as the value of i has reached ' || to_char(i));
While loop
The WHILE loop enables you to evaluate a condition before a sequence of
statements would be executed.
If condition is TRUE then sequence of statements areexecuted.
The syntax for the WHILE loop is as follows:
Syntax:
WHILE < Condition is TRUE >
LOOP
< Statements >
END LOOP;
Example: - Write a PL/SQL code block to calculate the area of a circle for a value of radius varying
from 3 to 7. Store the radius and the corresponding values of calculated area in an empty table
named AREAS, consisting of two columns Radius and Area.
CREATE TABLE AREAS(RADIUS NUMBER(5), AREA NUMBER(14,2));
DECLARE
/* Declaration of memory variables and constants to be used in the Execution section.*/
pi constant number(4,2) := 3.14 ;
radius number(5);
area number(14,2);
BEGIN
/* Initialize the radius to 3, since calculations are required for radius 3 to 7 */
radius := 3;
/* Set a loop so that it fires till the radius value reaches 7 */
WHILE RADIUS <= 7
LOOP
/* Area calculation for a circle */
area := pi * power(radius,2);
/* Insert the value for the radius and its corresponding area calculated in the table */
INSERT INTO areas VALUES (radius, area);
END;
/* Increment the value of the variable radius by 1 */
radius := radius + 1;
END LOOP;
FOR LOOP
The FOR LOOP enables you to execute a loop for predetermined number of times.
The variable in the FOR loop need not be declared.
The increament value can not be specified.
The for loop variable is always incremented by 1.
Reverse is an optional keyword. If we specify keyword reverse then variable
considers last value first and then decrement it to get the start value.
The syntax for FOR LOOP is as follows:
FOR var IN [REVERSE] start…end
LOOP
Statements;
END LOOP;
Example: Write a PL/SQL block of code for inverting a number 5639 to 9365.
DECLARE
/* Declaration of memory variables and constants to be used in the Execution section.*/
given_number varchar(5) := '5639';
str_length number(2);
inverted_number varchar(5);
BEGIN
/* Store the length of the given number */
str_length := length(given_number);
/* Initialize the loop such that it repeats for the number of times equal to the length of the given
number. Also, since the number is required to be inverted, the loop should consider the last number
first and store it i.e. in reverse order */
FOR cntr IN REVERSE 1..str_length
/* Variables used as counter in the for loop need not be declared i.e. cntr declaration is not required
*/
LOOP
/* The last digit of the number is obtained using the substr function, and stored in a
variable, while retaining the previous digit stored in the variable*/
inverted_number := inverted_number || substr(given_number, cntr, 1);
END LOOP;
/* Display the initial number, as well as the inverted number, which is stored in the variable on
screen */
dbms_output.put_line ('The Given number is ' || given_number );
dbms_output.put_line ('The Inverted number is ' || inverted_number );
END;
GOTO Statement:
The GOTO statement allows you to change the flow of control within a PL/SQLblock.
This statement allows execution of a section code, which is not in the normal flow ofcontrol.
The entry point into such a block of code is marked using tags << userdefined name>>.
The syntax is as follows
Syntax: GOTO <label name> ;
The label is surrounded by double brackets (<< >>) and label must not have a semi colon
after the label name.
The label name does not contain a semi colon because it is not a PL/SQL statement.
Example: Write a PL/SQL block of code to achieve the following. If there are no transactions
taken place in the last 365 days then mark the account status as inactive, and then record the
account number, the opening date and the type of account in the INACTV_ACCT_MSTR table.
CREATE TABLE INACTV_ACCT_MSTR (
ACCT_NO VARCHAR2(10), OPNDT DATE, TYPE VARCHAR2(2));
DECLARE
/* Declaration of memory variables and constants to be used in the Execution section.*/
mACCT_NO VARCHAR2(10);
mANS VARCHAR2(3);
mOPNDT DATE;
mTYPE VARCHAR2(2);
BEGIN
/* Accept the Account number from the user*/
mACCT_NO := &mACCT_NO;
/* Fetch the account number into a variable */
SELECT 'YES' INTO mANS FROM TRANS_MSTR WHERE ACCT_NO = mACCT_NO HAVING
MIN(SYSDATE - DT) >365;
/* If there are no transactions taken place in last 365 days the execution control is transferred to a
user labelled section of code, labelled as mark_status in this example. */
IF mANS = 'YES' THEN
GOTO mark_status;
ELSE
dbms_output.put_line('Account number: ' || mACCT_NO || 'is active');
END IF;
/* A labelled section of code which updates the STATUS of account number held in the ACCT_MSTR
table. Further the ACCT_NO, OPNDT and the TYPE are inserted in to the table INACTV_ACCT_MSTR.
*/
END;
<< mark_status>>
UPDATE ACCT_MSTR SET STATUS = 'I' WHERE ACCT_NO = mACCT_NO;
SELECT OPNDT, TYPE INTO mOPNDT, mTYPE
FROM ACCT_MSTR WHERE ACCT_NO = mACCT_NO;
INSERT INTO INACTV_ACCT_MSTR (ACCT_NO, OPNDT,TYPE)
VALUES (mACCT_NO, mOPNDT, mTYPE);
dbms_output.put_line(' Account number: '|| mACCT_NO || 'is marked as inactive');
NULL Statements
The NULL statement does nothing other than pass control to the next statement.
In a conditional construct, the NULL statement tells readers that a possibility has been
considered, but no action is necessary.
In IF statements or other places that require at least one executable statement, the NULL
statement to satisfy the syntax.
In the following example, the NULL statement emphasizes that only top-rated employees get
bonuses:
IF rating > 90 THEN
compute_bonus(emp_id);
ELSE
NULL;
END IF;
CASE Expression
A CASE expression selects a result from one or more alternatives, and returnsthe
result.
The CASE expression uses a selector, an expression whose value determineswhich
alternative to return.
A CASE expression has the following form:
CASE selector
WHEN expression1 THEN result1
WHEN expression2 THEN result2
...
WHEN expressionN THEN resultN
[ELSE resultN+1]
END;
The selector is followed by one or more WHEN clauses, which are checked
sequentially.
The value of the selector determines which clause is executed.
The first WHEN clause that matches the value of the selector determines the result
value, and subsequent WHEN clauses are not evaluated.
An example follows:
DECLARE
grade CHAR(1) := 'B';
appraisal VARCHAR2(20);
BEGIN
appraisal :=
CASE grade
WHEN 'A' THEN 'Excellent'
WHEN 'B' THEN 'Very Good'
WHEN 'C' THEN 'Good'
WHEN 'D' THEN 'Fair'
WHEN 'F' THEN 'Poor'
ELSE 'No such grade'
END;
END;
The optional ELSE clause works similarly to the ELSE clause in an IF statement.
If the value of the selector is not one of the choices covered by a WHEN clause, the
ELSE clause is executed.
If no ELSE clause is provided and none of the WHEN clauses are matched, the
expression returns NULL.
Searched CASE Expression
PL/SQL also provides a searched CASE expression, which has the form:
CASE
WHEN search_condition1 THEN result1
WHEN search_condition2 THEN result2
...
WHEN search_conditionN THEN resultN
[ELSE resultN+1]
END;
A searched CASE expression has no selector.
Each WHEN clause contains a search condition that yields a Boolean value, which lets
you test different variables or multiple conditions in a single WHEN clause.
An example follows:
DECLARE
grade CHAR(1);
appraisal VARCHAR2(20);
BEGIN
...
appraisal :=
CASE
WHEN grade = 'A' THEN 'Excellent'
WHEN grade = 'B' THEN 'Very Good'
WHEN grade = 'C' THEN 'Good'
WHEN grade = 'D' THEN 'Fair'
WHEN grade = 'F' THEN 'Poor'
ELSE 'No such grade'
END;
...
END;
The search conditions are evaluated sequentially.
The Boolean value of each search condition determines which WHEN clause is
executed.
If a search condition yields TRUE, its WHEN clause is executed.
After any WHEN clause is executed, subsequent search conditions are not evaluated.
If none of the search conditions yields TRUE, the optional ELSE clause is executed.
If no WHEN clause is executed and no ELSE clause is supplied, the value of the
expression is NULL.
CASE Statement
 Like the IF statement, the CASE statement selects one sequence of statements to execute.
 However, to select the sequence, the CASE statement uses a selector rather than multiple
Boolean expressions.
 The CASE statement is more readable and more efficient. So, when possible, rewrite
lengthy IF-THEN-ELSIF statements as CASE statements.
 The CASE statement begins with the keyword CASE.
 The keyword is followed by a selector, which is the variable grade in the example.
 The selector expression can be arbitrarily complex. For example, it can contain function
calls.
 Usually, however, it consists of a single variable. The selector expression is evaluated
only once.
 The selector is followed by one or more WHEN clauses, which are checked sequentially.
 The value of the selector determines which clause is executed. If the value of the
selector equals the value of a WHEN-clause expression, that WHEN clause isexecuted.
 The ELSE clause works similarly to the ELSE clause in an IF statement. In the example, if
the grade is not one of the choices covered by a WHEN clause, the ELSE clause is selected,
and the phrase 'No such grade' is output.
 The ELSE clause is optional. However, if you omit the ELSE clause, PL/SQL adds the
following implicit ELSE clause:
Consider the following code that outputs descriptions of school grades:
Example
CASE grade
WHEN 'A' THEN dbms_output.put_line('Excellent');
WHEN 'B' THEN dbms_output.put_line('Very Good');
WHEN 'C' THEN dbms_output.put_line('Good');
WHEN 'D' THEN dbms_output.put_line('Fair');
WHEN 'F' THEN dbms_output.put_line('Poor');
ELSE dbms_output.put_line('No such grade');
END CASE;
Searched Case Statement:
PL/SQL also provides a searched CASE statement, which has the form:
CASE
WHEN search_condition1 THEN sequence_of_statements1;
WHEN search_condition2 THEN sequence_of_statements2;
...
WHEN search_conditionN THEN sequence_of_statementsN;
[ELSE sequence_of_statementsN+1;]
END CASE;
The searched CASE statement has no selector.
Its WHEN clauses contain search conditions that yield a Boolean value,not
expressions that can yield a value of any type.
The search conditions are evaluated sequentially.
The Boolean value of each search condition determines which WHEN clause is
executed.
If a search condition yields TRUE, its WHEN clause is executed. If any WHEN clause
is executed, control passes to the next statement, so subsequent search conditions
are not evaluated.
If none of the search conditions yields TRUE, the ELSE clause is executed. The ELSE
clause is optional. However, if you omit the ELSE clause, PL/SQL adds the following
implicit ELSE clause:
An example follows:
CASE
WHEN grade = 'A' THEN dbms_output.put_line('Excellent');
WHEN grade = 'B' THEN dbms_output.put_line('Very Good');
WHEN grade = 'C' THEN dbms_output.put_line('Good');
WHEN grade = 'D' THEN dbms_output.put_line('Fair');
WHEN grade = 'F' THEN dbms_output.put_line('Poor');
ELSE dbms_output.put_line('No such grade');
END CASE;
Handling Null Values in Comparisons and Conditional Statements
When working with nulls, you can avoid some common mistakes by keeping in mind the following
rules:
Comparisons involving nulls always yield NULL
Applying the logical operator NOT to a null yields NULL
In conditional control statements, if the condition yields NULL, its associated sequence of
statements is not executed
If the expression in a simple CASE statement or CASE expression yields NULL, it cannot be
matched by using WHEN NULL. In this case, you would need to use the searched case syntax
and test WHEN expression IS NULL.
Example 1:
x := 5;
y := NULL;
...
IF x != y THEN -- yields NULL, not TRUE
sequence_of_statements; -- not executed
END IF;
Example 2:
a := NULL;
b := NULL;
...
IF a = b THEN -- yields NULL, not TRUE
sequence_of_statements; -- not executed
END IF;
Concept of Nested tables
Within the database, nested tables can be considered one-column databasetables.
Oracle stores the rows of a nested table in no particular order. But, when you retrieve the
nested table into a PL/SQL variable, the rows are given consecutive subscripts starting at1.
That gives you array-like access to individual rows.
PL/SQL nested tables are like one-dimensional arrays. You can model multi-dimensional
arrays by creating nested tables whose elements are also nested tables.
Nested tables are also singly dimensioned, unbounded collections of homogeneous
elements.
Nested tables are available in both PL/SQL and the database.

ORACLE TRANSACTIONS
PL / SQL TRANSACTIONS
A series of one or more SQL statements that are logically related or a series of operations
performed on Oracle table data is termed as a Transaction.
Oracle treats this logical unit as a single entity.
Oracle treats changes to table data as a two-step process. First, the changes requested are
done.
To make these changes permanent a COMMIT statement has to be given at the SQL
prompt.
A ROLLBACK statement given at the SQL prompt can be used to undo a part of or the entire
transaction.
Specifically, a transaction is a group of events that occur between any of the following
events:
 Connecting to Oracle
 Disconnecting from Oracle
 Committing changes to the database table
 Rollback
Closing Transactions
 A transaction can be closed by using either a commit or a rollback
statement.
 By using these statements. table data can be changed or all the changes made
to the table data undone.
Using COMMIT:
A COMMIT ends the current transaction and makes permanent any changes made
during the transaction.
All transactional locks acquired on tables are released.
Syntax:
COMMIT;
Using ROLLBACK:
A ROLLBACK does exactly the opposite of COMMIT.
It ends the transaction but undoes any changes made during the transaction.
All transactional locks acquired on tables are released.
Syntax:
ROLLBACK [WORK] [TO [SAVEPOINT] <SavePointName>];
where,
WORK Is optional and is provided for ANSI compatibility
SAVEPOINT Is optional and is used to rollback a transaction partially, as far as the
specified savepoint
SAVEPOINTNAME Is a savepoint created during the current transaction
Creating A SAVEPOINT
SAVEPOINT marks and saves the current point in the processing of a transaction.
When a SAVEPOINT is used with a ROLLBACK statement, parts of a transaction can be
undone.
An active SAVEPOINT is one that is specified since the last COMMIT or ROLLBACK.
Syntax:
SAVEPOINT <SavePointName>;
A ROLLBACK operation performed without the SAVEPOINT clause amounts to the
following:
Ends the transaction.
Undoes all the changes in the current transaction.
Erases all savepoints in that transaction.
Releases the transactional locks.
A ROLLBACK operation performed with the TO SAVEPOINT clause amounts to the
following:
 A predetermined portion of the transaction is rolled back
 Retains the save point rolled back to, but loses those created after the namedsavepoint
 Releases all transactional locks that were acquired since the savepoint was taken
Example 1: - Write a PL/SQL block of code that first withdraws an amount of Rs. 1,000.
Then deposits an amount of Rs.1,40,000. Update the current balance. Then check to see
that the current balance of all the accounts in the bank does not exceed Rs.2,00,000. If the
balance exceeds Rs.2,00,000 then undo the deposit just made.
DECLARE
mBAL number(8,2);
BEGIN
/* Insertion of a record in the 'TRANS_MSTR' table for withdrawals */
INSERT INTO TRANS_MSTR
(TRANS_NO, ACCT_NO, DT, TYPE, PARTICULAR, DR_CR, AMT, BALANCE)
VALUES('T100','CA10','04-JUL-2004','C','Telephone Bill','W', 1000, 31000);
/* Updating the current balance of account number CA10 in the
'ACCT _MSTR' table. */
UPDATE ACCT_MSTR SET CURBAL = CURBAL - 1000 WHERE ACCT_NO ='CA 10';
/* Defining a savepoint. */
SAVEPOINT no update;
/* Insertion of a record in the 'TRANS_MSTR' table for deposits. */
INSERT INTO TRANS_MSTR(TRANS_NO, ACCT NO, DT, TYPE, PARTICULAR, DR_CR, AMT,
BALANCE)VALUES('T101', 'CA10', '04-JUL-2004', 'C', 'Deposit', 'D', 140000, 171000);
/* Updating the current balance of account number CA10 in the
'ACCT_MSTR' table. */
UPDATE ACCT MSTR SET CURBAL = CURBAL + 140000 WHERE ACCT_NO= ‘CA10’;
/* Storing the total current balance from the 'ACCT MSTR' table into a
variable. */
SELECT SUM(CURBAL) INTO mBAL FROM ACCT MSTR;
/* Checking if total current balance exceeds 200000. */
IF mBAL > 200000 THEN
/* Undo the changes made to the 'TRANS_MSTR' table. */
ROLLBACK To SAVEPOINT no_update;
END IF;
/* Make the changes permanent. */
COMMIT;
END;
Output:
PL/SQL procedure successfully completed.
PL/SQL SECURITY
 An Oracle transaction can be made up of a single SQL sentence or several SQL
sentences.
 This gives rise to Single Query Transactions and Multiple Query Transactions
(i.e. SQT and MQT).
 These transactions (whether SQT or MQT) access an Oracle table(s).
 Since Oracle works on a multi-user platform, it is more than likely that several
people will access data either for viewing or for manipulating (inserting, updating
and deleting records) from the same tables at the same time via different SQL
statements.
 The Oracle table is therefore a global resource, i.e. it is shared by several users.
 The Oracle Engine has to allow simultaneous access to table data without causing
damage to the data.
 The technique employed by the Oracle engine to protect table data when several
people are accessing it is called Concurrency Control.
 Oracle uses a method called Locking to implement concurrency control when
multiple users access a table to manipulate its data at the same time.
LOCKS
 Locks are mechanisms used to ensure data integrity while allowing maximum
concurrent access to data. Oracle's locking is fully automatic and requires no user
intervention.
 The Oracle engine automatically locks table data while executing SQL statements.
This type of locking is called Implicit Locking.
Oracle's Default Locking Strategy - Implicit Locking
Since the Oracle engine has a fully automatic locking strategy, it has to decide on two
issues:
Type of Lock to be applied
Level of Lock to be applied
Types Of Locks
The type of lock to be placed on a resource depends on the operation being performed on
that resource. Operations on tables can be distinctly grouped into the following two
categories:
Read Operations : SELECT statements
Write Operations : INSERT, UPDATE, DELETE statements
 Since Read operations make no changes to data in a table and are meant only
for viewing purposes, simultaneous read operations can be performed on a
table without any danger to the table's data.
 Hence, the Oracle engine places a Shared lock on a table when its data is
being viewed.
 write operations cause a change in table data i.e. any insert, update or delete
statement affects table data directly and hence, simultaneous write
operations can adversely affect table data integrity.
 Simultaneous write operation will cause Loss of data consistency in the
table.
 Hence, the Oracle engine places an Exclusive lock on a table or specific
sections of the table's resources when data is being written to a table.
The rules of locking can be summarized as:
DATA being CHANGED cannot be READ.
Writers wait for other writers, if they attempt to update the same rows at the same time.
The two Types of locks supported by Oracle are:
Shared Locks
 Shared locks are placed on resources whenever a Read operation (SELECT) isperformed
 Multiple shared locks can be simultaneously set on a resource
Exclusive Locks
 Exclusive locks are placed on resources whenever Write operations (INSERT, UPDATE and
DELETE) are performed
 Only one exclusive lock can be placed on a resource at a time i.e. the first user who acquires an
exclusive lock will continue to have the sole ownership of the resource, and no other user can
acquire an exclusive lock on that resource
Note
 In the absence of explicit user defined locking being defined to the Oracle
engine, if a default Exclusive lock is taken on a table a Shared lock on the
very same data is permitted.
 Automatic application of locks on resources by the Oracle engine results in a
high degree of data consistency.
Levels of Locks
 A table can be decomposed into rows and a row can be further decomposed into
fields.
 Hence, if an automatic locking system is designed so as to be able to lock the fields of
a record, it will be the most flexible locking system available.
 Oracle does not provide a field level lock.
Oracle provides the following three levels of locking:
 Row level
 Page level
 Table level
The Oracle engine decides on the level of lock to be used by the presence or absence of a
WHERE condition in the SQL sentence.
 If the WHERE clause evaluates to only one row in the table, a row level lock is used
 If the WHERE clause evaluates to a set of data, a page level lock is used
 If there is no WHERE clause, (i.e. the query accesses the entire table,) a table level lock isused
Example for Implicit Locking
Example: The BRANCH_MSTR table will be used to check the behavior of the Oracle Engine
in multi-user environment when an insert operation is performed.
Table Name: BRANCH_MSTR
BRANCH_NO NAME BRANCH_NO NAME BRANCH_NO NAME
B1 Vile Parle
(HO)
B2 Andheri B3 Churchgate
B4 Mahim B5 Borivali B6 Darya Ganj
Client A performs an insert operation on the BRANCH_MSTR table:
Client A> INSERT INTO BRANCH_MSTR (BRANCH_NO, NAME) VALUES('B7','Dahisar');
Output
BRANCH_NO NAME
Bl Vile Parle (HO)
B2 Andheri
B3 Churchgate
Output:
1 row created.
Client A fires a SELECT statement on the BRANCH_MSTR table:
Client A> SELECT * FROM BRANCH_MSTR;
Output:
BRANCH_NO NAME
Bl Vile Parle (HO)
B2 Andheri
B3 Churchgate
B4 Mahim
B5 Borivali
B6 Darya Ganj
B7 Dahisar
7 rows selected.
Client B fires a SELECT statement on the BRANCH_MSTR table:
Client B> SELECT * FROM BRANCH_MSTR;
B4 Mahim
B5 Borivali
B6 Darya Ganj
B7 Dahisar
6 rows selected.
Observation:
 Client A can see the newly inserted record B7
 Client B cannot see the newly inserted record, as Client A has not committed it
Inferences:
Since Client A has not fired a commit statement for permanently saving the newly
inserted record in the BRANCH_MSTR table, Client B cannot access the newly
inserted record or manipulate it any way
Note
Client A can view, update or delete the newly inserted record since it exists in the buffer on the
Client A's computer. However, this record does not exist in the Server's table, because Client A has
not committed the transaction.
Explicit Locking
Although the Oracle engine, has a default locking strategy in commercial
applications, explicit user defined locking is often required.
Consider the example below:
If two client computers (Client A and Client B) are entering sales orders, each time a
sales order is prepared, the quantity on hand of the product for which the order is
being generated needs to be updated in the PRODUCT_MSTR table.
Now, if Client A fires an update command on a record in the PRODUCT _MSTR table, then
Oracle will implicitly lock the record so that no further data manipulation can be done by
any other user till the lock is released. The lock will be released only when Client A fires a
commit or rollback.
In the meantime, if Client B tries to view the same record, the Oracle engine will
display the old set for values for the record as the transaction for that record has not
been completed by Client A. This leads to wrong information being displayed to
Client B.
In such cases, Client A must explicitly lock the record such that, no other user can
access the record even for viewing purposes till Client A's transaction is
completed.
A Lock so defined is called Explicit Lock. User defined explicit locking always
overrides Oracle's default locking strategy.
Explicit Locking
The technique of lock taken on a table or its resources by a user is called Explicit Locking.
Users can lock tables they own or any tables on which they have been granted table
privileges (such as select, insert, update, delete).
Oracle provides facilities by which the default locking strategy can be overridden. Table(s)
or row(s) can be.
Explicit locking can be achieved in two ways:
o SELECT … FOR UPDATE statement;
o LOCK TABLE statement;
The SELECT ... FOR UPDATE Statement
It is used for acquiring exclusive row level locks in anticipation of performing updates on
records.
This clause is generally used to signal the Oracle engine that data currently being used
needs to be updated.
It is often followed by one or more update statements with a where clause.
Example 1:
Two client machines Client A and Client B are recording the transactions performed in a
bank for a particular account number simultaneously.
Client A fires the following select statement:
Client A> SELECT * FROM ACCT_MSTR WHERE ACCT NO ='SB9' FOR UPDATE;
When the above SELECT statement is fired, the Oracle engine locks the record SB9. This lock is
released
when a commit or rollback is fired by Client A.
Now Client B fires a SELECT statement, which points to record SB9, which has already been locked
by Client A:
Client B> SELECT * FROM ACCT_MSTR WHERE ACCT N0='SB9' FOR UPDATE;
The Oracle engine will ensure that Client B’s SQL statement waits for the lock to be released on
ACCT_MSTR by a commit or rollback statement fired by Client A forever.
SELECT … FOR UPDATE with NOWAIT Option
In order to avoid unnecessary waiting time, a NOWAIT option can be used to inform the Oracle
engine to terminate the SQL statement if the record has already been locked. If this happens the
Oracle engine terminates the running DML and comes up with a message indicating that the
resource is busy.
If Client B fires the following select statement now with a NOWAIT clause:
Client B> SELECT * FROM ACCT_MSTR WHERE ACCT_NO ='SB9' FOR UPDATE NOWAIT;
Output:
Since Client A has already locked the record SB9 when Client B tries to acquire a shared
lock on the same record the Oracle Engine displays the following message:
SQL> 00054: resource busy and acquire with nowait specified.
The SELECT ... FOR UPDATE cannot be used with the following:
 Distinct and the Group by clause
 Set operators and Group functions
Using Lock Table Statement
To manually override Oracle's default locking strategy by creating a data lock in a specific
mode.
Syntax:
LOCK TABLE <TabIeName> [, <TabIeName>] ...
IN {ROW SHARE|ROW EXCLUSIVEISHARE UPDATE|
SHAREISHARE ROW EXCLUSIVE I EXCLUSIVE }
[NOWAIT]
where,
TableName Indicates the name of table(s), view(s) to be locked. In
case of views, the lock is placed on underlying tables.
IN Decides what other locks on the same resource can
exist simultaneously. For example, if there is an
exclusive lock on the table no user can update rows in
the table. It can have any of the following values:
Exclusive: They allow query on the locked resource but prohibit
any other activity.
Share : It allows queries but prohibits updates to a table.
Row Exclusive: Row exclusive locks are the same as
row share locks, also prohibit locking in shared mode.
These locks are acquired when updating, inserting or
deleting.
Share RowExclusive: They are used to look at a whole
table, to selective updates and to allow other users to
look at rows in the table but not lock the table in share
mode or to update rows.
NOWAIT Indicates that the Oracle engine should immediately
return to the user with a message, if the resources are
busy. If omitted, the Oracle engine will wait till
resources are available forever.
Oracle Table Lock Mode (RS)
Row Share Table Lock (RS)
• Indicates a transaction holding the lock on the table has locked rows in the table and
intends to update them.
• Permitted Operations: Allows other transactions to query, insert, update, delete, or lock
rows concurrently in the same table. Therefore, other transactions can obtain
simultaneous row share, row exclusive, share, and share row exclusive table locks for
the same table.
• Prohibited Operations: Lock Table in Exclusive Mode.
Oracle Table Lock Mode (RX)
Row Exclusive Table Lock (RX)
• Indicates that a transaction holding the lock has made one or more updates to rows in
the table. A row exclusive table lock is acquired automatically by: INSERT, UPDATE,
DELETE, LOCK TABLE.. IN ROW EXCLUSIVE MODE; A row exclusive table lock is slightly
more restrictive than a row share table lock.
• Permitted Operations: Allows other transactions to query, insert, update, delete, or lock
rows in the same table. The row exclusive table locks allow multiple transactions to
obtain simultaneous row exclusive and row share table locks in the same table.
• Prohibited Operations: Prevents locking the table for exclusive reading or writing.
Therefore, other transactions cannot concurrently lock the table: IN SHARE MODE, IN
SHARE EXCLUSIVE MODE, or IN EXCLUSIVE MODE.
Oracle Table Lock Mode (S)
Share Table Lock (S)
• Acquired automatically for the table specified in the following statement: LOCK TABLE
<table> IN SHARE MODE;
• Permitted Operations: Allows other transactions only to query the table, to lock specific
rows with SELECT . . . FOR UPDATE, or to execute LOCK TABLE . . . IN SHARE MODE; no
updates are allowed by other transactions. Multiple transactions can hold share table
locks for the same table concurrently. No transaction can update the table (with
SELECT.. FOR UPDATE). Therefore, a transaction that has a share table lock can update
the table only if no other transaction has a share table lock on the same table.
Prohibited Operations: Prevents other transactions from modifying the same table or
lock table: IN SHARE ROW EXCLUSIVE MODE, IN EXCLUSIVE MODE, or IN ROW
EXCLUSIVE MODE.
Oracle Table Lock Mode (SRX)
Share Row Exclusive Table Lock (SRX)
• More restrictive than a share table lock. A share row exclusive table lock is acquired for
a table as follows: LOCK TABLE <table> IN SHARE ROW EXCLUSIVE MODE;
• Permitted Operations: Only one transaction at a time can acquire a share row exclusive
table lock on a given table. A share row exclusive table lock held by a transaction allows
other transactions to query or lock specific rows using SELECT with the FOR UPDATE
clause, but not to update the table.
• Prohibited Operations: Prevents other transactions from obtaining row exclusive table
locks and modifying the same table. A share row exclusive table lock also prohibits
other transactions from obtaining share, share row exclusive, and exclusive table locks.
Oracle Table Lock Mode (X)
Exclusive Table Lock (X)
• Most restrictive mode of table lock, allowing the transaction that holds the lock
exclusive write access to the table. An exclusive table lock is acquired by: LOCK TABLE
<table> IN EXCLUSIVE MODE;
• Permitted Operations: Only one transaction can obtain an exclusive table lock for a table.
An exclusive table lock permits other transactions only to query the table.
• Prohibited Operations: Prohibits other transactions from performing any type of DML
statement or placing any type of lock on the table.
Example: Two client machines Client A and Client B are performing data
manipulation on the table EMP_MS'I'R.
Table Name: EMP_MSTR
EMP
NO
BRANCH
_NO
FNAME MNAME LNAME DEPT DESIG MNGR_NO
El BI Ivan Nelson Bayross Administratio
n
Managing Director
E2 B2 Amit Desai Loans &
Financing
Finance Manager
E3 B3 Maya Mahima Joshi Client
Servicing
Sales Manager
E4 BI Peter tyer Joseph Loans &
Financing
Clerk E2
E5 B4 Mandhar Dilip Dalvi Marketing Marketing Manager
E6 B6 Sonal Abdul Khan Administratio
n
Admin. Executive E1
E7 B4 Anil Ashutosh Kambli Marketing Sales Asst. E5
L8 B3 Seema P. Apte Client
Servicing
Clerk E3
E9 B2 Vikram Vilas Randive Marketing Sales Asst. E5
E10 B6 Anjali Sameer Pathak Administratio
n
HR Manager E1
Client A has locked the table in exclusive mode (i.e. only querying of records is allowed on
the EMP_MSTR table by Client B):
Client A> LOCK TABLE EMP_MSTR IN EXCLUSIVE Mode NOWAIT;
Output:
Table (s) Locked.
Client A performs an insert operation but does not commit the transaction:
Client A> INSERT INTO EMP_MSTR(EMP_NO, BRANCH_NO, FNAME, MNAME, LNAME, DEPT,
DESIG, MNGR NO) VALUES('E100', 'B1', 'Sharanam', 'Chaitanya', 'Shah', 'Administration', 'Project
Leader', NULL);
Output:
1 row created.
Client B performs a view operation:
Client B> SELECT EMP NO, FNAME, MNAME, LNAME FROM EMP MSTR;
Output:
EMP NO FNAME MNAME LNAME
El Ivan Nelson Bayross
E2 Amit Desai
E3 Maya Mahima Joshi
E4 Peter Iyer Joseph
E5 Mandhar Dilip Dalvi
E6 Sonal Abdul Khan
E7 Ani.l Ashutosh Kambli
E8 Seema P. Apte
E9 Vikram Vilas Randive
E10 Anjali Sameer Pathak
Client B performs an insert operation:
Client B> INSERT INTO EMP_MSTR(EMP_NO, BRANCH_NO, FNAME, MNAME, LNAME, DEPT, DESIG,
MNGR_NO) VALUES('E 101', 'B1', 'Vaishali', 'Sharanam', 'Shah', 'Tech Team', 'Programmer', 'E 100');
Output:
Client B's SQL DML enters into a wait state waiting for Client A to release the locked
resource by using a Commit or Rollback statement.
Inferences:
 When Client A locks the table EMP_MSTR in exclusive mode the table is available only for
querying to other users. No other data manipulation (i.e. Insert, Update and Delete operation)
can be performed on the EMP_MSTR table by other users
 Since Client A has inserted a record in the EMP_MSTR table and not committed the changes
when Client B fires a select statement the newly inserted record is not visible to Client B
 As the EMP_MSTR table has been locked when Client B tries to insert a record, the system
enters into an indefinite wait period till all locks are released by Client A taken on EMP_MSTR
table
Releasing Locks
Locks are released under the following circumstances:
The transaction is committed successfully using the Commit verb
A rollback is performed
A rollback to a savepoint will release locks set after the specified savepoint
Note
All locks are released on commit or unqualified Rollback.
Table locks are released by rolling back to a savepoint.
Row-level locks are not released by rolling back to a savepoint.
Examples of Explicit Locking Using SQL and the Behavior of the Oracle Engine.
The ACCT_MSTR table will be used to check the behavior of the Oracle Engine in a multi-
user environment. An update operation is performed after specifying the FOR UPDATE
clause.
Table name: ACCT MSTR (Partials Extract)
ACCT_NO CURBAL ACCT_NO CURBAL ACCT_NO CURBAL ACCT_NO CURBAL
SB1 500 CA2 3000 SB3 500 CA4 12000
SB5 500 SB6 500 CA7 22000 SB8 500
S139 500 CA10 32000 SB11 500 CA12 5000
S13 500 CA 14 10000 SB15 500
Client A selects all the records from the ACCT_MSTR table with the FOR UPDATE clause:
Client A> SELECT ACCT_NO, CURBAL FROM ACCT_MSTR FOR UPDATE;
Client A performs an update operation on the record SB9 in the ACCT_MSTR table:
Client A> UPDATE ACCT_MSTR SET CURBAL = CURBAL + 5000 WHERE ACCT_NO =‘SB9’ ;
Output:
1 row updated.
Client A fires a SELECT statement on the ACCT_MSTR table:
Client A> SELECT ACCT_NO, CURBAL FROM ACCT_MSTR;
Output:
ACCT_NO CURBAL
SBl 500
CA2 3000
SB3 500
CA4 12000
SB5 500
SB6 500
CA7 22000
SB8 500
SB9 5500
CA10 32000
SB11 500
CA12 3000
SB13 500
CAI 4 12000
SB15 500
15 rows selected.
Client B fires a SELECT statement with a FOR UPDATE clause on the ACCT_MSTR table:
Client B> SELECT ACCT_NO, CURBAL FROM ACCT_MSTR FOR UPDATE;
Output:
Client B's SQL DML enters into an indefinite wait state waiting for Client A to release the locked
resource by using a commit or Rollback statement.
CURSORS
WHAT IS A CURSOR?
 The Oracle Engine uses a work area for its internal processing in order to execute an SQL
statement. This work area is private to SQL's operations and is called a Cursor.
 The data that is stored in the cursor is called the Active Data Set.
 Conceptually, the size of the cursor in memory is the size required to hold the number of
rows in the Active Data Set.
 The actual size, however, is determined by the Oracle engine's built in memory
management capabilities and the amount of RAM available.
 Oracle has a pre-defined area in main memory set aside, within which cursors are opened.
 Hence the cursor's size will be limited by the size of this pre-defined area.
Types Of Cursors
1. Implicit Cursors
I. The oracle engine implicits to open a cursor on the server to process each SQL
statements.
II. Since the implicit cursor is open and managed by oracle engine internally, the
function of reserving an area in memory, popularity this area withappropriate
data, processing the data in memory area, releasing the memory when
processing is over is taken care of by the oracle engine.
III. The resultant data is then passed to client machine via a network then a cursoris
open on the client machine to hold rows written by oracle engine.
IV. The number of rows held in cursor on the client is managed by client’s operating
system and its swapped area.
V. Implicit cursor attributes can be used to access information about the status of
last insert, update, delete or single row select statement.
VI. This can be done by preceding the implicit cursor attribute with cursor namethat
is SQL.
Example
1. The HRD manager has decided to raise the salary of employee by 0.15. write aprogram
to accept the employee number and update salary of that employee. Display
appropriate message if employee does not exist.
DECLARE
v_eno emp.empno%type
BEGIN
v_eno :=&v_eno;
If SQL%Found then
Dbms_output.put_line(‘Employee exist’);
Update emp set sal=sal+(sal*0.15)
Where empno=v_no;
Else
Dbms_output.put_line(‘Employee does not exist’);
End if;
END;
2. The HRD manager has decided to raise salary of employee whose job isprogrammer.
Display number of employees having job as programmer.
DECLARE
v_count number;
BEGIN
Update emp set sal=sal+(sal*0.5)
Where job=’Programmer’;
If SQL%Rowcount >0 then
Dbms_output.put_line(‘SQL%Rowcount||Employees record updated successfully’);
Else
Dbms_output.put_line(‘Number of Employees working as programmer’);
End if;
END;.
2. EXPLICIT CURSORS-
When an individual record in a table have to be proceed inside a PLSQL code block a cursor is used. This
cursor will be declared and mapped to as SQL query in the declare section of the PLSQL block and use
within its executable section. Thus a cursor created and used is known as explicit cursor.
Examples-
1. Print name and job of employee having job as manager or analyst.
DECLARE
vname emp.ename%type;
vjob emp.job%type;
cursor c1 is select ename,job from emp where job=’manager’ or job=’analyst’;
BEGIN
Open c1;
Dbms_output.put_line(‘Name’||’ ‘||’job’);
If c1%Found then
Loop
Fetch c1 into vname,vjob;
Exit when c1%NotFound;
Dbms_output.put_line(‘vname’||’ ‘||’vjob’);
End loop;
Close c1;
Dbms_output.put_line(‘Cursor not found’);
End if;
END;
General Cursor Attributes:
 When the Oracle engine creates an Implicit or Explicit cursor, cursor control variables are
also created to control the execution of the cursor.
 These are a set of four system variables, which keep track of the Current status of acursor.
 These cursor variables can be accessed and used in a PL/SQL code block.
Both Implicit and Explicit cursors have four attributes. They are described below:
Attribute Name Description
%ISOPEN Returns TRUE if cursor is open, FALSE otherwise.
%FOUND Returns TRUE if record was fetched successfully, FALSE otherwise.
%NOTFOUND Returns TRUE if record was not fetched successfully, FALSE otherwise.
%ROWCOUNT Returns number of records processed from the cursor.
Explicit Cursor Management
The steps involved in using an explicit cursor and manipulating data in its active set are:
 Declare a cursor mapped to a SQL select statement that retrieves data for processing
 Open the cursor
 Fetch data from the cursor one row at a time into memory variables
 Process the data held in the memory variables as required using a loop
 Exit from the loop after processing is complete
 Close the cursor
Cursor Declaration
A cursor is defined in the declarative part of a PL/SQL block. This is done by naming the
cursor and mapping it to a query.
When a cursor is declared, the Oracle engine is informed that a cursor of the said name
needs to be opened. The declaration is only intimation.
There is no memory allocation at this point in time.
Syntax:
CURSOR CursorName IS SELECT statement;
Opening a Cursor
Initialization of a cursor takes place via the open statement, this:
Defines a private SQL area named after the cursor name.
Executes a query associated with the cursor.
Retrieves table data and populates the named private SQL area in memory i.e. creates the
Active Data Set.
Sets the cursor row pointer in the Active Data Set to the first record.
Syntax:
OPEN CursorName;
Fetching data from a Cursor
A fetch statement then moves the data held in the Active Data Set into memory
variables.
The fetch statement is placed inside a Loop ... End Loop construct, which causes the data to
be fetched into the memory variables and processed until all the rows in the Active Data
Set are processed.
The fetch loop then exits.
The exiting of the fetch loop is user controlled.
Syntax:
FETCH CursorName INTO Variable 1, Variable2, ...,
Note:
There must be a memory variable for each column value of the Active Data Set.
Datatypes must match.
These variables will be declared in the DECLARE section of the PL/SQL block.
Processing data
Data held in the memory variables can be processed as desired.
Exiting loop
A standard loop structure (Loop-End Loop) is used to fetch records from the cursor
into memory variables one row at a time.
Closing A Cursor
After the fetch loop exits, the cursor must be closed with the closestatement.
This will release the memory occupied by the cursor and its Active Data Set
The close statement disables the cursor and the active set becomes undefined.
This will release the memory occupied by the cursor and its Data Set both on the
Client and on the Server.
Syntax:
CLOSE CursorName;
Note
Once a cursor is closed, the reopen statement causes the cursor to be reopened.
CURSOR FOR LOOPS
Another technique commonly used to control the Loop... End Loop within a PL/SQL block is the
FOR variable IN value construct.
This is an example of a machine defined loop exit i.e. when all the values in the FOR construct are
exhausted looping stops.
Syntax:
FOR memory variable IN CursorName
Here, the verb FOR automatically creates the memory variable of the %rowtype.
Each record in the opened cursor becomes a value for the memory variable of the %rowtype.
The FOR verb ensures that a row from the cursor is loaded in the declared memory variable and
the loop executes once.
This goes on until all the rows of the cursor have been loaded into the memory variable. After this
the loop stops.
A cursor for loop automatically does the following:
 Implicitly declares its loop index as a %rowtype record
 Opens a cursor
 Fetches a row from the cursor for each loop iteration
 Closes the cursor when all rows have beenprocessed
A cursor can be closed even when an exit or a goto statement is used to leave the loop prematurely,
or if an exception is raised inside the loop.
Example :
1. Assigning commission=500 for those employees who are getting null commission by
using cursor for loop-
DECLARE
Cursor c_comm is select * from emp where comm. Is null;
emp_rec c_comm%RowType;
BEGIN
Open c_comm;
For emp_rec in c_comm
Loop
Emp_rec.comm:=500;
Dbms_output.put_line(emp_rec.ename||’ ‘||emp_rec.comm);
End loop;
Commit;
Close c_comm;
END;
Output:
PL/SQL procedure successfully completed.
The FOR LOOP is responsible to:
 Automatically fetch the data retrieved by the cursor into the NoTrans_Rec recordset
 The sequence of statements inside the loop is executed once for every row that is fetched, i.e.
check if any data is retrieved
Update the STATUS field of the ACCT_MSTR table to reflect the inactivity
Insert a record in the INACTV_ACCT MSTR table to reflect the updation
 Automatically repeat the above steps until the data retrieval process is complete.
 The cursor closes automatically when all the records in the cursor have been processed. This is
because there are no more rows left to load into NoTrans_Rec. This situation is sensed by the
FOR verb which, causes the loop to exit.
Finally a COMMIT is fired to make the changes permanent
PARAMETERIZED CURSORS
Records, which satisfy conditions, set in the WHERE clause of the SELECT statement
mapped to the cursor.
In other words, the criterion on which the Active Data Set is determined is hard
coded and never changes.
Commercial applications require that the query, which, defines the cursor, be
generic and the data that is retrieved from the table be allowed to change according
to need.
Oracle recognizes this and permits the creation of parameterized cursors for use.
The contents of a parameterized cursor will constantly change depending upon the
value passed to its parameter.
Since the cursor accepts user-defined values into its parameters, thus changing the
Result set extracted, it is called as Parameterized Cursor.
Declaring A Parameterized Cursor
Syntax:
CURSOR CursorName (VariableName Datatype) IS <SELECT statement... >
Syntax:
OPEN CursorName (Value / Variable / Expression )
Note
The scope of cursor parameters is local to that cursor, which means that they can be
referenced only within the query declared in the cursor declaration.
Each parameter in the declaration must have a corresponding value in the openstatement.
Example-
Accept a salary and print name and salary of employee having salary less than or equal to the accepted
salary
DECLARE
Cursor c1(c.sal emp.sal%type) is select * from emp where sal<=c.sal;
emp_rec c1%RowType;
vsal emp.sal%type;
BEGIN
vsal =&vsal;
Open c1;
For emp_rec in c1(vsal)
Loop
Dbms_output.put_line(emp_rec.ename||’ ‘||emp_rec.sal);
End loop;
Close c1;
END;
Cursor Variables
Like a cursor, a cursor variable points to the current row in the result set of a multi-
row query.
But, unlike a cursor, a cursor variable can be opened for any type-compatiblequery.
It is not tied to a specific query.
Cursor variables are true PL/SQL variables, to which you can assign new valuesand
which you can pass to subprograms stored in an Oracle database.
This gives you more flexibility and a convenient way to centralize data retrieval.
Typically, you open a cursor variable by passing it to a stored procedurethat
declares a cursor variable as one of its formal parameters.
To execute a multi-row query, Oracle opens an unnamed work area that stores
processing information. You can access this area through an explicit cursor, which
names the work area, or through a cursor variable, which points to the workarea.
To create cursor variables, you define a REF CURSOR type, then declare cursor
variables of that type.
ref_cursor_type_definition ::=
TYPE type_name IS REF CURSOR
[RETURN
{ {db_table_name | cursor_name | cursor_variable_name}%ROWTYPE
| record_name%TYPE
| record_type_name
| ref_cursor_type_name
}];
ref_cursor_variable_declaration ::=
cursor_variable_name type_name;
Keyword and Parameter Description
cursor_name
o An explicit cursor previously declared within the current scope.
cursor_variable_name
o A PL/SQL cursor variable previously declared within the current scope.
db_table_name
o A database table or view, which must be accessible when the declaration is
elaborated.
record_name
o A user-defined record previously declared within the current scope.
record_type_name
o A user-defined record type that was defined using the datatype specifier RECORD.
REF CURSOR
o Cursor variables all have the datatype REF CURSOR.
RETURN
o Specifies the datatype of a cursor variable return value. You can use the %ROWTYPE
attribute in the RETURN clause to provide a record type that represents a row in a
database table, or a row from a cursor or strongly typed cursor variable. You can use
the %TYPE attribute to provide the datatype of a previously declared record.
Types of Cursor Variables
REF CURSOR types can be strong (with a return type) or weak (with no return type).
Strong REF CURSOR types are less error prone because the PL/SQL compiler lets you
associate a strongly typed cursor variable only with queries that return the right set of
columns.
Weak REF CURSOR types are more flexible because the compiler lets you associate a weakly
typed cursor variable with any query.
Because there is no type checking with a weak REF CURSOR, all such types are
interchangeable. Instead of creating a new type, you can use the predefinedtype
SYS_REFCURSOR.
The following procedure opens the cursor variable generic_cv for the chosen query:
PROCEDURE open_cv (generic_cv IN OUT GenericCurTyp,choice NUMBER) IS
BEGIN
IF choice = 1 THEN
OPEN generic_cv FOR SELECT * FROM emp;
ELSIF choice = 2 THEN
OPEN generic_cv FOR SELECT * FROM dept;
ELSIF choice = 3 THEN
OPEN generic_cv FOR SELECT * FROM salgrade;
END IF;
...
END;
Query Evaluation
THE SYSTEM CATALOG
A relational DBMS maintains information about every table and index that it contains. The
descriptive information is itself stored in a collection of special tables called the catalog
tables.
The catalog tables are also called the data dictionary, the system catalog, or simply the
catalog.
Information in the Catalog
System catalog stores system-wide information, such as the size of the buffer pool and the page size,
and the following information about individual tables, indexes, and views:
 For each table:
- Its table name, the file name (or some identifier), and the file structure (e.g. heap file) of the
file in which it is stored.
- The attribute name and type of each of its attributes.
- The index name of each index on the table.
- The integrity constraints (e.g. primary key and foreign key constraints) on the table.
 For each index:
- The index name and the structure (e.g. B+ tree) of the index.
- The search key attributes.
 For each view:
- Its view name and definition.
The following information is commonly stored:
 Cardinality: The number of tuples. NTuples (R) for each table R.
 Size: The number of pages NPages (R) for each table R.
 Index Cardinality: The number of distinct key values NKeys (I) for each index I.
 Index Size: The number of pages INPages(I) for each index I. (For a B+ tree index I, we take
INPages to be the number of leaf pages.)
 Index Height: The number of nonleaf levels IHeight(I) for each tree index I.
 Index Range: The minimum present key value ILow(I) and the maximum present key value
IHigh(I) for each index I.
The catalogs also contain information about users, such as accounting information and
authorization information (e.g. Joe User can modify the Reserves table but only read the Sailors
table).
Evaluation of Relational operator-
1. A relational database consist of collection of tables, each of which assign a unique name.
A row in table represents relationships among set of values.
2. As table is collection of such relationships there is close correspondence between the
concept table and the mathematical concept of relation.
3. A query language is a language in which user request information from the database.
Query language can be categorized as procedural or non-procedural language.
4. In procedural language user instructs the system to perform sequence of operation on
the database to compute desired result.
5. In non-procedural language user describes the desired information without givinga
specific procedure for obtaining information.
6. The relational algorithm is procedural language that fundamentals operations in
relational algorithm such as select, project, union, set difference and Cartesian product.
7. In addition to fundamental operations there are several other operations as set
intersection, natural join, division and assignment.
I. SELECT OPERATOR- the select operation selects the tuple(rows) that satisfies given
predicate lower case greek letter σis used to denote selection. The average relation
is in paranthesis after ‘σ ’.
Examples-
1. Select those tuple of loan relation where branch is nerul.
σ branchname=”Nerul”(loan)
2. Select all tuples in which amount of loan is more than 1200.
σ amount > 1200(loan)
II. PROJECT OPERATOR- This operator use to select only required attributes (columns). The
projection is denoted by greek letter “Π”.
Examples-
1. List all loan numbers and amount of loan.
Π loan_no,amount(loan).
III. SET DIFFERENCE OPERATOR- The set difference operator denoted by “-” sign allows to
find tuples that are in one relation but not in other relation.
Examples-
1. List all the customers of bank that have an account but not loan.
Π customer_name(depositor) – Π customer_name(loan)
IV. SET INTERSECTION OPERATOR- It is denoted by ‘W’ allows to find tuples thar are inboth
relations.
Example-
1. List all the customers of bank that have an account but not loan.
Π customer_name(depositor) W Π customer_name(loan)
V. NATURAL JOIN OPERATOR- This operator allows to combine certain selections. It is
denoted by ‘ ’ symbol. The natural join operation performs on twoarguments
forcing equality on those attributes that appear in both relational schemas and
finally removes duplicate attributes.
Examples-
1. Find all customers who has loan and account in bank.
Π customer_name (depositor loan)
2. Find names of all branches with customer who have account in the bank andlive
in vashi city.
Π branch_name (σ customer_city=”Vashi”(customer account
depositor))
INTRODUCTION TO QUERY OPTIMIZATION
It is the process of selecting the most efficient query evaluation plan from among the
many possible for processing a given query, especially if query is complex.
Programmer do not expect from user that they write queries for processing
efficiently.
It is expected that system should construct query evaluation plan that minimize the
cost of query evaluation. This is where query optimization comes into play.
One aspect of optimization offers at relational algebra level, where the system finds
an expression that is equivalent to given expression, but more efficient toexecute.
Another aspect is to select detailed strategy for processing query such as choosing
the algorithm to use for executing operation, choosing specific indices to use andso
on.
It is the job of query optimizer to come up with query evaluation plan that computes
same result as the given expression and is least costly way of generating result.
To find the least costly query evaluation plan, the query optimizer needs to generate
alternative plans that produce the same result as given expression.
To choose least costly one query evaluation plan generation involves following
three steps: -
i. Generating expressions that are logically equivalent to given expression.
ii. Estimating cost of each evaluation plan.
iii. Add explanatory notes to the resultant expressions in alternative wayto
generate alternative query evaluation plan.
Steps i and iii are interleaved in query optimizer. Some expression aregenerated
and explained and so on.
Step ii is done in background by collecting statistical information about relations
such as relation sizes and index depth to make good estimate of cost of plan.
Query optimization is one of the most important tasks of a relational DBMS.
One of the strengths of relational query languages is the wide variety of ways in which a
user can express and thus the system can evaluate a query.
Although this flexibility makes it easy to write queries, good performance relies greatly on
the quality of the query optimizer-a given query can be evaluated in many ways, and the
difference in cost between the best and worst plan may be several orders ofmagnitude.
Realistically, we cannot expect to always find the best plan, but we expect to consistently
find a plan that is quite good.
Queries are parsed and then presented to a query optimizer, which is responsible for
identifying an efficient execution plan.
The optimizer generates alternative plans and chooses the plan with the least estimated
cost.
 The space of plans considered by a typical relational query optimizer can be understood by
recognizing that a query is essentially treated as a  -  -  algebra expression, with the
remaining operations (if any, in a given query)
Query Parsing, Optimization, and Execution
Commercial Optimizers: current relations DBMS optimizers are very complex pieces of
software with many closely guarded details, and they typically represent 40 to 50 man-yeas
of development effort!
 carried out on the result of the  -  -  expression. Optimizing such a relational algebra
expression involves two basic steps:
 Enumerating alternative plans for evaluating the expression. Typically, an optimizer
consider a subset of all possible plans because the number of possible plans is very
large.
 Estimating the cost of each enumerated plan and choosing the plan with the lowest
estimated cost.
The process of decomposition of a relation R into a set of relations R1, R2....Rn is based on
identifying attributes and using that as a basis of decomposition.
R=R1 U R2 U………Rn.
This is a process of dividing one table into multiple tables using projection operator.
We may decompose tables into vertical segments. Vertical fragmentation is done with help of
projection operator. By taking projection of original table we can create multiple vertically
fragmented tables.
Original table
Decomposition
Q.] Explain decomposition?
If a relation is not in the normal form and we wish the relation to be normalized so that some of
the anomalies (like insert, update or delete anomalies) can be eliminated, it is necessary to
decompose the relation in two or more relations.
Eno Ename Class
1 Mahesh BE
2 Yogesh SE
3 Amit TE
Vertically Decomposed
Tables
Eno Ename
1 Mahesh
2 Yogesh
3 Amit
Eno Class
1 BE
2 SE
3 TE
Q.] Explain the desirable Properties of decomposition:
The main properties of decomposition are as listed below,
a. Lossless-join decomposition
b. Dependency preservation
c. Lack of redundancy (Repetition of information)
1) Lossless join decomposition
It is clear that decomposition must be lossless so that we do not lose any information from the
relation that is decomposed. Lossless join decomposition ensures that we can never get the
situation where false tuple are generated in relation. For every value on the join attributes there
should be a unique tuple in one of the relations. For following table to become lossless we need
to go for following steps.
Deptjd Dname Stud_Id Sname Location
10 Development 1 Sushant Mahim
20 Teaching 2 Snehal Vashi
30 HR 3 Pratiksha Warli
20 Admin 4 Supraja Dadar
a.Let R1 and R2 form decomposition of relation R.
b.Decompose the relation schema Department-Student into
Department-schema = (Dept_Id, Dname)
Student-schema = (Stud_id, Sname, Location)
c. The attributes in common must be a key for one of the relation for decomposition to be
lossless. R1 ∩ R2 ≠ Φ. There must not be null.
Note: After a lossless decomposition, you are joining a primary key and a foreign key of table.
For the above table, we have following decomposed tables:-
Studjd Deptjd Sname Location
1 10 Sushant Mahim
2 20 Snehal Vashi
3 30 Pratiksha Worli
4 20 Supraja Dadar
(Student)
(Department)
2.) Dependency preservation
Dependency preservation is another important requirement since a dependency is a very
important constraint on the database. As a result of any database updates, the database should
not result in illegal relation being created. Hence, our design should allow us to check updates
without natural joins.
If X —> Y holds then we know that the two (sets) attributes are closely related or functionally
dependent and it would be useful if both attributes are in the relation so that the dependency
can be checked easily. This can be done by maintaining functional dependency.
Example:
Student Schema = (Stud_Id, Stud_Name, Dept_Id, Dname, Location, Subj_Id, SubjName)
Student-Department Schema = (Stud_Id —> Dept_Id, Dname, Location)
Student-Subject Schema = (Stud_Id —> Subj_Id, SubjName)
Deptjd Dname
10 Development
20 Teaching
30 HR
3) Lack of Redundancy ( Repetition of information)
Decomposition that we have done should not suffer from any repetition of information problem.
For eg: STUDENT and SECTION data are separated into distinct relations. Thus we do not
have to repeat STUDENT data for each SECTION.
If a single SECTION is made into several STUDENTS, we do not have to repeat the SECTION
data for each STUDENT.
IT is desirable not to have any redundancy in database. This property may be achieved by
normalization process.
Q.] What is a Lossy- join decomposition?
We decomposed a relation intuitively but, still we need a better basis for deciding
decompositions since intuition may not always be correct. A careless decomposition may lead
to problems like loss of information.
Department-Student Schema = (Dept_Id, Dname, Stud_Id, Sname, Location)
Deptjd Dname Stud_Id Sname Location
10 Development 1 Sushant Mahim
20 Teaching 2 Snehal Vashi
30 HR 3 Pratiksha Warli
20 Admin 4 Supraja Dadar
Suppose we decompose the above relation into two relations Department and Student as
follows :
Schema 2: Student Schema contains (Stud_Id, Sname, Location)
All the information that was in the relation 'Department' appears to be still available in
'Department-Student' schema but this is not so. Suppose, we would need to join 'Department'
and 'Student' table this join becomes lossy join decomposition.
As R, n R2 = O
So, there is no column common between them, therefore joining not possible. A lossless
decomposition is that which guarantees that the join will result in exactly the same relation as
was decomposed
.
Q.] Explain Functional Dependency:
Schema 1: Department Schema containing (Deptjd, Dname)
Deptjd Dname
10 Development
20 Teaching
30 HR
Studjd Sname Location
1 Sushant Mahim
2 Snehal Vashi
3 Pratiksha Warli
4 Supraja Dadar
The concept of functional dependency is given by E. F. Codd which is also called as
normalization process. This concept is used to define various normal forms.
Functional dependency is a type of constraints exists between multiple attributes of a relation.
Functional Dependency (FD) determines the set of values or the attribute based on another
attribute
It is denoted by (—>)
This form can be written as Y is functionally dependent on X or X determines Y.
E code —> E Name
We can say column X is functionally dependent on other column Y. If data value in column X
change when data value in another column Y is modified.
an FD X  Y essentially says that if two tuples agree on the values in attributes X, they must
also agree on the values in attributes Y.
Functional Dependency provides a formal mechanism to express constraints between various
attributes of a relation.
For Eg, let us say in below table if the name of employee is changed its ID also need to be
changed so, we can say NAME column is dependent on employee ID column, i.e. For different
name of employee different ID is given.
Employee Table
ID Name
• a • •
MahOOl Mahesh ....
Q.] Explain the Types of Functional Dependencies
1. Full Functional Dependency:
A Functional Dependency A —> B is a full functional dependency if removal of any attributes
from A means that the dependency does not hold any more.
Example
{Emp_no, Project_no} -> HOURS i.e. Emp_no —> Hours and Project_no —> Hours
In the above example, Hours is fully functionally dependent on both Emp_no and Project_no.
The number of hours spent on the project by a particular employee cannot be determined with
the project number (Project_no) alone. It needs the employee number (Emp_no) as well.
2. Partial Functional Dependency:
A partial dependency means that a non key column is depend on some columns in composite
primary key of a table. An FD A —> B is a partial dependency if there is some attribute X € A (X
subset of A), that can be removed from A and the dependency will still hold.
Example
{Emp_no, Project_no} —> Ename that is, Emp_no —> Ename
In the above example, Ename is partially dependent on {Emp_no, Project_no} since employee
name (ename) can be determined using the employee id (Emp_no) alone even if project_no is
removed from the relation.
Note: For a table to be in 2nd Normal form there should be no partial dependencies.
Q.] Explain Transitive Dependency
This concept is used when there is redundancy in database. If changing any non key column
(column other then key column) causes change in other non key column in such situations you
may have transitive dependency.
When one non key attribute is functionally dependent on another non key attribute then such a
dependency is called as transitive dependency.
 Non Key Attribute —» Non Key Attribute
An FD X —>Y in a relation R is a transitive dependency, if there is a set of attributes Z that is
not a subset of any key of R, and both X —» Z and Z —> Y holds true.
For eg:
EMP_DEPT {Eno, Ename, Dnumber, DeptMgrNo}
Eno —> DeptmgrNo is transitive
Dependency of DeptMgrNo on key attribute Eno is transitive as DeptMgrNo depends on
Dnumber and DNumber itself is dependent on Eno.
Eno —> Dnumber and Dnumber —> DMgrNo
Q.] Armstrong's Axioms - Closures of Functional Dependency
Given that A, B and C are sets of attributes in a relation R; one can derive several properties of
functional dependencies.
Axioms are nothing but rules of inference which provides a simple technique for reasoning
about functional dependencies.
1) Primary Rules
a. Subset Property (Axiom of Reflexivity)
If Y is a subset of X, then X —» Y
b. Augmentation (Axiom of Augmentation)
If X —> Y, then XZ -> YZ
c. Transitivity (Axiom of Transitivity)
If X -> Y and Y -> Z, then X -> Z
2) Secondary Rules (Based on above rules)
a. Union: If X -» Y and X -> Z, then X -> YZ
b. Decomposition: If X —> YZ and X -> Y, then X -> Z
c. Pseudo Transitivity: If X —> Y and YZ -> W, then XZ -> W
Example : Consider relation R = (A,B,C,D,E,F) having set of FD's
A -> B A-> C
Solution:
1. A->E
As A —> B and B —> E So using Transitive rule,
2. BC -> DF
As
BC -> D .......(i)
BC -> F .......(ii)
Using union rule (i) and (ii) .. BC -> DF
3. AC -> D
A->B ..(i)
BC->D .(ii)
Using pseudo transitivity .. AC -> D
BC-> D B -> E
BC->F AC-> F
Calculate some members of Axioms as be below,
1.) A->E 2.) BC->DF
3.) AC-
>D
4.) AC->DF
4. AC -> DF
From above solution (3)
AC->D
AC->F
Therefore, AC->DF
Normalization
Normalization is a step by step decomposition of complex records into simple records.
Normalization results in tables that satisfy some constraints and are represented in a simple
manner. This process is also called as canonical synthesis. This is a relational database
design process to avoid data redundancy by applying some constraints on data to avoid various
data anomalies.
For E.g. If same information is repeated in multiple tables of database then there are chances
that these tables will not be consistent in case of data updation, insertion or deletion. This
instance may lead to problems of data integrity. A normalized table is less vulnerable to such
data anomalies.
Normalization is a process of designing a consistent database by minimizing redundancy
and ensuring data integrity through decomposition which is lossless.
Q.] Goals/ Importance of Database Normalization
1. Ensures Data Integrity
Data integrity ensures the correctness of data stored within the database. It is achieved by
imposing integrity constraints. An integrity constraint is a rule, which restricts values present in
the database.
There are three integrity constraints:
(i) Entity constraints: The entity integrity rule states that the value of the primary key can never
be a null value. Because a primary key is used to identify a unique row in a relational table, its
value must always be specified and should never be unknown.The integrity rule requires that
insert, update and delete operations maintain the uniqueness and existence of all primarykeys.
(ii) Domain Constraints: Only permissible values of an attribute are allowed in a relation.
(iii) Referential Integrity constraints: The referential integrity rule states that if a relational
table has a foreign key, then every value of the foreign key must either be null or match the
values in the relational table (referenced table) in which that foreign key is a primary key.
2.) Prevents Redundancy in data: A non-normalized database is vulnerable to data
anomalies, if it stores data redundantly. If data is stored in two locations, but the data is updated
in only one of the locations, then that data becomes inconsistent. A normalized database stores
non-primary key data in only one location.
3.) To avoid Data Anomaly: A non-normalized table can suffer from logical inconsistencies of
various types, and from data anomalies. A Relational database table should be designed in
such a way that it will avoid all data anomalies.
Q.] Which are the different anomalies related to normalization:-
1) Update anomaly: Same information can be present in multiple records of various relations;
updates to only one table may result in logical inconsistencies.
Example each record in an "Emp_Salary" table might contain an Emp_ID, Ename, Address,
Salary. Thus a change of address for a particular Employee will potentially need to be applied to
multiple tables such as Employee table. If all the records are not updated then some tables may
leave in an inconsistent state.
2) An insertion anomaly: There is a possibility in which certain facts cannot be recorded at all
or they are not yet recorded.
Example
Consider a table, Faculty (Faculty_ID, FName, Subject_Code, Subject, Class). We can add the
details of any faculty member who teaches for a certain subject in a certain class, but we cannot
record the details of a new faculty member who has not yet been assigned to teach any subject
or class. So, subject and class column may be empty initially. If data deleted from one table all
relevant data must also be deleted or redundant.
110
3) Deletion Anomaly: If data deleted from one table all relevant data in another related tables
must also be deleted otherwise it will create redundancy problem. Deletion of some data from a
relation necessitates the deletion of some unrelated data also called as deletion anomaly.
Example
In the previous example, the table suffers from this type of anomaly. If a faculty member
temporarily ceases to be assigned a subject, we must delete the entire record on which that
faculty member appears.
Normal Forms
Forms are designed to logically address potential problems such as inconsistencies and
redundancy in information stored in the database. A database is said to be in one of the Normal
Forms, if it satisfies the rules required by that form as well as the previous form; it also will not
suffer from any of the problems addressed by the form.
Q.] State and explain Types of Normal Forms
a. First normal form (INF)
b. Second normal form (2NF)
c. Third normal form (3NF)
d. Boyce-Codd normal form (BCNF)
e. Fourth normal form (4NF)
f. Fifth Normal Form (5NF)
1.) First Normal Form
Simplest form of normalization, simplifies each attribute in relation. This Normal form given by
E.F. Codd (1970) and the later version by C.J. Date (2003)
111
A Relation is in 1st
NF, if every row contains exactly one value for each attribute. 1st
NF states
that attributes included in relation must have atomic (Simple, indivisible) values and all attribute
in a tuple must have a single value from the domain of that attribute.
• In short rules 1st
NF is,
Table columns should contain atomic data
There should not be any repeating group of data.
Example
• Consider a table 'Faculty' which has information about the faculty, subjects and the number of
hours allotted to each subject they teaches in class.
Faculty
Faculty code Faculty Name Date of Birth Subject Hours
100 Yogesh 17/07/64 DSA 16
SS 8
IS 12
101 Amit 24/12/72 MIS 16
PM 8
IS 12
102 Omprakash 03/02/80 PWRC 8
PCOM 8
IP 16
103 Nitin 28/11/66 DT 10
PCOM 8
112
SS 8
104 Mahesh 01/01/86 DT 10
ADBMS 8
PWRC 8
The above table does not have any atomic values in the 'Subject' column. Hence, it is called un-
normalized table. Inserting, updating and deletion would be a problem is such table. Hence it
has to be normalized.
For the above table to be in first normal form, each row should have atomic values. Hence a 'Sr.
No.' column is included in the table to uniquely identity each row.
1NF Table
Sr. No. Faculty code Faculty Name Date of Birth Subject Hours
1 100 Yogesh 17/07/64 DSA 16
2 100 Yogesh 17/07/64 SS 8
3 100 Yogesh 17/07/64 IS 12
4 101 Amit 24/12/72 MIS 16
5 101 Amit 24/12/72 PM 8
6 101 Amit 24/12/72 IS 12
7 102 Omprakash 03/02/80 PWRC 8
8 102 Omprakash 03/02/80 PCOM 8
9 102 Omprakash 03/02/80 IP 16
10 103 Nitin 28/11/66 DT 10
11 103 Nitin 28/11/66 PCOM 8
12 103 Nitin 28/11/66 SS 8
113
13 104 Mahesh 01/01/86 DT 10
14 104 Mahesh 01/01/86 ADBMS 8
15 104 Mahesh 01/01/86 PWRC 8
This table shows the same data as the previous table but we have eliminated the repeating
groups. Hence the table is now said to be in First Normal form (INF).
But now we have introduced Redundancy into the table. This can be eliminated using Second
Normal Form (2NF).
2.) Second Normal form
This normal form makes use of functional dependency and tries to remove problem of
redundant data that was introduced by 1NF. Therefore before applying 2NF to a relation, it
needs to satisfy 1NF condition.
A relation is in 2NF, if it is in 1NF and every non-key attribute is fully functionally dependent on
the primary key of the relation.
OR
A relation is in 2NF, if it is in INF and every non-key attribute is fully functionally dependent on
the whole and not just part of primary key of relation.
In short 2NF means,
It should be in INF
There should not be any partial dependency
To make the relation in 2NF:-
a. Find and remove attributes that are related to only a part of the key or not related to key.
b. Group the removed attributes in another table.
c. Assign the new table a key that consists of that part of the old composite key.
d. If a relation is not in 2NF, it can be further normalized into a number of 2NF relations.
114
Let us consider the table we obtained after first normalization.
Sr. No. Faculty code Faculty Name Date of Birth Subject Hours
1 100 Yogesh 17/07/64 DSA 16
2 100 Yogesh 17/07/64 SS 8
3 100 Yogesh 17/07/64 IS 12
4 101 Amit 24/12/72 MIS 16
5 101 Amit 24/12/72 PM 8
6 101 Amit 24/12/72 IS 12
7 102 Omprakash 03/02/80 PWRC 8
8 102 Omprakash 03/02/80 PCOM 8
9 102 Omprakash 03/02/80 IP 16
10 103 Nitin 28/11/66 DT 10
11 103 Nitin 28/11/66 PCOM 8
12 103 Nitin 28/11/66 SS 8
13 104 Mahesh 01/01/86 DT 10
14 104 Mahesh 01/01/86 ADBMS 8
15 104 Mahesh 01/01/86 PWRC 8
While eliminating the repeating groups, we have introduced redundancy into table. Faculty
Code, Name and Date of Birth are repeated since the same faculty is multi skilled.
To eliminate this, let us split the table into 2 parts; one with the non-repeating groups and the
other for repeating groups.
Faculty
115
Faculty code Faculty Name Date of Birth
100 Yogesh 17/07/64
101 Amit 24/12/72
102 Omprakash 03/02/80
103 Nitin 28/11/66
104 Mahesh 01/01/86
Sr. No Faculty code Subject Hours
1 100 DSA 16
2 100 SS 8
3 100 IS 12
4 101 MIS 16
5 101 PM 8
6 101 IS 12
7 102 PWRC 8
8 102 PCOM 8
9 102 IP 16
10 103 DT 10
11 103 PCOM 8
12 103 SS 8
13 104 DT 10
14 104 ADBMS 8
116
15 104 PWRC 8
Faculty Code is the only key to identify the faculty name and the date of birth. Hence, Faculty
code is the primary key in the first table and foreign key in the second table.
Faculty code is repeated in the Subject table. Hence, we have to take into account the 'SNO' to
form a composite key in Subject table. Now, SNO & Faculty code can uniquely identity each row
in this table.
Hence, the relation is now in Second Normal form.
3.) Third Normal Form
This normal form used to minimize the transitive redundancy. In order to remove the anomalies
that arose in Second Normal Form and to remove transitive dependencies, if any, we have to
perform third normalization.
A relation is in 3NF, if it is in 2NF and no non-key attribute of the relation is transitively
dependent on the primary key. 3NF prohibits transitive dependencies.
In short 3NF means,
1. It should be in 2 NF
2. There should not be any transitive partial dependency
Example
Now let us see how to normalize the second table obtained after 2NF.
Subject
117
In this table, hours depend on the subject and subject depends on the Faculty code and Sr. No.
But, hours is neither dependent on the faculty code nor the SNO. Hence, there exists a
transitive dependency between Sr. No., Subject and Hours.
If a faculty code is deleted, due to transitive dependency, information regarding the subject and
hours allotted to it will be lost. For a table to be in 3rd Normal form, transitive dependencies
must be eliminated. So, we need to decompose the table further to normalize it.
Fac_Sub
Sr. No Faculty code Subject Hours
1 100 DSA 16
2 100 SS 8
3 100 IS 12
4 101 MIS 16
5 101 PM 8
6 101 IS 12
7 102 PWRC 8
8 102 PCOM 8
9 102 IP 16
10 103 DT 10
11 103 PCOM 8
12 103 SS 8
13 104 DT 10
14 104 ADBMS 8
15 104 PWRC 8
118
Subject Hours
DSA 16
SS 8
IS 12
MIS 16
PM 8
PWRC 8
PCOM 8
Sr. No Faculty code Subject
1 100 DSA
2 100 SS
3 100 IS
4 101 MIS
5 101 PM
6 101 IS
7 102 PWRC
8 102 PCOM
9 102 IP
10 103 DT
11 103 PCOM
12 103 SS
13 104 DT
14 104 ADBMS
15 104 PWRC
119
IP 16
DT 10
ADBMS 8
After decomposing the 'Subject' table we now have 'Fac_Sub' and 'Sub_Hrs' table respectively.
Note : In most cases, third normal form is the sufficient level of decomposition. But some case
requires the design to be further formalized up to the level of 4th
as well as 5th
.
4. BCNF Normal Form
BCNF is more precise form of 3NF. The intention of Boyce-Codd Normal Form (BCNF) is that
3NF does not satisfactorily handle the case of a relation processing two or more composite or
overlapping candidate keys. Candidate key is a column in a table which has the ability to
become a primary key. A determinant is any attribute (simple or composite) on which some
other attribute is fully functionally dependent.
a—>b
Then, attribute 'a' is determinant .
A relation R is said to be in BCNF, if and only if every determinant is a candidate key.
For example:-
Soldiers are part of one or many units, and each unit is under the control of an officer.
SOLDIERID OFFICERID UNITID
1 A 1
2 A 1
3 B 2
Firstly, we'll identify the dependencies. There is a dependency between (SOLDIERID +
OFFICERID) and UNITID, a soldier and an officer implies their respective unit, but there is also
a dependency between UNITID and OFFICERID.
SOLDIERID -> UNITID
UNITID -> OFFICEID
SOLDIERID, OFFICEID -> UNITID
ADDRESS:302 PARANJPE UDYOG BHAVAN,OPP SHIVSAGAR RESTAURANT,THANE [W].PH 8097071144/55
120
This last dependency however is not partial (dependence on part of a prime attribute), nor
transitive (dependence of a nonprime attribute on another nonprime attribute. What we have is a
table where a determinate in the table is not a candidate key (UNITID). Candidate key are
SOLDERID and OFFICEID.
Thus we can convert the above to BCNF by realizing that a better composite key is one of
SOLDERIERID and UNITID, which creates a dependency between UNITID and OFFICERID,
which is a partial dependency. This is then resolved by dividing the table, the solution being as
follow:
Candidate key (SOLDIERID) and SOLDIERID -> UNITID
Candidate key (UNIT ID) AND UNIT ID -> OFFICEID
UNITID OFFICERID
1 A
1 A
2 B
The above table is now in BCNF.
Q.] Explain Multivalued Dependency and 4th
NF
Multivalued dependency is defined as relationship which accepts the cross product pattern.
Multivalued dependency defined by X —>—> Y is said to hold for a relation
R(X,Y,Z) if for a given set of values of X, there is a set of associated values of attribute Y, and X
values depend only on X values and have no dependence on the set of attributes Z.
SOLDIERID UNITID
1 1
2 1
3 2
121
Multivalued dependencies occur when the presence of one or more rows in a table implies the
presence of one or more other rows in that same table.
For example:-
Imagine a car company that manufactures many models of car, but always makes both red and
blue colors of each model. If you have a table that contains the model name, color and year of
each car the company manufactures, there is a multivalued dependency in that table. If there is
a row for a certain model name and year in blue, there must also be a similar row corresponding
to the red version of that same car.
5. 4th
Normal Form
This Normal form is given by Ronald Fagin (1977)
Fourth Normal Form tries to remove multi valued dependency among attributes.
A relation is said to be in fourth normal form if each table contains no more than one multivalued
dependency per key attribute.
A Boyce Codd normal form relation is in fourth normal form if :-
 there is no multi value dependency in the relation or
 there are multi value dependency but the attributes, which are multi value dependent on
a specific attribute, are dependent between themselves.
Or
 If a relation scheme is in BCNF and at least one of its keys consists of a single attribute ,
it is also in 4th NF
Example
Seminar Faculty Topic
DBP-1 Brown Database Principles
DAT-2 Brown Database Advanced
Techniques
DBP-1 Brown Data Modeling Techniques
DBP-1 Robert Database Principles
122
DBP-1 Robert Data Modeling Techniques
DAT-2 Maria Database Advanced
Techniques
In the above example, same topic is being taught in a seminar by more than 1 faculty and each
Faculty takes up different topics in the same seminar. Hence, Topic names are being repeated
several times. This is an example of multivalued dependency.
To eliminate multivalued dependency, split the table such that there is no multivalued
dependency.
Seminar Topic
DBP-1 Database Principles
DAT-2 Database Advanced Techniques
DBP-1 Data Modeling Techniques
Seminar Faculty
DBP-1 Brown
DAT-2 Brown
DBP-1 Robert
DAT-2 Maria
Q.] Explain Join dependency and 5th
NF
A table T is subject to a join dependency if T can always be recreated by joining multiple
tables each having a subset of the attributes of T. If one of the tables in the join has all the
attributes of the table T, the join dependency is called trivial.
The join dependency plays an important role in the 5NF normalization, also known as project-
join normal form
123
6. 5th
Normal Form
This normal form given by Ronald Fagin (1979). This normal form decomposes relations to
reduce redundancy. A relation is said to be in 5NF if and only if it is in 4NF and every join
dependency in it is implied by the candidate keys.
Fifth normal form deals with cases where information can be reconstructed from smaller pieces
of information that can be maintained with less redundancy. Fifth normal mainly emphasizes on
lossless decomposition.
Example
• Consider the following example
A relation is in 5NF if every join dependency in the relation is implied by the keys of the relation.
It implies that relations that have been decomposed in previous NF can be recombined via
natural joins to recreate the original relation. If a relation is in 3NF and each of its keys consists
of a single attribute, it is also in 5NF.
124

Tybsc cs dbms2 notes

  • 1.
  • 2.
    THE CONCEPT OFA TRANSACTION  A transaction is an execution of a user program, as a series of reads and writes of database objects.  Database ‘objects’ are the units in which programs read or write information.  The units could be pages, records, and so on, but this is dependent on the DBMS and is not central to the principles underlying concurrency control or recovery. ACID PROPERTIES There are four important properties of transactions that a DBMS must ensure to maintain data in the face of concurrent access and system failures. 1. ATOMICITY: Users should be able to regard the execution of each transaction as atomic: either all actions are carried out or none are. Users should not have to worry about the effect of incomplete transactions (say, when a system crash occurs). 2. CONSISTENCY: Each transaction, run by itself with no concurrent execution of other transactions, must preserve the consistency of the database. This property is called consistency, and the DBMS assumes that it holds for each transaction. Ensuring this property of a transaction is the responsibility of the user. 3. ISOLATION: Users should be able to understand a transaction without considering the effect of other concurrently executing transactions, even if the DBMS interleaves the actions of several transactions for performance reasons. This property is sometimes referred to as isolation: Transactions are isolated, or protected, from the effects of concurrently scheduling other transactions. 4. DURABILITY: Once the DBMS informs the user that a transaction has been successfully completed, its effects should persist even if the system crashes before all its changes are reflected on disk. This property is called durability. The acronym ACID is sometimes used to refer to the four properties of transactions that we have presented here: atomicity, consistency, isolation and durability.
  • 3.
    Atomicity and Durability Transactionscan be incomplete for three kinds of reasons. 1. A transaction can be aborted, or terminated unsuccessfully, by the DBMS because some anomaly arises during execution. If a transaction is aborted by the DBMS for some internal reason, it is automatically restarted and executed a new. 2. The system may crash (e.g. Power supply is interrupted) while one or more transactions are in progress. 3. A transaction may encounter an unexpected situation (for example, read an unexpected data value or be unable to access some disk) and decide to abort (i.e., terminate itself). How DBMS does maintain Atomicity and Durability?  A transaction that is interrupted in the middle may leave the database in an inconsistent state.  Thus a DBMS must find a way to remove the effects of partial transactions from the database, that is, it must ensure transaction atomicity: either all of a transaction's actions are carried out, or none are.  A DBMS ensures transaction atomicity by undoing the actions of incompletetransactions.  This means that users can ignore incomplete transactions in thinking about how the database is modified by transactions over time.  To be able to do this, the DBMS maintains a record, called the log, of all writes to the database.  The log is also used to ensure durability: If the system crashes before the changes made by a completed transaction are written to disk, the log is used to remember and restore these changes when the system restarts.  The DBMS component that ensures atomicity and durability is called the recoveryManager. TRANSACTIONS AND SCHEDULES A transaction is seen by the DBMS as a series, or list, of actions. The actions that can be executed by a transaction include 1. reads: RT (O) :- Transaction T is reading database object O. 2. writes: WT (O) :- Transaction T is writing database object O. 3. commit (i.e., complete successfully) 4. abort (i.e., terminate and undo all the actions carried out thus far). When the transaction T is clear from the context, we can omit the subscript.
  • 4.
    Schedule: A schedule isa list of actions (reading, writing, aborting, or committing) from a set of transactions, and the order in which two actions of a transaction T appear in a schedule must be the same as the order in which they appear in T. Intuitively, a schedule represents an actual or potential execution sequence. For example, the following schedule shows an execution order for actions of two transactions T1 and T2. The schedule does not contain an abort or commit action for either transaction. A Schedule Involving Two Transactions Complete Schedule: A schedule that contains either an abort or a commit for each transaction whose actions are listed in it is called a complete schedule. A complete schedule must contain all the actions of every transaction that appears in it. T1 T2 R(A) W(A) R(B) W(B) commit R( C) W( C) commit
  • 5.
    Serial Schedule: If theactions of different transactions are not interleaved that is, transactions are executed from start to finish, one by one we call the schedule a serial schedule. or T1 T2 T1 T2 R(A) R(B) W(A) W(B) R( C) commit W( C) R(A) Commit W(A) R(B) R( C) W(B) W(C ) Commit commit Serial Schedule CONCURRENT EXECUTION OF TRANSACTIONS The DBMS interleaves the actions of different transactions to improve performance, in terms of increased throughput or improved response times for short transactions, but not all interleavings should be allowed. Motivation for Concurrent Execution Ensuring transaction isolation while permitting concurrent execution is difficult, but is necessary for following performance reasons. A Complete schedule
  • 6.
    1. While onetransaction is waiting for a page to be read in from disk, the CPU can process another transaction. Overlapping I/O and CPU activity reduces the amount of time disks and processors are idle, and increases system throughput (the average number of transactions completed in a given time). 2. Interleaved execution of a short transaction with a long transaction usually allows the short transaction to complete quickly. In serial execution, a short transaction could get stuck behind a long transaction leading to unpredictable delays in response time, or average time taken to complete a transaction. CONCURRENCY CONTROL Serializability :- Definition: A serializable schedule over a set S of transactions is a schedule whose effect on any consistent database instance is guaranteed to be identical to that of some complete serial schedule over the set of committed transactions in S. T1 T2 R(A) W(A) R(B) W(B) commit R(C) W(C) commit
  • 7.
    A Serializable Schedule Example: Givenschedule is serializable because if we execute the schedule its effect on any consistent database instance is same as executing T! followed by T2 or T2 followed by T1 that is its effect is same to some serial order. That is, the database instance that results from executing the given schedule is identical to the database instance that results from executing the transactions in some serial order. Some important points: 1. Executing the transactions serially in different orders may produce different results, but all are presumed to be acceptable; the DBMS makes no guarantees about which of them will be the outcome of an interleaved execution. 2. If a transaction computes a value and prints it to the screen, this is an `effect' that is not directly captured in the state of the database. We will assume that all such values are also written into the database, for simplicity. Some Anomalies Associated with Interleaved Execution Anomalies occur with inconsistent database when schedule involves conflicting actions. Two actions on the same data object conflict if 1. Schedule involves at least two transactions. 2. Both transactions use same data object. 3. One of them is modifying data object(i. e have write operation). The three anomalous situations can be described in terms of when the actions of two transactions T1 and T2 conflict with each other. 1. Reading Uncommitted Data (WR Conflicts) 2. Unrepeatable Reads (RW Conflicts) 3. Overwriting Uncommitted Data (WW Conflicts) 1. Reading Uncommitted Data (WR Conflicts) Consider two transactions T1 and T2. A transaction T2 could read a database object A that has been modified by another transaction T1, which has not yet committed. Such a read is called a dirty read.
  • 8.
    Example:  Consider twotransactions T1 and T2, each of which, run alone, preserves database consistency: T1 transfers Rs.1000 from A to B, and T2 increments both A and B by 6 percent (e.g., annual interest is deposited into these two accounts).  Suppose that their actions are interleaved so that the account transfer program T1 deducts Rs1000 from account A, then the interest deposit program T2 reads the current values of accounts A and B and adds 6 percent interest to each, and then the account transfer program credits Rs1000 to account B.  The result of this schedule is different from any result that we would get by running one of the two transactions first and then the other. The problem can be traced to the fact that the value of A written by T1 is read by T2 before T1 has completed all its changes. Note: Although a transaction must leave a database in a consistent state after it completes, it is not required to keep the database consistent while it is still in progress. Such a requirement would be too restrictive: To transfer money from one account to another, a transaction must debit one account, temporarily leaving the database inconsistent, and then credit the second account, restoring consistency again.
  • 9.
    2. Unrepeatable Reads(RWConflicts)  A transaction T2 could change the value of an object A that has been read by a transaction T1, while T1 is still in progress.  This situation causes two problems. (1) If T1 tries to read the value of A again, it will get a different result, even though it has not modified A in the meantime. This situation could not arise in a serial execution of two transactions; it is called an unrepeatable read. (2) suppose that A is the number of available copies for a book. A transaction that places an order first reads A, checks that it is greater than 0, and then decrements it. Transaction T1 reads A and sees the value 1. Transaction T2 also reads A and sees value 1, decrements A to 0, and commits. Transaction T1 then tries to decrement A and gets an error (violation of integrity constraint). T1 T2 R(A) R(A) W(A) W(B) commit W(A) commit Unrepeatable read 3. Overwriting Uncommitted Data (WW Conflicts)  A transaction T2 could overwrite the value of an object A, which has already been modified by a transaction T1, while T1 is still in progress.
  • 10.
     Example: o Supposethat Harry and Larry are two employees, and their salaries must be kept equal. o Transaction T1 sets their salaries to Rs1,000 and transaction T2 sets their salaries to Rs2,000. o If we execute these in the serial order T1 followed by T 2, both receive the salary $2,000; the serial order T2 followed by T 1 gives each the salary $1,000. o Either of these is acceptable from a consistency standpoint (although Harry and Larry may prefer a higher salary!). o Here, neither transaction reads a salary value before writing it; such a write is called a blind write. o Now, consider the following interleaving of the actions of T1 and T2:  T1 sets Harry's salary to $1,000,  T2 sets Larry's salary to $2,000,  T1 sets Larry's salary to $1,000,  and finally T2 sets Harry's salary to $2,000. o The result is not identical to the result of either of the two possible serial executions, and the interleaved schedule is therefore not serializable. It violates the desired consistency criterion that the two salaries must be equal. o The problem is that we have lost update. The first transaction to commit, T2, overwrote Larry’s salary as set by T1 and we can not get it back. T1 T2 W(H) - 1000 2000 - W(T) W(T) - 1000 commit commit Overwriting uncommitted data Schedules Involving Aborted Transactions Unrecoverable Schedule: Example:
  • 11.
    Suppose that (1)an account transfer program T1 deducts $100 from account A, then (2) an interest deposit program T2 reads the current values of accounts A and B and adds 6 percent interest to each, then commits, and then (3) T1 is aborted. Now, T2 has read a value for A that should never have been there! If T2 had not yet committed, we could deal with the situation by cascading the abort of T1 and also aborting T2; this process would recursively abort any transaction that read data written by T2, and so on. But T2 has already committed, and so we cannot undo its actions! We say that such a schedule is unrecoverable. Recoverable Schedule: A recoverable schedule is one in which transactions commit only after (and if!) all transactions whose changes they read commit. If transactions read only the changes of committed transactions, not only is the schedule recoverable, but also aborting a transaction can be accomplished without cascading the abort to other transactions. Such a schedule is said to avoid cascading aborts. Problem in undoing the actions of a transaction  Consider two transactions T1 and T2. Refer the schedule given in above figure(Unrecoverable schedule).Suppose that a transaction T2 overwrites the value of an object A that has been modified by a transaction T1, while T1 is still in progress, and T1 subsequentlyaborts.  All of T1's changes to database objects are undone by restoring the value of any object that it modified to the value of the object before T1's changes.  When T1 is aborted, and its changes are undone in this manner, T2's changes are lost as well, even if T2 decides to commit.  So, for example, if A originally had the value 5, then was changed by T1 to 6, and by T2 to 7, if T1 now aborts, the value of A becomes 5 again.  Even if T2 commits, its change to A is inadvertently lost
  • 12.
    LOCK-BASED CONCURRENCY CONTROL A DBMS must be able to ensure that only serializable, recoverable schedules are allowed, and that no actions of committed transactions are lost while undoing aborted transactions.  A DBMS typically uses a locking protocol to achieve this.  A locking protocol is a set of rules to be followed by each transaction (and enforced by the DBMS), in order to ensure that even though actions of several transactions might be interleaved, the net effect is identical to executing all transactions in some serial order. Strict Two-Phase Locking (Strict 2PL)  The most widely used locking protocol, called Strict Two-Phase Locking, or Strict 2PL, has two rules. (1) If a transaction T wants to read (respectively, modify) an object, it first requests a shared (respectively, exclusive) lock on the object. (2) All locks held by a transaction are released when the transaction is completed.  A transaction that has an exclusive lock can also read the object; an additional shared lock is not required.  A transaction that requests a lock is suspended until the DBMS is able to grant it the requested lock.  The DBMS keeps track of the locks it has granted and ensures that if a transaction holds an exclusive lock on an object, no other transaction holds a shared or exclusive lock on the same object.  Requests to acquire and release locks can be automatically inserted into transactions by the DBMS; users need not worry about these details.  But these rule reduces of strict 2PL the concurrency.  To increase the concurrency without sacrificing serializability we can modify second rule of Strict 2PL as: A transaction can not request additional locks once it releases any lock.  That is it allows transactions to release locks before the end, that is before the commit or abort action.  Thus, every transaction has a `growing' phase in which it acquires locks, followed by a `shrinking' phase in which it releases locks. Safe interleaving The locking protocol allows only `safe' interleavings of transactions. (1) If two transactions access completely independent parts of the database, they will be able to concurrently obtain the locks that they need and proceed on their ways. (2)if two transactions access the same object, and one of them wants to modify it, their actions are effectively ordered serially, all actions of one of these transactions (the one that gets the lock on the common object first) are completed before (this lock is released and) the other transaction can proceed.
  • 13.
    ST (O): TransactionT requesting a shared lock on object O. XT (O): Transaction T requesting a exclusive lock on object O. Schedule illustrating Strict 2PL Consider the following schedule. T1 T2 R(A) W(A) R(A) W(A) R(B) W(B) commit R(B) W(A) commit  This interleaving could result in a state that cannot result from any serial execution of the three transactions. For instance, (1) T1 could change A from 10 to 20, (2) then T2 (which reads the value 20 for A) could change B from 100 to 200, and (3) then T1 would read the value 200 for B.  If run serially, either T1 or T2 would execute FIrst, and read the values 10 for A and 100 forB.  Clearly, the interleaved execution is not equivalent to either serial execution.  If the Strict 2PL protocol is used, the above interleaving is disallowed.  Assuming that the transactions proceed at the same relative speed as before, T1 would obtain an exclusive lock on A first and then read and write A.
  • 14.
    Schedule Illustrating Strict2PL  Then, T2 would request a lock on A. However, this request cannot be granted until T1 releases its exclusive lock on A, and the DBMS therefore suspends T2.  T1 now proceeds to obtain an exclusive lock on B, reads and writes B, then finally commits, at which time its locks are released.  T2's lock request is now granted, and it proceeds. In this example the locking protocol results in a serial execution of the two transactions,  In general, however, the actions of different transactions could be interleaved. T1 T2 X(A) R(A) W(A) X(B) R(B) W(B) commit X(A) R(A) W(A) X(B) R(B) W(B) commit
  • 15.
    Illustrating strict 2PL with serial exicution  In the above schedule, first transaction T1 request for the shared lock on data object A. then transaction T2 request for shared lock data object A.  Since we can have two shared locks on the same database object at the same time DBMS grant this request. Then T2 requests for the exclusive lock on data object B. T1 T2 S(A) R(A) S(A) R(A) X(B) R(B) W(B) commit X(C) R(C) W(C) commit
  • 16.
     T2 readsand writes B, then finally commits, at which time its locks arereleased.  Now T1 requests for exclusive lock on data object C. DBMS grants this request.  T1 reads and writes C, then finally commits, at which time its locks arereleased.  In this case, we don’t have conflicting actions thus interleaving is allowed. LOCK-BASED CONCURRENCY CONTROL REVISITED Conflict serializability A schedule is conflict serializable if it is conflict equivalent to some serial schedule. Conflict Equivalent  Two schedules are said to be conflict equivalent if they involve the (same set of) actions of the same transactions and they order every pair of conflicting actions of two committed transactions in the same way.  Two actions conflict if they operate on the same data object and at least one of them is a write.  The outcome of a schedule depends only on the order of conflicting operations; we can interchange any pair of non-conflicting operations without altering the effect of the schedule on the database.  If two schedules are conflict equivalent, it is easy to see that they have the same effect on a database.  Indeed, because they order all pairs of conflicting operations in the same way, we can obtain one of them from the other by repeatedly swapping pairs of non-conflicting actions, that is, by swapping pairs of actions whose relative order does not alter the outcome.  Every conflict serializable schedule is serializable, if we assume that the set of items in the database does not grow or shrink; that is, values can be modified but items are not added or deleted.  However, some serializable schedules are not conflict serializable, This schedule is equivalent to executing the transactions serially in the order T1, T2, T3, but it is not conflict equivalent to this serial schedule because the writes of T1 and T2 are ordereddifferently.
  • 17.
    Example: Consider the followingschedule: T2 T1 W(X) S R(Y) R(Y) Commit R(X) Commit Conflict serializability To check if this schedule is conflict serializable or not we will have to check if it is conflict equivalent to some serial schedule of it. Consider serial schedule of it. T1 T2 W(X) R(Y) Commit R(Y) R(X) Commit Serial schedule of given schedule
  • 18.
    In this caseconflicting actions are the actions on data object X and conflicting actions are ordered in same way in both the schedules. So two schedules are conflict equivalent and thus given schedule is conflict serializable. Example: Precedence graph: a  A schedule S is conflict serializable if and only if its precedence graph is acyclic.  Strict 2PL ensures that the precedence graph for any schedule that it allows is acyclic.  It is useful to capture all potential conflicts between the transactions in a schedule in precedence graph, also called a serializability graph.  The precedence graph for a schedule S contains: o A node for each committed transaction in S. o An arc from Ti to Tj if an action of Ti precedes and conflicts with one of Tj'sactions.  If precedence graph contains cycle schedule is not conflict serializable.  Consider, the following transaction  The precedence graph for it is asfollows
  • 19.
    Strict Schedule:  Aschedule is said to be strict if a value written by a transaction T is not read or overwritten by other transactions until T either aborts or commits.  Strict schedules are recoverable, do not require cascading aborts, and actions of aborted transactions can be undone by restoring the original values of modified objects.  Strict 2PL improves upon 2PL by guaranteeing that every allowed schedule is strict, in addition to being conflict serializable.  The reason is that when a transaction T writes an object under Strict 2PL, it holds the (exclusive) lock until it commits or aborts. Thus, no other transaction can see or modify this object until T is complete. View Serializability  A schedule is view serializable if it is view equivalent to some serial schedule.  View Equivalent o wo schedules S1 and S2 over the same set of transactions i. e. any transaction that appears in either S1 or S2 must also appear in the other are view equivalent if the satisfy following conditions: i. If Ti reads the initial value of object A in S1, it must also read the initial value of A in S2. ii. If Ti reads a value of A written by Tj in S1, it must also read the value of A written by Tj in S2. iii. For each data object A, the transaction (if any) that performs the final write on A in S1 must also perform the final write on A in S2.  Every conflict serializable schedule is view serializable, although the converse is not true.  It can be shown that any view serializable schedule that is not conflict serializable contains one or more blind writes. LOCK MANAGEMENT  The part of the DBMS that keeps track of the locks issued to transactions is called the lock manager.  The lock manager maintains a lock table, which is a hash table with the data object identifier as the key. The DBMS also maintains a descriptive entry for each transaction in a transaction table also contains other things such as the entry that contains a pointer to a list of locks held by the transaction.  A lock table entry for an object which can be a page, a record, and so on, depending on the DBMS contains the following information: the number of transactions currently holding a lock on the object (this can be more than one if the object is locked in shared mode), the nature of the lock (shared or exclusive), and a pointer to a queue of lockrequests.
  • 20.
    Implementing Lock andUnlock Requests  According to the Strict 2PL protocol, before a transaction T reads or writes a database object O, it must obtain a shared or exclusive lock on O and must hold on to the lock until it commits or aborts.  When a transaction needs a lock on an object, it issues a lock request to the lock manager: (1)If a shared lock is requested, the queue of requests is empty, and the object is not currently locked in exclusive mode, the lock manager grants the lock and updates the lock table entry for the object (indicating that the object is locked in shared mode, and incrementing the number of transactions holding a lock by one). (2)If an exclusive lock is requested, and no transaction currently holds a lock on the object which also implies the queue of requests is empty), the lock manager grants the lock and updates the lock table entry. (3)Otherwise, the requested lock cannot be immediately granted, and the lock request is added to the queue of lock requests for this object. The transaction requesting the lock is suspended.  When a transaction aborts or commits, it releases all its locks. When a lock on an object is released, the lock manager updates the lock table entry for the object and examines the lock request at the head of the queue for this object.  If this request can now be granted, the transaction that made the request is woken up and given the lock. Indeed, if there are several requests for a shared lock on the object at the front of the queue, all of these requests can now be granted together. Note:  If T1 has a shared lock on O, and T2 requests an exclusive lock, T2's request is queued. Now, if T3 requests a shared lock, its request enters the queue behind that of T2, even though the requested lock is compatible with the lock held by T1.  This rule ensures that T2 does not starve, that is, wait indefinitely while a stream of other transactions acquire shared locks and thereby prevent T2 from getting the exclusive lock that it is waiting for. Atomicity of Locking and Unlocking  The implementation of lock and unlock commands must ensure that these are atomic operations.  To ensure atomicity of these operations when several instances of the lock manager code can execute concurrently, access to the lock table has to be guarded by an operating system synchronization mechanism such as a semaphore.  Suppose that a transaction requests an exclusive lock. The lock manager checks and finds that no other transaction holds a lock on the object and therefore decides to grant the request.
  • 21.
     But inthe meantime, another transaction might have requested and received a conflicting lock!  To prevent this, the entire sequence of actions in a lock request call (checking to see if the request can be granted, updating the lock table, etc.) must be implemented as an atomic operation. LOCK Conversions  The DBMS maintains a transaction table, which contains (among other things) a list of the locks currently held by a transaction.  This list can be checked before requesting a lock, to ensure that the same transaction does not request the same lock twice.  However, a transaction may need to acquire an exclusive lock on an object for which it already holds a shared lock. Such a lock upgrade request is handled specially by granting the write lock immediately if no other transaction holds a shared lock on the object and inserting the request at the front of the queue otherwise.  The rationale for favoring the transaction thus is that it already holds a shared lock on the object and queuing it behind another transaction that wants an exclusive lock on the same object causes both transactions to wait for each other and therefore be blockedforever.  Lock upgrades leads to a deadlocks caused by conflicting upgrade requests.  For example, , if two transactions that hold a shared lock on an object both request an upgrade to an exclusive lock, this leads to a deadlock because first transaction is waiting for second to release its lock and other transaction is waiting for first transaction to release its lock.]  A better approach is to avoid the need for lock upgrades altogether by obtaining exclusive locks initially and downgrading to a shared lock once it is clear that this issufficient.  For example of an SQL update statement, rows in a table are locked in exclusive mode first. If a row does not satisfy the condition for being updated, the lock on the row is downgraded to a shared lock.  The downgrade approach reduces concurrency by obtaining white lock some cases where they are not require.  On the whole require, however, it improve through put by reducing dead locks.  This approach is therefore widely used current commercial system.  Concurrency can be increase by introducing new kind of lock, called and update lock that is compatible with shared lock but not other update and exclusive lock.  By setting and update lock initially, rather than exclusive lock, we prevent conflict with read operation.  Once we are sure we need not update object, we can downgrade to the shared lock.  If we need to update the object, we must first upgrade to an exclusive lock. This upgrade does not lead to a deadlock because no other transaction can have an upgrade or exclusive on the object. Additional Issues: Lock Upgrades, Convoys, Latches  We have concentrated thus far on how the DBMS schedules transactions, based on their requests for locks. This interleaving interacts with the operating system's scheduling of processes access to the CPU and can lead to a situation called a convoy, where most of the CPU cycles are spent on process switching. The problem is that a transaction T holding a
  • 22.
    heavily used lockmay be suspended by the operating system. Until T is resumed, every other transaction that needs this lock is queued. Such queues, called convoys, can quickly become very long; a convoy, once formed, tends to be stable.  Convoys are one of the drawbacks of building a DBMS on top of a general-purpose operating system with preemptive scheduling.  In addition to locks, which are held over a long duration, a DBMS also supports short duration latches. Setting a latch before reading or writing a page ensures that the physical read or write operation is atomic; otherwise, two read/write operations might conflict if the objects being locked do not correspond to disk pages (the units of I/O).  Latches are unset immediately after the physical read or write operation is completed. Deadlocks  Consider the following example:  Transaction T1 gets an exclusive lock on object A, T2 gets an exclusive lock on B.  T1 requests an exclusive lock on B and is queued, and T2 requests an exclusive lock on A and is queued. Now, T1 is waiting for T2 to release its lock and T2 is waiting for T1 to release its lock!  Such a cycle of transactions waiting for locks to be released is called a deadlock.  Clearly, these two transactions will make no further progress. Worse, they hold locks that may be required by other transactions.  The DBMS must either prevent or detect (and resolve) such deadlock situations. Deadlock Detection  In the detection approach, the DBMS must periodically check for deadlocks.  When a transaction Ti is suspended because a lock that it requests cannot be granted, it must wait until all transactions Tj that currently hold conflicting locks releasethem.  The lock manager maintains a structure called a waits-for graph to detect deadlockcycles.  The nodes correspond to active transactions, and there is an arc from Ti to Tj if (and only if) Ti is waiting for Tj to release a lock.  The lock manager adds edges to this graph when it queues lock requests and removes edges when it grants lock requests.  The waits-for graph describes all active transactions, some of which will eventually abort.  The waits-for graph is periodically checked for cycles, which indicate deadlock.  A deadlock is resolved by aborting a transaction that is on a cycle and releasing its locks.  This action allows some of the waiting transactions to proceed.  The choice of which transaction to abort can be made using several criteria’s like : o The one with the fewest locks o The one that has done the least work o The one that is farthest from completion  The transaction might have been repeatedly restarted; if so, it should eventually be favored during deadlock detection and allowed to complete. Consider the schedule The last step, shown below the line, creates a cycle in the waits-for graph.
  • 23.
    Figure 19.4 showsthe waits-for graph before and after this step. Deadlock Prevention  We can prevent deadlocks by giving each transaction a priority and ensuring that lower priority transactions are not allowed to wait for higher priority transactions (or vice versa).  One way to assign priorities is to give each transaction a timestamp when it starts up. The lower the timestamp, the higher the transaction's priority, that is, the oldest transaction has the highest priority.  If a transaction Ti requests a lock and transaction Tj holds a conflicting lock,  The lock manager can use one of the following two policies:
  • 24.
    Wait-die: If Tihas higher priority, it is allowed to wait; otherwise it is aborted. Wound-wait: If Ti has higher priority, abort Tj; otherwise Ti waits. Wait-die policy:  In the wait-die scheme, lower priority transactions can never wait for higher priority transactions.  The wait-die scheme is non preemptive; only a transaction requesting a lock can beaborted. Wound-wait policy  In the wound-wait scheme, higher priority transactions never wait for lower priority transactions.  In either case no deadlock cycle can develop.  The wound-wait scheme is preemptive.  Advantage: A transaction that has all the locks it needs will never be aborted for deadlock reasons.  Disadvantage: A younger transaction that conflicts with an older transaction may be repeatedly aborted  We must also ensure that no transaction is perennially aborted because it never has a sufficiently high priority. (Note that in both schemes, the higher priority transaction is never aborted.)  When a transaction is aborted and restarted, it should be given the same timestamp that it had originally.  Reissuing timestamps in this way ensures that each transaction will eventually become the oldest transaction, and thus the one with the highest priority, and will get all the locks that it requires.  As a transaction grows older (and its priority increases), it tends to wait for more and more younger transactions Conservative 2PL:-  A variant of 2PL, called Conservative 2PL, can also prevent deadlocks. Under Conservative 2PL, a transaction obtains all the locks it will ever need when it begins, or blocks waiting for these lock to become available.  This scheme ensure that there will be no deadlocks, and, perhaps more important, that a transaction that already holds some locks will not block waiting for other locks.  If lock contention is heavy, Conservative 2PL can reduce the time that locks are held on average, because transactions that hold locks are blocked.
  • 25.
    Timestamp-Based Concurrency Control In optimistic concurrency control, a timestamp ordering is imposed on transactions, and validation Check that all conflicting actions occurred in the same order.  Timestamps can also be used in another way: each transaction can be assigned a timestamp at startup, and we can ensure, at execution time, that if action ai of transaction Ti conflicts with action aj of transaction Tj, ai occurs before aj if TS(Ti) < TS(Tj).  If an action violates this ordering, the transaction is aborted and restarted.  To implement this concurrency control scheme, every database object O is given a read timestamp RTS(O) and a write timestamp WTS(O).  If transaction T wants to read object O, and TS(T) < WTS(O), the order of this read with respect to the most recent write on O would violate the timestamp order between this transaction and the writer. Therefore, T is aborted and restarted with a new, larger timestamp.  If TS(T) > WTS(O), T reads O, and RTS(O) is set to the larger of RTS(O) and TS(T).  If T is restarted with the same timestamp, it is guaranteed to be aborted again, due to the same conflict.  consider when transaction T wants to write object O: 1. If TS (T) < RTS (O), the write action conflicts with the most recent read action of O, and T is therefore aborted and restarted. 2. If TS (T) < WTS (O), a naïve approach would be to abort T because its write action conflicts with the most recent of O and is out of timestamp order. However, we can safely ignore such write and continue. Ignoring outdated write is called the Thomas Write Rule. 3. Otherwise, T write O and WTS (O) is set to TS (T). The Thomas Write Rule If TS(T) <WTS(O),the current write action has, in effect, been made obsolete by the most recent write of O, which follows the current write according to the timestamp ordering. We can think of T, s write action as if it had occurred immediately before the most recent write of O and never read by anyone. If the Thomas Write Rule is not used, that is T is aborted in case (2), the timestamp protocol, lie 2PL, allows only conflict serializable schedules. If the Thomas Write Rule is used, some schedules are permitted that are not conflict serializable, Consider following schedule, Because T2’s write follows T1’s read and precedes T1’s write of the same object, this schedule is not conflict serializable.
  • 26.
    T1 T2 R (A) W(A) Commit W (A) Commit The Thomas Write Rule relies on the observation that T2’s write is never seen by the transaction and the schedule in following diagram is therefore equivalent to the serializable schedule obtained by deleting this write action, which is shown in diagram T1 T2 R (A) Commit W (A) Commit Recoverability:-  Unfortunately, the timestamp protocol just presented permits schedules that are not recoverable.  If TS(T1) = 1 and TS (T2) = 2, this schedule is permitted by the timestamp protocol (with or without the Thomas Write Rule).  The timestamp protocol can be modified to disallow such schedules by buffering all write actions until the transaction commits.  In the example, when T1 wants to write A, WTS (A) is updated to reflect this action, but the change to A is not carried out immediately; instead it is recorded in a private workspace, or buffer.  When T2 wants to read A subsequently, its timestamp is compared with WTS(A), and the read is seen to be permissible.
  • 27.
     However, T2is blocked until T1 completes. If T1 commits, its change to A is copied from the buffer; otherwise, the changes in the buffer are discarded. T2 is then allowed to read A.  This blocking of T2 is similar to the effect of T1 obtaining an exclusive lock on A.  Nonetheless, even with this modification, the timestamp protocol permits some schedules not permitted by 2PL; the two protocols are not quite the same. T1 T2 W(A) R (A) W (B) Commit  Because recoverability is essential, such a modification must be used for the timestamp protocol to be practical. CRASH RECOVERY  The recovery manager of a DBMS is responsible for ensuring two important propertiesof transactions: atomicity and durability.  It ensures atomicity by undoing the actions of transactions that do not commit and durability by making sure that all actions of committed transactions survive system crashes.  The recovery manager must deal with a wide variety of database states because it is called on during system failures. INTRODUCTION TO ARIES  ARIES is a recovery algorithm that is designed to work with a steal, no-forceapproach.  When the recovery manager is invoked after a crash, restart proceeds in three phases: 1. Analysis: Identifies dirty pages in the buffer pool (i.e., changes that have not been written to disk) and active transactions at the time of the crash. 2. Redo: Repeats all actions, starting from an appropriate point in the log, and restores the database state to what it was at the time of the crash. 3. Undo: Undoes the actions of transactions that did not commit, so that the database reflects only the actions of committed transactions.
  • 28.
    Consider the simpleexecution history.  When the system is restarted, the Analysis phase identifies T1 and T3 as transactions that were active at the time of the crash.  It also identifies the dirty pages p1,p3 and p5.  All the updates (including those of T1 and T3) are reapplied in the order shown during the Redo phase.  Since T2 as a committed transaction all its actions, therefore, to be written to disk.  Finally, the actions of T1 and T3 are undone in reverse order during the Undo phase; thatis, T3's write of P3 is undone, T3's write of P1 is undone, and then T1's write of P5 is undone. Three main principles behind the ARIES recovery algorithm: Write-ahead logging:  Any change to a database object is first recorded in the log.  The record in the log must be written to stable storage before the change to the database object is written to disk. Repeating history during Redo:  Upon restart following a crash, ARIES retraces all actions of the DBMS before the crash and brings the system back to the exact state that it was in at the time of thecrash.  Then, it undoes the actions of transactions that were still active at the time of the crash (effectively aborting them).
  • 29.
    Logging changes duringUndo:  Changes made to the database while undoing a transaction are logged to ensure such an action is not repeated in the event of repeated restart. The Log  The log, sometimes called the trail or journal, is a history of actions executed by theDBMS.  Physically, the log is a file of records stored in stable storage, which is assumed tosurvive crashes.  This durability can be achieved by maintaining two or more copies of the log on different disks (perhaps in different locations), so that the chance of all copies of the log being simultaneously lost is negligibly small.  The most recent portion of the log, called the log tail, is kept in main memory and is periodically forced to stable storage. This way, log records and data records are written to disk at the same granularity (pages or sets of pages).  Every log record is given a unique id called the log sequence number (LSN). As with any record id, we can fetch a log record with one disk access given the LSN.  LSNs should be assigned in monotonically increasing order; this property is required for the ARIES recovery algorithm.  If the log is a sequential file, in principle growing indefinitely, the LSN can simply bethe address of the first byte of the log record.  For recovery purposes, every page in the database contains the LSN of the most recent log record that describes a change to this page. This LSN is called the PageLSN.  A log record is written for each of the following actions: Updating a page:  After modifying the page, an update type record is appended to the log tail.  The page LSN of the page is then set to the LSN of the update logrecord. Commit:  When a transaction decides to commit, it force-writes a commit type logrecord containing the transaction id.  That is, the log record is appended to the log, and the log tail is written to stable storage, up to and including the commit record.  The transaction is considered to have committed at the instant that its commit log record is written to stable storage. Abort:  When a transaction is aborted, an abort type log record containing thetransaction id is appended to the log,  And Undo is initiated for this transaction.
  • 30.
    End:  when atransaction is aborted or committed, some additional actions such as removing the transaction's entry in the transaction table must be taken beyond writing the abort or commit log record.  After all these additional steps are completed, an end type log record containingthe transaction id is appended to the log. Undoing an update:  When a transaction is rolled back (because the transaction is aborted, or during recovery from a crash), its updates are undone.  When the action described by an update log record is undone, a compensation log record, or CLR, is written. FIELDS FOR LOG RECORD  Every log record has certain FIelds: prevLSN, transID, and type.  The set of all log records for a given transaction is maintained as a linked list going back in time.  Using the prevLSN field this list must be updated whenever a log record is added.  The transID field is the id of the transaction generating the log record.  The type field obviously indicates the type of the log record.  Additional fields depend on the type of the log record. ADDITIONAL FIELDS OF UPDATE TYPE OF LOG RECORD  The fields in an update log record are illustrated in following diagram: o The pageID field is the page id of the modified page; o The length in bytes represents the number of bytes modified. o The offset of the change are also included which represents from whichposition the has been started. o The before-image is the value of the changed bytes before the change. o The after-image is the value after the change.  An update log record that contains both before- and after-images can be used it redo the change and undo it.  A redo – only update log record contains just the after image.  similarly an undo-only update record contains just the before image.
  • 31.
    Compensation log record(CLR)  A compensation log record (CLR) is written just before the change recorded in an update log record U is undone.  A compensation log record C describes the action taken to undo the actions recorded in the corresponding update log record and is appended to the log tail just like any other log record.  The compensation log record C also contains a field called undoNextLSN, which is the LSN of the next log record that is to be undone for the transaction that wrote update record U;  This field in C is set to the value of prevLSN in U.  Unlike an update log record, a CLR describes an action that will never be undone, that is, we never undo an undo action.  The reason is simple: an update log record describes a change made by a transaction during normal execution and the transaction may subsequently be aborted, whereas a CLR describes an action taken to rollback a transaction for which the decision to abort has already been made.  Thus, the transaction must be rolled back, and the undo action described by the CLR is definitely required.  The number of CLRs that can be written during Undo is no more than the number of update log records for active transactions at the time of the crash.  It may well happen that a CLR is written to stable storage but that the undo action that it describes is not yet written to disk when the system crashes again.  In this case the undo action described in the CLR is reapplied during the Redo phase, just like the action described in update log records. Other Recovery-Related Data Structures In addition to the log, the following two tables contain important recovery-related information:
  • 32.
    Transaction table:  Thistable contains one entry for each active transaction.  The entry contains (among other things) the transaction id, the status, and a field called lastLSN, LastLSN is the LSN of the most recent log record for this transaction.  The status of a transaction can be that it is in progress, is committed, or is aborted. (In the latter two cases, the transaction will be removed from the table once certain `clean up' steps are completed.) Dirty page table:  This table contains one entry for each dirty page in the buffer pool, that is, each page with changes that are not yet reflected on disk.  The entry contains a field recLSN, which is the LSN of the first log record that caused the page to become dirty.  Consider the following simple example. Transaction T1000 changes the value of bytes 21 to 23 on page P500 from `ABC' to `DEF',  Transaction T2000 changes `HIJ' to `KLM' on page P600, transaction T2000 changes bytes 20 through 22 from `GDE' to `QRS' on page P500,  Then transaction T1000 changes `TUV' to `WXY' on page P505.  The dirty page table, the transaction table,3 and the log at this instant are shown inFigure In this case active transactions are T100 and T200. and dirty pages are P500,P505 and P600. The Write-Ahead Log Protocol  Before writing a page to disk, every update log record that describes a change to this page must be forced to stable storage.  This is accomplished by forcing all log records up to and including the one with LSN equal to the pageLSN to stable storage before writing the page to disk.  WAL is the fundamental rule that ensures that a record of every change to the database is available while attempting to recover from a crash.
  • 33.
     If atransaction made a change and committed, the no-force approach means that some of these changes may not have been written to disk at the time of a subsequentcrash.  Without a record of these changes, there would be no way to ensure that the changes of a committed transaction survive crashes.  When a transaction is committed, the log tail is forced to stable storage, even if a no-force approach is being used.  If a force approach is used, all the pages modified by the transaction, rather than a portion of the log that includes all its records, must be forced to disk when the transaction commits.  The set of all changed pages is typically much larger than the log tail because the size of an update log record is close to (twice) the size of the changed bytes, which is likely to be much smaller than the page size.  Cost of forcing the log tail is much smaller than the cost of writing all changed pages todisk. Checkpointing  A checkpoint is like a snapshot of the DBMS state, and by taking checkpoints periodically, the DBMS can reduce the amount of work to be done during restart in the event of a subsequent crash.  Checkpointing in ARIES has three steps. o First, a begin_checkpoint record is written to indicate when the checkpoint starts. o Second, an end_checkpoint record is constructed, including in it the current contents of the transaction table and the dirty page table, and appended to the log. o The third step is carried out after the end_checkpoint record is written to stable storage: A special master record containing the LSN of the begin_checkpoint log record is written to a known place on stable storage.  While the end_checkpoint record is being constructed, the DBMS continues executing transactions and writing other log records.  The only guarantee we have is that the transaction table and dirty page table are accurate as of the time of the begin checkpoint record.  This kind of checkpoint is called a fuzzy checkpoint and is inexpensive because it does not require quiescing the system or writing out pages in the buffer pool.  On the other hand, the effectiveness of this checkpointing technique is limited by the earliest recLSN of pages in the dirty pages table, because during restart we must redo changes starting from the log record whose LSN is equal to this recLSN.  When the system comes back up after a crash, the restart process begins by locating the most recent checkpoint record. RECOVERING FROM A SYSTEM CRASH  When the system is restarted after a crash, the recovery manager proceeds in threephases:  The Analysis phase begins by examining the most recent begin checkpoint record, whose LSN is denoted as C and proceeds forward in the log until the last log record.  The Redo phase follows Analysis and redoes all changes to any page that might have been dirty at the time of the crash; this set of pages and the starting point for Redo (the smallest recLSN of any dirty page) are determined during Analysis.  The Undo phase follows Redo and undoes the changes of all transactions that were active at the time of the crash; again, this set of transactions is identified during the Analysisphase.  Redo reapplies changes in the order in which they were originally carried out; Undo reverses changes in the opposite order, reversing the most recent change first.
  • 34.
    Analysis Phase  TheAnalysis phase performs three tasks: 1. It determines the point in the log at which to start the Redo pass. 2. It determines (a conservative superset of the) pages in the buffer pool that were dirty at the time of the crash. 3. It identifies transactions that were active at the time of the crash and must be undone.  Analysis begins by examining the most recent begin checkpoint log record and initializing the dirty page table and transaction table to the copies of those structures in the next end checkpoint record.  Thus, these tables are initialized to the set of dirty pages and active transactions at the time of the checkpoint.  Analysis then scans the log in the forward direction until it reaches the end of thelog:  If an end log record for a transaction T is encountered, T is removed from the transaction table because it is no longer active.  If a log record other than an end record for a transaction T is encountered, an entry for T is added to the transaction table if it is not already there. Further, the entry for T ismodified: 1. The lastLSN field is set to the LSN of this log record. 2. If the log record is a commit record, the status is set to C, otherwise it is set to U (indicating that it is to be undone).  If a redoable log record affecting page P is encountered, and P is not in the dirty page table, an entry is inserted into this table with page id P and recLSN equal to the LSN of this redoable log record. This LSN identifies the oldest change affecting page P that may not have been written to disk.
  • 35.
     At theend of the Analysis phase, the transaction table contains an accurate list of all transactions that were active at the time of the crash.  This is the set of transactions with status U. The dirty page table includes all pages that were dirty at the time of the crash, but may also contain some pages that were written todisk.  If an end write log record were written at the completion of each write operation, the dirty page table constructed during Analysis could be made more accurate, but in ARIES, the additional cost of writing end write log records is not considered to be worth thegain. Redo Phase  During the Redo phase, ARIES reapplies the updates of all transactions, committed or otherwise. Further, if a transaction was aborted before the crash and its updates were undone, as indicated by CLRs, the actions described in the CLRs are alsoreapplied.  This repeating history paradigm distinguishes ARIES from other proposed WAL based recovery algorithms and causes the database to be brought to the same state that it was in at the time of the crash.  The Redo phase begins with the log record that has the smallest recLSN of all pages in the dirty page table constructed by the Analysis pass because this log record identifies the oldest update that may not have been written to disk prior to the crash.  Starting from this log record, Redo scans forward until the end of the log.  For each redoable log record (update or CLR) encountered, Redo checks whether the logged action must be redone. The action must be redone unless one of the following conditions holds: 1. The affected page is not in the dirty page table. 2. The affected page is in the dirty page table, but the recLSN for the entry is greater than the LSN of the log record being checked. 3. The pageLSN (stored on the page, which must be retrieved to check this condition) is greater than or equal to the LSN of the log record being checked.  The first condition obviously means that all changes to this page have been written to disk because the recLSN is the first update to this page that may not have been written to disk.  The second condition means that the update being checked was indeed propagated to disk.  The third condition, which is checked last because it requires us to retrieve the page, also ensures that the update being checked was written to disk, because either this update or a later update to the page was written. If the logged action must be redone: 1) The logged action is reapplied: 2) The pageLSN on the page is set to the LSN of the redone log record. No additional log record is written at this time.  The log records are processed in Redo phase, bringing the system back to the exact state it was in at the time of the crash.
  • 36.
     At theend of the Redo phase, end type records are written for all transactions with status C, which are removed from the transaction table Undo Phase  The Undo phase, unlike the other two phases, scans backward from the end of thelog.  The goal of this phase is to undo the actions of all transactions that were active at the time of the crash, that is, to effectively abort them.  This set of transactions is identified in the transaction table constructed by the Analysis phase. The Undo Algorithm  Undo begins with the transaction table constructed by the Analysis phase, which identifies all transactions that were active at the time of the crash, and includes the LSN of the most recent log record (the lastLSN field) for each such transaction.  Such transactions are called loser transactions. All actions of losers must be undone, and further, these actions must be undone in the reverse of the order in which they appear in the log.  Consider the set of lastLSN values for all loser transactions. Let us call this set ToUndo.  Undo repeatedly chooses the largest (i.e., most recent) LSN value in this set and processes it, until ToUndo is empty.  To process a log record: 1. If it is a CLR, and the undoNextLSN value is not null, the undoNextLSN value is added to the set ToUndo; if the undoNextLSN is null, an end record is written for the transaction because it is completely undone, and the CLR is discarded. 2. If it is an update record, a CLR is written and the corresponding action is undone, and the prevLSN value in the update log record is added to the set ToUndo.  When the set ToUndo is empty, the Undo phase is complete. Restart is now complete, and the system can proceed with normal operations. Aborting a Transaction  Aborting a transaction is just a special case of the Undo phase of Restart in which a single transaction, rather than a set of transactions, is undone. Crashes during Restart It is important to understand how the Undo algorithm handles repeated system crashes.
  • 37.
     The logshows the order in which the DBMS executed various actions; the LSNs are in ascending order, and that each log record for a transaction has a prevLSN field that points to the previous log record for that transaction.  In this case, there are no null prevLSNs, that is, some special value used in the prevLSN field of the first log record for a transaction to indicate that there is no previous logrecord.  Log record (with LSN) 30 indicates that T1 aborts. All actions of this transaction should be undone in reverse order, and the only action of T1, described by the update log record 10, is indeed undone as indicated by CLR 40.  After the first crash, Analysis identifies P1 (with recLSN 50), P3 (with recLSN 20), and P5 (with recLSN 10) as dirty pages.  Log record 45 shows that T1 is a completed transaction; thus, the transaction table identifies T2 (with lastLSN 60) and T3 (with lastLSN 50) as active at the time of thecrash.  The Redo phase begins with log record 10, which is the minimum recLSN in the dirty page table, and reapplies all actions (for the update and CLR records), as per the Redo algorithm.  The ToUndo set consists of LSNs 60, for T2, and 50, for T3. The Undo phase now begins by processing the log record with LSN 60 because 60 is the largest LSN in the ToUndoset.  The update is undone, and a CLR (with LSN 70) is written to the log.
  • 38.
     This CLRhas undoNextLSN equal to 20, which is the prevLSN value in log record 60; 20 is the next action to be undone for T2.  Now the largest remaining LSN in the ToUndo set is 50. The write corresponding to log record 50 is now undone, and a CLR describing the change is written.  This CLR has LSN 80, and its undoNextLSN field is null because 50 is the only log record for transaction T3.  Thus T3 is completely undone, and an end record is written. Log records 70, 80, and 85 are written to stable storage before the system crashes a second time;  However, the changes described by these records may not have been written to disk.  When the system is restarted after the second crash, Analysis determines that the only active transaction at the time of the crash was T2; in addition, the dirty page table is identical to what it was during the previous restart.  Log records 10 through 85 are processed again during Redo.  The Undo phase considers the only LSN in the ToUndo set, 70, and processes it by adding the undoNextLSN value (20) to the ToUndo set.  Next, log record 20 is processed by undoing T2's write of page P3, and a CLR is written (LSN 90). Because 20 is the first of T2's log records and therefore, the last of its records to be undone.  The undoNextLSN field in this CLR is null, an end record is written for T2, and the ToUndo set is now empty.  Recovery is now complete, and normal execution can resume with the writing of a checkpoint record.  For completeness, consider what happens if the system crashes while Restart is in the Analysis or Redo phases. If a crash occurs during the Analysis phase, all the work done in this phase is lost, and on restart the Analysis phase starts afresh with the same information as before.  If a crash occurs during the Redo phase, the only effect that survives the crash is that some of the changes made during Redo may have been written to disk prior to thecrash.  Restart starts again with the Analysis phase and then the Redo phase, and some update log records that were redone the first time around will not be redone a second time because the pageLSN will now be equal to the update record's LSN. We can take checkpoints during Restart to minimize repeated work in the event of a crash.
  • 39.
    UNIT 2 SEQUENCES The quickestway to retrieve data from a table is to have a column in the table whose data uniquely identifies a row. By using this column and a specific value, in the WHERE condition of a SELECT sentence the Oracle engine will be able to identify and retrieve the row the fastest. To achieve this, a constraint is attached to a specific column in the table that ensures that the column is never left empty and that the data values in the column are unique. Since human beings do data entry, it is quite likely that a duplicate value could be entered, which violates this constraint and the entire row is rejected. If the value entered into this column is computer generated it will always fulfill the unique constraint and the row will always he accepted for storage. Oracle provides an object called a Sequence that can generate numeric values. The value generated can have a maximum of 38 digits. A sequence can be defined to:  Generate numbers in ascending or descending order  Provide intervals between numbers  Caching of sequence numbers in memory to speed up their availability A sequence is an independent object and can be used with any table that requires its output. Creating Sequences The minimum information required for generating numbers using a sequence is:  The starting number  The maximum number that can be generated by a sequence  The increment value for generating the next number. This information is provided to Oracle at the time of sequence creation. Syntax: CREATE SEQUENCE < SequenceName > [INCREMENT BY < IntegerValue > START WITH < IntegerValue > MAXVALUE < IntegerValue > / NOMAXVALUE MINVALUE < integervalue > / NOMINVALUE CYCLE / NOCYCLE CACHE < IntegerValue > / NOCACHE
  • 40.
    ORDER / NOORDER] Note:- Sequence is always given a name so that it can be referenced later when required. Keywords And Parameters INCREMENT BY: Specifies the interval between sequence numbers. It can be any positive or negative value but not zero. If this clause is omitted, the default value is 1. MINVALUE: Specifies the sequence minimum value. NOMINVALUE: Specifies a minimum value of 1 for an ascending sequence and -(10)^26 for a descending sequence. MAXVALUE: Specifics the maximum value that a sequence can generate. NOMAXVALUE: Specifies a maximum of 10^27 for an ascending sequence or -1 for a descending sequence. This is the default clause. START WITH: Specifies the first sequence number to be generated. The default for an ascending sequence is the sequence minimum value (1) and for a descending sequence, it is the maximum value (-1). CYCLE: Specifies that the sequence continues to generate repeat values after reaching either its maximum value. NOCYCLE: Specifies that a sequence cannot generate more values alter reaching the maximum value. CACHE: Specifies how many values of a sequence Oracle pre-allocates and keeps in memory for faster access. The minimum value for this parameter is two. NOCACHE: Specifies that values of a sequence are not pre-allocated. Note: - If the CACHE / NOCACHE clause is omitted ORACLE caches 20 sequence numbers by default. ORDER: This guarantees that sequence numbers are generated in the order of request. This is only necessary if using Parallel Server in Parallel mode option. In exclusive mode option, a sequence always generates numbers in order. NOORDER: This does not guarantee sequence numbers are generated in order of request. This is only necessary if you are using Parallel Server in Parallel mode option. If the ORDER/NOORDER clause is omitted, a sequence takes the NOORDER clause by default. Note The ORDER, NOORDER Clause has no significance, if Oracle is configured with Single Server option.
  • 41.
    Example 20: -Create a sequence by the name ADDR_SEQ, which will generate numbers from 1 upto 9999 in ascending order with an interval of 1. The sequence must restart from the number 1 after generating number 999. CREATE SEQUENCE ADDR_SEQ INCREMENT BY | START WITH | MINVALUE 1 MAXVALUE 999 CYCLE; Referencing A Sequence  Once a sequence is created SQL can be used to view the values held in its cache.  To simply view sequence value use a SELECT sentence as described below: SELECT < SequenceNnme >. NextVal FROM DUAL;  This will display the next value held in the cache on the VDU screen.  Every time nextval references a sequence its output is automatically incremented from the old value to the new value ready for use.  The example below explains how to access a sequence and use its generated value in the INSERT statement. Example 21: - Insert values for ADDR_TYPE, ADDR1, ADDR2, CITY, STATE and PINCODE in the ADDR_DTLS table. The ADDR_SEQ sequence must be used to generate ADDR_NO and CODE_NO must be a value held in the BRANCH_NO column of the BRANCH_MSTR table. Table Name: ADDR_DTLS Column Name Data Type Size Attributes ADDR_NO Number 6 CODE_NO VarChar2 10 Foreign Key references BRANCH_NO of the BRANCH_MSTR table. ADDR_TYPE VarChar2 1 Can hold the values: H for Head Office or B for Branch Column Name Data Type Size Attributes ADDR1 VarChar2 50
  • 42.
    ADDR2 VarChar2 50 CITYVarChar2 25 STATE VarChar2 25 PINCODE VarChar2 6 INSERT INTO ADDR_DTLS (ADDR_NO, CODE_NO, ADDR_TYPE, ADDR1, ADDR2, CITY, STATE, PINCODE) VALUES(ADDR SEQ.NextVal, ‘B5’, ‘B’, ‘Vertex Plaza, Shop 4,’, ‘Western Express Highway, Dahisar (East),’, ‘Mumbai’, ‘Maharashtra’, ‘400078’); To reference the current value of a sequence: SELECT < SequenceNnme >. CurrVal FROM DUAL; Altering a Sequence  A sequence once created can be altered.  This is achieved by using the ALTER SEQUENCE statement. Syntax: ALTER SEQUENCE <SequenceName> [INCREMENT BY <IntegerValue> MAXVALUE <IntegerValue> / NOMAXVALUE MINVALUE <IntegerValue> / NOMINVALUE CYCLE / NOCYCLE CACHE < IntegerVnlue >/ NOCACHE ORDER / NOORDER] Note The START value of the sequence cannot be altered. Example 23: Change the Cache value of the sequence ADDR_SEQ to 30 and interval between two numbers as 2. ALTER SEQUENCE ADDR_SEQ INCREMENT BY 2 CACHE 30; Dropping A Sequence The DROP SEQUENCE command is used to remove the sequence from the database. Syntax: DROP SEQUENCE < SequenceName > ;
  • 43.
    Example 24: Destroy thesequence ADDR_SEQ. DROP SEQUENCE ADDR_SEQ; FUNDAMENTALS OF PL/SQL WHAT IS SQL?  SQL is nothing but a Structured Query Language.  SQL is the natural language of the DBA.  But it suffers from various disadvantages, when used as a conventionalprogramming language. DISADVANTAGES OF SQL  SQL does not have any procedural capabilities that is does not provide the programming techniques of condition checking, looping and branching that is vital for data testing before its permanent storage.  SQL statements are passed to the Oracle Engine one at a time. Each time an SQL statement is executed, a call is made to the engine’s resources. This adds to the traffic on the network, thereby decreasing in the speed of the data processing, especially in a multi-user environment.  While processing an SQL sentence if an error occurs, the Oracle engine displays its own error messages. SQL has no facility of programmed handling of errors. INTRODUCTION TO PL/SQL  PL/SQL is a superset of SQL.  PL/SQL is a block structured language that enables developers to combine the power of SQL with procedural statements.  PL/SQL bridges the gap between database technology and procedural programming languages. ADVANTAGES OF PL/SQL  Support for SQL and Support for object-oriented programming  PL/SQL is development tool that not only supports SQL data manipulation but also provides facilities of conditional checking, branching and looping.
  • 44.
     Block Structures PL SQL consists of blocks of code, which can be nested within each other. SQL sends an entire block of SQL statements to the Oracle engine all in one go. Communication between program block and the Oracle engine reduces considerably, reducing networktraffic.  Since the Oracle engine got the SQL statements as a single block, it processes this code much faster than if it got the code one sentence at a time.  Oracle engine at one time for execution, all changes made to the data in the table are done or undone, in one go.  Error Handling  PL/SQL also permits dealing with errors as required, and facilities displaying user-friendly messages, when errors are encountered.  Use of variables  PL/SQL allows declaration and use of variables in block of code.  These variables can be used to store intermediate results of a query for later processing, or calculate values and insert them into an Oracle table later.  PL/SQL variables can be used anywhere, either in SQL statements or in PL/SQL blocks.  Performance and Efficiency  Via PL/SQL, all sorts of calculations can be done quickly and efficiently without the use of the Oracle engine.  This considerably improves transaction performance.  Portability  Applications written in PL/SQL are portable to any computer hardware andoperating system, where Oracle is operational.  Hence, PL/SQL code blocks written for a DOS version of Oracle will run on itsLinux/Unix version, without any modifications at all. THE GENERIC PL/SQL BLOCK  PL/SQL permits the creation of structured logical blocks of code that describe processes, which have to be applied to data. A single PL/SQL code block consists of a set of SQL statements, clubbed together and passed to the Oracle engine entirely.  A PL/SQL block has a definite structure, which can be divided into sections. The sections of a PL/SQL block are:  The Declare Section  The Master Begin and End section that also (optionally) contains anException section.
  • 45.
    Fig. PL/SQL blockstructure  Declare section  Code blocks start with a declaration section, in which, memory variables and other Oracle objects can be declared and if required initialized. Once declared, they can be used in SQL statements for data manipulation.  Begin section  It consists of a set of SQL and PL/SQL statements, which describe processes that have to be applied to table data. Actual data manipulation, retrieval, looping and branching constructs are specified in this section.  The Exception Section  This section deals with handling of errors that arise during execution of the data manipulation statements, which make up the PL/SQL code block. Errors can arise due to syntax, logic and/or validation rule violation.  The End Section  This marks the end of a PL/SQL block. PL/SQL IN THE ORACLE ENGINE  The PL/SQL engine resides in Oracle engine.  Oracle engine can process PL/SQL blocks as well as single SQL statement.  The PL/SQL block is sent to the PL/SQL engine, where procedural statements areexecuted and SQL statements are sent to the SQL executor in the Oracle engine.
  • 46.
     The callto the Oracle engine needs to be made only once to execute number ofstatements.  Oracle engine is called only once for each block, the speed of SQL statement execution is vastly enhanced, when compared to the Oracle engine being called once for each SQL statement. Fundamentals of PL/SQL  The Character Set  Uppercase alphabets(A-Z)  Lowercase alphabets (a-z)  Numerals (0-9)  Symbols ( ) + - * / < > = ! ; : . ‘ @ % , “ # $ ^ & _ { } ? [ ]  Compound symbols used in PL/SQL block are <> != -= ^= <= >= : = ** || << >>  Literals  Literal is a numeric value or a character string used to represent itself.  Numeric Literal  These can be either integer or floats. If a floats being represented, then the integer part must be separated from float part by a period.  Example  25, 6.34, -5, 25e-03, .1  Logical (Boolean) Literal  These are predetermined constants. The values that can be assigned to this type are: TRUE, FALSE, NULL  String Literal  These are represented by one or more legal characters and must be enclosed within single quotes. The single quote character can be represented, by writing it twice in a string literal.  Example
  • 47.
     ‘Hello world’ ‘Don’t go without saving your work’  Character Literal  These are string literals consisting of single characters.  Example  ‘*’  ‘A’ PL/SQL Datatypes  Predefined Datatypes  Number Types  Character Types  National Character Types  Boolean Types  LOB Types  Date and Interval Types  Number Types  Number types let you store numeric data (integers, real numbers, and floating-point numbers)  BINARY_INTEGER  We use the BINARY_INTEGER datatype to store signed integers.  BINARY_INTEGER values require less storage than NUMBER values.  BINARY_INTEGER Subtypes  NATURAL - Restrict an integer variable to non-negative or positive values  NATURALN - Prevent the assigning of nulls to an integer variable.  POSITIVE - Restrict an integer variable to non-negative or positive values  POSITIVEN - Prevent the assigning of nulls to an integer variable.  SIGNTYPE - Lets you restrict an integer variable to the values -1, 0, and 1.  NUMBER  We use the NUMBER datatype to store fixed-point or floating-point numbers.  We can specify precision, which is the total number of digits, and scale, which isthe number of digits to the right of the decimal point. The syntax follows:  NUMBER[(precision,scale)]  To declare fixed-point numbers, for which you must specify scale, use the following form:  NUMBER(precision,scale)  NUMBER Subtypes  DEC  DECIMAL  NUMERIC  INTEGER  INT
  • 48.
     SMALLINT  DOUBLEPRECISION  FLOAT  REAL Use the subtypes DEC, DECIMAL, and NUMERIC to declare fixed-point numbers with a maximum precision of 38 decimal digits. Use the subtypes DOUBLE PRECISION and FLOAT to declare floating-point numbers with a maximum precision of 126 binary digits, which is roughly equivalent to 38 decimal digits. REAL to declare floating-point numbers with a maximum precision of 63 binary digits, this is roughly equivalent to 18 decimal digits. Use the subtypes INTEGER, INT, and SMALLINT to declare integers with a maximum precision of 38 decimal digits.  PLS_INTEGER  You use the PLS_INTEGER datatype to store signed integers  PLS_INTEGER values require less storage than NUMBER values. Also,PLS_INTEGER operations use machine arithmetic, so they are faster than NUMBER and BINARY_INTEGER operations.  Character Types  Character types let you store alphanumeric data, represent words and text, and manipulate character strings.  CHAR  We use the CHAR datatype to store fixed-length character data. How the datais represented internally depends on the database character set.  Maximum size up to 32767 bytes.  We can specify the size in terms of bytes or characters, where each character contains one or more bytes, depending on the character set encoding.  CHAR[(maximum_size [CHAR | BYTE] )]  If you do not specify a maximum size, it defaults to 1.  VARCHAR2  You use the VARCHAR2 datatype to store variable-length character data.  VARCHAR2(maximum_size [CHAR | BYTE])
  • 49.
     National CharacterTypes  The widely used one-byte ASCII and EBCDIC character sets are adequate to represent the Roman alphabet, but some Asian languages, such as Japanese, contain thousands of characters.  These languages require two or three bytes to represent each character.  NCHAR  You use the NCHAR datatype to store fixed-length (blank-padded if necessary) national character data  The NCHAR datatype takes an optional parameter that lets you specify a maximumsize in characters. The syntax follows:  NCHAR[(maximum_size)]  NVARCHAR2  You use the NVARCHAR2 datatype to store variable-length Unicode character data.  The NVARCHAR2 datatype takes a required parameter that specifies a maximum sizein characters. The syntax follows:  NVARCHAR2(maximum_size)  Boolean Type  BOOLEAN  You use the BOOLEAN datatype to store the logical values TRUE, FALSE, and NULL.  Only logic operations are allowed on BOOLEAN variables.  The BOOLEAN datatype takes no parameters.  LOB Types  The LOB (large object) datatypes BFILE, BLOB, CLOB, and NCLOB let you store blocks of unstructured data (such as text, graphic images, video clips, and sound waveforms) up to four gigabytes in size.  PL/SQL operates on LOBs through the locators.  BFILE
  • 50.
     You usethe BFILE datatype to store large binary objects in operating system files outside the database. Every BFILE variable stores a file locator, which points to a large binary file on the server.  BFILEs are read-only, so you cannot modify them. The size of a BFILE is system dependent but cannot exceed four gigabytes  BLOB  You use the BLOB datatype to store large binary objects in the database.  Every BLOB variable stores a locator, which points to a large binary object.  The size of a BLOB cannot exceed four gigabytes.  CLOB  You use the CLOB datatype to store large blocks of character data in thedatabase  The size of a CLOB cannot exceed four gigabytes.  NCLOB  You use the NCLOB datatype to store large blocks NCHAR data in the database  The size of a NCLOB cannot exceed four gigabytes.  Date and Interval Types  The datatypes in this section let you store and manipulate dates, times, and intervals (periods of time).  A variable that has a date/time datatype holds values called datetimes; a variable that has an interval datatype holds values called intervals.  A datetime or interval consists of fields, which determine its value.  DATE  You use the DATE datatype to store fixed-length datetimes, which include the time of day in seconds since midnight.  The date portion defaults to the first day of the current month; the time portion defaults to midnight.  The date function SYSDATE returns the current date and time.  You can add and subtract dates.  Example, the following statement returns the number of days since an employee was hired: SELECT SYSDATE - hiredate INTO days_worked FROM emp WHERE empno = 7499;
  • 51.
     INTERVAL YEARTO MONTH  You use the datatype INTERVAL YEAR TO MONTH to store and manipulate intervals of years and months.  Example, INTERVAL YEAR[(precision)] TO MONTH where precision specifies the number of digits in the years field. You cannot use a symbolic constant or variable to specify the precision; you must use an integer literal in the range 0 .. 4. The default is 2.  INTERVAL DAY TO SECOND  You use the datatype INTERVAL DAY TO SECOND to store and manipulate intervals of days, hours, minutes, and seconds.  The syntax is: INTERVAL DAY[(leading_precision)] TO SECOND[(fractional_seconds_precision)] where leading_precision and fractional_seconds_precision specify the number of digits in the days field and seconds field, respectively. In both cases, you cannot use a symbolic constant or variable to specify the precision; you must use an integer literal in the range 0 .. 9. The defaults are 2 and 6, respectively.  VARIABLES  Variables may be used to store the result of a query or calculations. Variables must be declared before being used.  Variable Name  A variable name must begin with a character.  Variable length is 30 Characters.  Reserved words can not be used as variable names unless enclosed within the double quotes.  Variables must be separated from each other by at least one space or by a punctuation mark.  The case (upper/lower) is insignificant when declaring variable names.  Space can not be used in variable name.  Declaring variables
  • 52.
     We candeclare a variable of any data type either native to the ORACLE or native to PL/SQL.  Variables are declared in the DECLARE section of the PL/SQL block.  . Declaration involves the name of the variable followed by its data type followed by semicolon (;).  To assign a value to the variable the assignment operator (:=) is used.  Syntax:  <Variable name> <type> [ :=<value> ];  Example:  Ename CHAR(10);  Assigning value to a variable  There are two ways to assign a value to a variable.  Using the assignment operator ( := ) Ex: sal := 1000.00; Total_sal := sal – tax;  Selecting or fetching table data values in to variables. Ex: SELECT sal INTO pay FROM Employee WHERE emp_id = ‘E001’;  CONSTANT  A variable can be modified, a constant cannot.  Declaring Constant  Declaring a constant is similar to declaring a variable except that you have toadd the key word CONSTANT and immediately assign a value toit.  Syntax:  <variable_name> CONSTANT <datatype> := <value>;  Example:  Pi CONSTANT NUMBER(3,2) := 3.14;  USE OF %TYPE  While creating a table user attaches certain attributes like data type and constraints.  These attributes can be passed on to the variables being created in PL/SQL using %TYPE attribute.  This simplifies the declaration of variables and constants.
  • 53.
     The %TYPEattribute is used in the declaration of a variable when the variable’s attributes must be picked from a table field.  Advantages of using %TYPE  You do not need to know the data type of the table column  If you change the parameters of the table column, the variable’s parameters will change as well.  Syntax: <variable_name> Tablename.column_name %TYPE;  Example: mSal Employee.Sal %TYPE; Here, mSal is the variable. It gets the datatype and constraints of the column Sal belongs to the table Employee.  USE OF % ROWTYPE  In case, variables for the entire row of a table need to be declared, then instead of declaring them individually, %ROWTYPE is used.  In this case, the variable is a composite variable, consisting of the column names of the table as its member.  Syntax:  <variable_name> Tablename %ROWTYPE;  Example:  mEmployee_Row Employee %ROWTYPE;  Here, variable mEmployee_Row is a composite variable, consisting of the column names of the table as its member. To refer a specific field such as sal, we canuse  mEmployee_Row.sal  IDENTIFIER  The name of any ORACLE object (variable, memory variable, constant, record, cursor etc) is known as an Identifier.  Working with Identifier:  An identifier cannot be declared twice in the same block  The same identifier can be declared in two different blocks. In this case, two identifiers are unique and any change in one does not affect the other.
  • 54.
    >= Greater thanor equal to Operator
  • 55.
    LIKE OPERATOR LIKE Operatoris a Pattern matching operator. It is used to compare a character string against a pattern. Wild card characters: o Percentage sign (%) - It matches any number of characters in a string. o Underscore ( _ ) - It matches exactly one character. Example:  SELECT EName FROM Employee WHERE EName LIKE ‘P%’;  It displays EName field of Employee table where ENames starts with P. IN OPERATOR It checks to see if a value lies within a specified list of values. IN operator returns a BOOLEAN result, either TRUE or FALSE. Syntax: The_value [NOT] IN (value1, value2, value3……) Example: 3 IN (4, 8, 7, 5, 3, 2) Returns TRUE BETWEEN It checks to see if a value lies within a specified range of value. Low_End and Upper_Ends are inclusive. Syntax: the_value [NOT] BETWEEN low_end AND high_end. Example: 5 BETWEEN –5 AND 10. Returns TRUE IS NULL It checks to see if a value is NULL. Syntax: Example: the_value IS [NOT] NULL IF balance IS NULL THEN
  • 56.
    Sequence_of_statements; END IF LOGICAL OPERATORS PL/SQLimplements 3 logical operations AND, OR and NOT. A B A AND B A OR B NOT A TRUE TRUE TRUE TRUE FALSE TRUE FALSE FALSE TRUE FALSE TRUE NULL NULL TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE FALSE FALSE TRUE FALSE NULL FALSE NULL TRUE NULL TRUE NULL TRUE NULL NULL FALSE FALSE NULL NULL NULL NULL NULL NULL NULL STRING OPERATORS PL/SQL has two operators specially designed to operate only on character string typedata. They are  LIKE  LIKE Operator is a Pattern matching operator.  It is used to compare a character string against a pattern.  Wild card characters:  Percentage sign (%) - It matches any number of characters in a string.  Underscore ( _ ) - It matches exactly one character.
  • 57.
     Example: SELECT ENameFROM Employee WHERE EName LIKE ‘P%’; It displays EName field of Employee table where ENames starts with P.  Concatenation ( || ) The concatenation operator returns a resultant string consisting of all the charactersin string_1 followed by all the characters in string_2. Syntax: String_1 || string_2; A=’XX’ B=’YY’ C=VARCHAR2 (50) C=A || ‘ ‘ || B Returns a value to variable C as ‘XX YY’. DISPLAYING USER MESSAGES ON THE SCREEN DBMS_OUTPUT is a package that includes number of procedures and functions that accumulate information in a buffer so that it can be retrieved later. PUT_LINE puts a piece of information in the package buffer followed by an end-of-line marker. It can also be used to display message. PUT_LINE expects a single parameter of character data type. CONDITIONAL CONTROL IN PL/SQL In PL/SQL, the if statement allows you to control the execution of a block ofcode. PL/SQL allows the use of an IF statement to control the execution of a block of code. The different forms are:  IF condition THEN Statements; END IF;
  • 58.
     IF conditionTHEN mFINE number(4) := 100; mMIN_BAL constant number(7,2) := 5000.00;
  • 59.
    BEGIN /* Accept theAccount number from the user*/ mACCT_NO := &mACCT_NO; /* Retrieving the current balance from the ACCT_MSTR table where the ACCT_NO in the table is equal to the mACCT_NO entered by the user.*/ SELECT CURBAL INTO mCUR_BAL FROM ACCT_MSTR WHERE ACCT_NO= mACCT_NO; /* Checking if the resultant balance is less than the minimum balance of Rs.5000. If the condition is satisfied an amount of Rs.100 is deducted as a fine from the current balance of the corresponding ACCT_NO.*/ IF mCUR_BAL < mMIN_BAL THEN UPDATE ACCT_MSTR SET CURBAL = CURBAL – mFINE WHERE ACCT_NO = mACCT_NO; END IF; END; EXIT and EXIT WHEN statement: EXIT and EXIT WHEN statements enable you to escape out of the control of aloop. The format of the EXIT statement is as follows : Syntax: EXIT; EXIT WHEN statements has following syntax: Syntax: EXIT WHEN <condition is true >;  EXIT WHEN statement enables you to specify the condition required to exit the execution of the loop.  In this case if statement is not required.
  • 60.
    Ex-1: IF count> = 10 THEN EXIT; END IF; Ex-2: EXIT WHEN count > = 10; Iterative Control Iterative control indicates the ability to repeat or skip section of codeblock. A loop marks a sequence of statements that has to be repeated. The keyword loop has to be placed before the first statement in the sequence of statements to be repeated, while keyword end loop is placed immediately after the last statement in the sequence. .Once a LOOP begins to run, it will go on forever. Hence loops are always accompanied by conditional statements that keep control on the number of times it is executed. Simple Loop In simple loop, the keyword loop should be placed before the first statement in the sequence and the keyword end loop should be written at the end of the sequence to end the loop. The format is as follows. LOOP Statements; END LOOP; Example: Create a simple loop such that a message is displayed when a loop exceeds a particular value. DECLARE i number := 0; BEGIN LOOP
  • 61.
    END; i := i+ 2; EXIT WHEN i > 10; END LOOP; dbms_output.put_line('Loop exited as the value of i has reached ' || to_char(i)); While loop The WHILE loop enables you to evaluate a condition before a sequence of statements would be executed. If condition is TRUE then sequence of statements areexecuted. The syntax for the WHILE loop is as follows: Syntax: WHILE < Condition is TRUE > LOOP < Statements > END LOOP; Example: - Write a PL/SQL code block to calculate the area of a circle for a value of radius varying from 3 to 7. Store the radius and the corresponding values of calculated area in an empty table named AREAS, consisting of two columns Radius and Area. CREATE TABLE AREAS(RADIUS NUMBER(5), AREA NUMBER(14,2)); DECLARE /* Declaration of memory variables and constants to be used in the Execution section.*/ pi constant number(4,2) := 3.14 ; radius number(5); area number(14,2);
  • 62.
    BEGIN /* Initialize theradius to 3, since calculations are required for radius 3 to 7 */ radius := 3; /* Set a loop so that it fires till the radius value reaches 7 */ WHILE RADIUS <= 7 LOOP /* Area calculation for a circle */ area := pi * power(radius,2); /* Insert the value for the radius and its corresponding area calculated in the table */ INSERT INTO areas VALUES (radius, area); END; /* Increment the value of the variable radius by 1 */ radius := radius + 1; END LOOP; FOR LOOP The FOR LOOP enables you to execute a loop for predetermined number of times. The variable in the FOR loop need not be declared. The increament value can not be specified. The for loop variable is always incremented by 1. Reverse is an optional keyword. If we specify keyword reverse then variable considers last value first and then decrement it to get the start value. The syntax for FOR LOOP is as follows: FOR var IN [REVERSE] start…end LOOP Statements; END LOOP;
  • 63.
    Example: Write aPL/SQL block of code for inverting a number 5639 to 9365. DECLARE /* Declaration of memory variables and constants to be used in the Execution section.*/ given_number varchar(5) := '5639'; str_length number(2); inverted_number varchar(5); BEGIN /* Store the length of the given number */ str_length := length(given_number); /* Initialize the loop such that it repeats for the number of times equal to the length of the given number. Also, since the number is required to be inverted, the loop should consider the last number first and store it i.e. in reverse order */ FOR cntr IN REVERSE 1..str_length /* Variables used as counter in the for loop need not be declared i.e. cntr declaration is not required */ LOOP /* The last digit of the number is obtained using the substr function, and stored in a variable, while retaining the previous digit stored in the variable*/ inverted_number := inverted_number || substr(given_number, cntr, 1); END LOOP; /* Display the initial number, as well as the inverted number, which is stored in the variable on screen */ dbms_output.put_line ('The Given number is ' || given_number ); dbms_output.put_line ('The Inverted number is ' || inverted_number ); END;
  • 64.
    GOTO Statement: The GOTOstatement allows you to change the flow of control within a PL/SQLblock. This statement allows execution of a section code, which is not in the normal flow ofcontrol. The entry point into such a block of code is marked using tags << userdefined name>>. The syntax is as follows Syntax: GOTO <label name> ; The label is surrounded by double brackets (<< >>) and label must not have a semi colon after the label name. The label name does not contain a semi colon because it is not a PL/SQL statement. Example: Write a PL/SQL block of code to achieve the following. If there are no transactions taken place in the last 365 days then mark the account status as inactive, and then record the account number, the opening date and the type of account in the INACTV_ACCT_MSTR table. CREATE TABLE INACTV_ACCT_MSTR ( ACCT_NO VARCHAR2(10), OPNDT DATE, TYPE VARCHAR2(2)); DECLARE /* Declaration of memory variables and constants to be used in the Execution section.*/ mACCT_NO VARCHAR2(10); mANS VARCHAR2(3); mOPNDT DATE; mTYPE VARCHAR2(2); BEGIN /* Accept the Account number from the user*/ mACCT_NO := &mACCT_NO; /* Fetch the account number into a variable */ SELECT 'YES' INTO mANS FROM TRANS_MSTR WHERE ACCT_NO = mACCT_NO HAVING MIN(SYSDATE - DT) >365;
  • 65.
    /* If thereare no transactions taken place in last 365 days the execution control is transferred to a user labelled section of code, labelled as mark_status in this example. */ IF mANS = 'YES' THEN GOTO mark_status; ELSE dbms_output.put_line('Account number: ' || mACCT_NO || 'is active'); END IF; /* A labelled section of code which updates the STATUS of account number held in the ACCT_MSTR table. Further the ACCT_NO, OPNDT and the TYPE are inserted in to the table INACTV_ACCT_MSTR. */ END; << mark_status>> UPDATE ACCT_MSTR SET STATUS = 'I' WHERE ACCT_NO = mACCT_NO; SELECT OPNDT, TYPE INTO mOPNDT, mTYPE FROM ACCT_MSTR WHERE ACCT_NO = mACCT_NO; INSERT INTO INACTV_ACCT_MSTR (ACCT_NO, OPNDT,TYPE) VALUES (mACCT_NO, mOPNDT, mTYPE); dbms_output.put_line(' Account number: '|| mACCT_NO || 'is marked as inactive'); NULL Statements The NULL statement does nothing other than pass control to the next statement. In a conditional construct, the NULL statement tells readers that a possibility has been considered, but no action is necessary. In IF statements or other places that require at least one executable statement, the NULL statement to satisfy the syntax. In the following example, the NULL statement emphasizes that only top-rated employees get bonuses: IF rating > 90 THEN compute_bonus(emp_id); ELSE NULL;
  • 66.
    END IF; CASE Expression ACASE expression selects a result from one or more alternatives, and returnsthe result. The CASE expression uses a selector, an expression whose value determineswhich alternative to return. A CASE expression has the following form: CASE selector WHEN expression1 THEN result1 WHEN expression2 THEN result2 ... WHEN expressionN THEN resultN [ELSE resultN+1] END; The selector is followed by one or more WHEN clauses, which are checked sequentially. The value of the selector determines which clause is executed. The first WHEN clause that matches the value of the selector determines the result value, and subsequent WHEN clauses are not evaluated. An example follows: DECLARE grade CHAR(1) := 'B'; appraisal VARCHAR2(20); BEGIN appraisal := CASE grade WHEN 'A' THEN 'Excellent' WHEN 'B' THEN 'Very Good' WHEN 'C' THEN 'Good' WHEN 'D' THEN 'Fair' WHEN 'F' THEN 'Poor' ELSE 'No such grade' END; END; The optional ELSE clause works similarly to the ELSE clause in an IF statement. If the value of the selector is not one of the choices covered by a WHEN clause, the ELSE clause is executed.
  • 67.
    If no ELSEclause is provided and none of the WHEN clauses are matched, the expression returns NULL. Searched CASE Expression PL/SQL also provides a searched CASE expression, which has the form: CASE WHEN search_condition1 THEN result1 WHEN search_condition2 THEN result2 ... WHEN search_conditionN THEN resultN [ELSE resultN+1] END; A searched CASE expression has no selector. Each WHEN clause contains a search condition that yields a Boolean value, which lets you test different variables or multiple conditions in a single WHEN clause. An example follows: DECLARE grade CHAR(1); appraisal VARCHAR2(20); BEGIN ... appraisal := CASE WHEN grade = 'A' THEN 'Excellent' WHEN grade = 'B' THEN 'Very Good' WHEN grade = 'C' THEN 'Good' WHEN grade = 'D' THEN 'Fair' WHEN grade = 'F' THEN 'Poor' ELSE 'No such grade' END; ... END; The search conditions are evaluated sequentially. The Boolean value of each search condition determines which WHEN clause is executed. If a search condition yields TRUE, its WHEN clause is executed. After any WHEN clause is executed, subsequent search conditions are not evaluated. If none of the search conditions yields TRUE, the optional ELSE clause is executed.
  • 68.
    If no WHENclause is executed and no ELSE clause is supplied, the value of the expression is NULL. CASE Statement  Like the IF statement, the CASE statement selects one sequence of statements to execute.  However, to select the sequence, the CASE statement uses a selector rather than multiple Boolean expressions.  The CASE statement is more readable and more efficient. So, when possible, rewrite lengthy IF-THEN-ELSIF statements as CASE statements.  The CASE statement begins with the keyword CASE.  The keyword is followed by a selector, which is the variable grade in the example.  The selector expression can be arbitrarily complex. For example, it can contain function calls.  Usually, however, it consists of a single variable. The selector expression is evaluated only once.  The selector is followed by one or more WHEN clauses, which are checked sequentially.  The value of the selector determines which clause is executed. If the value of the selector equals the value of a WHEN-clause expression, that WHEN clause isexecuted.  The ELSE clause works similarly to the ELSE clause in an IF statement. In the example, if the grade is not one of the choices covered by a WHEN clause, the ELSE clause is selected, and the phrase 'No such grade' is output.  The ELSE clause is optional. However, if you omit the ELSE clause, PL/SQL adds the following implicit ELSE clause: Consider the following code that outputs descriptions of school grades: Example CASE grade WHEN 'A' THEN dbms_output.put_line('Excellent'); WHEN 'B' THEN dbms_output.put_line('Very Good'); WHEN 'C' THEN dbms_output.put_line('Good'); WHEN 'D' THEN dbms_output.put_line('Fair'); WHEN 'F' THEN dbms_output.put_line('Poor'); ELSE dbms_output.put_line('No such grade'); END CASE; Searched Case Statement: PL/SQL also provides a searched CASE statement, which has the form:
  • 69.
    CASE WHEN search_condition1 THENsequence_of_statements1; WHEN search_condition2 THEN sequence_of_statements2; ... WHEN search_conditionN THEN sequence_of_statementsN; [ELSE sequence_of_statementsN+1;] END CASE; The searched CASE statement has no selector. Its WHEN clauses contain search conditions that yield a Boolean value,not expressions that can yield a value of any type. The search conditions are evaluated sequentially. The Boolean value of each search condition determines which WHEN clause is executed. If a search condition yields TRUE, its WHEN clause is executed. If any WHEN clause is executed, control passes to the next statement, so subsequent search conditions are not evaluated. If none of the search conditions yields TRUE, the ELSE clause is executed. The ELSE clause is optional. However, if you omit the ELSE clause, PL/SQL adds the following implicit ELSE clause: An example follows: CASE WHEN grade = 'A' THEN dbms_output.put_line('Excellent'); WHEN grade = 'B' THEN dbms_output.put_line('Very Good'); WHEN grade = 'C' THEN dbms_output.put_line('Good'); WHEN grade = 'D' THEN dbms_output.put_line('Fair'); WHEN grade = 'F' THEN dbms_output.put_line('Poor'); ELSE dbms_output.put_line('No such grade'); END CASE; Handling Null Values in Comparisons and Conditional Statements When working with nulls, you can avoid some common mistakes by keeping in mind the following rules: Comparisons involving nulls always yield NULL Applying the logical operator NOT to a null yields NULL In conditional control statements, if the condition yields NULL, its associated sequence of statements is not executed If the expression in a simple CASE statement or CASE expression yields NULL, it cannot be matched by using WHEN NULL. In this case, you would need to use the searched case syntax and test WHEN expression IS NULL.
  • 70.
    Example 1: x :=5; y := NULL; ... IF x != y THEN -- yields NULL, not TRUE sequence_of_statements; -- not executed END IF; Example 2: a := NULL; b := NULL; ... IF a = b THEN -- yields NULL, not TRUE sequence_of_statements; -- not executed END IF; Concept of Nested tables Within the database, nested tables can be considered one-column databasetables. Oracle stores the rows of a nested table in no particular order. But, when you retrieve the nested table into a PL/SQL variable, the rows are given consecutive subscripts starting at1. That gives you array-like access to individual rows. PL/SQL nested tables are like one-dimensional arrays. You can model multi-dimensional arrays by creating nested tables whose elements are also nested tables. Nested tables are also singly dimensioned, unbounded collections of homogeneous elements. Nested tables are available in both PL/SQL and the database. 
  • 71.
    ORACLE TRANSACTIONS PL /SQL TRANSACTIONS A series of one or more SQL statements that are logically related or a series of operations performed on Oracle table data is termed as a Transaction. Oracle treats this logical unit as a single entity. Oracle treats changes to table data as a two-step process. First, the changes requested are done. To make these changes permanent a COMMIT statement has to be given at the SQL prompt. A ROLLBACK statement given at the SQL prompt can be used to undo a part of or the entire transaction. Specifically, a transaction is a group of events that occur between any of the following events:  Connecting to Oracle  Disconnecting from Oracle  Committing changes to the database table  Rollback Closing Transactions  A transaction can be closed by using either a commit or a rollback statement.  By using these statements. table data can be changed or all the changes made to the table data undone. Using COMMIT: A COMMIT ends the current transaction and makes permanent any changes made during the transaction. All transactional locks acquired on tables are released. Syntax: COMMIT; Using ROLLBACK: A ROLLBACK does exactly the opposite of COMMIT. It ends the transaction but undoes any changes made during the transaction. All transactional locks acquired on tables are released. Syntax: ROLLBACK [WORK] [TO [SAVEPOINT] <SavePointName>]; where, WORK Is optional and is provided for ANSI compatibility
  • 72.
    SAVEPOINT Is optionaland is used to rollback a transaction partially, as far as the specified savepoint SAVEPOINTNAME Is a savepoint created during the current transaction Creating A SAVEPOINT SAVEPOINT marks and saves the current point in the processing of a transaction. When a SAVEPOINT is used with a ROLLBACK statement, parts of a transaction can be undone. An active SAVEPOINT is one that is specified since the last COMMIT or ROLLBACK. Syntax: SAVEPOINT <SavePointName>; A ROLLBACK operation performed without the SAVEPOINT clause amounts to the following: Ends the transaction. Undoes all the changes in the current transaction. Erases all savepoints in that transaction. Releases the transactional locks. A ROLLBACK operation performed with the TO SAVEPOINT clause amounts to the following:  A predetermined portion of the transaction is rolled back  Retains the save point rolled back to, but loses those created after the namedsavepoint  Releases all transactional locks that were acquired since the savepoint was taken Example 1: - Write a PL/SQL block of code that first withdraws an amount of Rs. 1,000. Then deposits an amount of Rs.1,40,000. Update the current balance. Then check to see that the current balance of all the accounts in the bank does not exceed Rs.2,00,000. If the balance exceeds Rs.2,00,000 then undo the deposit just made. DECLARE mBAL number(8,2); BEGIN /* Insertion of a record in the 'TRANS_MSTR' table for withdrawals */ INSERT INTO TRANS_MSTR (TRANS_NO, ACCT_NO, DT, TYPE, PARTICULAR, DR_CR, AMT, BALANCE) VALUES('T100','CA10','04-JUL-2004','C','Telephone Bill','W', 1000, 31000); /* Updating the current balance of account number CA10 in the 'ACCT _MSTR' table. */
  • 73.
    UPDATE ACCT_MSTR SETCURBAL = CURBAL - 1000 WHERE ACCT_NO ='CA 10'; /* Defining a savepoint. */ SAVEPOINT no update; /* Insertion of a record in the 'TRANS_MSTR' table for deposits. */ INSERT INTO TRANS_MSTR(TRANS_NO, ACCT NO, DT, TYPE, PARTICULAR, DR_CR, AMT, BALANCE)VALUES('T101', 'CA10', '04-JUL-2004', 'C', 'Deposit', 'D', 140000, 171000); /* Updating the current balance of account number CA10 in the 'ACCT_MSTR' table. */ UPDATE ACCT MSTR SET CURBAL = CURBAL + 140000 WHERE ACCT_NO= ‘CA10’; /* Storing the total current balance from the 'ACCT MSTR' table into a variable. */ SELECT SUM(CURBAL) INTO mBAL FROM ACCT MSTR; /* Checking if total current balance exceeds 200000. */ IF mBAL > 200000 THEN /* Undo the changes made to the 'TRANS_MSTR' table. */ ROLLBACK To SAVEPOINT no_update; END IF; /* Make the changes permanent. */ COMMIT; END; Output: PL/SQL procedure successfully completed. PL/SQL SECURITY  An Oracle transaction can be made up of a single SQL sentence or several SQL sentences.  This gives rise to Single Query Transactions and Multiple Query Transactions (i.e. SQT and MQT).  These transactions (whether SQT or MQT) access an Oracle table(s).  Since Oracle works on a multi-user platform, it is more than likely that several people will access data either for viewing or for manipulating (inserting, updating and deleting records) from the same tables at the same time via different SQL statements.  The Oracle table is therefore a global resource, i.e. it is shared by several users.  The Oracle Engine has to allow simultaneous access to table data without causing damage to the data.
  • 74.
     The techniqueemployed by the Oracle engine to protect table data when several people are accessing it is called Concurrency Control.  Oracle uses a method called Locking to implement concurrency control when multiple users access a table to manipulate its data at the same time. LOCKS  Locks are mechanisms used to ensure data integrity while allowing maximum concurrent access to data. Oracle's locking is fully automatic and requires no user intervention.  The Oracle engine automatically locks table data while executing SQL statements. This type of locking is called Implicit Locking. Oracle's Default Locking Strategy - Implicit Locking Since the Oracle engine has a fully automatic locking strategy, it has to decide on two issues: Type of Lock to be applied Level of Lock to be applied Types Of Locks The type of lock to be placed on a resource depends on the operation being performed on that resource. Operations on tables can be distinctly grouped into the following two categories: Read Operations : SELECT statements Write Operations : INSERT, UPDATE, DELETE statements  Since Read operations make no changes to data in a table and are meant only for viewing purposes, simultaneous read operations can be performed on a table without any danger to the table's data.  Hence, the Oracle engine places a Shared lock on a table when its data is being viewed.  write operations cause a change in table data i.e. any insert, update or delete statement affects table data directly and hence, simultaneous write operations can adversely affect table data integrity.  Simultaneous write operation will cause Loss of data consistency in the table.  Hence, the Oracle engine places an Exclusive lock on a table or specific sections of the table's resources when data is being written to a table. The rules of locking can be summarized as: DATA being CHANGED cannot be READ. Writers wait for other writers, if they attempt to update the same rows at the same time. The two Types of locks supported by Oracle are:
  • 75.
    Shared Locks  Sharedlocks are placed on resources whenever a Read operation (SELECT) isperformed  Multiple shared locks can be simultaneously set on a resource Exclusive Locks  Exclusive locks are placed on resources whenever Write operations (INSERT, UPDATE and DELETE) are performed  Only one exclusive lock can be placed on a resource at a time i.e. the first user who acquires an exclusive lock will continue to have the sole ownership of the resource, and no other user can acquire an exclusive lock on that resource Note  In the absence of explicit user defined locking being defined to the Oracle engine, if a default Exclusive lock is taken on a table a Shared lock on the very same data is permitted.  Automatic application of locks on resources by the Oracle engine results in a high degree of data consistency. Levels of Locks  A table can be decomposed into rows and a row can be further decomposed into fields.  Hence, if an automatic locking system is designed so as to be able to lock the fields of a record, it will be the most flexible locking system available.  Oracle does not provide a field level lock. Oracle provides the following three levels of locking:  Row level  Page level  Table level The Oracle engine decides on the level of lock to be used by the presence or absence of a WHERE condition in the SQL sentence.  If the WHERE clause evaluates to only one row in the table, a row level lock is used  If the WHERE clause evaluates to a set of data, a page level lock is used  If there is no WHERE clause, (i.e. the query accesses the entire table,) a table level lock isused Example for Implicit Locking Example: The BRANCH_MSTR table will be used to check the behavior of the Oracle Engine in multi-user environment when an insert operation is performed. Table Name: BRANCH_MSTR BRANCH_NO NAME BRANCH_NO NAME BRANCH_NO NAME B1 Vile Parle (HO) B2 Andheri B3 Churchgate
  • 76.
    B4 Mahim B5Borivali B6 Darya Ganj Client A performs an insert operation on the BRANCH_MSTR table: Client A> INSERT INTO BRANCH_MSTR (BRANCH_NO, NAME) VALUES('B7','Dahisar'); Output BRANCH_NO NAME Bl Vile Parle (HO) B2 Andheri B3 Churchgate Output: 1 row created. Client A fires a SELECT statement on the BRANCH_MSTR table: Client A> SELECT * FROM BRANCH_MSTR; Output: BRANCH_NO NAME Bl Vile Parle (HO) B2 Andheri B3 Churchgate B4 Mahim B5 Borivali B6 Darya Ganj B7 Dahisar 7 rows selected. Client B fires a SELECT statement on the BRANCH_MSTR table: Client B> SELECT * FROM BRANCH_MSTR;
  • 77.
    B4 Mahim B5 Borivali B6Darya Ganj B7 Dahisar 6 rows selected. Observation:  Client A can see the newly inserted record B7  Client B cannot see the newly inserted record, as Client A has not committed it Inferences: Since Client A has not fired a commit statement for permanently saving the newly inserted record in the BRANCH_MSTR table, Client B cannot access the newly inserted record or manipulate it any way Note Client A can view, update or delete the newly inserted record since it exists in the buffer on the Client A's computer. However, this record does not exist in the Server's table, because Client A has not committed the transaction. Explicit Locking Although the Oracle engine, has a default locking strategy in commercial applications, explicit user defined locking is often required. Consider the example below: If two client computers (Client A and Client B) are entering sales orders, each time a sales order is prepared, the quantity on hand of the product for which the order is being generated needs to be updated in the PRODUCT_MSTR table. Now, if Client A fires an update command on a record in the PRODUCT _MSTR table, then Oracle will implicitly lock the record so that no further data manipulation can be done by any other user till the lock is released. The lock will be released only when Client A fires a commit or rollback. In the meantime, if Client B tries to view the same record, the Oracle engine will display the old set for values for the record as the transaction for that record has not been completed by Client A. This leads to wrong information being displayed to Client B.
  • 78.
    In such cases,Client A must explicitly lock the record such that, no other user can access the record even for viewing purposes till Client A's transaction is completed. A Lock so defined is called Explicit Lock. User defined explicit locking always overrides Oracle's default locking strategy. Explicit Locking The technique of lock taken on a table or its resources by a user is called Explicit Locking. Users can lock tables they own or any tables on which they have been granted table privileges (such as select, insert, update, delete). Oracle provides facilities by which the default locking strategy can be overridden. Table(s) or row(s) can be. Explicit locking can be achieved in two ways: o SELECT … FOR UPDATE statement; o LOCK TABLE statement; The SELECT ... FOR UPDATE Statement It is used for acquiring exclusive row level locks in anticipation of performing updates on records. This clause is generally used to signal the Oracle engine that data currently being used needs to be updated. It is often followed by one or more update statements with a where clause. Example 1: Two client machines Client A and Client B are recording the transactions performed in a bank for a particular account number simultaneously. Client A fires the following select statement: Client A> SELECT * FROM ACCT_MSTR WHERE ACCT NO ='SB9' FOR UPDATE; When the above SELECT statement is fired, the Oracle engine locks the record SB9. This lock is released when a commit or rollback is fired by Client A. Now Client B fires a SELECT statement, which points to record SB9, which has already been locked by Client A: Client B> SELECT * FROM ACCT_MSTR WHERE ACCT N0='SB9' FOR UPDATE; The Oracle engine will ensure that Client B’s SQL statement waits for the lock to be released on ACCT_MSTR by a commit or rollback statement fired by Client A forever.
  • 79.
    SELECT … FORUPDATE with NOWAIT Option In order to avoid unnecessary waiting time, a NOWAIT option can be used to inform the Oracle engine to terminate the SQL statement if the record has already been locked. If this happens the Oracle engine terminates the running DML and comes up with a message indicating that the resource is busy. If Client B fires the following select statement now with a NOWAIT clause: Client B> SELECT * FROM ACCT_MSTR WHERE ACCT_NO ='SB9' FOR UPDATE NOWAIT; Output: Since Client A has already locked the record SB9 when Client B tries to acquire a shared lock on the same record the Oracle Engine displays the following message: SQL> 00054: resource busy and acquire with nowait specified. The SELECT ... FOR UPDATE cannot be used with the following:  Distinct and the Group by clause  Set operators and Group functions Using Lock Table Statement To manually override Oracle's default locking strategy by creating a data lock in a specific mode. Syntax: LOCK TABLE <TabIeName> [, <TabIeName>] ... IN {ROW SHARE|ROW EXCLUSIVEISHARE UPDATE| SHAREISHARE ROW EXCLUSIVE I EXCLUSIVE } [NOWAIT] where, TableName Indicates the name of table(s), view(s) to be locked. In case of views, the lock is placed on underlying tables. IN Decides what other locks on the same resource can exist simultaneously. For example, if there is an exclusive lock on the table no user can update rows in the table. It can have any of the following values: Exclusive: They allow query on the locked resource but prohibit
  • 80.
    any other activity. Share: It allows queries but prohibits updates to a table. Row Exclusive: Row exclusive locks are the same as row share locks, also prohibit locking in shared mode. These locks are acquired when updating, inserting or deleting. Share RowExclusive: They are used to look at a whole table, to selective updates and to allow other users to look at rows in the table but not lock the table in share mode or to update rows. NOWAIT Indicates that the Oracle engine should immediately return to the user with a message, if the resources are busy. If omitted, the Oracle engine will wait till resources are available forever. Oracle Table Lock Mode (RS) Row Share Table Lock (RS) • Indicates a transaction holding the lock on the table has locked rows in the table and intends to update them. • Permitted Operations: Allows other transactions to query, insert, update, delete, or lock rows concurrently in the same table. Therefore, other transactions can obtain simultaneous row share, row exclusive, share, and share row exclusive table locks for the same table. • Prohibited Operations: Lock Table in Exclusive Mode. Oracle Table Lock Mode (RX) Row Exclusive Table Lock (RX) • Indicates that a transaction holding the lock has made one or more updates to rows in the table. A row exclusive table lock is acquired automatically by: INSERT, UPDATE, DELETE, LOCK TABLE.. IN ROW EXCLUSIVE MODE; A row exclusive table lock is slightly more restrictive than a row share table lock. • Permitted Operations: Allows other transactions to query, insert, update, delete, or lock rows in the same table. The row exclusive table locks allow multiple transactions to obtain simultaneous row exclusive and row share table locks in the same table. • Prohibited Operations: Prevents locking the table for exclusive reading or writing. Therefore, other transactions cannot concurrently lock the table: IN SHARE MODE, IN SHARE EXCLUSIVE MODE, or IN EXCLUSIVE MODE. Oracle Table Lock Mode (S) Share Table Lock (S) • Acquired automatically for the table specified in the following statement: LOCK TABLE <table> IN SHARE MODE; • Permitted Operations: Allows other transactions only to query the table, to lock specific
  • 81.
    rows with SELECT. . . FOR UPDATE, or to execute LOCK TABLE . . . IN SHARE MODE; no updates are allowed by other transactions. Multiple transactions can hold share table locks for the same table concurrently. No transaction can update the table (with SELECT.. FOR UPDATE). Therefore, a transaction that has a share table lock can update the table only if no other transaction has a share table lock on the same table. Prohibited Operations: Prevents other transactions from modifying the same table or lock table: IN SHARE ROW EXCLUSIVE MODE, IN EXCLUSIVE MODE, or IN ROW EXCLUSIVE MODE. Oracle Table Lock Mode (SRX) Share Row Exclusive Table Lock (SRX) • More restrictive than a share table lock. A share row exclusive table lock is acquired for a table as follows: LOCK TABLE <table> IN SHARE ROW EXCLUSIVE MODE; • Permitted Operations: Only one transaction at a time can acquire a share row exclusive table lock on a given table. A share row exclusive table lock held by a transaction allows other transactions to query or lock specific rows using SELECT with the FOR UPDATE clause, but not to update the table. • Prohibited Operations: Prevents other transactions from obtaining row exclusive table locks and modifying the same table. A share row exclusive table lock also prohibits other transactions from obtaining share, share row exclusive, and exclusive table locks. Oracle Table Lock Mode (X) Exclusive Table Lock (X) • Most restrictive mode of table lock, allowing the transaction that holds the lock exclusive write access to the table. An exclusive table lock is acquired by: LOCK TABLE <table> IN EXCLUSIVE MODE; • Permitted Operations: Only one transaction can obtain an exclusive table lock for a table. An exclusive table lock permits other transactions only to query the table. • Prohibited Operations: Prohibits other transactions from performing any type of DML statement or placing any type of lock on the table. Example: Two client machines Client A and Client B are performing data manipulation on the table EMP_MS'I'R. Table Name: EMP_MSTR EMP NO BRANCH _NO FNAME MNAME LNAME DEPT DESIG MNGR_NO El BI Ivan Nelson Bayross Administratio n Managing Director E2 B2 Amit Desai Loans & Financing Finance Manager E3 B3 Maya Mahima Joshi Client Servicing Sales Manager
  • 82.
    E4 BI Petertyer Joseph Loans & Financing Clerk E2 E5 B4 Mandhar Dilip Dalvi Marketing Marketing Manager E6 B6 Sonal Abdul Khan Administratio n Admin. Executive E1 E7 B4 Anil Ashutosh Kambli Marketing Sales Asst. E5 L8 B3 Seema P. Apte Client Servicing Clerk E3 E9 B2 Vikram Vilas Randive Marketing Sales Asst. E5 E10 B6 Anjali Sameer Pathak Administratio n HR Manager E1 Client A has locked the table in exclusive mode (i.e. only querying of records is allowed on the EMP_MSTR table by Client B): Client A> LOCK TABLE EMP_MSTR IN EXCLUSIVE Mode NOWAIT; Output: Table (s) Locked. Client A performs an insert operation but does not commit the transaction: Client A> INSERT INTO EMP_MSTR(EMP_NO, BRANCH_NO, FNAME, MNAME, LNAME, DEPT, DESIG, MNGR NO) VALUES('E100', 'B1', 'Sharanam', 'Chaitanya', 'Shah', 'Administration', 'Project Leader', NULL); Output: 1 row created. Client B performs a view operation: Client B> SELECT EMP NO, FNAME, MNAME, LNAME FROM EMP MSTR; Output: EMP NO FNAME MNAME LNAME El Ivan Nelson Bayross E2 Amit Desai E3 Maya Mahima Joshi E4 Peter Iyer Joseph E5 Mandhar Dilip Dalvi
  • 83.
    E6 Sonal AbdulKhan E7 Ani.l Ashutosh Kambli E8 Seema P. Apte E9 Vikram Vilas Randive E10 Anjali Sameer Pathak Client B performs an insert operation: Client B> INSERT INTO EMP_MSTR(EMP_NO, BRANCH_NO, FNAME, MNAME, LNAME, DEPT, DESIG, MNGR_NO) VALUES('E 101', 'B1', 'Vaishali', 'Sharanam', 'Shah', 'Tech Team', 'Programmer', 'E 100'); Output: Client B's SQL DML enters into a wait state waiting for Client A to release the locked resource by using a Commit or Rollback statement. Inferences:  When Client A locks the table EMP_MSTR in exclusive mode the table is available only for querying to other users. No other data manipulation (i.e. Insert, Update and Delete operation) can be performed on the EMP_MSTR table by other users  Since Client A has inserted a record in the EMP_MSTR table and not committed the changes when Client B fires a select statement the newly inserted record is not visible to Client B  As the EMP_MSTR table has been locked when Client B tries to insert a record, the system enters into an indefinite wait period till all locks are released by Client A taken on EMP_MSTR table Releasing Locks Locks are released under the following circumstances: The transaction is committed successfully using the Commit verb A rollback is performed A rollback to a savepoint will release locks set after the specified savepoint Note All locks are released on commit or unqualified Rollback. Table locks are released by rolling back to a savepoint. Row-level locks are not released by rolling back to a savepoint. Examples of Explicit Locking Using SQL and the Behavior of the Oracle Engine. The ACCT_MSTR table will be used to check the behavior of the Oracle Engine in a multi- user environment. An update operation is performed after specifying the FOR UPDATE
  • 84.
    clause. Table name: ACCTMSTR (Partials Extract) ACCT_NO CURBAL ACCT_NO CURBAL ACCT_NO CURBAL ACCT_NO CURBAL SB1 500 CA2 3000 SB3 500 CA4 12000 SB5 500 SB6 500 CA7 22000 SB8 500 S139 500 CA10 32000 SB11 500 CA12 5000 S13 500 CA 14 10000 SB15 500 Client A selects all the records from the ACCT_MSTR table with the FOR UPDATE clause: Client A> SELECT ACCT_NO, CURBAL FROM ACCT_MSTR FOR UPDATE; Client A performs an update operation on the record SB9 in the ACCT_MSTR table: Client A> UPDATE ACCT_MSTR SET CURBAL = CURBAL + 5000 WHERE ACCT_NO =‘SB9’ ; Output: 1 row updated. Client A fires a SELECT statement on the ACCT_MSTR table: Client A> SELECT ACCT_NO, CURBAL FROM ACCT_MSTR; Output: ACCT_NO CURBAL SBl 500 CA2 3000 SB3 500 CA4 12000 SB5 500 SB6 500 CA7 22000 SB8 500 SB9 5500 CA10 32000 SB11 500 CA12 3000
  • 85.
    SB13 500 CAI 412000 SB15 500 15 rows selected. Client B fires a SELECT statement with a FOR UPDATE clause on the ACCT_MSTR table: Client B> SELECT ACCT_NO, CURBAL FROM ACCT_MSTR FOR UPDATE; Output: Client B's SQL DML enters into an indefinite wait state waiting for Client A to release the locked resource by using a commit or Rollback statement. CURSORS WHAT IS A CURSOR?  The Oracle Engine uses a work area for its internal processing in order to execute an SQL statement. This work area is private to SQL's operations and is called a Cursor.  The data that is stored in the cursor is called the Active Data Set.  Conceptually, the size of the cursor in memory is the size required to hold the number of rows in the Active Data Set.  The actual size, however, is determined by the Oracle engine's built in memory management capabilities and the amount of RAM available.  Oracle has a pre-defined area in main memory set aside, within which cursors are opened.  Hence the cursor's size will be limited by the size of this pre-defined area. Types Of Cursors 1. Implicit Cursors I. The oracle engine implicits to open a cursor on the server to process each SQL statements. II. Since the implicit cursor is open and managed by oracle engine internally, the function of reserving an area in memory, popularity this area withappropriate data, processing the data in memory area, releasing the memory when processing is over is taken care of by the oracle engine. III. The resultant data is then passed to client machine via a network then a cursoris open on the client machine to hold rows written by oracle engine. IV. The number of rows held in cursor on the client is managed by client’s operating system and its swapped area.
  • 86.
    V. Implicit cursorattributes can be used to access information about the status of last insert, update, delete or single row select statement. VI. This can be done by preceding the implicit cursor attribute with cursor namethat is SQL. Example 1. The HRD manager has decided to raise the salary of employee by 0.15. write aprogram to accept the employee number and update salary of that employee. Display appropriate message if employee does not exist. DECLARE v_eno emp.empno%type BEGIN v_eno :=&v_eno; If SQL%Found then Dbms_output.put_line(‘Employee exist’); Update emp set sal=sal+(sal*0.15) Where empno=v_no; Else Dbms_output.put_line(‘Employee does not exist’); End if; END; 2. The HRD manager has decided to raise salary of employee whose job isprogrammer. Display number of employees having job as programmer. DECLARE v_count number; BEGIN Update emp set sal=sal+(sal*0.5)
  • 87.
    Where job=’Programmer’; If SQL%Rowcount>0 then Dbms_output.put_line(‘SQL%Rowcount||Employees record updated successfully’); Else Dbms_output.put_line(‘Number of Employees working as programmer’); End if; END;. 2. EXPLICIT CURSORS- When an individual record in a table have to be proceed inside a PLSQL code block a cursor is used. This cursor will be declared and mapped to as SQL query in the declare section of the PLSQL block and use within its executable section. Thus a cursor created and used is known as explicit cursor. Examples- 1. Print name and job of employee having job as manager or analyst. DECLARE vname emp.ename%type; vjob emp.job%type; cursor c1 is select ename,job from emp where job=’manager’ or job=’analyst’; BEGIN Open c1; Dbms_output.put_line(‘Name’||’ ‘||’job’); If c1%Found then Loop Fetch c1 into vname,vjob;
  • 88.
    Exit when c1%NotFound; Dbms_output.put_line(‘vname’||’‘||’vjob’); End loop; Close c1; Dbms_output.put_line(‘Cursor not found’); End if; END; General Cursor Attributes:  When the Oracle engine creates an Implicit or Explicit cursor, cursor control variables are also created to control the execution of the cursor.  These are a set of four system variables, which keep track of the Current status of acursor.  These cursor variables can be accessed and used in a PL/SQL code block. Both Implicit and Explicit cursors have four attributes. They are described below: Attribute Name Description %ISOPEN Returns TRUE if cursor is open, FALSE otherwise. %FOUND Returns TRUE if record was fetched successfully, FALSE otherwise. %NOTFOUND Returns TRUE if record was not fetched successfully, FALSE otherwise. %ROWCOUNT Returns number of records processed from the cursor. Explicit Cursor Management The steps involved in using an explicit cursor and manipulating data in its active set are:  Declare a cursor mapped to a SQL select statement that retrieves data for processing  Open the cursor  Fetch data from the cursor one row at a time into memory variables  Process the data held in the memory variables as required using a loop  Exit from the loop after processing is complete  Close the cursor Cursor Declaration A cursor is defined in the declarative part of a PL/SQL block. This is done by naming the cursor and mapping it to a query. When a cursor is declared, the Oracle engine is informed that a cursor of the said name
  • 89.
    needs to beopened. The declaration is only intimation. There is no memory allocation at this point in time. Syntax: CURSOR CursorName IS SELECT statement; Opening a Cursor Initialization of a cursor takes place via the open statement, this: Defines a private SQL area named after the cursor name. Executes a query associated with the cursor. Retrieves table data and populates the named private SQL area in memory i.e. creates the Active Data Set. Sets the cursor row pointer in the Active Data Set to the first record. Syntax: OPEN CursorName; Fetching data from a Cursor A fetch statement then moves the data held in the Active Data Set into memory variables. The fetch statement is placed inside a Loop ... End Loop construct, which causes the data to be fetched into the memory variables and processed until all the rows in the Active Data Set are processed. The fetch loop then exits. The exiting of the fetch loop is user controlled. Syntax: FETCH CursorName INTO Variable 1, Variable2, ..., Note: There must be a memory variable for each column value of the Active Data Set. Datatypes must match. These variables will be declared in the DECLARE section of the PL/SQL block. Processing data Data held in the memory variables can be processed as desired. Exiting loop A standard loop structure (Loop-End Loop) is used to fetch records from the cursor into memory variables one row at a time.
  • 90.
    Closing A Cursor Afterthe fetch loop exits, the cursor must be closed with the closestatement. This will release the memory occupied by the cursor and its Active Data Set The close statement disables the cursor and the active set becomes undefined. This will release the memory occupied by the cursor and its Data Set both on the Client and on the Server. Syntax: CLOSE CursorName; Note Once a cursor is closed, the reopen statement causes the cursor to be reopened. CURSOR FOR LOOPS Another technique commonly used to control the Loop... End Loop within a PL/SQL block is the FOR variable IN value construct. This is an example of a machine defined loop exit i.e. when all the values in the FOR construct are exhausted looping stops. Syntax: FOR memory variable IN CursorName Here, the verb FOR automatically creates the memory variable of the %rowtype. Each record in the opened cursor becomes a value for the memory variable of the %rowtype. The FOR verb ensures that a row from the cursor is loaded in the declared memory variable and the loop executes once. This goes on until all the rows of the cursor have been loaded into the memory variable. After this the loop stops. A cursor for loop automatically does the following:  Implicitly declares its loop index as a %rowtype record  Opens a cursor  Fetches a row from the cursor for each loop iteration  Closes the cursor when all rows have beenprocessed A cursor can be closed even when an exit or a goto statement is used to leave the loop prematurely, or if an exception is raised inside the loop. Example :
  • 91.
    1. Assigning commission=500for those employees who are getting null commission by using cursor for loop- DECLARE Cursor c_comm is select * from emp where comm. Is null; emp_rec c_comm%RowType; BEGIN Open c_comm; For emp_rec in c_comm Loop Emp_rec.comm:=500; Dbms_output.put_line(emp_rec.ename||’ ‘||emp_rec.comm); End loop; Commit; Close c_comm; END; Output: PL/SQL procedure successfully completed. The FOR LOOP is responsible to:  Automatically fetch the data retrieved by the cursor into the NoTrans_Rec recordset  The sequence of statements inside the loop is executed once for every row that is fetched, i.e. check if any data is retrieved Update the STATUS field of the ACCT_MSTR table to reflect the inactivity Insert a record in the INACTV_ACCT MSTR table to reflect the updation  Automatically repeat the above steps until the data retrieval process is complete.  The cursor closes automatically when all the records in the cursor have been processed. This is because there are no more rows left to load into NoTrans_Rec. This situation is sensed by the FOR verb which, causes the loop to exit. Finally a COMMIT is fired to make the changes permanent PARAMETERIZED CURSORS Records, which satisfy conditions, set in the WHERE clause of the SELECT statement mapped to the cursor.
  • 92.
    In other words,the criterion on which the Active Data Set is determined is hard coded and never changes. Commercial applications require that the query, which, defines the cursor, be generic and the data that is retrieved from the table be allowed to change according to need. Oracle recognizes this and permits the creation of parameterized cursors for use. The contents of a parameterized cursor will constantly change depending upon the value passed to its parameter. Since the cursor accepts user-defined values into its parameters, thus changing the Result set extracted, it is called as Parameterized Cursor. Declaring A Parameterized Cursor Syntax: CURSOR CursorName (VariableName Datatype) IS <SELECT statement... > Syntax: OPEN CursorName (Value / Variable / Expression ) Note The scope of cursor parameters is local to that cursor, which means that they can be referenced only within the query declared in the cursor declaration. Each parameter in the declaration must have a corresponding value in the openstatement. Example- Accept a salary and print name and salary of employee having salary less than or equal to the accepted salary DECLARE Cursor c1(c.sal emp.sal%type) is select * from emp where sal<=c.sal; emp_rec c1%RowType; vsal emp.sal%type; BEGIN vsal =&vsal; Open c1; For emp_rec in c1(vsal) Loop
  • 93.
    Dbms_output.put_line(emp_rec.ename||’ ‘||emp_rec.sal); End loop; Closec1; END; Cursor Variables Like a cursor, a cursor variable points to the current row in the result set of a multi- row query. But, unlike a cursor, a cursor variable can be opened for any type-compatiblequery. It is not tied to a specific query. Cursor variables are true PL/SQL variables, to which you can assign new valuesand which you can pass to subprograms stored in an Oracle database. This gives you more flexibility and a convenient way to centralize data retrieval. Typically, you open a cursor variable by passing it to a stored procedurethat declares a cursor variable as one of its formal parameters. To execute a multi-row query, Oracle opens an unnamed work area that stores processing information. You can access this area through an explicit cursor, which names the work area, or through a cursor variable, which points to the workarea. To create cursor variables, you define a REF CURSOR type, then declare cursor variables of that type. ref_cursor_type_definition ::= TYPE type_name IS REF CURSOR [RETURN { {db_table_name | cursor_name | cursor_variable_name}%ROWTYPE | record_name%TYPE
  • 94.
    | record_type_name | ref_cursor_type_name }]; ref_cursor_variable_declaration::= cursor_variable_name type_name; Keyword and Parameter Description cursor_name o An explicit cursor previously declared within the current scope. cursor_variable_name o A PL/SQL cursor variable previously declared within the current scope. db_table_name o A database table or view, which must be accessible when the declaration is elaborated. record_name o A user-defined record previously declared within the current scope. record_type_name o A user-defined record type that was defined using the datatype specifier RECORD. REF CURSOR o Cursor variables all have the datatype REF CURSOR. RETURN o Specifies the datatype of a cursor variable return value. You can use the %ROWTYPE attribute in the RETURN clause to provide a record type that represents a row in a database table, or a row from a cursor or strongly typed cursor variable. You can use the %TYPE attribute to provide the datatype of a previously declared record. Types of Cursor Variables REF CURSOR types can be strong (with a return type) or weak (with no return type). Strong REF CURSOR types are less error prone because the PL/SQL compiler lets you associate a strongly typed cursor variable only with queries that return the right set of columns. Weak REF CURSOR types are more flexible because the compiler lets you associate a weakly typed cursor variable with any query. Because there is no type checking with a weak REF CURSOR, all such types are interchangeable. Instead of creating a new type, you can use the predefinedtype SYS_REFCURSOR. The following procedure opens the cursor variable generic_cv for the chosen query: PROCEDURE open_cv (generic_cv IN OUT GenericCurTyp,choice NUMBER) IS BEGIN
  • 95.
    IF choice =1 THEN OPEN generic_cv FOR SELECT * FROM emp; ELSIF choice = 2 THEN OPEN generic_cv FOR SELECT * FROM dept; ELSIF choice = 3 THEN OPEN generic_cv FOR SELECT * FROM salgrade; END IF; ... END; Query Evaluation THE SYSTEM CATALOG A relational DBMS maintains information about every table and index that it contains. The descriptive information is itself stored in a collection of special tables called the catalog tables. The catalog tables are also called the data dictionary, the system catalog, or simply the catalog. Information in the Catalog System catalog stores system-wide information, such as the size of the buffer pool and the page size, and the following information about individual tables, indexes, and views:  For each table: - Its table name, the file name (or some identifier), and the file structure (e.g. heap file) of the file in which it is stored. - The attribute name and type of each of its attributes. - The index name of each index on the table. - The integrity constraints (e.g. primary key and foreign key constraints) on the table.  For each index: - The index name and the structure (e.g. B+ tree) of the index. - The search key attributes.  For each view: - Its view name and definition. The following information is commonly stored:  Cardinality: The number of tuples. NTuples (R) for each table R.  Size: The number of pages NPages (R) for each table R.  Index Cardinality: The number of distinct key values NKeys (I) for each index I.
  • 96.
     Index Size:The number of pages INPages(I) for each index I. (For a B+ tree index I, we take INPages to be the number of leaf pages.)  Index Height: The number of nonleaf levels IHeight(I) for each tree index I.  Index Range: The minimum present key value ILow(I) and the maximum present key value IHigh(I) for each index I. The catalogs also contain information about users, such as accounting information and authorization information (e.g. Joe User can modify the Reserves table but only read the Sailors table). Evaluation of Relational operator- 1. A relational database consist of collection of tables, each of which assign a unique name. A row in table represents relationships among set of values. 2. As table is collection of such relationships there is close correspondence between the concept table and the mathematical concept of relation. 3. A query language is a language in which user request information from the database. Query language can be categorized as procedural or non-procedural language. 4. In procedural language user instructs the system to perform sequence of operation on the database to compute desired result. 5. In non-procedural language user describes the desired information without givinga specific procedure for obtaining information. 6. The relational algorithm is procedural language that fundamentals operations in relational algorithm such as select, project, union, set difference and Cartesian product. 7. In addition to fundamental operations there are several other operations as set intersection, natural join, division and assignment. I. SELECT OPERATOR- the select operation selects the tuple(rows) that satisfies given predicate lower case greek letter σis used to denote selection. The average relation is in paranthesis after ‘σ ’. Examples- 1. Select those tuple of loan relation where branch is nerul. σ branchname=”Nerul”(loan) 2. Select all tuples in which amount of loan is more than 1200. σ amount > 1200(loan) II. PROJECT OPERATOR- This operator use to select only required attributes (columns). The projection is denoted by greek letter “Π”. Examples- 1. List all loan numbers and amount of loan. Π loan_no,amount(loan).
  • 97.
    III. SET DIFFERENCEOPERATOR- The set difference operator denoted by “-” sign allows to find tuples that are in one relation but not in other relation. Examples- 1. List all the customers of bank that have an account but not loan. Π customer_name(depositor) – Π customer_name(loan) IV. SET INTERSECTION OPERATOR- It is denoted by ‘W’ allows to find tuples thar are inboth relations. Example- 1. List all the customers of bank that have an account but not loan. Π customer_name(depositor) W Π customer_name(loan) V. NATURAL JOIN OPERATOR- This operator allows to combine certain selections. It is denoted by ‘ ’ symbol. The natural join operation performs on twoarguments forcing equality on those attributes that appear in both relational schemas and finally removes duplicate attributes. Examples- 1. Find all customers who has loan and account in bank. Π customer_name (depositor loan) 2. Find names of all branches with customer who have account in the bank andlive in vashi city. Π branch_name (σ customer_city=”Vashi”(customer account depositor)) INTRODUCTION TO QUERY OPTIMIZATION It is the process of selecting the most efficient query evaluation plan from among the many possible for processing a given query, especially if query is complex. Programmer do not expect from user that they write queries for processing efficiently. It is expected that system should construct query evaluation plan that minimize the cost of query evaluation. This is where query optimization comes into play. One aspect of optimization offers at relational algebra level, where the system finds an expression that is equivalent to given expression, but more efficient toexecute. Another aspect is to select detailed strategy for processing query such as choosing the algorithm to use for executing operation, choosing specific indices to use andso on.
  • 98.
    It is thejob of query optimizer to come up with query evaluation plan that computes same result as the given expression and is least costly way of generating result. To find the least costly query evaluation plan, the query optimizer needs to generate alternative plans that produce the same result as given expression. To choose least costly one query evaluation plan generation involves following three steps: - i. Generating expressions that are logically equivalent to given expression. ii. Estimating cost of each evaluation plan. iii. Add explanatory notes to the resultant expressions in alternative wayto generate alternative query evaluation plan. Steps i and iii are interleaved in query optimizer. Some expression aregenerated and explained and so on. Step ii is done in background by collecting statistical information about relations such as relation sizes and index depth to make good estimate of cost of plan. Query optimization is one of the most important tasks of a relational DBMS. One of the strengths of relational query languages is the wide variety of ways in which a user can express and thus the system can evaluate a query. Although this flexibility makes it easy to write queries, good performance relies greatly on the quality of the query optimizer-a given query can be evaluated in many ways, and the difference in cost between the best and worst plan may be several orders ofmagnitude. Realistically, we cannot expect to always find the best plan, but we expect to consistently find a plan that is quite good. Queries are parsed and then presented to a query optimizer, which is responsible for identifying an efficient execution plan. The optimizer generates alternative plans and chooses the plan with the least estimated cost.  The space of plans considered by a typical relational query optimizer can be understood by recognizing that a query is essentially treated as a  -  -  algebra expression, with the remaining operations (if any, in a given query)
  • 99.
    Query Parsing, Optimization,and Execution Commercial Optimizers: current relations DBMS optimizers are very complex pieces of software with many closely guarded details, and they typically represent 40 to 50 man-yeas of development effort!  carried out on the result of the  -  -  expression. Optimizing such a relational algebra expression involves two basic steps:  Enumerating alternative plans for evaluating the expression. Typically, an optimizer consider a subset of all possible plans because the number of possible plans is very large.  Estimating the cost of each enumerated plan and choosing the plan with the lowest estimated cost.
  • 100.
    The process ofdecomposition of a relation R into a set of relations R1, R2....Rn is based on identifying attributes and using that as a basis of decomposition. R=R1 U R2 U………Rn. This is a process of dividing one table into multiple tables using projection operator. We may decompose tables into vertical segments. Vertical fragmentation is done with help of projection operator. By taking projection of original table we can create multiple vertically fragmented tables. Original table Decomposition Q.] Explain decomposition? If a relation is not in the normal form and we wish the relation to be normalized so that some of the anomalies (like insert, update or delete anomalies) can be eliminated, it is necessary to decompose the relation in two or more relations. Eno Ename Class 1 Mahesh BE 2 Yogesh SE 3 Amit TE Vertically Decomposed Tables Eno Ename 1 Mahesh 2 Yogesh 3 Amit Eno Class 1 BE 2 SE 3 TE
  • 101.
    Q.] Explain thedesirable Properties of decomposition: The main properties of decomposition are as listed below, a. Lossless-join decomposition b. Dependency preservation c. Lack of redundancy (Repetition of information) 1) Lossless join decomposition It is clear that decomposition must be lossless so that we do not lose any information from the relation that is decomposed. Lossless join decomposition ensures that we can never get the situation where false tuple are generated in relation. For every value on the join attributes there should be a unique tuple in one of the relations. For following table to become lossless we need to go for following steps. Deptjd Dname Stud_Id Sname Location 10 Development 1 Sushant Mahim 20 Teaching 2 Snehal Vashi 30 HR 3 Pratiksha Warli 20 Admin 4 Supraja Dadar a.Let R1 and R2 form decomposition of relation R. b.Decompose the relation schema Department-Student into Department-schema = (Dept_Id, Dname) Student-schema = (Stud_id, Sname, Location) c. The attributes in common must be a key for one of the relation for decomposition to be lossless. R1 ∩ R2 ≠ Φ. There must not be null.
  • 102.
    Note: After alossless decomposition, you are joining a primary key and a foreign key of table. For the above table, we have following decomposed tables:- Studjd Deptjd Sname Location 1 10 Sushant Mahim 2 20 Snehal Vashi 3 30 Pratiksha Worli 4 20 Supraja Dadar (Student) (Department) 2.) Dependency preservation Dependency preservation is another important requirement since a dependency is a very important constraint on the database. As a result of any database updates, the database should not result in illegal relation being created. Hence, our design should allow us to check updates without natural joins. If X —> Y holds then we know that the two (sets) attributes are closely related or functionally dependent and it would be useful if both attributes are in the relation so that the dependency can be checked easily. This can be done by maintaining functional dependency. Example: Student Schema = (Stud_Id, Stud_Name, Dept_Id, Dname, Location, Subj_Id, SubjName) Student-Department Schema = (Stud_Id —> Dept_Id, Dname, Location) Student-Subject Schema = (Stud_Id —> Subj_Id, SubjName) Deptjd Dname 10 Development 20 Teaching 30 HR
  • 103.
    3) Lack ofRedundancy ( Repetition of information) Decomposition that we have done should not suffer from any repetition of information problem. For eg: STUDENT and SECTION data are separated into distinct relations. Thus we do not have to repeat STUDENT data for each SECTION. If a single SECTION is made into several STUDENTS, we do not have to repeat the SECTION data for each STUDENT. IT is desirable not to have any redundancy in database. This property may be achieved by normalization process. Q.] What is a Lossy- join decomposition? We decomposed a relation intuitively but, still we need a better basis for deciding decompositions since intuition may not always be correct. A careless decomposition may lead to problems like loss of information. Department-Student Schema = (Dept_Id, Dname, Stud_Id, Sname, Location) Deptjd Dname Stud_Id Sname Location 10 Development 1 Sushant Mahim 20 Teaching 2 Snehal Vashi 30 HR 3 Pratiksha Warli 20 Admin 4 Supraja Dadar Suppose we decompose the above relation into two relations Department and Student as follows :
  • 104.
    Schema 2: StudentSchema contains (Stud_Id, Sname, Location) All the information that was in the relation 'Department' appears to be still available in 'Department-Student' schema but this is not so. Suppose, we would need to join 'Department' and 'Student' table this join becomes lossy join decomposition. As R, n R2 = O So, there is no column common between them, therefore joining not possible. A lossless decomposition is that which guarantees that the join will result in exactly the same relation as was decomposed . Q.] Explain Functional Dependency: Schema 1: Department Schema containing (Deptjd, Dname) Deptjd Dname 10 Development 20 Teaching 30 HR Studjd Sname Location 1 Sushant Mahim 2 Snehal Vashi 3 Pratiksha Warli 4 Supraja Dadar The concept of functional dependency is given by E. F. Codd which is also called as normalization process. This concept is used to define various normal forms.
  • 105.
    Functional dependency isa type of constraints exists between multiple attributes of a relation. Functional Dependency (FD) determines the set of values or the attribute based on another attribute It is denoted by (—>) This form can be written as Y is functionally dependent on X or X determines Y. E code —> E Name We can say column X is functionally dependent on other column Y. If data value in column X change when data value in another column Y is modified. an FD X  Y essentially says that if two tuples agree on the values in attributes X, they must also agree on the values in attributes Y. Functional Dependency provides a formal mechanism to express constraints between various attributes of a relation. For Eg, let us say in below table if the name of employee is changed its ID also need to be changed so, we can say NAME column is dependent on employee ID column, i.e. For different name of employee different ID is given. Employee Table ID Name • a • • MahOOl Mahesh .... Q.] Explain the Types of Functional Dependencies 1. Full Functional Dependency: A Functional Dependency A —> B is a full functional dependency if removal of any attributes from A means that the dependency does not hold any more. Example
  • 106.
    {Emp_no, Project_no} ->HOURS i.e. Emp_no —> Hours and Project_no —> Hours In the above example, Hours is fully functionally dependent on both Emp_no and Project_no. The number of hours spent on the project by a particular employee cannot be determined with the project number (Project_no) alone. It needs the employee number (Emp_no) as well. 2. Partial Functional Dependency: A partial dependency means that a non key column is depend on some columns in composite primary key of a table. An FD A —> B is a partial dependency if there is some attribute X € A (X subset of A), that can be removed from A and the dependency will still hold. Example {Emp_no, Project_no} —> Ename that is, Emp_no —> Ename In the above example, Ename is partially dependent on {Emp_no, Project_no} since employee name (ename) can be determined using the employee id (Emp_no) alone even if project_no is removed from the relation. Note: For a table to be in 2nd Normal form there should be no partial dependencies. Q.] Explain Transitive Dependency This concept is used when there is redundancy in database. If changing any non key column (column other then key column) causes change in other non key column in such situations you may have transitive dependency. When one non key attribute is functionally dependent on another non key attribute then such a dependency is called as transitive dependency.  Non Key Attribute —» Non Key Attribute An FD X —>Y in a relation R is a transitive dependency, if there is a set of attributes Z that is not a subset of any key of R, and both X —» Z and Z —> Y holds true. For eg: EMP_DEPT {Eno, Ename, Dnumber, DeptMgrNo} Eno —> DeptmgrNo is transitive
  • 107.
    Dependency of DeptMgrNoon key attribute Eno is transitive as DeptMgrNo depends on Dnumber and DNumber itself is dependent on Eno. Eno —> Dnumber and Dnumber —> DMgrNo Q.] Armstrong's Axioms - Closures of Functional Dependency Given that A, B and C are sets of attributes in a relation R; one can derive several properties of functional dependencies. Axioms are nothing but rules of inference which provides a simple technique for reasoning about functional dependencies. 1) Primary Rules a. Subset Property (Axiom of Reflexivity) If Y is a subset of X, then X —» Y b. Augmentation (Axiom of Augmentation) If X —> Y, then XZ -> YZ c. Transitivity (Axiom of Transitivity) If X -> Y and Y -> Z, then X -> Z 2) Secondary Rules (Based on above rules) a. Union: If X -» Y and X -> Z, then X -> YZ b. Decomposition: If X —> YZ and X -> Y, then X -> Z c. Pseudo Transitivity: If X —> Y and YZ -> W, then XZ -> W Example : Consider relation R = (A,B,C,D,E,F) having set of FD's A -> B A-> C
  • 108.
    Solution: 1. A->E As A—> B and B —> E So using Transitive rule, 2. BC -> DF As BC -> D .......(i) BC -> F .......(ii) Using union rule (i) and (ii) .. BC -> DF 3. AC -> D A->B ..(i) BC->D .(ii) Using pseudo transitivity .. AC -> D BC-> D B -> E BC->F AC-> F Calculate some members of Axioms as be below, 1.) A->E 2.) BC->DF 3.) AC- >D 4.) AC->DF 4. AC -> DF From above solution (3) AC->D
  • 109.
    AC->F Therefore, AC->DF Normalization Normalization isa step by step decomposition of complex records into simple records. Normalization results in tables that satisfy some constraints and are represented in a simple manner. This process is also called as canonical synthesis. This is a relational database design process to avoid data redundancy by applying some constraints on data to avoid various data anomalies. For E.g. If same information is repeated in multiple tables of database then there are chances that these tables will not be consistent in case of data updation, insertion or deletion. This instance may lead to problems of data integrity. A normalized table is less vulnerable to such data anomalies. Normalization is a process of designing a consistent database by minimizing redundancy and ensuring data integrity through decomposition which is lossless. Q.] Goals/ Importance of Database Normalization 1. Ensures Data Integrity Data integrity ensures the correctness of data stored within the database. It is achieved by imposing integrity constraints. An integrity constraint is a rule, which restricts values present in the database. There are three integrity constraints: (i) Entity constraints: The entity integrity rule states that the value of the primary key can never be a null value. Because a primary key is used to identify a unique row in a relational table, its value must always be specified and should never be unknown.The integrity rule requires that insert, update and delete operations maintain the uniqueness and existence of all primarykeys.
  • 110.
    (ii) Domain Constraints:Only permissible values of an attribute are allowed in a relation. (iii) Referential Integrity constraints: The referential integrity rule states that if a relational table has a foreign key, then every value of the foreign key must either be null or match the values in the relational table (referenced table) in which that foreign key is a primary key. 2.) Prevents Redundancy in data: A non-normalized database is vulnerable to data anomalies, if it stores data redundantly. If data is stored in two locations, but the data is updated in only one of the locations, then that data becomes inconsistent. A normalized database stores non-primary key data in only one location. 3.) To avoid Data Anomaly: A non-normalized table can suffer from logical inconsistencies of various types, and from data anomalies. A Relational database table should be designed in such a way that it will avoid all data anomalies. Q.] Which are the different anomalies related to normalization:- 1) Update anomaly: Same information can be present in multiple records of various relations; updates to only one table may result in logical inconsistencies. Example each record in an "Emp_Salary" table might contain an Emp_ID, Ename, Address, Salary. Thus a change of address for a particular Employee will potentially need to be applied to multiple tables such as Employee table. If all the records are not updated then some tables may leave in an inconsistent state. 2) An insertion anomaly: There is a possibility in which certain facts cannot be recorded at all or they are not yet recorded. Example Consider a table, Faculty (Faculty_ID, FName, Subject_Code, Subject, Class). We can add the details of any faculty member who teaches for a certain subject in a certain class, but we cannot record the details of a new faculty member who has not yet been assigned to teach any subject or class. So, subject and class column may be empty initially. If data deleted from one table all relevant data must also be deleted or redundant.
  • 111.
    110 3) Deletion Anomaly:If data deleted from one table all relevant data in another related tables must also be deleted otherwise it will create redundancy problem. Deletion of some data from a relation necessitates the deletion of some unrelated data also called as deletion anomaly. Example In the previous example, the table suffers from this type of anomaly. If a faculty member temporarily ceases to be assigned a subject, we must delete the entire record on which that faculty member appears. Normal Forms Forms are designed to logically address potential problems such as inconsistencies and redundancy in information stored in the database. A database is said to be in one of the Normal Forms, if it satisfies the rules required by that form as well as the previous form; it also will not suffer from any of the problems addressed by the form. Q.] State and explain Types of Normal Forms a. First normal form (INF) b. Second normal form (2NF) c. Third normal form (3NF) d. Boyce-Codd normal form (BCNF) e. Fourth normal form (4NF) f. Fifth Normal Form (5NF) 1.) First Normal Form Simplest form of normalization, simplifies each attribute in relation. This Normal form given by E.F. Codd (1970) and the later version by C.J. Date (2003)
  • 112.
    111 A Relation isin 1st NF, if every row contains exactly one value for each attribute. 1st NF states that attributes included in relation must have atomic (Simple, indivisible) values and all attribute in a tuple must have a single value from the domain of that attribute. • In short rules 1st NF is, Table columns should contain atomic data There should not be any repeating group of data. Example • Consider a table 'Faculty' which has information about the faculty, subjects and the number of hours allotted to each subject they teaches in class. Faculty Faculty code Faculty Name Date of Birth Subject Hours 100 Yogesh 17/07/64 DSA 16 SS 8 IS 12 101 Amit 24/12/72 MIS 16 PM 8 IS 12 102 Omprakash 03/02/80 PWRC 8 PCOM 8 IP 16 103 Nitin 28/11/66 DT 10 PCOM 8
  • 113.
    112 SS 8 104 Mahesh01/01/86 DT 10 ADBMS 8 PWRC 8 The above table does not have any atomic values in the 'Subject' column. Hence, it is called un- normalized table. Inserting, updating and deletion would be a problem is such table. Hence it has to be normalized. For the above table to be in first normal form, each row should have atomic values. Hence a 'Sr. No.' column is included in the table to uniquely identity each row. 1NF Table Sr. No. Faculty code Faculty Name Date of Birth Subject Hours 1 100 Yogesh 17/07/64 DSA 16 2 100 Yogesh 17/07/64 SS 8 3 100 Yogesh 17/07/64 IS 12 4 101 Amit 24/12/72 MIS 16 5 101 Amit 24/12/72 PM 8 6 101 Amit 24/12/72 IS 12 7 102 Omprakash 03/02/80 PWRC 8 8 102 Omprakash 03/02/80 PCOM 8 9 102 Omprakash 03/02/80 IP 16 10 103 Nitin 28/11/66 DT 10 11 103 Nitin 28/11/66 PCOM 8 12 103 Nitin 28/11/66 SS 8
  • 114.
    113 13 104 Mahesh01/01/86 DT 10 14 104 Mahesh 01/01/86 ADBMS 8 15 104 Mahesh 01/01/86 PWRC 8 This table shows the same data as the previous table but we have eliminated the repeating groups. Hence the table is now said to be in First Normal form (INF). But now we have introduced Redundancy into the table. This can be eliminated using Second Normal Form (2NF). 2.) Second Normal form This normal form makes use of functional dependency and tries to remove problem of redundant data that was introduced by 1NF. Therefore before applying 2NF to a relation, it needs to satisfy 1NF condition. A relation is in 2NF, if it is in 1NF and every non-key attribute is fully functionally dependent on the primary key of the relation. OR A relation is in 2NF, if it is in INF and every non-key attribute is fully functionally dependent on the whole and not just part of primary key of relation. In short 2NF means, It should be in INF There should not be any partial dependency To make the relation in 2NF:- a. Find and remove attributes that are related to only a part of the key or not related to key. b. Group the removed attributes in another table. c. Assign the new table a key that consists of that part of the old composite key. d. If a relation is not in 2NF, it can be further normalized into a number of 2NF relations.
  • 115.
    114 Let us considerthe table we obtained after first normalization. Sr. No. Faculty code Faculty Name Date of Birth Subject Hours 1 100 Yogesh 17/07/64 DSA 16 2 100 Yogesh 17/07/64 SS 8 3 100 Yogesh 17/07/64 IS 12 4 101 Amit 24/12/72 MIS 16 5 101 Amit 24/12/72 PM 8 6 101 Amit 24/12/72 IS 12 7 102 Omprakash 03/02/80 PWRC 8 8 102 Omprakash 03/02/80 PCOM 8 9 102 Omprakash 03/02/80 IP 16 10 103 Nitin 28/11/66 DT 10 11 103 Nitin 28/11/66 PCOM 8 12 103 Nitin 28/11/66 SS 8 13 104 Mahesh 01/01/86 DT 10 14 104 Mahesh 01/01/86 ADBMS 8 15 104 Mahesh 01/01/86 PWRC 8 While eliminating the repeating groups, we have introduced redundancy into table. Faculty Code, Name and Date of Birth are repeated since the same faculty is multi skilled. To eliminate this, let us split the table into 2 parts; one with the non-repeating groups and the other for repeating groups. Faculty
  • 116.
    115 Faculty code FacultyName Date of Birth 100 Yogesh 17/07/64 101 Amit 24/12/72 102 Omprakash 03/02/80 103 Nitin 28/11/66 104 Mahesh 01/01/86 Sr. No Faculty code Subject Hours 1 100 DSA 16 2 100 SS 8 3 100 IS 12 4 101 MIS 16 5 101 PM 8 6 101 IS 12 7 102 PWRC 8 8 102 PCOM 8 9 102 IP 16 10 103 DT 10 11 103 PCOM 8 12 103 SS 8 13 104 DT 10 14 104 ADBMS 8
  • 117.
    116 15 104 PWRC8 Faculty Code is the only key to identify the faculty name and the date of birth. Hence, Faculty code is the primary key in the first table and foreign key in the second table. Faculty code is repeated in the Subject table. Hence, we have to take into account the 'SNO' to form a composite key in Subject table. Now, SNO & Faculty code can uniquely identity each row in this table. Hence, the relation is now in Second Normal form. 3.) Third Normal Form This normal form used to minimize the transitive redundancy. In order to remove the anomalies that arose in Second Normal Form and to remove transitive dependencies, if any, we have to perform third normalization. A relation is in 3NF, if it is in 2NF and no non-key attribute of the relation is transitively dependent on the primary key. 3NF prohibits transitive dependencies. In short 3NF means, 1. It should be in 2 NF 2. There should not be any transitive partial dependency Example Now let us see how to normalize the second table obtained after 2NF. Subject
  • 118.
    117 In this table,hours depend on the subject and subject depends on the Faculty code and Sr. No. But, hours is neither dependent on the faculty code nor the SNO. Hence, there exists a transitive dependency between Sr. No., Subject and Hours. If a faculty code is deleted, due to transitive dependency, information regarding the subject and hours allotted to it will be lost. For a table to be in 3rd Normal form, transitive dependencies must be eliminated. So, we need to decompose the table further to normalize it. Fac_Sub Sr. No Faculty code Subject Hours 1 100 DSA 16 2 100 SS 8 3 100 IS 12 4 101 MIS 16 5 101 PM 8 6 101 IS 12 7 102 PWRC 8 8 102 PCOM 8 9 102 IP 16 10 103 DT 10 11 103 PCOM 8 12 103 SS 8 13 104 DT 10 14 104 ADBMS 8 15 104 PWRC 8
  • 119.
    118 Subject Hours DSA 16 SS8 IS 12 MIS 16 PM 8 PWRC 8 PCOM 8 Sr. No Faculty code Subject 1 100 DSA 2 100 SS 3 100 IS 4 101 MIS 5 101 PM 6 101 IS 7 102 PWRC 8 102 PCOM 9 102 IP 10 103 DT 11 103 PCOM 12 103 SS 13 104 DT 14 104 ADBMS 15 104 PWRC
  • 120.
    119 IP 16 DT 10 ADBMS8 After decomposing the 'Subject' table we now have 'Fac_Sub' and 'Sub_Hrs' table respectively. Note : In most cases, third normal form is the sufficient level of decomposition. But some case requires the design to be further formalized up to the level of 4th as well as 5th . 4. BCNF Normal Form BCNF is more precise form of 3NF. The intention of Boyce-Codd Normal Form (BCNF) is that 3NF does not satisfactorily handle the case of a relation processing two or more composite or overlapping candidate keys. Candidate key is a column in a table which has the ability to become a primary key. A determinant is any attribute (simple or composite) on which some other attribute is fully functionally dependent. a—>b Then, attribute 'a' is determinant . A relation R is said to be in BCNF, if and only if every determinant is a candidate key. For example:- Soldiers are part of one or many units, and each unit is under the control of an officer. SOLDIERID OFFICERID UNITID 1 A 1 2 A 1 3 B 2 Firstly, we'll identify the dependencies. There is a dependency between (SOLDIERID + OFFICERID) and UNITID, a soldier and an officer implies their respective unit, but there is also a dependency between UNITID and OFFICERID. SOLDIERID -> UNITID UNITID -> OFFICEID SOLDIERID, OFFICEID -> UNITID
  • 121.
    ADDRESS:302 PARANJPE UDYOGBHAVAN,OPP SHIVSAGAR RESTAURANT,THANE [W].PH 8097071144/55 120 This last dependency however is not partial (dependence on part of a prime attribute), nor transitive (dependence of a nonprime attribute on another nonprime attribute. What we have is a table where a determinate in the table is not a candidate key (UNITID). Candidate key are SOLDERID and OFFICEID. Thus we can convert the above to BCNF by realizing that a better composite key is one of SOLDERIERID and UNITID, which creates a dependency between UNITID and OFFICERID, which is a partial dependency. This is then resolved by dividing the table, the solution being as follow: Candidate key (SOLDIERID) and SOLDIERID -> UNITID Candidate key (UNIT ID) AND UNIT ID -> OFFICEID UNITID OFFICERID 1 A 1 A 2 B The above table is now in BCNF. Q.] Explain Multivalued Dependency and 4th NF Multivalued dependency is defined as relationship which accepts the cross product pattern. Multivalued dependency defined by X —>—> Y is said to hold for a relation R(X,Y,Z) if for a given set of values of X, there is a set of associated values of attribute Y, and X values depend only on X values and have no dependence on the set of attributes Z. SOLDIERID UNITID 1 1 2 1 3 2
  • 122.
    121 Multivalued dependencies occurwhen the presence of one or more rows in a table implies the presence of one or more other rows in that same table. For example:- Imagine a car company that manufactures many models of car, but always makes both red and blue colors of each model. If you have a table that contains the model name, color and year of each car the company manufactures, there is a multivalued dependency in that table. If there is a row for a certain model name and year in blue, there must also be a similar row corresponding to the red version of that same car. 5. 4th Normal Form This Normal form is given by Ronald Fagin (1977) Fourth Normal Form tries to remove multi valued dependency among attributes. A relation is said to be in fourth normal form if each table contains no more than one multivalued dependency per key attribute. A Boyce Codd normal form relation is in fourth normal form if :-  there is no multi value dependency in the relation or  there are multi value dependency but the attributes, which are multi value dependent on a specific attribute, are dependent between themselves. Or  If a relation scheme is in BCNF and at least one of its keys consists of a single attribute , it is also in 4th NF Example Seminar Faculty Topic DBP-1 Brown Database Principles DAT-2 Brown Database Advanced Techniques DBP-1 Brown Data Modeling Techniques DBP-1 Robert Database Principles
  • 123.
    122 DBP-1 Robert DataModeling Techniques DAT-2 Maria Database Advanced Techniques In the above example, same topic is being taught in a seminar by more than 1 faculty and each Faculty takes up different topics in the same seminar. Hence, Topic names are being repeated several times. This is an example of multivalued dependency. To eliminate multivalued dependency, split the table such that there is no multivalued dependency. Seminar Topic DBP-1 Database Principles DAT-2 Database Advanced Techniques DBP-1 Data Modeling Techniques Seminar Faculty DBP-1 Brown DAT-2 Brown DBP-1 Robert DAT-2 Maria Q.] Explain Join dependency and 5th NF A table T is subject to a join dependency if T can always be recreated by joining multiple tables each having a subset of the attributes of T. If one of the tables in the join has all the attributes of the table T, the join dependency is called trivial. The join dependency plays an important role in the 5NF normalization, also known as project- join normal form
  • 124.
    123 6. 5th Normal Form Thisnormal form given by Ronald Fagin (1979). This normal form decomposes relations to reduce redundancy. A relation is said to be in 5NF if and only if it is in 4NF and every join dependency in it is implied by the candidate keys. Fifth normal form deals with cases where information can be reconstructed from smaller pieces of information that can be maintained with less redundancy. Fifth normal mainly emphasizes on lossless decomposition. Example • Consider the following example A relation is in 5NF if every join dependency in the relation is implied by the keys of the relation. It implies that relations that have been decomposed in previous NF can be recombined via natural joins to recreate the original relation. If a relation is in 3NF and each of its keys consists of a single attribute, it is also in 5NF.
  • 125.