Oracle and DB2, Comparison and Compatibility/Process Model

From Wikibooks, open books for an open world
< Oracle and DB2, Comparison and Compatibility
Jump to: navigation, search

Overview[edit]

The Wikibook section on Architecture covers the various logical and physical storage structures used by Oracle and DB2. This section covers how these DBMS’s perform common database operations on these structures. There are a number of concepts fundamental to the operation of a database, the most important being the concept of a transaction.

Transaction.

A transaction is a logical unit of work in a DBMS. It is the combination of the operations the database performs on a set of data (such as Insert, Select, Update and Delete), and the data itself. The purpose of a transaction is to provide isolation between all the operations on the database – the transaction basically says ‘I’m working on this set of data, and until I’m finished, I will control access to it.’ This also gives the DBMS a mechanism to maintain consistency in the database – all the operations in a transaction either work together or fail together, and at the end, the database is left in a known consistent state. Isolation and Consistency are two of the properties that define a database transaction. The others are Atomicity and Durability – and these concepts are collectively known as ACID.

ACID

All databases, independent of vendor, typically have to handle large numbers of concurrent users, and complex operations that may consist of a number of discrete steps – each of which may succeed or fail. Whether an operation is a single step, or a number of steps it is described by a logical unit called a transaction. This requirement is so common that it has been formulated as a set of properties by which a DBMS (or indeed any transactional system) can be evaluated. Once you understand how a database implements these properties, you can see how they guarantee transaction reliability. It is important to stress that although transactions need to be processed reliably, this does not mean that each DBMS (even databases from the same vendor) handle these transactions exactly the same. What reliability means is that once you know how the database handles a transaction you can be sure that it always does it this way.

The acronym ACID stands for Atomicity, Consistency, Isolation and Durability, and these properties are used to evaluate that a DBMS is processing transactions reliably. They were defined by Jim Gray in the 1970s and in 1983, Andreas Reuter and Theo Haerder coined the acronym ACID to describe them. By definition, database transactions must be ACID.

Atomicity.

There are any number of reasons that a transaction, or steps in a transaction might fail. The property of atomicity states that a transaction cannot be subdivided, and that no matter how many steps there may be in the transaction, they all work or they all fail.

Consistency.

At the end of a transaction, the database needs to be in a consistent state. One example of this is if a field is defined as a whole number, consistency could be maintained in this field by either rejecting decimal numbers or rounding them to a whole number – both leave the field in a consistent state. Another example is inter-field dependencies, where one field in a row depends on a field in another row. This is called referential integrity, and there are a number of options to maintain consistency in the database when one of these fields is changed. If, for example, you try to delete a row referenced by another row, you could either stop the transaction and roll it back (atomicity), delete the row and all other rows that reference it, or set the field in a row that refers to the row you are going to delete to a predetermined value (i.e. NULL).

Isolation.

All but the most primitive of computer operations involve many steps, and in multi user systems it is important to make sure that operations being carried out by one user on a set of data do not have that data changed underneath them by another user. What we do in this instance is throw a logical boundary around all the steps that we need to do to complete an operation and call this a transaction. Isolation is the mechanism by which we protect the data involved in this operation. At the simplest level we can just prohibit access to this data for the duration of the transaction. Problems can arise with this strategy through such situations as deadlock (processes a holding resources that process b needs, and process b holding data that process a needs – the so-called deadly embrace), and that simply locking data can degrade concurrency – locking out users defeats the purpose of having multi-user systems. Isolation is generally the most relaxed of the ACID properties, and there are a number of strategies for handling this.

Durable.

A database cannot lose any data, including transactions, and this property is known as durability. Oracle and DB2 implement durability by writing transactions to log files, and taking special steps to ensure that these log files are protected (to the extent of writing transactions simultaneously to different log files on different disks, and then backing these log files up in permanent offline storage).


Summary.

Databases absolutely cannot corrupt or lose data, and the ACID properties outlined above are implemented by such techniques as Locking, Logging, and Transaction Control (each covered in the following sections.) Another requirement is that they go fast.

Before you ever write a line of code against your data, the DBMS has implemented each of these requirements, and in many cases the implementation can be different even among databases from the same vendor. The following sections cover these operational aspects of Oracle and DB2