Oracle and DB2, Comparison and Compatibility/Storage Model/Physical Storage/DB2
In DB2, the hierarchy of storage units (from least inclusive to most inclusive) is Page, Extent, Container, Tablespace, Database.
In DB2, data is stored in a structure called a Page, it is the equivalent of an Oracle Block. The basic page structure in DB2 is that there is a fixed length page header, a variable length page trailer, and the space in between is used for data or free space. The information stored in a page can be data records (rows or indexes) or system information such as a table space spacemap page, which keeps track of the free space in data pages in a table space. Index data and Row data cannot be stored in the same page, and row data will not be split across pages (unlike Oracle that will mix index data and row data in the same block, and will break rows into ‘row pieces’ that may span blocks).
Database page size can vary and is determined by the buffer pool size. Pages sizes can be 4, 8, 16 and 32KB. It is possible to set up a database instance that has mixed page sizes in it – 4KB for system pages, and 16KB for user data page sizes. While this gives flexibility in tuning page sizes for different application workloads (OLTP applications with small fast transactions will benefit from smaller page sizes, wheras OLAP applications doing analytical queries across large data volumes will benefit from a larger page size), there is an administrative overhead for the DBA in maintaining this mixed page size environment. Alternatively, the DBA can set up the database with a uniform page size, where depending on the application requirements, all pages in the database have the same size.
Regardless of how big the pages are, they all have a fixed 68 byte overhead that DB2 uses to describe the page. This header contains information such as the offset to the beginning of contiguous free space in the page, the location of free space within the page (the hole chain) and the total amount of free space in the page. Pointers to the information on the page (i.e. the location of rows stored in the page) are maintained in a chain of two-byte entries at the end of each table page. This is called the ID Map, and each of these entries contains a Relative Byte Address (RBA) pointing to the beginning location of a row. Since the entries in the ID Map are 2 bytes long, the upper limit on the number of elements that can be stored in a page is 255.
Because free space and user data are managed in a non-contiguous way throughout the page, the ID Map entries can change – in the example where a modified row no longer fits in its original location. If a row will not fit back into a page where it was originally, the ID Map RBA will point to a location that contains a 6 byte pointer to a different page and ID map (called an offset pointer).
Free space in each page is managed in pieces called ‘holes’. Large holes (greater than 4 bytes) are chained from the header; small holes (1 to 4 bytes) can be anywhere in the page. Any contiguous free space is immediately in front of the ID map area. Because its location is known, it is not stored in the hole chain, there is an offset in the header that points to the beginning of contiguous free space.
The elements that can be stored in a DB2 page are therefore records (Rows and Indexes), Holes, Overflow Records or pointers to Overflow Records. Each element has an identifying flag and a 6-byte ID Map entry number in its prefix – the ROWID. As mentioned earlier, this ROWID can either be the actual record in the page or a pointer to a ROWID of an Overflow Record in a separate page.
DB2 indexes an element to a page, and its location in the page is stored in the ID Map. This means that DB2 can reorganize elements in a page without updating the index – the index points to the page, but the location of the record referred to by the index is maintained in the ID Map. DB2 can do housekeeping on a page – if a record is too large to fit back where it originally came from, DB2 will de-frag any fragmented free space (holes) and the space freed up by this will become part of the contiguous free space immediately before the ID Map.
Data on a page is a record or an offset pointer and it is useful to visualize how it is organized. Initially the page has a lot of free space and a pointer to its location is kept in the fixed page header. Rows are inserted into a table in first-fit order. The page’s free space map is searched for the first available space large enough to hold the new row. When a row is updated, it will be updated in place unless it is now larger than the space it occupied previously, in which case DB2 will de-frag the page. If there is still insufficient space left on the page the record will ‘overflow’ into another page, and its new location will be stored as a 6 byte offset pointer in the original record location. As the page fills up with information and as the information is modified it becomes fragmented throughout the page. Since contiguous free space is maintained at the end of the table (before the ID Map), it can be thought of as information on the page growing downward from the page header. The page header also keeps track of the free space in the page (holes between page elements), and will chain the larger (more useful) holes together directly beneath the page header. When space begins to run out on the page will move records into the space that was taken up with large holes. Think of records ‘floating’ up the page into spaces that used to be holes. The data will thus shuffle up the page and increase the contiguous free space at the bottom of the page. Small holes may eventually become big holes in this process (and as data is modified on a page), and will become incorporated into the large hole chain.
A pointer is kept to the beginning of contiguous free space, since the end is ‘nibbled’ away by ID Map entries – each new element on a page gets a 2 byte ID Map entry, and this will grow this map upward from the bottom of the page.
An extent in DB2 contains a number of contiguous pages. The page size and extent size in DB2 are set at the time that the tablespace is created and cannot be altered easily (i.e. by an ALTER command), once they are defined they are set for the life of the database. It is inefficient for a DBMS to work on individual pages, the I/O overhead eats into performance. Database I/O works with extents, in this way a number of contiguous pages are fetched from disk in a single I/O. As we will see later, through design considerations that account for data distribution and application needs (access based design), you can arrange that large amounts of related (and hence likely to be needed) data are fetched with a single I/O. Table data cannot be mixed within extents, so one extent is for one table only.
An extent is a multiple of pages, so selection of the page and extent size will have an impact on the type of application processing you are doing. Typically an OLTP environment will have small pages (4k,8k), and a small number of pages per extent: 4,6,16. Decision Support and Data Warehousing applications will have larger pages (16 or 32K), and larger extents (16, 32, 64 or 128). The extent size should be chosen based on the table size and anticipated usage, depending on whether the application is query intensive, transaction intensive or a mixture of both. Smaller tables are handled more efficiently with smaller extents.
The extent size is used by DB2 to determine the size of the prefetch block when data is retrieved. It also determines the number of pages that will be written to a container before skipping to the next container. If new data needs to be written and there is insufficient space, the write will physically allocate a full extent of contiguous pages.
Containers are how DB2 handles the storage of data on the operating systems file system. Data in the database is viewed logically as tablespaces, tables, rows and so on. Each tablespace is stored and managed in one or more containers, and how the tablespace is defined determines how data in the tablespace is handled on disk.
Containers can be managed by the DBA – Database Managed Storage, by the Operating System File System – System Managed Storage, or by the Database – Automatic Storage.
Containers (and how they are managed) are more concerned with the efficient usage of storage than the logical considerations of the database.
Database Managed Storage (DMS) requires space to be allocated in advance, and when more space is required the DBA needs to manage this. In a DMS space, the containers can be in a file under the host file system, or can be created directly on a raw device. If a raw device is used, data is written directly to the disk from the container without any operating system cacheing. While DMS performance is very good, it has an administration overhead – the DBA has to pre-allocate space and manage this space as the dataset grows. DMS is flexible in that the DBA can drop, add and resize containers as needed.
System Managed Storage (SMS) stores containers in a file system directory. Tables and indices are created as files within the directory, and the host operating system is responsible for space allocation (both initially and as the dataset grows or shrinks.) Although this reduces the DBA overhead involved with DMS, the tablespace can become fragmented which will result in overhead when opening and closing multiple files. It is not as easy to manage the size of containers in SMS as it is in DMS.
With DB2 9 tablespaces and container management is increasingly handled by the database itself – Automatic Storage. In Automatic Storage the database has a number of storage paths associated with them, and the tablespaces (and hence containers) in the database are assigned and allocated based on those storage paths by DB2. In this way, you don’t need to explicitly define the containers. They will grow automatically in their directory paths, and if the directory paths are on different disks, the table data will be spread across these disks. DB2 will create new containers if they are needed.Automatic storage gives the best options of DMS (speed and flexibility) and SMS (ease of administration)
A tablespace is a very important structure in DB2. As we saw earlier, tablespaces are made up of a number of containers which map directly to the disk. The containers in the tablespace specify the storage paths or directory structures that will physically hold the data.
Tablespaces contain tables, and hence user data that maps to the user and database applications. When you create a table, it has to go in a tablespace, and you can specify which tablespace to use (if you don’t, the system will default where it puts the table, and it might not be where you need it).
The tablespace is also what is mapped into RAM. For each tablespace, you have to define the page size, and each tablespace must have access to a bufferpool of the same page size, and a system temporary tablespace of the same page size. You have to specify which buffer pool the table space is associated with.
An extent is the unit of I/O for the DBMS, multiple contiguous pages are fetched and written as an extent. Table data cannot be mixed in extents, an extent is for one table only. Extents are stored within containers. Depending on how the containers are set up (either with Automatic Storage, DMS or SMS) there is a path associated with each container to part of the physical file system. If the tablespace (and hence containers) are on a disk array, the extent size should be set to the disk stripe size. In this way we can see that extents striped across multiple containers can be resident on different disks (useful for performance reasons), and the relationship between tables, extents, containers and disks is shown in figure n below
There are three types of tablespace in DB2; Regular, Large, and Temporary.
• Regular tablespace
A regular table space holdsl permanent data such as tables and indexes. It can also hold Large Objects (LOBs), or LOBs can be stored in a large table space. • Large tablespace
A table created in a large table space can be larger than a table in a regular table space. A large table can support more than 255 rows per data page improving space utilization on data pages.
Tablespaces within Temporary are system temporary (for database operations like sorts and joins) and user temporary where the user can define global temporary tables. Tables created in system temporary are managed by the system, once they are no longer needed, the system will delete them. User Temporary tables need to be explicitly managed by the user.
User temporary table space A user temporary table space stores declared global temporary tables. No user temporary table spaces exist when a database is created. At least one user temporary table space should be created to allow definition of declared temporary tables. User temporary table spaces are optional. No user temporary table space is created by default.
In DB2 (like Oracle) A database is comprised of a number of tablespaces.
The size of a database is the sum of the tablespaces it contains, and the containers within each tablespace have storage paths within them that map to the underlying disk system, whether this is managed by the OS, DBMS or DBA.
When a database is first created, three system tablespaces are created
• The Catalog tablespace
The catalog tablespace is a Regular tablespace called SYSCATSPACE and it holds the system catalog tables. There is only one catalog table space per database.
• Large table space
DB2 creates one large table space named USERSPACE1 when a database is created. A large table space stores all permanent data just as a regular table space does, including LOBs. This table space type must be DMS.
• System temporary table space
At least one system temporary must exist per database, and the default named TEMPSPACE1 is created with the database. A system temporary table space stores internal temporary data required during SQL operations such as joins, sorts, index creation, and table reorganization.
At database creation time, the recovery logs for the database are created, and a set of system tables is created in SYSCATSPACE. These system catalog tables contain information about the definitions of the database objects (i.e. tables, views, indexes, and packages), and database security information (i.e. which users can have access to these objects.) These tables are updated during the operation of a database; for example, when a table is created. The information in these tables is available through a series of routines and views.
After the database is created, the user can create additional Large Tablespaces, Regular Tablespaces and User Temporary Tablespaces. By default, no User Temporary Tablespaces are created along with the database.
Tablespaces manage space in the database, as they are created, grow and shrink, the method by with they are managed and the paths to the underlying storage take care of this for you. DBA’s can manage tablespaces for maintenance by taking them offline (i.e. for backup or recovery), however the SYSCATSPACE cannot be taken offline since it contains control information for the database.