PostgreSQL/MVCC

From Wikibooks, open books for an open world
Jump to navigation Jump to search
Configuration PostgreSQL
MVCC
WAL


Multiversion Concurrency Control (MVCC) is a broadly used database technique to accomplish two goals: first, it provides every concurrently running transaction with a consistent state of the data as of the time of the transaction's start - this state is called a snapshot -, and second, it minimizes lock contention and guarantees high performance in a multi-user environment. MVCC implementation is very complex. This short introduction is provides a high-level overview.

Note: On this page, row is used in the conventional sense of relational database systems, and record denotes one or more versions of this row.

The Concept[edit]

The basic idea is, that the modification of a column value does not lead to a modification of the value in the original record. Instead PostgreSQL creates an additional record which contains the changed value. Thus an UPDATE command does not really update values in a record. It creates a new record and leaves the values in the old record unchanged.

This behaviour raises the question, which one of the multiple records of the row shall be delivered to concurrent or later running processes. To resolve the situation PostgreSQL adds two additional hidden system columns to every record. The columns xmin and xmax contain transaction IDs (xid). Those transaction IDs act like a serial data type . They are increased by 1 for every new transaction within the system. Therefore a lower xid indicates an older transaction. xmin contains the xid of the transaction that created the record. Transactions with an xid greater that xmin are younger and may see this record. xmax is initially 0. During write operations it changes to the xid of the transaction, which has created the next version of this record - or has deleted the record. Transactions with an xid greater that xmax are younger and cannot see this record, if xmax differs from 0.

Examples[edit]

Example 1: INSERT

Time xmin xmax column value Transaction #20 Transaction #21
INSERT ...
t1 20 0 1
COMMIT
SELECT ...

The initial INSERT creates one record. xmin contains the xid of this transaction and xmax the value 0 to indicate, that the record is not deleted. The SELECT of transaction #21 sees this record because its xid is greater than xmin.

Example 2: UPDATE

Time xmin xmax column value Transaction #25 Transaction #33
t1 20 0 1
UPDATE ...
t2 20 25 1
25 0 2
COMMIT
SELECT ...

UPDATE stores its own xid in xmax of the already existing record and creates a new record with the new column value in the same way as an INSERT. The SELECT of transaction #33 sees only the record with the column value 2. The first record is invisible as its xmax is smaller (= older) than the transactions xid.

Example 3: DELETE

Time xmin xmax column value Transaction #101 Transaction #120
t2 20 25 1
25 0 2
DELETE ...
t3 20 25 1
25 101 2
COMMIT
SELECT ...

DELETE changes xmax in the last record to its own xid - nothing else is done! Transaction #120 cannot see any of the two records as its xid is heigher than xmax (and xmax is not 0).

The three examples show only the basic operations. The situation can become much more complicated, if the SELECT runs before the COMMIT of the writing transaction, if the writing transaction use multiple write-operations within its transaction context, if the writing transaction aborts, if the number of xids gets exhaused and wraps arround, ... .

Vacuum[edit]

VACUUM reclaims storage occupied by dead tuples. In normal PostgreSQL operation, tuples that are deleted or obsoleted by an update are not physically removed from their table; they remain present until a VACUUM is done. Therefore it's necessary to do VACUUM periodically, especially on frequently-updated tables.

If no additional action is taken, the MVCC technique significantly increases the size of any database that performs a large number of write actions. The number of records continues to grow even if the overall number of rows keeps constant or decreases. It is absolutely necessary to periodically remove such records that are no longer valid. The utility vacuumdb and the non-standard SQL command vacuum remove the invalid records.

Vacuum[1] retrieves the smallest (= oldest) xid from all active transactions. All records with an xmax smaller than this xid cannot been seen by any of the running or future transactions. Therefore it is safe, to physically remove them.

This task is very I/O intensive. Therefore, it should not run constantly, but on a periodical basis.


Configuration PostgreSQL
MVCC
WAL
  1. https://www.postgresql.org/docs/current/static/sql-vacuum.html