PostgreSQL/Architecture Directories

From Wikibooks, open books for an open world
Jump to navigation Jump to search

PostgreSQL organizes durable (persistent) data as well as volatile state information about transactions or replication actions in the file system. Every cluster has its root directory somewhere in the file system. In many cases, the environment variable PGDATA points to this directory. The following graphic uses data, which is the default, as the name of the cluster's root directory.



The cluster's root directory contains many subdirectories and some files, all of which are necessary to store durable as well as temporary information. The root's name can be selected as desired, but the names of its subdirectories and files are constant respectively determined by PostgreSQL. The following paragraphs describe the most important subdirectories and files.

base contains one subdirectory per database. The names of those subdirectories consist of numbers. These are the internal Object Identifiers (OID), which are numbers to identify their definition in the System Catalog.

Within the database-specific subdirectories of base, there are many files: one or more for every heap and index. Again, the filenames consist of numbers. Those files are accompanied by files for the Free Space Maps (suffixed _fsm) and Visibility Maps (suffixed _vm), which contain optimization information. An example for filenames is: 3083, 3083_fsm, 3083_vm.

Another subdirectory is global. It contains files with information about SQL Objects which are not restricted to a certain schema, but known and relevant at the schema level.

In pg_tblspc, there are symbolic links that point to directories that are outside of the root directory tree, e.g. at a different disk. Heap and index files of non-default tablespaces reside there. Those defined within the default tablespace reside in the database-specific subdirectories.

The subdirectory pg_wal contains the WAL files. They arise and grow in parallel with data changes in the cluster and remain as long as they are required for recovery, archiving, or replication.

The subdirectory pg_xact contains information about the status of each transaction: in_progress, committed, aborted, or sub_committed.

In the root directory, there are some files. In many cases, the configuration files of the cluster are stored here. Also, if the instance is up and running, the file postmaster.pid exists here (by default, but other locations are possible). It contains the process ID (pid) of the Postmaster process which has started the instance and controls it.