PostgreSQL/Configuration

From Wikibooks, open books for an open world
Jump to navigation Jump to search
Managing the Instance PostgreSQL
Configuration
MVCC


The main configuration file is postgresql.conf. It is divided into several sections according to different tasks. The second important configuration file is pg_hba.conf, where authentication definitions are stored.

Both files reside in the special directory $PGDATA (Debian/Ubuntu) or in the main directory of the cluster (RedHat).

Some configuration items have a dynamic nature, and will take effect with a simple pg_ctl reload. Others require a restart of the instance pg_ctl restart. The comments in the default configuration files state of the two actions has to be taken.

postgresql.conf[edit]

File Locations[edit]

The value of data_directory defines the location of the cluster's main directory. In the same way the value of hba_file defines the location and the name of the above mentioned pg_hba.conf file (host based authentication file), where rules for authentication are stored - some more details are shown below.

Connections[edit]

In the connections section you define the port number (default: 5432), with which client applications can reach the instance. In addition the maximum number of connections is defined as well as SSL, IP and TCP settings.

Resources[edit]

The main definition in the resources section is the size of shared buffers. It determines, how much space is reserved to "mirror" the content of data files within PostgeSQL's buffers in RAM. The predefined default value of 128 MB is relatively low.

Secondly, there are definitions for the work and the maintenance memory. They determine the RAM sizes for sorts, create index commands, ... . These two RAM areas exist per connection and are used individually by them whereas the shared buffers exist only once for the whole instance and are used concurrently by multiple processes.

Additionally there are some definitions concerning vacuum and background writer processes.

WAL[edit]

In the WAL section there are definitions for the behaviour of the WAL mechanism.

First, you define a WAL level out of the four possibilities minimal, archive, hot_standby, and logical. Depending on the decision, which kind of archiving or replication you want to use, the WAL mechanism will either write only basic information to the WAL files or include additional information. minimal is the basic method which is always required for every crash recovery. archive is necessary for any archiving action, which includes the point-in-time-recovery (PITR) mechanism. hot_standby adds information required to run read-only queries on a standby server. logical adds information necessary to support logical decoding.

Additionally and in correlation to the WAL level archive there are definitions which describe the archive behaviour. Especially the 'archive_command' is essential. It contains a command which copies WAL files to an archive location.

Replication[edit]

If you use replication to a different server, you can define the necessary values for master and standby server in this section. The master reads and pays attention only on the master-definitions and the standby only to the standby-definitions (you can copy this section of 'postgres.conv' directly from master to standby). You must define the WAL level to an appropriate value.

Tuning[edit]

The tuning section defines the relative costs of different operations: sequential disc I/O, random disc I/O, process one row, process one index entry, process one function-call or arithmetic operation, size of effective RAM pages (PostgreSQL + OS) per process which will be available at runtime. These values are used by the query planner during its search for an optimal query execution plan. The values are not real values in sense of milliseconds or number of CPU cycles. They are only a rough guideline for the query planer and relative to each other. The real values are calculated during the query execution may differ significantly.

There is also a subsection concerning costs for the genetic query optimizer, which - in opposite to the standard query optimizer - implements a heuristic searching for optimal plans.

Error Logging[edit]

The error logging section defines the amount, location and format of log messages which are reported in error situations or for debugging purposes.

Statistics[edit]

In the statistics section you can define - among other things - the amount of statistic collection for parsing, planing and execution of queries.

pg_hba.conf[edit]

The pg_hba.conf file (host-based authentication) contains rules for client access to the instance. All connection attempts of clients which do not satisfy these rules are rejected. The rules restrict the connection type, client IP address, database within the cluster, user-name, and authentication method.

There are two main connection types: local connections (local) via sockets and connections via TCP/IP (host). The term local refers to the situation, where a client program resides on the same machine as the instance. The client may override the local connection and use the host connection type by using the TCP/IP address syntax (e.g.: 'localhost:5432') of the cluster.

The client IP address is a single IPv4 or IPv6 address or a masking of a net-segment via a CIDR mask.

The database and the client user name must be given explicitly or may be abbreviated by the key word "ALL".

There are different authentication methods

  • trust: don't ask for any password
  • reject: don't allow any access
  • password: ask for a password
  • md5: same as 'password', but the transfer of the password occurs MD5-encrypted
  • peer: trust the client, if he uses the same database username as his operation system username (only applicable for local connections)

Since the pg_hba.conf records are examined sequentially for each connection attempt, the order of the records is significant. The first match between defined criteria and properties of incoming connection requests hits.


Managing the Instance PostgreSQL
Configuration
MVCC