An alternative to the built-in standby mode described in the previous
sections is to use a restore_command
that polls the archive location.
This was the only option available in versions 8.4 and below. In this
setup, set standby_mode
off, because you are implementing
the polling required for standby operation yourself. See the
pg_standby module for a reference
implementation of this.
Note that in this mode, the server will apply WAL one file at a
time, so if you use the standby server for queries (see Hot Standby),
there is a delay between an action in the master and when the
action becomes visible in the standby, corresponding the time it takes
to fill up the WAL file. archive_timeout
can be used to make that delay
shorter. Also note that you can't combine streaming replication with
this method.
The operations that occur on both primary and standby servers are normal continuous archiving and recovery tasks. The only point of contact between the two database servers is the archive of WAL files that both share: primary writing to the archive, standby reading from the archive. Care must be taken to ensure that WAL archives from separate primary servers do not become mixed together or confused. The archive need not be large if it is only required for standby operation.
The magic that makes the two loosely coupled servers work together is
simply a restore_command
used on the standby that,
when asked for the next WAL file, waits for it to become available from
the primary. The restore_command
is specified in the
recovery.conf
file on the standby server. Normal recovery
processing would request a file from the WAL archive, reporting failure
if the file was unavailable. For standby processing it is normal for
the next WAL file to be unavailable, so the standby must wait for
it to appear. For files ending in
.history
there is no need to wait, and a non-zero return
code must be returned. A waiting restore_command
can be
written as a custom script that loops after polling for the existence of
the next WAL file. There must also be some way to trigger failover, which
should interrupt the restore_command
, break the loop and
return a file-not-found error to the standby server. This ends recovery
and the standby will then come up as a normal server.
Pseudocode for a suitable restore_command
is:
triggered = false; while (!NextWALFileReady() && !triggered) { sleep(100000L); /* wait for ~0.1 sec */ if (CheckForExternalTrigger()) triggered = true; } if (!triggered) CopyWALFileForRecovery();
A working example of a waiting restore_command
is provided
in the pg_standby module. It
should be used as a reference on how to correctly implement the logic
described above. It can also be extended as needed to support specific
configurations and environments.
The method for triggering failover is an important part of planning
and design. One potential option is the restore_command
command. It is executed once for each WAL file, but the process
running the restore_command
is created and dies for
each file, so there is no daemon or server process, and
signals or a signal handler cannot be used. Therefore, the
restore_command
is not suitable to trigger failover.
It is possible to use a simple timeout facility, especially if
used in conjunction with a known archive_timeout
setting on the primary. However, this is somewhat error prone
since a network problem or busy primary server might be sufficient
to initiate failover. A notification mechanism such as the explicit
creation of a trigger file is ideal, if this can be arranged.
The short procedure for configuring a standby server using this alternative method is as follows. For full details of each step, refer to previous sections as noted.
Set up primary and standby systems as nearly identical as possible, including two identical copies of PostgreSQL at the same release level.
Set up continuous archiving from the primary to a WAL archive directory on the standby server. Ensure that archive_mode, archive_command and archive_timeout are set appropriately on the primary (see Section 25.3.1).
Make a base backup of the primary server (see Section 25.3.2), and load this data onto the standby.
Begin recovery on the standby server from the local WAL
archive, using a recovery.conf
that specifies a
restore_command
that waits as described
previously (see Section 25.3.4).
Recovery treats the WAL archive as read-only, so once a WAL file has been copied to the standby system it can be copied to tape at the same time as it is being read by the standby database server. Thus, running a standby server for high availability can be performed at the same time as files are stored for longer term disaster recovery purposes.
For testing purposes, it is possible to run both primary and standby servers on the same system. This does not provide any worthwhile improvement in server robustness, nor would it be described as HA.
It is also possible to implement record-based log shipping using this alternative method, though this requires custom development, and changes will still only become visible to hot standby queries after a full WAL file has been shipped.
An external program can call the pg_walfile_name_offset()
function (see Section 9.26)
to find out the file name and the exact byte offset within it of
the current end of WAL. It can then access the WAL file directly
and copy the data from the last known end of WAL through the current end
over to the standby servers. With this approach, the window for data
loss is the polling cycle time of the copying program, which can be very
small, and there is no wasted bandwidth from forcing partially-used
segment files to be archived. Note that the standby servers'
restore_command
scripts can only deal with whole WAL files,
so the incrementally copied data is not ordinarily made available to
the standby servers. It is of use only when the primary dies —
then the last partial WAL file is fed to the standby before allowing
it to come up. The correct implementation of this process requires
cooperation of the restore_command
script with the data
copying program.
Starting with PostgreSQL version 9.0, you can use streaming replication (see Section 26.2.5) to achieve the same benefits with less effort.