PostgreSQL Hot Standby Steps

- January 04, 2017

First, determine the mount points of the Postgresql data directories and the WAL (Write Ahead Log) directories. For performance, the $PGDATA and pg_xlog directories should be on separate volumes.
In the examples below, they are defined as follows:

$PGDATA is the "master" cluster, located at /pgdata/9.3/data, at (for example) 192.168.100.0
The "slave" (aka "hot standby") will be (for example) located at /pgdata/9.3/data, at (for example) 192.168.100.1
The WAL staging area on the slave will be /pgdata/WAL_Archive. This is where the WAL segments are sent to from the archive_command on the master. Ideally on a separate volume from $PGDATA.

Assumptions:

The master cluster is up and running normally.
There is no slave cluster yet.
You are doing all operations as the "postgres" OS user (not root!)
ssh between both servers is working
You are using at least PostgreSQL 9.2
pg_hba.conf and postgresql.conf are located at $PGDATA/. If not, change the instructions below to match your location.
You are setting up the slave on a separate server.
If the servers are not on the same network (eg. different colocations), use the "-z" flag in the "archive_command" rsync on the master to compress before transferring.
These commands are for Linux, but with some substitution of commands should work on Windows too.

Step 1. On the slave, create the staging directory to hold the master's log files

mkdir -p /pgdata/WAL_Archive
chown postgres:postgres /pgdata/WAL_Archive

Step 2.On the master, edit the $PGDATA/postgresql.conf

wal_level = hot_standby
archive_mode = on ## /pgdata/WAL_Archive is a staging directory on the slave
archive_command = 'rsync -W -az %p postgres@$SLAVE_IP_HERE:/pgdata/WAL_Archive/'
max_wal_senders = 5
wal_keep_segments = 5000   # If you have the room, to help the pg_basebackup
                           # not fail due to WAL segments being removed from master.
                           # For clusters will very little traffic, 100 is probably fine

Step 3.On the master, create the replication role, which will be copied to the slave via pg_basebackup. Set a password for the "replication" role

psql -U postgres -d postgres -c "CREATE USER replication WITH replication ENCRYPTED PASSWORD 'changeme' LOGIN"

Step 4.Modify the master $PGDATA/pg_hba.conf and enable the replication user for the IP of the slave

# TYPE DATABASE USER ADDRESS METHOD
#hostssl    replication     replication     $SLAVE_IP_HERE/32       md5
    host    replication     replication     $SLAVE_IP_HERE/32       md5

Step 5.Restart the master cluster to pick up the changes to the postgresql.conf. This is done as the cluster superuser.
For Example:

pg_ctl -D $PGDATA restart -m fast
## The master cluster MUST be restarted before the pg_basebackup command is executed.

Step 6.On the slave, from $HOME, issue the pg_basebackup command to start setting up the hot standby from the master.

## --host=IP_OF_MASTER  -> The master's IP
## --pgdata=$PGDATA     -> The slave's $PGDATA directory
## --xlog-method=stream -> Opens a second connection to the master to stream the WAL segments rather than pulling them all at the end
## --password will prompt for the replication role's password
## Without compression, "stream" gets the changes via the same method as Streaming Replication time
pg_basebackup --pgdata=$PGDATA --host=IP_OF_MASTER --port=5432 --username=replication --password --xlog-method=stream --format=plain --progress --verbose

## Alternate version with compression, note "--xlog --gzip --format=tar"
#time pg_basebackup --pgdata=$PGDATA --host=IP_OF_MASTER --port=5432 --username=replication --password --xlog --gzip --format=tar --progress --verbose

Step 7.On the slave, after pg_basebackup has successfully completed, edit $PGDATA/postgresql.conf

hot_standby = on #off               # "on" allows queries during recovery
max_standby_archive_delay = 15min   # max delay before canceling queries,
                                    # set to hours if backups will be taken from here
max_standby_streaming_delay = 15min # max delay before canceling queries
hot_standby_feedback = on #off

Step 8.On the slave, create $PGDATA/recovery.conf:

standby_mode = on

## To promote the slave to a live database, issue "touch /tmp/promote_db"
## Warning: If multiple slaves share the same /tmp/ directory,
##          then the trigger file must be named uniquely, else multiple slaves
##          could attempt to be promoted in the presence of the trigger file.
trigger_file = '/tmp/promote_db_slave'

## Host can be the master's IP or hostname
primary_conninfo = 'host=IP_OF_MASTER port=5432 user=replication password=CHANGEME'

## Log the standby WAL segments applied to a standby.log file
## TODO: Add the standby.log to a log rotator
## The paths must be explicitly defined, including the path to pg_archivecleanup
restore_command = 'cp /pgdata/WAL_Archive/%f "%p" 2>>/pgdata/9.3/data/pg_log/standby.log'

## XXX: If multiple slaves share the staging WAL directory,
##      do not use pg_archivecleanup as WAL segments could be removed
##      before being applied to other slaves.
archive_cleanup_command = '/usr/pgsql-9.3/bin/pg_archivecleanup /pgdata/WAL_Archive %r'

## On hot standby clusters, set to 'latest' to switch to the newest timeline in the archive
recovery_target_timeline = 'latest'

Step 9 .The slave should be ready to be started now. Start the cluster on the slave with whatever method works best.

pg_ctl -D $PGDATA start

Step 10.The following are the commands I run when recreating a standby over and over for test. I add them to a script called /root/recreate_standby.sh. Runs as root, but that is not necessary - if you run as postgres, removed the "sudo su - postgres -c" command.

#!/bin/bash

## This script runs on the standby.
## Executed as root, else remove the "sudo - postgres -c" commands.
## Assumes you have a valid recovery.conf saved at
## $PGDATA/../recovery.conf.bkp

export PGDATA=/path/to/data/dir     ## Must be set correctly
export PGPORT=5432
export MASTER=192.168.x.x            ## IP or host entry for the master Postgresql server
export PGBIN=/usr/pgsql-9.3/bin

service postgresql-9.3 stop -m immediate

if [ $? != 0 ]; then
    service postgresql-9.3 start
    echo "Could not shut down PostgreSQL. Aborting."
    exit 1
fi

rm -rf $PGDATA

if [ $? != 0 ]; then
    echo "Could not remove the PostgreSQL $PGDATA dir. Aborting."
    exit 1
fi

## If the replication role is not set to "trust" in the master's
## pg_hba.conf file, the password will need to be passed into the command below,
## and "--no-password" will need to be removed or revised to be "--password"
su - postgres -c "$PGBIN/pg_basebackup --pgdata=$PGDATA --host=$MASTER --port=$PGPORT --username=replication --no-password --xlog-method=stream --format=plain --progress --verbose"

su - postgres -c "cp -p $PGDATA/../recovery.conf.bkp $PGDATA/recovery.conf"

service postgresql-9.3 start
su - postgres -c "$PGBIN/pg_isready -U postgres -p $PGPORT -t2"
while [ $? != 0 ]; do
  echo "Sleep 1 second, check if slave is up yet. If not, sleep again."
  sleep 1;
  su - postgres -c "$PGBIN/pg_isready -U postgres -p $PGPORT -t2"
done
su - postgres -c "$PGBIN/psql -d postgres -U postgres -qXc 'select pg_is_in_recovery() as is_pg_in_recovery'"
exit 0

Search This Blog

TutorialDBA - Support | Training | Consultant

PostgreSQL Hot Standby Steps

Comments

Post a Comment

Popular posts from this blog

PostgreSQL Index

Top 40 Highest Paying URL Shortener Companies - Best URL Shorteners To Earn Fast

VMWARE WORKSTATION 3,4,5,6,7,8,9,10,11,12,14,15...etc LICENSE KEYS COLLECTION

How to CreateYour Own AWS Account Alias?

How to Get Table Size, Database Size, Indexes Size, schema Size, Tablespace Size, column Size in PostgreSQL Database