Drop Down MenusCSS Drop Down MenuPure CSS Dropdown Menu

PostgreSQL Replication

Now today with versions 9.X of PostgreSQL, we call it “streaming replication”, and allows you to setup multiple standby servers using both synchronous and asynchronous slaves.  Any type of multi-master replication, however, still requires that you use one of the other independent replication options.

Configure Simple Master / Slave Streaming Replication in PostgreSQL in 9.X

Inasmuch as I recently had to setup a simple PostgreSQL master slave configuration recently, I thought that I jot down the notes and share them here to help others that needed to perform the same task someday.  These steps assume that you installed your PostgreSQL similar to steps in the previous article:  CentOS Install PostgreSQL.  The slave server will be overwritten.  Any data stored on it will be gone.

Step 1 – Configure SSH Keys for the PostgreSQL User

When the WAL logs get switched out, we typically will want to be able to copy them over to our slave so that we have less risk of losing data in the event of a disaster.  Rather than manually copying these files, we can automate the task.  However, to automate the copying of these files, we need to create an SSH key on the master server that we will copy over to the slave so that later the master can copy its WAL files over after each switch.  Because we already wrote an article on creating ssh keys here at uptimemadeeasy.com, I will refer you to steps 2-5 of the article automate-file-copy-machines-using-rsync-over-ssh to accomplish this first task.  Follow those steps, creating the ssh keys as the postgres user on the master server and then copy the key over to the postgres user’s $HOME/.ssh/authorized_keys file on the slave server.  When all of this is done, come back here and we will begin the next steps.

Step 2 – Create a PostgreSQL Replication User

Now that we have our ssh keys figured out, our next step is to create a userid that the slave servers can use to contact PostgreSQL on this server.  This is pretty simple using the create user syntax:
create user <username> replication encrypted password '<password-here>';
Now, we need to edit the pg_hba.conf file to allow this new replication user access to the master PostgreSQL server.  I did this by adding this line to my file:
host    replication     rep             my.ip.add.ress/32           md5
where my.ip.add.ress is the actual ip address of the slave server.  After editing this pg_hba.conf file, I had to restart PostgreSQL to make it take.

Step 3 – Changes to the Master Server’s PostgreSQL.conf File

Now it is time to progress to the postgresql.conf file configuration.  We have a list of configuration options that we need to change.  I will list all of my changes in one spot and then discuss them a bit below:
listen_addresses = '*'   
wal_level = hot_standby   
wal_keep_segments = 10
archive_mode = on         
archive_command = '/usr/bin/rsync -qa %p postgres@prizm2.1on1.com:/data/pgsql/9.3/archive/%f' 
max_wal_senders = 1             
hot_standby = on
Let’s go through each of these:
listen_addresses lists all of the addresses that you want to have your PostgreSQL listen on.listen_addresses lists all of the addresses that you want to have your PostgreSQL listen on.listen_addresses lists all of the addresses that you want to have your PostgreSQL listen on.listen_addresses lists all of the addresses that you want to have your PostgreSQL listen on.
Once you have your postgresql.conf file set on the master, restart the PostgreSQL server.

 Name
 Description
listen_addresses
Lists all of the addresses that you want to have your PostgreSQL listen on.  You can set this to a comma separated list or use wildcards.  Be careful using wildcards as changing this value will allow different hosts to connect to it.  Using this option as well as a firewall such as iptables is a good idea.
wal_keep_segments
This is the minimum number of 16MB WAL files to retain in the pg_xlog directory.  More files may exist to handle a large checkpoint.  If your slave server frequently loses connection and has to catch up on lots of data, setting this to a higher number can help.  Remember that the you can also set your WAL files to be rsynced to the slave server using the archive_command option.
wal_level
Can be set to one of these options to determine the level of wal logging done:  minimal, archive, or hot_standby.  Must be set to archive or preferably hot_standby for streaming replication as minimal does not log enough information for replication.
archive_mode
Setting archive_mode to on enables the WAL logs.
archive_command
if archive_mode is set to “on” then this shell command is executed whenever a WAL file segment (default is 16MB) is completed.  In the case above, we have it copying with rsync the file to the slave server.  There are two variables that can be used in your shell command:  %f – filename or %p – path.
max_wal_senders
Default is Zero.  Max_wal_senders is the number of concurrent connections allowed   from the slave servers.  wal sender connections count against the total number of connections configured in max_connections.  Be sure to have at least one connection configured in max_wal_senders per configured slave.
hot_standby
default is off.  Determines if you can connect and run queries during times that your server is in archive recovery or standby mode.  This is helpful to replication slaves.

 4 – Changes to the Slave Server (PostgreSQL.conf and recovery.conf Files)

Next, we go to the slave server and configure its postgresql.conf file and the recovery.conf file.

Edit the postgresql.conf file and set:
hot_standby = on
Now, we create the recovery file in the postgresql data directory.  On my repo installed postgresql server, I found a sample recovery file at:  /usr/pgsql-9.3/share/recovery.conf.sample.  This file tells PostgreSQL how to perform either an archive recovery of a database or how to connect to the master server in the event of replication like we are doing.
standby_mode = on
primary_conninfo = ‘host=pg1.uptimemadeeasy.com port=5432 user=rep password=<my rep user’s password>’
trigger_file = ‘/tmp/postgresql.trigger.5432’
standby_modedescription
Parameter
Description
standby_mode
When enabled, standby_mode causes PostgreSQL to work as a standby in a replication configuration
primary_conninfo
Specifies the connection information used to connect to the master server.  Example:  primary_conninfo = ‘host=host.domain.com port=5432 user=userid password = password’
trigger_file
While running as a standby, the slave will continue to restore the XLOG records from the primary server. The trigger file is what is used to “trigger” a server to leave standby mode and to begin being a primary server.  You can configure this option by specifying the trigger file to look for.

 Step 5 – Materialize the Slave
Before we can finalize the replication we need to get an initial copy of the database onto the slave machine.  We can do this pretty easily by running the typical binary backup process (pg_start_backup and pg_stop_backup), but instead of backing up the PostgreSQL files, we are copying them to the slave server using rsync and the ssh key that we generated in step 1 above.
$ psql -c “select pg_start_backup(‘mybackup’);”
$ rsync -cva --inplace --exclude=*pg_xlog* /data/pgsql/9.3/data/ postgres@pg2.1on1.com:/data/pgsql/9.3/data/
$ psql -c "select pg_stop_backup();"

Restart everything and hope for the best!

This is where we are hoping that we have everything working and, of course, everything really should be up and working once we startup the slave server.

Start the Slave Server

# service postgresql-9.3 start
Watch the log file as it starts up to make sure that there are no errors. So, I will tail the log being written today:
# tail -f /data/pgsql/9.3/data/pg_log/postgresql-Tue.log
< 2014-03-24 23:15:47.670 MDT >LOG: started streaming WAL from primary at 5D9/87000000 on timeline 1
< 2014-03-24 23:17:20.233 MDT >LOG: consistent recovery state reached at 5D9/921BBF70
< 2014-03-24 23:17:20.234 MDT >LOG: database system is ready to accept read only connections
Great!  Looks like everything is working.  We can verify it is working by creating a test table and inserting data to see if they make it into the slave.

On the Master:

postgres=# create database joke;

postgres=# \c joke;

joke=# create table joketable ( jokeint int, jokechar varchar(40) );
CREATE TABLE
joke=# insert into joketable values ( 1, '2 Guys walk into a bar...' );
INSERT 0 1

On the Slave:

joke=# select * from joketable;
jokeint | jokechar
---------+---------------------------
1 | 2 Guys walk into a bar...
(1 row)
So, it appears to be working as our database, table, and row data made it to the slave.

Repairing Broken PostgreSQL Streaming Replication

So, I have already broken my PostgreSQL replication.  Don’t stress!  Take a deep breath.  The best thing is that your slave is just a copy of the master’s data.  We can just rebuild the slave.  Here are the steps that I take to repair my slave when it is broken:

First – Shutdown the Slave and Copy the recovery.conf, postgresql.conf, and Any Other File You Wish to Keep, Then Rename the Data Directory

Shutdown the slave:
# service postgresql-9.3 stop
Make a copy of these important files:
$ cd /data/pgsql/9.3/data
$ cp recovery.conf $HOME
$ cp postgresql.conf $HOME
$ mv $ cd /data/pgsql/9.3/data $ cd /data/pgsql/9.3/data-old

Second, Rematerialize Your Slave From the Master’s Copy

On the Master perform these steps again:
$ psql -c “select pg_start_backup(‘mybackup’);”
$ rsync -cva --inplace --exclude=*pg_xlog* /data/pgsql/9.3/data/ postgres@pg2.1on1.com:/data/pgsql/9.3/data/
$ psql -c "select pg_stop_backup();"

Third, Copy Back the Recovery.conf and the Postgresql.conf Files and Restart the Slave

Remember when we backed up these files? We will put them back in place.
$ cp $HOME/recovery.conf   /data/pgsql/9.3/data
$ cp $HOME/postgresql.conf /data/pgsql/9.3/data
Startup the Slave PostgreSQL
# /sbin/service postgresql-9.3 start
Verify the logs to make sure that it is working.

Other Helpful PostgreSQL Streaming Replication Materials

pg_current_xlog_insert_location()# select pg_current_xlog_insert_location();
pg_current_xlog_insert_location
———————————
5DB/93301F20
(1 row)
master
Get the current transaction log insert location
pg_current_xlog_location()# select pg_current_xlog_location();
pg_current_xlog_location
————————–
5DB/93B442A8
(1 row)
master
Get the current transaction log write location.
pg_switch_xlog()# select pg_switch_xlog();
pg_switch_xlog
—————-
5DB/93B6AC60
(1 row)
master
Force PostgreSQL to switch to a new transaction log file.
pg_is_in_recovery()# select pg_is_in_recovery();
pg_is_in_recovery
——————-
t
(1 row)
master(returns f)slave (returns t)
This is a boolean return type.  t for true, f for false.  This can help you identify if you are on the slave or master as the slave will return t.
pg_last_xlog_receive_location# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
——————————-
5DB/B9FC4880
(1 row)
slave
Find the last transaction log location received that was synced to disk through streaming replication.  Will return a null if not on slave or if replication is not enabled.
pg_last_xlog_replay_location# select pg_last_xlog_replay_location();
pg_last_xlog_replay_location
——————————
5DB/B9FD61
slave
Find the last transaction log location replayed during a recovery.  Returns NULL if replication is not enabled.
pg_last_xact_replay_timestamp()# select pg_last_xact_replay_timestamp();
pg_last_xact_replay_timestamp
——————————-
2014-03-25 15:34:29.867547-06
(1 row)
slave
Find the timestamp of the last transaction replayed during the recovery.  Reflects the time from the WAL record commit or abort for the transaction.
pg_is_xlog_replay_paused()# select pg_is_xlog_replay_paused();
pg_is_xlog_replay_paused
————————–
f
(1 row)
slave
This is a boolean that returns true if the replay has been paused.  Let’s you know if your slave has been paused or not.
pg_xlog_replay_pause()# select pg_xlog_replay_pause();
pg_xlog_replay_pause
———————-(1 row)
slave
Will pause a slave if run.  Shows a NULL, but still works –Be careful.
pg_xlog_replay_resume()
slave
Will restart the slave when run.  Shows a NULL, but still works to start.
Here are some other great Links for you that should be useful:

Monitoring PostgreSQL Streaming Replication 

Comments

Popular posts from this blog

PostgreSQL Pgbadger Installation On Linux

PostgreSQL Sequence

How to configure Replication Manager (repmgr) ?

Postgresql maximum size

PostgreSQL pgBadger