High Availability/Scalability Using Haproxy And Pgbouncer In Postgresql
- I have multiple PostgreSQL servers for a web application. Typically one master and multiple slaves in hot standby mode (asynchronous streaming replication).
- I use PGBouncer for connection pooling: one instance installed on each PG server (port 6432) connecting to database on localhost. I use transaction pool mode.
In order to load-balance my read-only connections on slaves, I use HAProxy (v1.5) with a conf more or less like this:
listen pgsql_pool 0.0.0.0:10001
mode tcp
option pgsql-check user ha
balance roundrobin
server master 10.0.0.1:6432 check backup
server slave1 10.0.0.2:6432 check
server slave2 10.0.0.3:6432 check
server slave3 10.0.0.4:6432 check
So, my web application connects to haproxy (port 10001), that load-balance connections on multiple pgbouncer configured on each PG slave.
Here is a representation graph of my current architecture:
This works quite well like this, but I realize that some implements this quite differently: web application connects to a single PGBouncer instance that connects to HAproxy which load-balance over multiple PG servers:
Here is the reason: HAProxy redirects connection to different servers. this results in MAC address change in the database connection. So if PGBouncer is above HAProxy, each time the connections in the pool gets invalidated because of MAC address change.
but
Clients making a large number of DB requests will have to setup a connection with a remote PGBouncer for each request. This is more expensive, than running PgBouncer locally (so the application connects to pgbouncer locally) and pgBouncer maintains a pool of connections with the remote PG server.
So, IMO,
PGBouncer -> HAProxy -> PGServer seems to be better than, HAProxy -> PGBouncer ->
PGServer, especially when the PGBouncer is local to the client application.
Comments
Post a Comment