DB Failover Installation

Overview

This installation guide shows the steps to configure database replication management and failover via a tool known as RepMgr. To complete this configuration it is expected that the DB Installation tasks have been completed.

This example runs through the configuration of a PostgreSQL replication cluster consisting of three separate nodes:

In a production deployment it is expected that more than one application services node will be deployed. Actions relating to application services node will need to be performed for each instance.

It is assumed that each N2ACD environment will deploy both a primary and secondary SMS node, and N+1 application service nodes in a geographically redundant setup. Only the SMS nodes will be configured to be elected as the primary in the replica set.

Overall Installation Steps

The high-level steps for installing and deploying PostgreSQL database instances are:

  1. Install the RepMgr packages.
  2. Configure PostgreSQL and RepMgr on each node. Additional configuration is made to the primary SMS node.
  3. Deploy each PostgreSQL replica set via RepMgr.
  4. Enable PostgreSQL replica management and fail over via RepMgr. If PostgreSQL is installed from the Linux distribution’s main source repositories, this will automatically be done.

Path and Service Name Usage

Due to the variety of ways that PostgreSQL and RepMgr can be installed, and support for multiple individual deployments of N-Squared software on individual nodes, naming conventions in this configuration page are kept generic.

Software Install

RepMgr Repository Configuration (All Nodes)

RepMgr can be is installed via the N-Squared Yum repository, or from the official PostgreSQL yum repository. If using the official PostgreSQL repository, skip this step.

Add the N-Squared distribution repository alongside the existing N-Squared delivery repository:

sudo nano /etc/yum.repos.d/nsquared.repo

Adding the following, replacing the username and password as provided by N-Squared.

[2ndquadrant-dl-default-release-pg13]
gpgcheck=1
enabled=1
name=2ndquadrant-dl-default-release-pg13 via N-Squared Software Distributions
baseurl=https://username:password@artefacts.nsquared.nz/yum/repo/distributions/el8/2ndquadrant-dl-default-release-pg13

Clean and update:

sudo dnf clean all
sudo dnf update

RepMgr Installation (All Nodes)

Install the correct RepMgr version for your postgres version. E.g. for PostgreSQL 13:

sudo dnf install repmgr13

N-Squared “PGAUX” Installation (Primary & Secondary SMS Nodes)

N-Squared provide several enhancements to the RepMgr package in a separate n2pg-aux package. This is available from the same repository as the N2ACD packages. Install this package on the primary and secondary SMS nodes:

sudo dnf install n2pg-aux

Configuration

Setup PostgreSQL Path (All Nodes)

If PostgreSQL is installed from the official postgres source repository the PostgreSQL binaries will not be automatically added to the path. To make using PostgreSQL and RepMgr binaries easier, add the following to /etc/profile and restart the current SSH session. E.g.:

PATH=$PATH:/usr/pgsql-13/bin

Log Management

Setup log rotate for the RepMgr service. Logs will be written to: /var/log/repmgr/repmgrd.log

Create a new log rotate configuration. As root:

sudo nano /etc/logrotate.d/repmgrd

Add the content:

/var/log/repmgr/repmgrd.log {
        missingok
        compress
        rotate 52
        maxsize 100M
        weekly
        create 0600 postgres postgres
        postrotate
            /usr/bin/killall -HUP repmgrd
        endscript
}

Where multiple repmgrd instances are configured to manage multiple PostgreSQL clusters create individual logrotate configuration files referencing individual log files.

RepMgr Configuration (All Nodes)

Create the RepMgr configuration. As root:

sudo nano /etc/repmgr/13/repmgr.conf

Create this file on each node in the N2ACD cluster with the following configuration template, replacing the <...> variables as appropriate for each node:

# Base configuration.
node_id                         = <unique node ID>
node_name                       = '<local nodename>'
location                        = '<geographic location>'
conninfo                        = 'host=<local hostname> user=repmgr dbname=repmgr connect_timeout=5'
data_directory                  = '/var/lib/pgsql/13/data'
log_file                        = '/var/log/repmgr/repmgrd.log'
pg_bindir                       = '/usr/pgsql-13/bin'
use_replication_slots           = true

# Automatic failover configuration.
promote_command                 = '/usr/pgsql-13/bin/repmgr standby promote -f /etc/repmgr/13/repmgr.conf --log-to-file'
follow_command                  = '/usr/pgsql-13/bin/repmgr standby follow -f /etc/repmgr/13/repmgr.conf --log-to-file --upstream-node-id=%n'

failover                        = automatic
priority                        = <100 or 0>
standby_follow_restart          = false
primary_visibility_consensus    = true
standby_disconnect_on_failover  = true

service_start_command           = 'sudo service postgresql-13 start'
service_stop_command            = 'sudo service postgresql-13 stop'
service_restart_command         = 'sudo service postgresql-13 restart'
service_reload_command          = 'sudo service postgresql-13 reload'

reconnect_attempts              = 6
reconnect_interval              = 10

Where:

RepMgr PostgresSQL Management (All Nodes)

In order for RepMgr to manage PostgreSQL instances sudo permissions will need to be granted. Permissions granted will be restricted to stopping, starting, restarting, and reloading PostgreSQL DB instances.

Setup sudoers file access for PostgreSQL so that it can manage its own PostgreSQL instance.

sudo nano /etc/sudoers

adding the following content to the bottom of the file:

    Defaults:postgres !requiretty
    postgres ALL=(ALL) NOPASSWD: /usr/sbin/service postgresql-13 stop, \
            /usr/sbin/service postgresql-13 start, \
            /usr/sbin/service postgresql-13 reload, \
            /usr/sbin/service postgresql-13 restart

Note that postgresql-13 is the service name, and may need to be adjusted.

RepMgr PostgreSQL Access (Primary Node Only)

The pre-requisite for this step is for the PostgreSQL instance being configured is available and running. If the relevant PostgreSQL instance is not yet configured and started, perform these steps as documented in the DB Node Installation guide.

Configure the PostgreSQL access configuration, this will contain the addresses of all nodes for when replication is enabled.

sudo nano /var/lib/pgsql/13/data/pg_hba.conf

In this file, add the following to rows for each node that will be part of the cluster:

host replication repmgr <hostname> trust
host repmgr      repmgr <hostname> trust

Where:

Then configure the replication user. RepMgr will create a schema called repmgr to be used to store its metadata information.

su - postgres
createuser -s repmgr
createdb repmgr -O repmgr

Then connect to the postgres instance as the maintenance user (as the user postgres run psql) and execute:

ALTER USER repmgr SET search_path TO repmgr, "$user", public;

PostgreSQL Replication Configuration (Primary Node Only)

Configure the primary node PostgreSQL instance for replication. Edit the instance configuration file:

sudo nano /var/lib/pgsql/13/data/postgres.conf

Configure the following options as appropriate:

Field Value Notes
listen_addresses localhost,<host1>,<host2>,etc Configure as a list of all nodes that are in the replication set. This is deployed automatically to all nodes as RepMgr/Postgres replicates this file to each node as part of the clone step. Alternatively configure a hostname such as postgres-local that has a local IP on each host by using the /etc/hosts file to change the IP on each node.
port 5432 The listen port for this DB instance.
password_encryption scram-sha-256 Enables a secure password mechanism using encrypted passwords rather than plain text MD5 hashes.
Note: This must be set before any users are created and must be set on duplicated instances as the default is md5.
max_wal_senders 10 This value should be configured based on the amount of connected replica sets. Rule of thumb is to set this to: (Number of Nodes) * 2) + 2
wal_level replica
hot_standby on
archive_mode on
archive_command 'true'
shared_preload_libraries 'repmgr' This enables the shared preloaded libraries that allow the repmgrd daemon process the ability to access and manage the PostgreSQL instances.
wal_log_hints on
max_replication_slots 4 The number of active replica slots to allow. Rule of thumb is to set this to: (Number of Nodes) + 1

Primary SMS Node Registration with RepMgr

With all configuration complete, the primary node can now be registered with RepMgr.

su - postgres

Register the instance as the cluster primary:

repmgr -f /etc/repmgr/13/repmgr.conf primary register

Confirm registration. Use the cluster show and service status

repmgr -f /etc/repmgr/13/repmgr.conf cluster show

This will generate output, for example:

 ID | Name       | Role    | Status    | Upstream  | Location   | Priority | Timeline | Connection string
----+------------+---------+-----------+-----------+------------+----------+----------+-----------------------------------------------------------------------
 1  | pg-node-a  | primary | * running |           | Location A | 100      | 10       | host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5
repmgr -f /etc/repmgr/13/repmgr.conf service status

This will generate output, for example:

 ID | Name       | Role    | Status    | Upstream  | repmgrd | PID    | Paused? | Upstream last seen
----+------------+---------+-----------+-----------+---------+--------+---------+--------------------
 1  | pg-node-a  | primary | * running |           | running | 745513 | no      | n/a

Setup PostgreSQL DB Instance (Secondary SMS Node)

With the primary node configured with RepMgr, the secondary SMS node can be cloned and configured. All PostgreSQL initialization tasks will be performed by RepMgr.

Preparation

Validate that the data directories are configured for syncing:

For example:

[postgres@test-sms-02]$ ls -larth /var/lib/pgsql/13/
drwx------. 20 postgres postgres 4.0K Feb 12 00:00 data

If no direct exists it can be created. As the user root:

mkdir /var/lib/pgsql/13/data
chown -R postgres:postgres /var/lib/pgsql/13/data
chmod 700 /var/lib/pgsql/13/data
semanage fcontext -a -t postgresql_db_t "/var/lib/pgsql/13/data(/.*)?"
restorecon -R -v /var/lib/pgsql/13/data

Note: The correct directory (as configured in the repmgr.conf file) should be used in these actions.

Ensure the PostgreSQL service is enabled and correct permissions are granted. As the user root:

systemctl enable postgresql-13
sudo semanage port -a -t postgresql_port_t -p tcp 5432
firewall-cmd --zone=public --add-port=5432/tcp --permanent
firewall-cmd --reload

Use the correct service name for the version of postgres installed. Note: The service is not started at this stage. RepMgr will start it for us.

RepMgr Clone of Primary Node

Check connectivity to the primary node. As the user postgres:

repmgr -h <primary node> -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone --dry-run

Where:

Perform the database sync from the primary node. As the user postgres:

repmgr -h <primary node> -U repmgr -d repmgr -f /etc/repmgr/13/repmgr.conf standby clone

Start the PostgreSQL instance on the secondary node. As the user root:

service postgresql-13 start

Register the secondary node into the cluster. As the user postgres:

repmgr -f /etc/repmgr/13/repmgr.conf standby register

Checking the cluster status both the primary and secondary nodes should be visible. As the user postgres:

repmgr -f /etc/repmgr/13/repmgr.conf cluster show

This will give output similar to:

 ID | Name       | Role    | Status    | Upstream  | Location   | Priority | Timeline | Connection string
----+------------+---------+-----------+-----------+------------+----------+----------+-----------------------------------------------------------------------
 1  | pg-node-a  | primary | * running |           | Location A | 100      | 10       | host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5
 3  | pg-node-b  | standby |   running | pg-node-a | Location B | 100      | 10       | host=pg-node-b port=5432 user=repmgr dbname=repmgr connect_timeout=5

Setup PostgreSQL DB Instance (Service Nodes)

Repeating the steps taken for the secondary SMS node as described in the section above, each of the service nodes can be integrated into the cluster using RepMgr using the same process.

RepMgr Daemon Process (All Nodes)

RepMgr includes a daemon process that is responsible for monitoring the cluster health and initiating failover tasks. Once the cluster is initialized configure the RepMgr daemon process to start.

Enable and start the daemon process:

systemctl enable repmgr13
systemctl start repmgr13

The RepMgr service status can be checked with:

repmgr -f /etc/repmgr/13/repmgr.conf service status

Output logs for the daemon process can be found in /var/log/repmgr.

Enable PostgreSQL Fencing on Primary and Secondary Nodes

To avoid split brain scenarios where two PostgreSQL nodes in the cluster are both acting as primary, a fencing process is enabled using the n2pg-aux packages scripts.

To enable this, on both the primary SMS and secondary SMS nodes edit the systemd startup script. As root:

systemctl edit postgresql-13 --full

Add the following line to the [Service] section:

ExecStartPre=/usr/share/n2pg-aux/bin/pgsql_fence.pl /usr/pgsql-13/bin/postgres <nodename> 

Where:

In the same service configuration, tie RepMgr and the postgres instance together tightly. For postgres add:

After=network-online.target
Before=repmgr13.service
Wants=repmgr13.service

To the service configuration in the [unit] section. Use the correct service name for the service instance of repmgr you’re using (in this example it is repmgr13).

Tie the RepMgr Service to the PostgreSQL Service

Configure the RepMgr service in a reciprocal manner to how PostgreSQL is tied to RepMgr. As root:

systemctl edit repmgr13 --full

Add the following to the [unit] section:

After=postgresql-13.service
PartOf=postgresql-13.service

Use the correct service name for the service instance of PostgreSQL you’re using (in this example it is postgresql-13).

In the [service] section also add:

Restart    = on-failure
RestartSec = 60

The restart configuration for RepMgr is important as RepMgr will exit if the primary node is temporarily unavailable, To avoid manual intervention being required to start RepMgr it must be configured to restart regularly.

PostgreSQL Fencing

With the configuration as described in this document above, PostgreSQL SMS primary/secondary nodes may be fenced to avoid startup in scenarios where a primary node cannot be certain it is safe to do so.

A successfully fenced PostgreSQL instance can be determined by reviewing the systemd log for a PostgreSQL instance which has failed startup:

systemctl status postgresql-13

The log file will print output such as this:

-- Unit postgresql-13.service has begun starting up.
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.928] Running /usr/share/n2pg-aux/bin/pgsql_fence.pl
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.929] Postgres binary file is '/usr/pgsql-13/bin/postgres' and data dir is '/var/lib/pgsql/13/data/'. Force file is /etc/pgsqlchk/pgsql_fence.force
Apr 08 02:05:46 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:46.929] repmgr database name is 'repmgr' and node_name is 'pg-node-a'
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.074] Successfully executed: SELECT pg_is_in_recovery();
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.074] Local database is a primary node. Must check whether to fence it.
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Successfully executed: SELECT node_name, conninfo FROM repmgr.nodes where priority != 0
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Potential primary postgres node connection is: pg-node-b -> host=pg-node-b port=5432 user=repmgr dbname=repmgr connect_timeout=5
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.172] Potential primary postgres node connection is: pg-node-a -> host=pg-node-a port=5432 user=repmgr dbname=repmgr connect_timeout=5
Apr 08 02:05:47 pg-node-a pgsql_fence.pl[597502]: [2025-04-08 02:05:47.192] [pg-node-b] is NOT in recovery (0) - i.e. is primary. Will not allow startup touch '/etc/pgsqlchk/pgsql_fence.force' to force startup
Apr 08 02:05:47 pg-node-a systemd[1]: postgresql-13.service: Control process exited, code=exited status=255
Apr 08 02:05:47 pg-node-a systemd[1]: postgresql-13.service: Failed with result 'exit-code'.
-- Subject: Unit failed

It is possible to force the startup of a PostgreSQL node even if it is fenced. To do so, create an empty file. As root:

touch /etc/pgsqlchk/pgsql_fence.force

Once this file is created, start PostgreSQL. As root:

systemctl start postgresql-13

The startup will show the fencing is skipped:

-- Unit postgresql-13.service has begun starting up.
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.616] Running /usr/share/n2pg-aux/bin/pgsql_fence.pl
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] Postgres binary file is '/usr/pgsql-13/bin/postgres' and data dir is '/var/lib/pgsql/13/data/'. Force file is /etc/pgsqlchk/pgsql_fence.force
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] repmgr database name is 'repmgr' and node_name is 'pg-node-a'
Apr 08 02:06:17 pg-node-a pgsql_fence.pl[597572]: [2025-04-08 02:06:17.617] Determined force file '/etc/pgsqlchk/pgsql_fence.force' exists. Allowing Postgres to run regardless.
Apr 08 02:06:17 pg-node-a postmaster[597580]: 2025-04-08 02:06:17.667 UTC [597580] LOG:  redirecting log output to logging collector process
Apr 08 02:06:17 pg-node-a postmaster[597580]: 2025-04-08 02:06:17.667 UTC [597580] HINT:  Future log output will appear in directory "/var/log/postgresql".
Apr 08 02:06:17 pg-node-a systemd[1]: Started PostgreSQL 13 database server.
-- Subject: Unit postgresql-13.service has finished start-up

It is important to only force startup by using this procedure if it is safe to do so. In the above example the secondary SMS node is running as the primary node, and in forcing startup of the pg-node-a two primary nodes are now running. With a correctly configured HA setup the HAProxy setup would start blocking access to the backend databases until this has been resolved. However it is important in general to ensure that when the fencing is overridden the complete state of the cluster is understood.

To stop the fencing script from being overridden, remove the pgsql_fence.force file. As root:

rm /etc/pgsqlchk/pgsql_fence.force