Postgresql Bi-Directional Replication
We are going to assume the starting point is two clean systems using Debian Bullseye
1
2
3
4
apt update && apt upgrade -y
reboot
sudo -i
apt install postgresql postgresql-13-pglogical
Setup SSL and modify the postgresql.conf file
Change the database_version variable to suit your installation this sets the private key owner and permissions to one that postgres is happy with.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
export database_version=13
sed -i /etc/postgresql/$database_version/main/postgresql.conf -e s:'snakeoil.key:snakeoil-postgres.key:'
cp /etc/ssl/private/ssl-cert-snakeoil.key /etc/ssl/private/ssl-cert-snakeoil-postgres.key
chown postgres:postgres /etc/ssl/private/ssl-cert-snakeoil-postgres.key
chmod 600 /etc/ssl/private/ssl-cert-snakeoil-postgres.key
echo "listen_addresses = '*'" >> /etc/postgresql/$database_version/main/postgresql.conf
echo "shared_preload_libraries = 'pglogical'" >> /etc/postgresql/$database_version/main/postgresql.conf
echo "wal_level = 'logical'" >> /etc/postgresql/$database_version/main/postgresql.conf
echo "track_commit_timestamp = on" >> /etc/postgresql/$database_version/main/postgresql.conf
echo "max_connections = 100" >> /etc/postgresql/$database_version/main/postgresql.conf
echo "max_wal_senders = 10" >> /etc/postgresql/$database_version/main/postgresql.conf
echo "max_replication_slots = 10" >> /etc/postgresql/$database_version/main/postgresql.conf
echo "max_worker_processes = 10" >> /etc/postgresql/$database_version/main/postgresql.conf
echo "pglogical.conflict_resolution = 'last_update_wins'" >> /etc/postgresql/$database_version/main/postgresql.conf
systemctl restart postgresql
You can check the settings are applied with:
1
2
3
4
su - postgres
psql
select name, setting from pg_settings where name in ('wal_level','track_commit_timestamp','max_worker_processes','max_replication_slots','max_wal_senders','shared_preload_libraries', 'max_connections');
Setup the pg_hba.conf file
1
2
3
4
5
6
7
8
echo "host all all 192.168.30.40/32 trust" >> /etc/postgresql/$database_version/main/pg_hba.conf
echo "hostssl all all 192.168.30.40/32 trust" >> /etc/postgresql/$database_version/main/pg_hba.conf
echo "hostssl replication postgres 192.168.30.40/32 trust" >> /etc/postgresql/$database_version/main/pg_hba.conf
echo "host all all 192.168.30.41/32 trust" >> /etc/postgresql/$database_version/main/pg_hba.conf
echo "hostssl all all 192.168.30.41/32 trust" >> /etc/postgresql/$database_version/main/pg_hba.conf
echo "hostssl replication postgres 192.168.30.41/32 trust" >> /etc/postgresql/$database_version/main/pg_hba.conf
systemctl restart postgresql
On each node, create a user and db, we will use fusionpbx for both in this example
1
2
3
4
5
su - postgres
createuser fusionpbx
createdb -O fusionpbx fusionpbx
psql
alter user fusionpbx with password 'fusionpbx2022';
Create a table to replicate:
1
2
\c fusionpbx
CREATE TABLE table1 (id int primary key, info text, value int);
Only one the first node:
1
INSERT INTO table1 VALUES (1, 'initial data 1', 1), (2, 'initial data 1', 1), (3, 'initial data 1', 1), (4, 'initial data 1', 1), (5, 'initial data 1', 1);
On node1:
1
2
3
4
5
CREATE EXTENSION pglogical;
SELECT pglogical.create_node(
node_name := 'node1',
dsn := 'host=192.168.30.40 port=5432 sslmode=require dbname=fusionpbx user=postgres');
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']);
On node2:
1
2
3
4
5
6
7
8
9
10
CREATE EXTENSION pglogical;
SELECT pglogical.create_node(
node_name := 'node2',
dsn := 'host=192.168.30.41 port=5432 sslmode=require dbname=fusionpbx user=postgres');
SELECT pglogical.create_subscription(
subscription_name := 'node2_sub',
provider_dsn := 'host=192.168.30.40 port=5432 sslmode=require dbname=fusionpbx user=postgres',
replication_sets := ARRAY['default'],
synchronize_data := true,
forward_origins := '{}' );
The step above is to create a subscription to start the replication of data from node1 using the function create_subscription. You can use any name you want for the subscription_name and we need the connection string of the source node, and also the name of the replication sets to replicate. In our example, we used the ‘default’ replication set, so it’s used in the parameter. The argument synchronize_data tells pglogical whether to copy all existing data from the source or not. In our test, this is set to true because we want five test rows to be copied in addition to any future changes. The parameter forward_origins is important for bi-directional replication setup. It controls what changes are forwarded to other nodes. An empty array tells pglogical not to forward any changes that didn’t originate on the provider node.
Check the results on node2:
1
2
3
4
5
6
7
8
9
10
fusionpbx=# select * from table1 ;
id | info | value
----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
(5 rows)
This confirms that the logical replication has been set up and data is now flowing from node1 to node2. Because we set synchronize_data to true, the five rows that were already present in the test table in node1 were first copied over.
Test unidirectional replication
Now that we have configured replication from node1 to node2, we can test that the changes (inserts, updates, and deletes) made on node1 are replicated to node2.
Start with testing out insert and update combinations. In this test, a new row is added and immediately updated on node1. The following code is the output of node1:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
fusionpbx=# INSERT INTO table1 VALUES (11, 'manual 1', 1);
INSERT 0 1
fusionpbx=# UPDATE table1 SET value=10 WHERE id=11;
UPDATE 1
fusionpbx=# SELECT * FROM table1;
id | info | value
----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
11 | manual 1 | 10
(6 rows)
Now when you check node2, it shows that both INSERT and UPDATE changes were replicated:
1
2
3
4
5
6
7
8
9
10
fusionpbx=# SELECT * FROM table1;
id | info | value
----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
11 | manual 1 | 10
(6 rows)
Now, let’s delete the row we created on node1 and confirm that it’s removed from node2 as well:
1
2
3
4
5
6
7
8
9
10
11
fusionpbx=# DELETE FROM table1 WHERE id=11;
DELETE 1
fusionpbx=# SELECT * FROM table1;
id | info | value
----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
(5 rows)
Check the current state of the test table on node2. It shows that the DELETE change was successfully replicated:
1
2
3
4
5
6
7
8
9
fusionpbx=# SELECT * FROM table1;
id | info | value
----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
(5 rows)
These tests show that all INSERT, UPDATE, and DELETE actions were successfully replicated from node1 to node2.
Test conflicts in unidirectional replication
Because the table on node2 is also open for direct writes, this can result in conflicts if the same row is changed on both nodes. In case of this conflict, pglogical uses the conflict resolution method that has been configured. In this test environment, the last_update_wins option has been set, which means that the version of the data with the latest commit timestamp is kept.
For example, the conflict can happen in the following use case. First, we insert a row on node2 with id set to 100:
1
2
3
4
5
6
7
8
9
10
11
12
fusionpbx=# INSERT INTO table1 VALUES (100, 'node 2', 200);
INSERT 0 1
fusionpbx=# SELECT * FROM table1;
id | info | value
-----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
100 | node 2 | 200
(6 rows)
Now we insert a row with id set to 100 on node1:
1
2
3
4
5
6
7
8
9
10
11
12
fusionpbx=# INSERT INTO table1 VALUES (100, 'node 1', 100);
INSERT 0 1
fusionpbx=# SELECT * FROM table1;
id | info | value
-----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
100 | node 1 | 100
(6 rows)
If you go back and check the rows on node2, you can see that the row directly inserted on node2 has been overwritten by the row from node1:
1
2
3
4
5
6
7
8
9
10
fusionpbx=# SELECT * FROM table1;
id | info | value
-----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
100 | node 1 | 100
(6 rows)
This occurred because the commit timestamp of the insert on node1 was newer than the commit timestamp of the insert on node2.
Adding bi-directional logical replication
The steps shared in the previous sections allow you to create unidirectional replication from node1 to node2. Now we run additional commands to enable the flow of changes in the reverse direction.
Set up Node2
Set up a replication set just as you did on Node1:
1
SELECT pglogical.replication_set_add_all_tables('default', ARRAY['public']);
Set up Node1
Create a subscription on node1 so that it starts fetching the changes from node2:
1
2
3
4
5
6
SELECT pglogical.create_subscription(
subscription_name := 'node1_sub',
provider_dsn := 'host=192.168.30.41 port=5432 sslmode=require dbname=fusionpbx user=postgres',
replication_sets := ARRAY['default'],
synchronize_data := false,
forward_origins := '{}' );
In the preceding command, the argument synchronize_data has been set to false, unlike before. This is because we don’t want to copy any existing data from node2 to node1, because we know both tables already contain the same data.
With this command, the bidirectional replication setup is complete, and now any change on one server is replicated to the other.
Test bi-directional replication
Let’s start by adding a row on both nodes to confirm that bi-directional replication is happening.
Insert a row on node1 as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
fusionpbx=# INSERT INTO table1 VALUES (200, 'node 1', 1);
INSERT 0 1
fusionpbx=# SELECT * FROM table1;
id | info | value
-----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
100 | node 1 | 100
200 | node 1 | 1
(7 rows)
Now check the table status on node2 and add a row there as follows:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
fusionpbx=# SELECT * FROM table1;
id | info | value
-----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
100 | node 1 | 100
200 | node 1 | 1
(7 rows)
fusionpbx=# INSERT INTO table1 VALUES (201, 'node 2', 1);
INSERT 0 1
fusionpbx=# SELECT * FROM table1;
id | info | value
-----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
100 | node 1 | 100
200 | node 1 | 1
201 | node 2 | 1
(8 rows)
Now go back to node1 to confirm the new row is present:
1
2
3
4
5
6
7
8
9
10
11
12
fusionpbx=# SELECT * FROM table1;
id | info | value
-----+----------------+-------
1 | initial data 1 | 1
2 | initial data 1 | 1
3 | initial data 1 | 1
4 | initial data 1 | 1
5 | initial data 1 | 1
100 | node 1 | 100
200 | node 1 | 1
201 | node 2 | 1
(8 rows)
This confirms that the changes are now flowing in both directions. You can perform additional tests with updates and deletes to test this behavior further.
Monitor logical replication lag
Monitoring logical replication lag is the responsibility of the user. Logical replication lag increases can be caused by large amounts of data being changed on the replication node.
You can check the replication lag status in terms of the number of bytes on each of the servers by running the following query:
1
2
3
4
5
6
7
SELECT pid, usename, application_name, client_addr, state, sync_state,
pg_wal_lsn_diff(sent_lsn, write_lsn) as write_lag,
pg_wal_lsn_diff(sent_lsn, flush_lsn) as flush_lag,
pg_wal_lsn_diff(sent_lsn, replay_lsn) as replay_lag,
pg_wal_lsn_diff(sent_lsn, replay_lsn) as total_lag
FROM pg_stat_replication
WHERE usename='postgres';
Schema changes
When deploying schema changes while replicating between clusters using pglogical, it’s important to note that Data Definition Language (DDL) changes aren’t automatically replicated between replicating nodes. While you may choose to run DDL changes on all replicating nodes individually, you can also use the pglogical function replicate_ddl_command to run the schema changes locally and replicate to the other node.
For example, to add a new column named comments to the test table, you can run the following command on any of the hosts so that the same change is applied to both clusters:
1
SELECT pglogical.replicate_ddl_command('ALTER TABLE public.table1 ADD COLUMN comments TEXT', ARRAY['default']);
This works seamlessly for unidirectional replication, but for bi-directional replication there is a risk of breaking the replication. This can happen because if transactions are in transit from one node to another, and at the same time the other node initiates the schema change, the in-transit data changes can’t write to the changed schema, and therefore the replication can break. The safe way to make schema changes to tables being changed on both nodes is to temporarily pause the write traffic.
Limitations
When using pglogical, there are some considerations to keep in mind when designing or converting workloads for usage with pglogical. We provide a brief description of those here, but we encourage you to review the complete list of limitations and restrictions.
To replicate the updates and deletes on tables, pglogical requires a primary key or other valid replica identity such as a unique constraint. Without this unique identifier, pglogical can’t identify the updates and deletes. Additionally, foreign key constraints aren’t enforced during the replication process, so any successful operation on the publisher is assumed to be acceptable for the subscriber as well. Considering that foreign key constraints help maintain the referential integrity of data between child and parent tables, it’s important to understanding this limitation and ensure that your workload can’t cause any integrity violations.
Another important aspect to understand is the handling of sequences. Although pglogical provides a mechanism to sync up the sequence numbers, this doesn’t happen in real time. A better strategy is to have independent sequences on both nodes. For example, you may consider using only odd numbers on one node and only even numbers on the other node. This makes sure that there is no conflict of the sequence numbers.