Wednesday, January 16

PGLogical 1.1 released with sequence support and more

The new feature version of pglogical is now available. The new release brings support for sequence replication, manually configured parallel subscriptions, replica triggers, numerous usability improvements and bug fixes. Let’s look into the changes in more detail.

Sequences support

As the title of this post says, pglogical now supports sequence replication. While I am sure this is good news for everybody, for me the main importance of this feature is delivering better experience with the zero-downtime upgrades. With the first version of pglogical you’d have to manage sequences manually after the data were transfered which complicates the whole procedure, now they just get replicated.

PostgreSQL does not provide any infrastructure for capturing sequence changes (not for lack of trying, it just turns out that it’s hard to add support for it). So we use similar trick as one of the previous replication projects I worked on. We periodically capture state current state of the sequence and send update to the replication queue with some additional buffer. This means that in normal situation the sequence values on subscriber will be ahead of those on provider. We try to dynamically size the buffer so that the value is as close as possible but also does not fall behind on frequently updated sequences. It is however advisable to force the sequence update using the pglogical.synchronize_sequence() function after big data loads or upgrade.

Other than the above, the sequence replication works similarly to the way tables work. This includes the replication sets, so there are functions for adding and removing sequences to replication sets and subscribers can receive updates only for some sequences.

Parallel subscriptions

Another major change is support for multiple subscriptions between one pair of node. In pglogical 1.0 there was strict limit on single subscription between one pair of nodes. You could replicate from one node to multiple nodes or from multiple nodes to one node but no parallel subscriptions between the same two nodes was allowed. This limitation is now gone provided that the replication sets of the two subscriptions don’t overlap. What it means is that users can manually parallelize the replication for improved performance.

Foreign keys and triggers handling on subscriber

PGLocgical now runs both the initial data copy process and the ongoing replication process under session_replication_role set to replica. This, plus some other internal changes have couple of interesting effects:

  • Foreign keys on subscriber are no longer checked for validity. The philosophy behind this change is that the if the FK check passed on the provider, the subscriber should accept the data as well. This should solve the foreign key issues some people had when replicating just part of the database.
  • The ENABLE REPLICA and ENABLE ALWAYS triggers are now called on the subscriber. This means that you can do data post-processing on the subscriber now albeit in somewhat limited fashion.

These might break compatibility on some existing installations.

(In)Compatibility

Speaking of the compatibility with existing installations. There are also few other compatibility breaking changes in the new version of pglogical. The biggest one is that the create_subscription() function now defaults to not synchronize the schema. If you wish to do the schema synchronization, you can enable it by setting the synchronize_schema parameter to true when calling create_subscription(). Also, the synchronize parameters in table synchronization functions were renamed to synchronize_data for better clarity and consistency with the create_subscription() naming.

Usability improvements

We also tried to make pglogical easier to use and manage. For this reason we added functions to add and remove node interfaces (connection strings) and for switching the subscription to different interface. Main use for these is when server IP has been changed and it’s also a first step towards supporting physical failover. We also modified the replicate_ddl_command() function to optionally accept list of replication sets for which given DDL change will be replicated to ease filtering of the structure changes in partially replicated databases. There are also smaller behavior changes that should improve general usability like:

  • Better behavior on worker crashes. PGLogical will longer spam the log with failures, it will give the worker grace period of 5 seconds before retrying.
  • Improved logging. For example it’s easier to spot which workers have been started and stopped and when it happened.
  • Workers now set application_name on start for easier identification of pglogical processes in pg_stat_activity.

And finally, like any software release, there are some small bug fixes all around.

The download and installation instructions as well as documentation are available at our pglogical project page.

Leave a Reply

Your email address will not be published. Required fields are marked *