I've started to write about the tool (pglupgrade) that I developed to perform near-zero downtime automated upgrades of PostgreSQL clusters. In this post, I'll be talking about the tool and discuss its design details.
You can check the first part of the series here: Near-Zero Downtime Automated Upgrades of PostgreSQL Clusters in Cloud (Part I).
The tool is written in Ansible. I have prior experience of working with Ansible, and I currently work with it in 2ndQuadrant as well, which is why it was a comfortable option for me. That being said, you can implement the minimal downtime upgrade logic, which will be explained later in this post, with your favorite automation tool.
Further reading: Blog posts Ansible Loves PostgreSQL , PostgreSQL Planet in Ansible Galaxy and (more…)
Last week, I was at Nordic PGDay 2018 and I had quite a few conversations about the tool that I wrote, namely pglupgrade, to automate PostgreSQL major version upgrades in a replication cluster setup. I was quite happy that it has been heard and some other people in different communities giving talks at meetups and other conferences about near-zero downtime upgrades using logical replication. Given that there is a talk that I gave at PGDAY'17 Russia, PGConf.EU 2017 in Warsaw and lastly at FOSDEM PGDay 2018 in Brussels, I thought it is better to create a blog post to keep this presentation available to the folks who could not make it to any of the conferences aforementioned. If you would like to directly go the talk and skip reading this blog post here is your link: Near-Zero Downtime (more…)
PgBouncer is a popular proxy and pooling layer for Postgres. It's extremely common to reconfigure PgBouncer with repmgr so it always directs connections to the current primary node. It just so happens our emerging Docker stack could use such a component.In our last article, we combined Postgres with repmgr to build a Docker container that could initialize and maintain a Postgres cluster with automated failover capabilities. Yet there was the lingering issue of connecting to the cluster. It's great that Postgres is always online, but how do we connect to whichever node is the primary?While we could write a layer into our application stack to call repmgr cluster show to find the primary before connecting, that's extremely cumbersome. Besides that, there's a better way. Let's alter our stack (more…)
PostgreSQL 10 is getting close to its first beta release and it will include the initial support for logical replication, which is was written primarily by me and committed by my colleague Peter Eisentraut, and is internally based on the work 2ndQuadrant did on pglogical (even though the user interface is somewhat different).
I'd like to share some overview of basics in this blog post.
What's logical replication?
Let me start with briefly mentioning what logical replication is and what's it good for. I expect that most people know the PostgreSQL streaming master-standby replication that has been part of PostgreSQL for years and is commonly used both for high availability and read scaling.
So why add another replication mechanism and why call it logical? Well, the traditional (more…)
A few weeks ago I explained basics of autovacuum tuning. At the end of that post I promised to look into problems with vacuuming soon. Well, it took a bit longer than I planned, but here we go.
To quickly recap, autovacuum is a background process cleaning up dead rows, e.g. old deleted row versions. You can also perform the cleanup manually by running VACUUM, but autovacuum does that automatically depending on the amount of dead rows in the table, at the right moment - not too often but frequently enough to keep the amount of "garbage" under control.
repmgr 3.3 introduces a number of additional options for setting up and managing replication clusters, with particular emphasis on cascading replication support. These changes will also make it easier to set up complex clusters using provisioning scripts.
Additionally there are changes to the repmgr command line utility's logging behaviour which you should take into consideration when running therepmgrd daemon.
repmgr is also tracking developments in the next major PostgreSQL release, 10.0, which will bring a lot of changes to the way PostgreSQL handles replication. At the time of writing, repmgr will work with the current PostgreSQL development code, but this combination is of course not suitable for use in production.
Changes to logging behaviour
Traditionally the repmgr command (more…)
PostgreSQL 9.6 is now out and so is an updated version of pglogical that works with it.
For quick guide on how to upgrade the database with pglogical you can check my post which announced 9.6beta support.
The main change besides the support for 9.6.x release of PostgreSQL is in the way we handle the output plugin and apply plugin. They have now been merged into single code base and single package so that there is no need to track the pglogical_output separately for the users and developers alike.
We fixed several bugs this time and also made upgrades from 9.4 much easier.
Here is a more detailed list of changes:
keepalive is tuned to much smaller values by default so that pglogical will notice network issues earlier
better compatibility when upgrading from PostgreSQL 9.4 (more…)
This is the third and last part of blog articles dedicated to pg_rewind. In the two previous articles we have seen how pg_rewind is useful to fix split-brain events due to mistakes in the switchover procedures, avoiding the need of new base backups. We have also seen that this is true for simple replication clusters, where more standby nodes are involved. In this case, just two nodes can be fixed, and the other ones need a new base backup to be re-synchronised. pg_rewind for PostgreSQL 9.6 is now able to work with complex replication clusters.
Indeed, pg_rewind has been extended so it can view the timeline history graph of an entire HA cluster, like the one mentioned in my previous blog article. It is able to find out the most recent, shared point in the timeline history between (more…)
repmgr 3.2 has recently been released with a number of enhancements, particularly support for 2ndQuadrant's Barman archive management server, additional cluster monitoring functionality and improvements to the standby cloning process.
One aim of this release is to remove the requirement to set up passwordless SSH between servers, which means when using repmgr's standard functionality to clone a standby, this is no longer a prerequisite. However, some advanced operations do require SSH access to be enabled.
repmgr 3.2 can now clone a standby directly from the Barman backup and recovery manager. In particular it is now possible to clone a standby from a Barman archive, rather than directly from a running database server. This means the server is not subjected to the I/O (more…)
In the previous blog article we have seen how pg_rewind works with a simple HA cluster, composed of a master node replicating to a standby. In this context, an eventual switchover involves just two nodes that have to be aligned. But what happens with HA clusters when there are several (also cascading) standbys?
Now, consider a more complicated HA cluster, composed of a master with two standbys, based on PostgreSQL 9.5; similar to what has been made in the first blog article dedicated to pg_rewind, we now create a master node replicating to two standby instances. Let's start with the master:
# Set PATH variable
# This is the directory where we will be working on
# Feel free to change it and the rest of the script
# will adapt itself (more…)