PostgreSQL 9.6 has just been released and most of the postgres users will start asking themselves how to upgrade to the new major version. This post has the intention of showing different procedures for upgrading your PostgreSQL server.
Upgrading to a new major version is a task which has a high ratio of preparation over total execution time. Specifically when skipping a release in the middle, for example, when you jump from version 9.3 to version 9.5.
On the other hand, point release upgrades don’t need as much preparation. Generally, the only requirement is for the postgres service to be restarted. There are no changes to the underlying data structure, so there’s no need to dump and restore. In the worst case scenario you may need to recreate some of your indexes after you’ve finished upgrading the point release.
It’s very wise to always stay on the latest point release, so you don’t stumble over a known (and likely fixed) bug. This means that once the latest version is out, schedule time for the upgrade as soon as possible.
When doing complex tasks like this one, it’s good to consider all possible options to accomplish the final result.
For major release upgrades, there are three possible paths you can take:
Let me explain each one in detail:
This is the simplest of all possible ways to upgrade your cluster’s data structure.
To make it short, the process here requires a logical dump using pg_dump from the old version, and pg_restore on a clean cluster created with the newly installed version.
Key points in favor of using this path are:
Reasons why you should avoid using this option:
On servers with several CPU cores, it’s possible to run pg_dump in parallel using the directory format. Once it has finished, restore in parallel as well, using the -j switch in both pg_dump and pg_restore. But you can’t start the restore process until the whole dump has finished.
These kind of upgrades have been available since version 9.0 to perform in-place upgrades from versions as old as 8.3. They are called “in-place” because they’re done over the same server, and preferably on the same data directory.
Advantages of these kind of upgrades:
There are quite a few disadvantages:
You can also run pg_upgrade without the –link option (in this case pg_upgrade will generate a second copy on disk of your cluster) so you can go back to the old version. But you’ll lose both of the advantages listed above.
The procedure to follow for this method would be like this:
This type of upgrade has been available since the first trigger based replication solutions were around. In other words, since Slony-I was around.
Trigger based replication solutions don’t care which version you have on one side or the other, as it copies the changes using supported SQL commands.
The trigger based replication tools recommended, in the order they appear are:
This should be the preferred way for upgrading. Let’s see the advantages of using this method:
There are some disadvantages:
Since PostgreSQL 9.4 we have logical decoding of the transaction logs. This means that now we can decode the transactions from the WAL and manipulate the output.
A whole new world of possibilities appeared in the replication field. Triggers are no longer necessary to accomplish logical replication!
This basically means you don’t need to save a separate copy of all the inserts, updates and deletes using triggers any more. You can just get those data manipulation operations from the transaction logs, by decoding it. Then, it’s the tool you’re using that has to manipulate the output and send it over to the subscribed downstream node. In this case that tool is pglogical.
So, what does this all mean?
Well, if you are on a version 9.4 or 9.5 of PostgreSQL and you want to upgrade to 9.6, you can perform an online upgrade like the one detailed above, but using pglogical instead of one of the trigger based replication solutions.
I won’t go further into details as there are other blogs on this particular topic, like this one written by Petr Jelinek: PGLogical 1.1 packages for PostgreSQL 9.6beta1
Depending on the schema of your database, the data size, possible down time window, version of the current PostgreSQL installation you might choose one option over another or whatever fits best.
Always remember that for logical replication the tables have to have a Primary Key.
With pglogical that rule isn’t so strict: You can replicate insert-only tables. But for update and deletes, it’s still mandatory to have a Primary Key or a REPLICA IDENTITY on the table.
If you’re in need of help for upgrading your PostgreSQL database, you can check the
2ndQuadrant Upgrade services.