In the previous blog post, I briefly explained how we got the performance numbers published in the pglogical announcement. In this blog post I’d like to discuss the performance limits of logical replication solutions in general, and also how they apply to pglogical.
Firstly, let’s see how physical replication (built into PostgreSQL since version 9.0) works. A somewhat simplified figure of the with two just two nodes looks like this:
Clients execute queries on the master node, the changes are written to a transaction log (WAL) and copied over network to WAL on the standby node. The recovery on the standby process on the standby then reads the changes from WAL and applies them to the data files just like during recovery. If the standby is in “hot_standby” (more…)
A few days ago we released pglogical, a fully open-source logical replication solution for PostgreSQL, that’ll hopefully get included into the PostgreSQL tree in a not-too-distant future. I’m not going to discuss about all the things enabled by logical replication - the pglogical release announcement presents a quite good overview, and Simon also briefly explained the advantages of logical replication in another post a few days ago.
Instead I’d like to talk about one particular aspect mentioned in the announcement - performance comparison with existing solutions. The pglogical page mentions
... preliminary internal testing demonstrating a 5x increase in transaction throughput (OLTP workloads using pgBench ) over other replication methods like slony and londiste3. So let's see where the statement comes from.
In Postgres-XL, sequences are maintained at the Global Transaction Manager (GTM) to ensure that they are assigned non-conflicting values when they are incremented from multiple nodes. This adds significant overhead for a query doing thousands of INSERTs in a table with a serial column, incrementing sequence one at a time and making a network roundtrip to the GTM, for every INSERT.
Shaun Thomas in a recent blog complained about INSERTs running a magnitude slower on Postgres-XL as compared to vanilla PostgreSQL. There is already a way to improve performance for sequences, but it’s clearly not well advertised. I thought this is a good opportunity to explain the facility.
Postgres-XL provides a user-settable GUC called sequence_range. Every backend requests a block of sequence values as (more…)
This another simple test in continuation from last time. We will start with the same lineitem table as in the previous example. We will measure the time it takes to load the same 7.2GB text file repeatedly until the table size grows to about 1TB. We create a baseline with a table that has no indexes built on it. Then repeat with a B-tree index on the l_shipdate DATE column, and again after replacing the B-tree index with a BRIN index.
Our baseline shows that as the table grows the time it takes to insert data also increases. The difference in the time that it takes to insert data when the table is near 1TB compared to when it is empty is about 12 seconds. With the B-tree index in place the difference increases to 84 seconds. Finally the change is only about 15 seconds with (more…)
I have been looking at the new Block Range Indexes (BRIN) being developed for PostgreSQL 9.5. BRIN indexes are designed to provide similar benefits to partitioning, especially for large tables, just without the need to declare partitions. That sounds pretty good but let's look in greater detail to see if it lives up to the hype.
How large? Here's one data point. Using the TPC Benchmark(TM) H provided dbgen we created data for the lineitem table at the 10GB scale factor, which results in a 7.2GB text file.
We're going to compare a couple of basic tasks. The first look will be at the impact of inserting data into a table using the COPY command. We will do a simple experiment of creating a table without any indexes or constraints on it and time how long it takes to load the (more…)
The profiling utility perf that ships with the Linux kernel is extremely useful for examining system-wide and multi-process behaviour - but it does a lot more than the CPU profiling it's often used for. You've probably looked at perf top -az or perf top -u postgres output, but that's only the littlest bit of what it can do. (If you want the TL/DR version, jump down to "User space dynamic probes").
One of the big advantages of perf is that it's non-intrusive. You don't have to attach a debugger and interrupt execution. You don't have to run commands directly under a profiler in a special environment. There's no need to re-start the server to debug a problematic workload, and often no need to re-compile with debug options. This is extremely useful when you're trying to track down (more…)