Monday, June 26

Author: Pavan Deolasee

PGConf India 2017 – An Event to Remember

2ndQuadrant, Pavan's PlanetPostgreSQL
Yet another edition of PGConf India came to conclusion in early March. You may have noticed the change from PGDay to PGConf, which signals a much larger gathering of PostgreSQL enthusiasts, now and in future. What started as a small meet-up of like minded people 4 years back, has now grown into a 2-day conference with a dedicated training day and a multi-track event at the main conference.The keynote this year was delivered by Simon Riggs, a major developer and committer at PostgreSQL. He presented his thoughts on why Persistence is key to PostgreSQL's success. Persistence is why users trust PostgreSQL to manage their critical data and persistence is why PostgreSQL community is able to deliver a solid product, release after release.This year's conference was attended by more th

Looking forward to PGConf India 2017

Pavan's PlanetPostgreSQL, PostgreSQL
It has taken a lot of planning and efforts, but I'm happy to see that PGConf India 2017 is coming along very well and promises to be a great event. This is our third year in a row in Bengaluru, but there are many distinguishing factors that make this year stand out:For the first time, we're having a dedicated training day ahead of the main conference. All seats for the trainings are sold out and we had to turn away many interested folks. For the first time, we're having a multi-track conference. And for the first time, we've crossed 225 conference delegate registrations, and there are still a few days left.When we started planning for the conference, we were a bit nervous about whether we could get a good response to the training programme, but we’re almost stumped by the p

Corruption in CREATE INDEX CONCURRENTLY

Pavan's PlanetPostgreSQL
Two weeks back we discovered an ancient bug in PostgreSQL which may cause index corruption when index is built via CREATE INDEX CONCURRENTLY (aka CIC). All supported releases of PostgreSQL are affected by this bug. The bug could be a source of index corruption reports that we receive from the field once in awhile.Now before we discuss the problem, let me make it absolutely clear that PostgreSQL is known for its gold standards as far as code quality and product stability is concerned. But just like any other software, the database engine can also have unnoticed bugs. But PostgreSQL community takes every known bug very seriously and try to fix it on priority. No wonder that the bug fix went into the scheduled 9.6.2 release (and all other stable releases) at a very short notice, but not b
Explaining CREATE INDEX CONCURRENTLY

Explaining CREATE INDEX CONCURRENTLY

2ndQuadrant, Pavan's PlanetPostgreSQL
This technical blog explains how CREATE INDEX CONCURRENTLY (CIC) works and how it manages to avoid locking the table from updates. A unique distinguishing factor of CIC is that it can build a new index on the table, without blocking it from updates/inserts/deletes.But even before that, let's understand how Heap-Only-Tuple (HOT) works. It was a landmark feature added in PostgreSQL 8.3 to reduce table bloat and improve performance significantly. But the feature also has some implications on the working of CIC. Heap-Only-Tuple (HOT) PostgreSQL uses multi-version concurrency control (MVCC) for transactions. In this technique, when a row is updated, a new version of the row is created and the old version is left unchanged. Each version has creator and destroyer information and transaction...

Postgres-XL Scalability for Loading Data

Pavan's PlanetPostgreSQL
In my last blog, we looked at the benchmark results from bulk load test for a Postgres-XL database cluster. Using a 16-datanode, 2-coordinator cluster, running on EC2 instances, we could easily clock 9M rows/sec or 3TB/hr of ingestion rate. That’s a significant number in itself. In this blog, we’ll see if the ingestion rate is scalable in Postgres-XL. In particular, we’ll try to answer if adding more nodes to the cluster can result in a linear increase in performance.Let’s use the same line item table from the TPC-H benchmark for these tests. We'll increase the cluster size from 16 datanodes to 20 datanodes and then further to 24 datanodes. We'll also repeat the tests with 1, 2 and 3 coordinators respectively. For all these tests, we are using i2.xlarge EC2 instance for a datanode an

Load data in Postgres-XL at over 9M rows/sec

2ndQuadrant, Pavan's PlanetPostgreSQL
We are faced with this question: “What’s the ingestion rate of Postgres-XL?”, and I realised I don’t have a very good answer to that. Since recently we made some good improvements in this area, I was curious to know too. Well, I decided to benchmark. Hardware and Software For the tests, I used a Postgres-XL cluster running on EC2 instances. Since COPY has to go through the coordinator, it seemed reasonable to use a compute-optimised c3.8xlarge instance for running coordinator. Similarly, for datanodes,  storage-optimised i2.xlarge instances are more appropriate. Both these instances have attached SSD disks, though i2.xlarge instance has more storage than the c3.8xlarge instance. So the next question was how to generate data for the test? We’d used TPC-H benchmark for previous perform

How will Postgres-XL exploit the Parallel Query Capabilities of PostgreSQL 9.6?

Pavan's PlanetPostgreSQL
With PostgreSQL 9.6 now out, many of our users and customers are asking us about the plans for Postgres-XL 9.6. The new release of PostgreSQL implements some new features, especially around parallel query processing and many are interested to know when those capabilities will be available in Postgres-XL.Postgres-XL itself is a massively parallel processing system. It has an efficient distributed planner and executor which can distribute work to tens of datanodes, process the result and send it back to the client. It allows the user to either shard or replicate a table for efficient query execution. On top of that, Postgres-XL also provides global consistency of the data, all the time, so that users don’t need to worry about consistency while designing their applications.One major d

Postgres-XL 9.5r1.1 released

2ndQuadrant, Pavan's PlanetPostgreSQL
The Postgres-XL community has released a new version of the software, tagged as 9.5r1.1. This is a minor update to earlier 9.5r1 release, but contains some very important bug fixes. I would encourage everyone to immediately upgrade to this latest release. Being a minor release, you only need to download the latest sources from , compile, install new binaries and restart your servers. Major Improvements This release is fully caught up with the latest PostgreSQL 9.5 minor release i.e. it includes all bug fixes from PostgreSQL 9.5.3 version, also released today. List of important bug fixes in PostgreSQL 9.5.3 can be found here. Apart from that, it contains following important bug fixes:Fix a nasty bug in CLOG and Subtrans Log management which may lead to severe data corruption. Add a...

Postgres-XL 9.5r1 has finally arrived!

2ndQuadrant, Pavan's PlanetPostgreSQL
After months of efforts, I'm pleased that Postgres-XL 9.5r1 is seeing the daylight. It has been tremendous collective efforts by many, both inside and outside 2ndQuadrant. Often it's not visible via commit history or mailing list communications, but I must admit that many folks have contributed in making this grand release. Contributors who wrote code and sent ideas, those who reviewed the code, QA team for testing and enhancing regression coverage, those who did benchmarks to show how we are doing, those who wrote automation to deploy the product and of course the user community for testing the product and providing invaluable feedback. So thank you folks!Over the year, I've periodically wrote about various enhancements we made to the product. But let me summarise some of them here fo...

Postgres-XL 9.5R1Beta2 Released!

Pavan's PlanetPostgreSQL
The Postgres-XL 9.5R1Beta2 release went out yesterday. It's another step forward to have a stable 9.5 release sometime very soon. A few key enhancements from the last beta release are captured in this blog. For the full list, I would recommend to read the release notes. Support for binary data transfer for JDBC and libpq If you'd trouble receiving data in binary format via JDBC or libpq, this release should fix those issues for you. The coordinator now have intelligence to figure out if the client is requesting data in binary format and handle those cases correctly. Pushdown of Append and MergeAppend plans One of the problems reported with the beta1 release was that for inherited tables, coordinator fails to pushdown plans to the remote nodes even when its possible. A simple example wo