Friday, September 22

Planning to succeed

PostgreSQL 9.6 has a lot of good features; many of the changes are in the SQL planner, aiming to improve performance by carefully selecting the right execution plan. The great thing here is that doing less work makes many queries much, much faster than they were before.

First, we are now using Foreign Key data in the planner to improve estimates.

Next, we are combining aggregates to avoid duplicating effort.

We’re also improving the way that GROUP BY estimation occurs.

And we’re using partial indexes for index-only scans in more cases.

And we’ve improved estimates for distinct rows, leading to more accurate planning of hash joins and other plan types.

Congratulations to my 2ndQuadrant colleagues Tomas Vondra and David Rowley for their insight, rigour and persistence in chasing down these issues.

All of these things were planned. They didn’t occur randomly, they were part of a coordinated attack on planning problems with big data as part of the AXLE project.

5 Comments

  • mbanck

    While planner changes usually don’t feature prominently in the release announcement/release notes, a line like “Various planner changes have increased the TPC-H score by XXX %” would be taken notice of I think. Now just somebody will have to rerun those benchmarks ofter Beta or RC…

  • Anon

    As an observer, I have been watching and reading the commitfest emails/threads. Amazing performance gains coming.

    The only items im disappointed in not seeing is the POC cache of getsnapshotdata being pushed to 9.7. This shows huge performance gains for x86 (Intel I believe). I thought thats why you guys now run long Alphas, get it out quicker but also allow longer to fine tune things.
    There also seems be a performance bias towards Power8 in the threads which is unfortunate… I hope they start to benchmark 9.6 and 9.7 against AMD as well when it releases its next chip ZEN and ARM towards end of year, when they make decisions about commits.

    It would be nice to see the POC OR-clause as well, but again pushed to 9.7.

    Was hoping all the incredible performance improvements would be in 9.6 since there isn’t any major killer feature for end user in this release unlike the previous releases (JSONB etc in the past). There still seems like there is time to get this stuff in since the buffer spin lock atomic operations has been committed and in that state for quite a while.. so hope that makes it into 9.6 as well.

    Just by 2 cents :)

  • Anon

    So from a query performance perspective. Where is the next lot of best gains MOST LIKELY going to come from in your opinion?
    Better query plan selection (is it that bad currently?), LLVM, GPU, without having to Scale Out.. that we may see in 9.7 and 9.8?
    and how much more performance gain to you think there is to be had? 10%, 50% best case?

  • Alepaes

    “Better query plan selection (is it that bad currently?)”
    Yes, it’s sometimes too bad, specially in aggregates (i.e. nothing pushing predicates). Oracle and SQL Server have a good work in helping bad wrote querys, what is good in times of web frameworks with data abstraction layer.
    Unfortunately, we will haven’t Transparent Table Partitioning in 9.6. This can bring a monster performance gain, IMHO.

Leave a Reply

Your email address will not be published. Required fields are marked *