Wednesday, August 16

BRIN for PostGIS: my story at the Code Sprint 2016 in Paris

Last week I was pleased to attend Code Sprint 2016 in Paris, organised by the OSGeo Foundation: it has been an opportunity for many GIS users from all over the world to get together, discuss and actively work on many open tasks of some of the major open source frameworks on GIS. PostGIS was obviously one of the projects involved in the Code Sprint.

25114331182_92bc5ebab3_z

Some background information first. A few months ago I started to work on adding BRIN support to the geospatial datatype available in PostGIS: I was able to develop a working prototype on my own.

However, it was thanks to this Code Sprint that I was finally able to complete the job. Code Sprint was the opportunity for me to know Julien and Ronan from Dalibo, who shared the same goals. We worked as a team and shared our skills and knowledge for the whole duration of the event, together we were able to produce a first version of a patch for PostGIS 2.3. The main features introduced with this patch are the definitions of two operator classes that include some operators that can benefit from block range indexing (BRIN), both for 2D and n-D cases. For now, just the “inclusion support” has been implemented: this means that just inclusion and/or intersection operations can benefit from BRINs.

Anyway, the final results surprised all of us: indexing is robust and promising. A simple test made on real geospatial data from the OpenStreet Map dataset has shown that BRINs, compared to GiST indexing, have:

  • execution times of the queries 5 times higher (CON);
  • indexes that are 1000 times smaller and that can be created 100 times faster (PRO).

This feature could be particularly useful in big geospatial data contexts, and this was the original motivation that prompted me to include BRIN support into PostGIS: for instance, start to use relational databases to manage LiDAR surveys. I have already talked about this case, and have also made a presentation at the FOSS4G.EU 2015 conference where I showed the limitation of PostgreSQL and its indexes with point clouds.

paris

On a personal note, I am enthusiastic for the obtained results, but, most importantly, I am extremely happy for the priceless experience. It’s been the first time I’ve taken part to a code sprint, and the atmosphere was really exciting, collaborative and constructive. Code Sprint gave me the opportunity to learn and improve. Moreover, it has been a good opportunity for me to share my knowledge of PostgreSQL with Ronan and Julien. Even though this is the first version of the patch, it is already robust. I must say that our joint everyday experience with PostgreSQL VLDBs allowed us to achieve our goal. I feel it is important that PostgreSQL developers and professional users – like us from 2ndQuadrant, for example – actively participate to the development of PostGIS.

Obviusly, I’d like to thank Julien and Ronan for their work on this patch. But I also want to thank both Àlvaro Herrera (that introduced the BRINs in PostgreSQL) and Emre Hesegeli (that has worked on BRINs too, in particular on the documentation) for the insightful conversations and email/chat exchanges about the topic. Many thanks to Vincent Picavet from OSLandia who invited me to attend, as well as all the staff that organised the event in Paris. “Grazie” also to 2ndQuadrant Italia that is self funding my research and development in this field.

Finally, I am sure you are really curious now to try this new feature! Be patient … we are polishing the patch, and then we will propose it to the PostGIS team (that has already seen some preliminary results in the Code Sprint in Paris). Our goal is to have it included in PostGIS 2.3.

I have already submitted this talk to the 5432…meetus! PostgreSQL conference that will be held in Milan at the end of June and the FOSS4G.NA geospatial conference that will be in Raleigh (North Carolina) in May! I hope I will be able to speak about how BRIN indexing performs with geospatial data in these occasions.

Leave a Reply

Your email address will not be published. Required fields are marked *