In my previous post we looked at various partitioning techniques in PostgreSQL for efficient IoT data management. We do understand that the basic objective behind time based partitions is to achieve better performance, especially in IoT environments, where active data is usually the most recent data. New data is usually append only and it can grow pretty quickly depending on the frequency of the data points.
Some might argue on why to have multiple write nodes (as would be inherently needed in a BDR cluster) when a single node can effectively handle incoming IoT data utilizing features such as time based partitioning. Gartner estimated 8.4 billion connected devices in 2017, and it expects that this number will grow to over 20 billion by 2020. The scale at which connected devices are
I spent a couple of days in São Paulo, Brazil last week, for the top-notch PGConf.Brazil 2018 experience. This year I gave a talk about improvements in the declarative partitioning area in the upcoming PostgreSQL 11 release — a huge step forward from what PostgreSQL 10 offers. We have some new features, some DDL handling enhancements, and some performance improvements, all worth checking out.
I'm told that the organization is going to publish video recordings at some point; for the time being, here's my talk slides.
I'm very happy that they invited me to talk once again in Brazil. I had a great time there, even if they won't allow me to give my talk in Spanish! Like every time I go there, I regret it once it's time to come home, because it's so easy to feel at home with the
This blog continues the discussion from my previous post on scalability for IoT workloads where I discussed how declarative partitioning in PostgreSQL 10 can help achieve scalability. While native declarative partitioning is a good start, the experience of creating and maintaining the same partitions I did in my last post becomes much more fun with pg_partman.
pg_partman is an extension to create and manage both time-based and serial-based table partition sets. Native partitioning in PostgreSQL 10 is supported as of pg_partman v3.0.1. It is important to note that all the features of trigger-based partitioning are not yet supported in native, but performance in both reads and writes is significantly better. Since Postgres-BDR runs as an extension on PostgreSQL, we can enjoy all features
Here’s a step by step guide to install PostgreSQL on your machine using PGInstaller. PGInstaller supports three modes of installation; Graphical, Unattended and Text. We’re going to cover all three of them in this guide.
To Install PostgreSQL via Graphical Mode
Download PGInstaller here. PGInstaller is available for PostgreSQL 9.5, 9.6, 10, and 11(beta).
Click on the executable file to run the installer.
Select your preferred language.
Specify directory where you want to install PostgreSQL.
Specify PostgreSQL server port. You can leave this as default if you’re unsure what to enter.
Specify data directory to initialize PostgreSQL database.
Create a PostgreSQL user
PostgreSQL administration, configuration, and deployment can be a tough ask while working in an agile environment with strict deadlines. The key to simplify these operational tasks for PostgreSQL is Ansible - an open source IT automation tool.
To explain how these technologies work together, 2ndQuadrant hosted a Webinar on PostgreSQL deployments using Ansible. The webinar was presented by Tom Kincaid, GM North America at 2ndQuadrant, who gave an overview of Ansible and PostgreSQL, covered best strategies for deployments, and a variety of other topics.
If you weren’t able to make it to the live session, you can now view the recording here.
For any questions, comments, or feedback, please visit our website or send an email to
The video of my presentation below walks you through the major features of the native JSON data type in PostgreSQL 9.3 and beyond.
This presentation covers the following topics:
What is JSON?
How is it available in PostgreSQL?
What's the difference between JSON and JSONB?
Accessing JSON values
Creating JSON from table data
Creating table data from JSON
Crosstabs with JSON
Indexing and JSON
When to use JSON, when to use JSONB, and when neither should be used
The elephant has been the symbol of PostgreSQL for many years now, referring to the robustness and strength as well as its reputed wisdom. Long may that association continue.
Even after many years of protection, the elephant is being killed by poachers at an incredible rate of 20,000 per year, or approximately 1 elephant will be killed while you read this.
The things we care about can be destroyed if we do nothing.
If the online trade in ivory can be reduced, we can reduce the killing.
Please contribute in some way, and report traders if you see them.
A couple of weeks back, I wrote about how to use Windows Functions for time series IoT analytics in Postgres-BDR. This post follows up on IoT time series data and covers the next challenge: Scalability.
‘Internet of Things’ is the new buzzword as we move to a smarter world equipped with more advanced technologies. From transport to building industry, smart homes to personal gadgets, it’s not just about gadgets and sensors anymore.
In reality, it is all about data. Not just simple data, but data that grows at an enormous rate. Businesses and application developers in Internet of Things domain face some similar questions today in terms of finding the best combination of technologies to support them. Without a doubt, database remains at the core of any such decision making.
OmniDB 2.8 introduced support for Postgres-BDR 3.0, the ground-breaking multi-master replication tool for PostgreSQL databases, announced last month in PostgresConf US.
Here we have 2 virtual machines with Postgres-BDR 3.0 installed and we will use OmniDB to connect to them and setup replication between the machines.
Postgres-BDR 3.0 requires PostgreSQL 10 or better and also pglogical 3.0 extension should be installed, as Postgres-BDR 3.0 works on top of pglogical 3.0. Make sure you put the required entries in pg_hba.conf to make both machines communicate to each other via streaming replication. Then, in postgresql.conf you should set the following parameters in both machines:
listen_addresses = '*'
client_encoding = utf8
wal_level = 'logical'
If the title of this blog post rings a bell with you, perhaps you were at PG Day in Horwood House in 2014, when I stood up for 5 minutes to make the case for data modelling; a data model is much more than just a diagram. I shouldn’t be, but I am often amazed by the way data models (and the tools we use to manage them) are derided as ‘just pretty pictures’ or ‘documentation’. I’m not going to repeat my lightning talk here (watch it yourself if you want to), instead I’m going to talk about Data Vault.
Data Vault (DV) is a technique for building scalable data warehouses. Dan Linstedt describes DV as “a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing