Thursday, July 27

Author: Shaun Thomas

PG Phriday: Crazy Correlated Column Crusade

Shaun's PlanetPostgreSQL
For a long time, the Postgres query planner has sported a huge blinking neon blind-spot that frustrated and enraged DBAs throughout the universe to a level just shy of murderous frenzy. How is this even possible? What terrible lurking horror could elicit such a visceral response from probably the most boring and straight-laced people ever to draw breath? What else? Correlated statistics. The Adventure Begins! The Postgres query planner is a cost-estimation engine. Postgres gathers statistics on table contents such as most and least frequent values, value cardinality, rudimentary histograms, and so on. Along with multiple metrics related to hardware responsiveness, it generates multiple viable execution strategies, and chooses the one that probably costs the least resources. So far, (more…)

PG Phriday: Uptown Func

Shaun's PlanetPostgreSQL
One of the coolest things about Postgres functions is that they can return rows as if they were a table. Not only is it possible, but creating a set-returning function in Postgres is almost frighteningly trivial when compared to other database engines. In fact, some Oracle users are probably rolling their eyes and muttering "Just use a cursor!" already. Just hear us out! Postgres functions can return cursors too, but using them afterwards isn't exactly a friendly experience. Cursors are passed by reference, so the function must either accept a parameter to name the cursor, or it generates something wacky like "<unnamed portal 1>". After that, cursors can only be used with FETCH instead of SELECT, greatly limiting their utility. What about views, then? Obviously they can return (more…)

PG Phriday: Postgres on Tap

Shaun's PlanetPostgreSQL
Testing within the context of a database can sometimes seem a bit overwhelming. Not only is it necessary to create a reasonable set of test data to bootstrap multiple tables, but we need some reliable invocation system to compare query, view, and function output to expected results. And that is no small feat. For a very long time, the "easiest" way to test end-user database objects involved building a Makefile that included PGXS. Then we had to create the tests, run the tests, and copy over the output into static expected files. Modifying any test always meant updating the static output file as well. It's great that the tools are there, yet why do we need to be halfway to a Postgres developer in order follow good QA practices with Postgres? Well, maybe we don't. It turns out, (more…)

PG Phriday: Dearly De-parted

Shaun's PlanetPostgreSQL
With the christening of PG Phriday at its new location, it's only fitting that we revisit a frequent boon and bane of organized storage: partitioning. When volume and scale reach a certain point that the very laws of physics become a barrier to efficiency, partitions are a crucial tool to escape the otherwise inevitable pile of unmaintainable bits. It's not feasible to reindex a multi-terabyte table with tens or hundreds of billions of rows. It's excruciating to excise portions for long-term archival, and all but impossible to perform anything but the most cursory maintenance in general. Huge tables are a partition advocate's dream. But until fairly recently, the implementation in Postgres has been... less than savory. While nothing is perfect, Postgres has never actually offered (more…)