Saturday, March 23

Greg’s PlanetPostgreSQL

Planetary alignment

Greg's PlanetPostgreSQL, PostgreSQL
Picking back up this week's theme of where you can publicize your PostgreSQL related project at, you're probably reading this blog entry because it appeared on the Planet PostgreSQL blog aggregator. There are "Planet" feeds around many open-source projects. The Debian and GNOME ones spawned off the Planet software, which now powers a ton of other blogs such as the the well regarded Planet Python. Occasionally you'll find general open-source database news posted both here and on Planet MySQL. And I used to read Planet CentOS back when I used to care if they'd ever release CentOS 6. Planet PostgreSQL has been around since about seven years ago, when Devrim Gündüz first made the service available on one of his servers. Like many good open-source projects, it has some history (more…)

PostGIS In Action

Greg's PlanetPostgreSQL, PostgreSQL
I doubt many people can tell you exactly when the first time they read a map was.  Mine was memorable though.  Circa 3rd grade, I went through the usual battery of standardized tests for the first time, which included map reading.  I did pretty bad, which was odd because it was the only section I bombed like that.  Concerned that perhaps I had some sort of learning problem related to spatial data or visualization, a guidance counselor reviewing my scores quizzed me about that section and what I thought of it.  Told her I thought it was pretty neat, and that I was looking forwarding to learning about these "maps" one day.  Turns out, due to a school change and differences in class order between schools, I had never been shown one before the exam.  For (more…)

Intel SSD, now off the sh..err, shamed list

Greg's PlanetPostgreSQL, PostgreSQL, United States News
I already did the long conference entry here, so just a quick update:  slides from PGEast are posted and next week I'll be at the increasingly misnamed MySQL Conference in Santa Clara, California.One thing I'm known for now is ranting about cheap Solid State Drives and how they suck for database use.  The Reliable Writes wiki page collects up most of the background here.  The situation the last few years has been that every inexpensive drive on the market does not have a safe write cache for database use.  Every customer of mine who has purchased one of Intel's SSD drives for example, either the X25-M or the not-enterprise-at-all X25-E, has suffered at least one massive data corruption loss.In order to make a flash drive safe, you need to have a battery-backup on the (more…)

Spring 2011 PostgreSQL Conferences, US/Canada

Greg's PlanetPostgreSQL, PostgreSQL, United States News
This week's water falling from the skies isn't turning into snow.  And on days when it's clear, my car is covered with tree pollen.  While they means something different to most people, to me these are the signs that the spring conference season is about to start.  There's a conference in North America during each of the next three months containing serious PostgreSQL content, and this year I'm making each of them.PgEast 2011 takes place in New York City starting just over two weeks from now.  If you're nearby, it's not too late to make travel plans.  The location is so easy to reach via train.  The conference venue is right on top of New York's Penn Station, a major train hub.  All the major East Coast cities from Boston to Richmond are easy rides from (more…)

Linux filesystems and PostgreSQL checkpoint benchmarks

Greg's PlanetPostgreSQL, PostgreSQL, United States News
Following up on last month's Tuning Linux for low PostgreSQL Latency, there's now been a giant pile of testing done on two filesystems, three patches, and two sets of kernel tuning parameters run.  The result so far is some interesting new data, and one more committed improvements in this area that are in PostgreSQL 9.1 now (making three total, the other two are monitoring patches).  I'll be speaking about recommended practice next month during one of my talks at PostgreSQL East, and I've submitted something in this area for May's PGCon too.  Here I'll talk a bit more about the dead ends too, while those memories are still fresh. The basic problem here is that the way PostgreSQL uses the operating system cache when writing allows large amounts of data to accumulate.  The result when (more…)

Data generation and hardware quality

Greg's PlanetPostgreSQL, PostgreSQL, United States News
One of the challenges when dealing with a new database design is that you don't know things like how big the tables will end up being until they're actually populated with a fair amount of data.  But if the design has to factor in the eventual scalability concerns, you can't deploy it to obtain that data until the estimation is done.  One way around this is to aggressively prototype things.  Use staging hardware for this purpose that new applications can live on temporarily while sorting details like this out.  You can just factor in that you'll need to move the app and possibly redesign it after a few months, when you have a better idea what data is going to show up in it.The other way to get around this chicken/egg problem is to write a data generator.  Construct (more…)

Hinting at PostgreSQL

Greg's PlanetPostgreSQL, PostgreSQL, United States News
This week's flame war on the pgsql-performance list once again revolves around the fact that PostgreSQL doesn't have the traditional hint syntax available in other databases.  There are a mix of technical and pragmatic reasons behind why that is: Introducing hints is a common source of later problems, because fixing a query place once in a special case isn't a very robust approach.  As your data set grows, and possibly changes distribution as well, the idea you hinted toward when it was small can become an increasingly bad idea.Adding a useful hint interface would complicate the optimizer code, which is difficult enough to maintain as it is.  Part of the reason PostgreSQL works as well as it does running queries is because feel-good code ("we can check off hinting on our (more…)

Tuning Linux for low PostgreSQL latency

Greg's PlanetPostgreSQL, United States News
One of the ugly parts of Linux with PostgreSQL is that the OS will happily cache up to around 5% of memory before getting aggressive about writing it out.  I've just updated a long list of pgbench runs showing how badly that can turn out, even on a server with a modest 16GB of RAM.  Note that I am intentionally trying to introduce the bad situation here, so this is not typical performance.  The workload that pgbench generates is not representative of any real-world workload, it's as write-intensive as it's possible to be.Check out test set 5, which is running a stock development version PostgreSQL 9.1.  Some the pauses where the database is unresponsive during checkpoints, as shown by the max_latency figure there (which is in milliseconds), regularly exceed 40 seconds.& (more…)

Reducing the postgresql.conf, parameter at a time

Greg's PlanetPostgreSQL, PostgreSQL, United States News
One of the more useful bits of PostgreSQL documentation I ever worked on is Tuning Your PostgreSQL Server.  When that was written in the summer of 2008, a few months after the release of PostgreSQL 8.3, it was hard to find any similar guide that was both (relatively) concise and current. Since then, myself and many other PostgreSQL contributors have helped keep that document up to date as changes to PostgreSQL were made. The interesting and helpful trend during that period is that parameters keep disappearing from the set of ones you need to worry about. In PostgreSQL 8.2, there was a long list of parameters you likely needed to adjust for good performance on a PostgreSQL server: shared_buffers, effective_cache_size, checkpoint_segments, autovacuum, max_fsm_pages, (more…)

How not to build PostgreSQL 9.0 extensions on RPM platforms

Greg's PlanetPostgreSQL, PostgreSQL, United Kingdom News, United States News
For a long time, adding packages to RedHat derived Linux systems has been called "RPM Hell", for good reason.  Particularly before the yum utility came about to help, getting RPM to do the right thing has often been a troublesome task.  I was reminded of this again today, while trying to compile a PostgreSQL extension on two nearly identical CentOS systems.PostgreSQL provides an API named PGXS that lets you build server extensions that both leverage the code library of the server and communicate with it.  We use PGXS to install our repmgr utility, and having that well defined API let the program be developed externally from the main server core.  Many popular pieces of PostgreSQL add-ons rely on PGXS to build themselves.  In fact, the contrib modules that come with (more…)