Using the PostgreSQL System Columns

There are a few parts of the PostgreSQL internals that poke out usefully if you look in the right place for them.  One useful set to know about are the System Columns, which you can explicitly request but don’t see by default.  For example:

psql -x -c “SELECT oid,* FROM pg_class LIMIT 1″

There is no column named oid in the pg_class table, but it’s there if you ask for it.  The oid used to be relied on more heavily in PostgreSQL as a way to identify rows.  That’s not true for regular tables anymore, and you really don’t want to start doing that for your own tables.  OIDs are mainly useful now when joining parts of the System Catalog together.  A good example is the Disk Usage query.  If you want to find the namespace a table is in, you need to know you can ask for its OID.  It’s possible to get some of this data out of more portable views like information_schema.tables.  But many of the useful things in this area are PostgreSQL specific.  Sometimes I see people starting with the information_schema views and joining against other tables using its text name fields, such as the listed table_name.  That approach has several edge cases that don’t work out correctly; not handling TOAST columns is a common example.  That makes them more prone to breaking on you later, probably after your system has gone into production, than an OID based join.

There is also a tableoid system column.  As described in the documentation, its main use case is identifying which partition a row come from.  That’s not a great thing to be driving application logic from, but it can be useful for monitoring or troubleshooting purposes.  For example, if you SELECT rows from the parent table in a partitioning inheritance scheme, it’s normally expected that no rows will actually be stored there.  Checking the tableoid is one way to confirm that.  You might confirm that your INSERT/UPDATE trigger is moving rows to the right place using tableoid as well.  It’s possible to do that for each individual partition section, but running a query against the parent will make sure you hit every row in the table.

Another internal column related to uniquely identifying rows is the ctid.  The ctid is a direct pointer to the physical block (using PostgreSQL’s 8K page size) and position of a row.  ctids are a pair of numbers, and the first row will be (0,1).  While this is the fastest way to find a row more than once in the same transaction block, these numbers are not stable in the long term.  Any UPDATE and some maintenance operations will change them.  One thing you can use these for is finding duplicate data in a table.  Let’s say you’re trying to add a unique constraint, but one row in the table is duplicated 3 times, which blocks the unique index from being created.  When rows are identical in every column, you can’t write any simple SELECT statement to uniquely identify them.  That means deleting all of them but one copy requires some annoying and fragile SQL code, combining DELETE with LIMIT and/or OFFSET–which is always scary.  If you use the ctid instead, the implementation will be PostgreSQL specific, but it will also be faster and cleaner.  See Deleting Duplicate Records in a Table for an example of how that can be done.

The other system columns all relate to transaction visibility:  xmin, cmin, xmax, cmax.  When you delete a row in PostgreSQL, it isn’t eliminated from disk immediately.  It’s possible that some other query that’s executing at the same time will still need to see that row, and the transaction isolation in PostgreSQL worries about such things.  If you ever want to learn how that isolation works, the way the Multiversion Concurrency Control (MVCC) implementation is handled, you can watch parts of it happen.  Just open transactions in two different sessions, UPDATE/DELETE in one of them, and then look at those rows in the other.  You can still see them in the session where they weren’t touched, but they’ll be marked to expire in the future via their xmax being set.  To really pull that all together, you also need to know about some of the System Information Functionstxid_current() is the most useful for this sort of learning experience, it provides a reference point for the always increasing system transaction ID.  You can find a more detailed exploration of using these functions and system columns in Bruce’s MVCC Unmasked talk.  The “Routine Maintenance” chapter of my book also shows examples how how MVCC works through the perspective of the system columns.

This Post Has 0 Comments

Leave A Reply