PostgreSQL anti-patterns: read-modify-write cycles

Shaun Thomas’s recent post about client-side loops as an SQL anti-pattern is well worth a read if you’re relatively new to SQL-based application development.

It’s reminded me of another SQL coding anti-pattern that I see quite a lot: the na├»ve read-modify-write cycle. Here I’ll explain what this common development mistake is, how to identify it, and options for how to fix it.

Imagine your code wants to look up a user’s balance, subtract 100 from it if doing so won’t make it negative, and save it.

It’s common to see this written as three steps:

SELECT balance FROM accounts WHERE user_id = 1;
-- in the application, subtract 100 from balance if it's above
-- 100; and, where ? is the new balance:
UPDATE accounts SET balance = ? WHERE user_id =1;

and everything will appear to work fine to the developer. However, this code is critically wrong, and will malfunction as soon as the same user is updated by two different sessions at the same time.

Imagine two concurrent sessions, each subtracting 100 from the user’s balance, starting with an initial value of 300.

Session 1 Session 2
SELECT balance FROM accounts WHERE user_id = 1; (returns 300)  
  SELECT balance FROM accounts WHERE user_id = 1; (also returns 300)
UPDATE balance SET balance = 200 WHERE user_id = 1; (300 – 100 = 200)  
  UPDATE balance SET balance = 200 WHERE user_id = 1; (300 – 100 = 200)

Whoops!. The balance is 200, but you took out 200 from a starting point of 300. So 100 has just vanished.

Most testing and development is done on standalone servers running single sessions, so unless you’re doing rigorous testing this sort of thing often doesn’t get noticed until production, and can be painful to debug. It’s important to know about it so you can code defensively.

Don’t transactions prevent this?

Following along

If you want to try this or any of the other examples in this article, just run the following setup code:

CREATE TABLE accounts (user_id integer primary key, balance integer not null);

INSERT INTO accounts(user_id, balance) VALUES (1, 300);

then run the commands in each session column in separate psql sessions by using two terminal windows**.

I often have people on Stack Overflow ask things to the tune of “Don’t transactions prevent this?”. Unfortunately, while great, transactions aren’t magic secret sauce you can add for easy concurrency. The only way to let you completely ignore concurrency issues is to LOCK TABLE every table you might use before starting the transaction (and even then you have to always lock in the same order to prevent deadlocks).

In a transaction the outcome in this case is exactly the same:

Session 1 Session 2
BEGIN; BEGIN;
SELECT balance FROM accounts WHERE user_id = 1; (returns 300)  
  SELECT balance FROM accounts WHERE user_id = 1; (also returns 300)
UPDATE balance SET balance = 200 WHERE user_id = 1; (300 – 100 = 200)  
  UPDATE balance SET balance = 200 WHERE user_id = 1; (300 – 100 = 200)
COMMIT; COMMIT;

Solutions

Thankfully PostgreSQL (and SQL in general) has a few tools that will help you, and there are some application side options too. Some popular solutions to this problem are:

  • Avoiding the read-modify-write with a calculated update
  • Row level locking with SELECT ... FOR UPDATE
  • Use of SERIALIZABLE transactions
  • Optimistic concurrency control, otherwise known as optimistic locking

Avoiding the read-modify-write cycle

The best solution is often to just do the work in SQL, avoiding the read-modify-write-cycle entirely.

Just write:

Session 1 Session 2
UPDATE accounts SET balance = balance - 100 WHERE user_id = 1; (sets balance=200)  
  UPDATE accounts SET balance = balance - 100 WHERE user_id = 1; (sets balance=100)

This works even if the two statements are in concurrent transactions, because the first takes a lock on the row and the second waits on the lock until the first commits or rolls back. The transaction isolation documentation covers this in more detail under READ COMMITTED.

This option is only viable for simpler cases, though. It isn’t useful if the application needs to do some complex logic based on the current balance to determine if the update should proceed, for example.

This method is usually the simplest and fastest choice where it’s applicable.

Note that in the non-default SERIALIZABLE isolation it prevents the error, but does so differently. See the discussion of SERIALIZABLE below, and the documentation.

Row level locking

The simplest solution to fix an existing broken application with the fewest changes is generally to add row level locking.

Instead of SELECT balance FROM accounts WHERE user_id = 1 write SELECT balance FROM accounts WHERE user_id = 1 FOR UPDATE. This takes a row-level write lock. Any other transaction that tries to UPDATE the row or SELECT ... FOR UPDATE (or FOR SHARE) it will pause until the transaction that holds the lock rolls back or commits.

In the example above and in PostgreSQL's default transaction isolation level the second SELECT would simply not return until the first transaction ran its UPDATE then COMMITed. Then the first transaction would continue, but the SELECT would return 200 instead of 300, so the correct value would be produced.

Session 1 Session 2
BEGIN; BEGIN;
SELECT balance FROM accounts WHERE user_id = 1 FOR UPDATE; (returns 300)  
  SELECT balance FROM accounts WHERE user_id = 1 FOR UPDATE; (gets stuck and waits for transaction 1)
UPDATE balance SET balance = 200 WHERE user_id = 1; (300 - 100 = 200)  
COMMIT;  
  (second transaction's SELECT returns 200)
  UPDATE balance SET balance = 100 WHERE user_id = 1; (200 - 100 = 100)
  COMMIT

The explicit locking chapter in the documentation discusses this in more detail.

Note that this only works if the read-modify-write cycle is contained within a single transaction, as locks only exist for the lifetime of a transaction.

SERIALIZABLE transactions

If the read-modify-write cycle is always contained within a single transaction and if you're running PostgreSQL 9.1 or newer, you can use a SERIALIZABLE transaction instead of explicit SELECT ... FOR UPDATE.

In this case both SELECTs would proceed normally, as would both UPDATEs. The first transaction would COMMIT fine. However, when you went to COMMIT the second transaction it would instead abort with a serialization error. You would then re-run the failed transaction from the beginning*:

Session 1 Session 2
BEGIN ISOLATION LEVEL SERIALIZABLE; BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT balance FROM accounts WHERE user_id = 1; (returns 300)  
  SELECT balance FROM accounts WHERE user_id = 1; (also returns 300)
UPDATE accounts SET balance = 200 WHERE user_id = 1; (300 - 100 = 200)  
  UPDATE accounts SET balance = 200 WHERE user_id = 1; (gets stuck on session1's lock and doesn't proceed)
COMMIT - succeeds, setting balance=200  
  (UPDATE continues, but sees that the row has been changed and aborts with a could not serialize access due to concurrent update error)
  COMMIT converted into forced ROLLBACK, leaving balance unchanged

SERIALIZABLE isolation can force applications to repeat a lot of work if a big transaction aborts or if conflicts are common. It's very useful for complex cases where attempting to use row locking might just cause deadlocks, though.

The documentation on concurrency control and transaction isolation is the best resource for information on serializable isolation. If you're not used to thinking about concurrency issues be prepared to read it a few times and be sure to try out the examples.

Optimistic concurrency control

Optimistic concurrency control (or "optimistic locking") is usually implemented as an application-side method for handling concurrency, often by object relational mapping tools like Hibernate.

In this scheme, all tables have a version column or last-updated timestamp, and all updates have an extra WHERE clause entry that checks to make sure the version column hasn't changed since the row was read. The application checks to see if any rows were affected by the UPDATE and if none were affected, treats it as an error and aborts the transaction.

For this demo you need to add a new column:

ALTER TABLE accounts ADD COLUMN version integer NOT NULL DEFAULT 1;

Then the example above becomes:

Session 1 Session 2
BEGIN; BEGIN;
SELECT balance, version FROM accounts WHERE user_id = 1; (returns 1, 300)  
  SELECT version, balance FROM accounts WHERE user_id = 1; (also returns 1, 300)
COMMIT; COMMIT;
BEGIN; BEGIN;
UPDATE accounts SET balance = 200, version = 2 WHERE user_id = 1 AND version = 1; (300 - 100 = 200. Succeeds, reporting 1 row changed.)  
  UPDATE accounts SET balance = 200, version = 2 WHERE user_id = 1 AND version = 1; (300 - 100 = 200). Blocks on session 1's lock.
COMMIT;  
  (UPDATE returns, matching zero rows because it sees version=2 in the WHERE clause)
  ROLLBACK; because of error detected

Because it's fiddly to code, optimistic concurrency control is usually used via an ORM or query building tool.

Unlike SERIALIZABLE isolation, it works even in autocommit mode or if the statements are in separate transactions. For this reason it's often a good choice for web applications that might have very long user "think time" pauses or where clients might just vanish mid-session, as it doesn't need long-running transactions that can cause performance problems.

Optimistic concurrency control can co-exist with traditional locking based approaches if you use triggers to enforce the optimistic concurrency locking rules; like this. Nonetheless it's generally used instead of other explicit concurrency control methods.

Which to use?

The appropriate choice depends on what you're doing, your scalability requirements, how easy it is to write retry loops to re-run failed transactions, etc.

There is no right answer for everybody. If there was, there'd only be one way to do it, not several.

There is, however, a definitively wrong way, and that's to ignore concurrency issues and expect the database to just take care of them for you.

Future topics

There's lots more to watch out for.

  • "Testing" connections to avoid adding error handling and retry code
  • Home brew queueing implementations
  • Sending email / calling web services etc from the DB
  • The many ways to get upsert (insert-if-exists-update) or insert-if-not-exists wrong
  • ... and plenty more

I'll discuss some of them in later weeks.


* The topic of transaction re-tries brings me to an anti-pattern I want to discuss later: testing connections and writing cannot-afford-to-fail transactions.

** Or if you're really keen you can turn them into automated tests using PostgreSQL's isolationtester, in src/test/isolation

This Post Has 12 Comments

  1. lesovsky says:

    Some sql queries are incorrect, “update balance” but must be “update accounts”, fix it please.

  2. Thormick says:

    A small omission in one of the queries: Under “Row level locking” the second session’s select needs FOR UPDATE as well, otherwise it’ll go right ahead and return the value.

    • craig.ringer says:

      Thanks. I knew I should’ve written isolationtester scripts first, then converted them.

      Writing SQL in HTML tables is a special experience I’d prefer not to repeat. Ever. Again.

  3. Shaun Thomas says:

    Very nice, Craig. This is definitely something I don’t see addressed often.

    This can also be addressed particularly through stored procedures, but that would take a much longer post. :)

  4. Dan says:

    Whats wrong with using the following?

    UPDATE accounts SET balance = balance – 100 WHERE user_id = 1;

    If you wanted the balanced returned after this, you can add ‘RETURNING balance’ on the end (obv the 100 would be a param and not a static number).

    • craig.ringer says:

      As covered under “Avoiding the read-modify-write cycle”, in simple cases that’s just fine, and usually the best choice where it’s possible.

      If your predicate logic is more complex and requires application interaction (say “update the balance only if they have enough to cover the request in their account and they send a correct SMS verification token”), not so much.

      That’s why it’s important to understand all the options. Different concurrency tools are suitable for different problems.

      These are all contrived examples because a real world example would be complex enough that most of the energy would be spent explaining the example, not the problem and solutions.

  5. Lachlan says:

    There is also the option of adding a condition on the UPDATE statement to verify the original known value of the balance so that the record is only modified if the original value has not changed:

    – eg
    UPDATE accounts SET balance = 200 WHERE user_id = 1 AND balance = 300;
    – or
    UPDATE accounts SET balance = balance – 100 WHERE user_id = 1 AND balance = 300;

    The query will return a count of either 0 or 1 records updated so it is simple to confirm whether the update has been executed and (possibly) refresh the client-side copy of the record and retry if no update took place.
    This solution requires no special transaction blocks, isolation level or row-level locking and allows for graceful recovery.

    • craig.ringer says:

      Yep, that’s essentially a special case of row versioning where you test every field instead of just a dedicated version column. It’s supported natively by psqlODBC for example, and works OK for small rows.

      It tends to fall down a bit with bigger rows. It can also be problematic in cases where you care about any change to the row, even if a sequence of changes might return it to the original state you first saw.

  6. Gilles says:

    If you use JDBC connection pooling, I guess you have to follow the “Optimistic concurrency control” way as you cannot guarantee to get the same connection for SELECT and UPDATE. Corrrect ?

    • craig.ringer says:

      With JDBC you usually use a connection pool at a layer above the driver, like DBCP or C3P0, or an app server’s built in pool.

      These give you a connection as a wrapper object until you return it to the pool. You open a transaction, do the work, and close the connection wrapper, which returns the underlying connection to the pool.

      So long as you keep the connection object (and keep the transaction open) across the whole operation then that will be fine.

      If your app returns connections to the pool between a SELECT … FOR UPDATE and subsequent UPDATE then yes, that’s useless, because it’ll do the work in two separate transactions; the row lock taken by the FOR UPDATE is cleared when the transaction is rolled back/committed upon returning the connection to the pool (if you don’t do this yourself). If it wasn’t you’d get stuck when you tried to UPDATE from a different connection, so either way it can’t work.

      Just keep the connection. If you can’t do that – say, if it’s over user think-time or between separate HTTP requests – then yes, you’ll want to use optimistic concurrency control. That’s why it’s so popular with Java ORMs like Hibernate that’re heavily used with web apps.

  7. Very good article. One of the major benefit for using an ORM tool is probably the optimistic locking mechanism with the “transactional write-behind” flushing mechanism coming closely.

    Like you said, SERIALIZABLE gives you the ultimate “pessimistic locking” “lost update” collision detection, but according to Amdahl’s Law the more serial the code the less you can benefit from parallelization.

    My all time favorite is Optimistic Locking with an automatic retry mechanism (http://vladmihalcea.com/2013/11/15/optimistic-locking-retry-with-jpa/).

Leave A Reply