Preamble
FOR UPDATE to a query can introduce transaction anomalies. This article will talk about what’s going on and what can be done about it. Get ready to learn some implementation specifics for PostgreSQL!The example
preparing the situation
There is a table of bills and a table of the items on a bill:
(bill_id bigint PRIMARY KEY, total numeric(15,2) DEFAULT 0 NOT NULL, CREATE TABLE bill); Create the table item with the columns amount numeric(15,2) and item_id (both bigint primary keys).
Let’s add some information:
INSERT INTO ITEM (item_id, bill_id, amount) VALUES (101, 1, 10.0), (102, 1, 20.0), and (103, 1, 30.0); INSERT INTO BILL (bill_id, total) VALUES (1, 60.0);
new item being added to the bill
Adding a new item would be simpleINSERT, but there is a catch: bill.total should always be the sum of the item.amount for all items on that bill. This constraint introduces redundancy to the data model. Usually, it’s best not to store that sum in the database but to figure it out when you ask for information from the database. But there are certain reasons that can justify storing redundant data:
- calculating the sum on the fly is too expensive, for example if the calculation is performed very often of the bills are large
- you want to place a constraint on the sum, for example to guarantee that it is never negative
Now the only proper way to do this is with a trigger, but for simplicity’s sake, we will do it explicitly in the transaction that adds the new item:
BEGIN; INSERT INTO item (item_id, bill_id, amount) VALUES (104, 1, 40.0); UPDATE bill SET bill_id = 1 AND total = bill + 40.0; COMMIT;
a parallel report
We could go through the following steps to query the bill data:
SELECT item.amount, bill.total, and bill_id using (bill_id), FROM bill LEFT JOIN item;
If we run the report before the inserting transaction commits, the following will happen:
bill_id, amount, and overall ═════════╪════════╪═══════ 1 │ 10.00 │ 60.00 1 │ 20.00 │ 60.00 1 │ 30.00 │ 60.00 (3 rows)
If we run the report after the inserting transaction commits, we get the following:
bill_id, amount, and total ═════════╪════════╪════════ 1 │ 10.00 │ 100.00 1 │ 20.00 │ 100.00 1 │ 30.00 │ 100.00 1 │ 40.00 │ 100.00 (4 rows)
Can there be any transaction anomalies in this example?
A race condition known as a transaction anomaly occurs when multiple transactions are executed concurrently and produce a result that is impossible to achieve with any serialized execution of the transactions. Note that not all serialized execution orders need to produce the same result; in our example above, it makes a difference if we run the report before or after the inserting transaction.
Concurrent transaction execution in our example will never result in an anomaly. No matter what we do, the report will always be consistent, and the total will match the sum of the items. The explanation for this is that PostgreSQL takes a snapshot at the beginning of the reporting query. This kind of snapshot lets the query know which transactions it can see and which it can’t. Now either the reporting transaction is part of the snapshot or not, and the query will either see all or none of the data modifications from the inserting transaction.
A surprising transaction anomaly with SELECT … FOR UPDATE
We can addFOR UPDATE to the reporting query, for example, if we plan to delete bills that have no items.FOR UPDATE takes a row lock on the rows that the query returned, which is useful to stop other transactions from changing these rows of data:
SELECT item.amount, bill.total, and bill_id FOR BILL UPDATE; FROM bill LEFT, JOIN item USING (bill_id);
This only locks the rows inbill. Note that it would be an error to try and lock the rows returned from it (the reason for that would be material for another article). Also, note that FOR UPDATE is the correct lock if we intend toDELETE, and the weakerFOR NO KEY UPDATE lock would be appropriate if we intend toUPDATE.
Right now, timing is crucial. We run the modified reporting query after the inserting transaction has performed the UPDATE, but before the COMMIT. That will cause the reporting query to hang (because the row in bill is locked), but after the COMMIT, the lock is released, and we get the following output:
bill_id, amount, and total ═════════╪════════╪════════ 1 │ 10.00 │ 100.00 1 │ 20.00 │ 100.00 1 │ 30.00 │ 100.00 (3 rows)
We now have an inconsistent result because the sum of the parts is no longer equal to the whole. Even though the new item was added before the total was updated, for some reason we only see the updated total.
Explanation for the transaction anomaly
We must first comprehend the query processing steps used by PostgreSQL in order to comprehend what is happening. For that, we use EXPLAIN to see the execution plan:
EXPLAIN (FREE OF CHARGE) SELECT bill_id, item.amount, and bill.total FROM bill FOR BILL UPDATE, LEFT JOIN ITEM USING (bill_id); QUERY PLAN ══════════════════════════════════════════════════ LockRows -> Hash Right Join Hash Cond: (item.bill_id = bill.bill_id) -> Seq Scan on item -> Hash -> Seq Scan on bill (6 rows)
As you can see, locking the rows completes the process. So PostgreSQL first scansbill, which results in the old version of the data (amount = 60), because that is the snapshot that the query sees. The query also only sees the three original entries from item. When PostgreSQL attempts to lock the row from bill, it becomes entangled behind the row lock, which explains what is currently happening (the “second updater” is our select… FOR UPDATE, emphasis mine):
In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). […] If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (theWHEREclause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row.
In other words, a version of the row that is normally hidden from view in the query’s snapshot may be what is locked (and returned).
“EvalPlanQual,” which is short for “EPQ,” is what PostgreSQL calls the process of running the operation again with the latest row version. This name comes from the function that makes it work. Reading the README in the source code is a good idea.
Is that a PostgreSQL bug?
Not at all, no. First of all, it functions as it should. The behavior is also condoned by the SQL standard: our query was executed with the default transaction isolation levelREAD COMMITTED, and all that the standard requires is that a READ COMMITTED transaction never see data from uncommitted transactions. Only with the serializable transaction isolation level does the SQL standard require that you never have a transaction anomaly.
Consider PostgreSQL’s alternatives in this case and decide if they would be preferable:
- No, that would not make sense because a subsequent update can only update the most recent version of a row, which would then be different from what it told you it was, so PostgreSQL cannot lock the older version of the row.
- If there are many concurrent transactions, that could mean that a query keeps repeating and never gets finished because PostgreSQL could abandon everything, take a new snapshot, and start over from the beginning when the lock is released and the row has changed.
- PostgreSQL could throw a serialization failure. The SQL standard allows this explicitly “when [the database] detects the inability to guarantee the serializability of two or more concurrent SQL-transactions”. But that would mean that you have to be ready to repeat transactions in a
READ COMMITTEDtransaction, which would be cumbersome. It is a desirable feature that PostgreSQL does not throw serialization failures inREAD COMMITTEDtransactions!
How can I avoid these transaction anomalies?
Using a higher transaction isolation level is the proper way to prevent transaction anomalies. The SQL standard requires thatSERIALIZABLE transactions be free from anomalies (even though Oracle doesn’t seem to have gotten the message). However, in PostgreSQL, it is enough to use the REPEATABLE READ isolation level. The reason is that, on theREPEATABLE READ isolation level, PostgreSQL uses the same snapshot for the whole transaction. If you try to change a row that was changed by another transaction after the snapshot was taken, you will get a serialization failure. Consequently, SELECT ... FOR UPDATE, which should behave consistent with UPDATE, does the same thing:
AT FIRST READ, REPEATABLE ISOLATION LEVEL: Select item.amount, bill.total, and bill_id from the bill. LEFT JOIN ITEM USING (bill_id) FOR BILL UPDATE; ERROR: could not serialize access due to concurrent update
Then you have to do what you always have to do when you get a serialization failure: rollback and repeat the transaction.
Replicable Rollback: Begin Isolation Level READ; SELECT bill_id, item.amount, bill.total FROM bill, LEFT JOIN item WITH (bill_id) FOR BILL UPDATE, bill_id amount total ═════════╪════════╪════════ 1 │ 10.00 │ 100.00 1 │ 20.00 │ 100.00 1 │ 30.00 │ 100.00 1 │ 40.00 │ 100.00 (4 rows)
Different from SERIALIZABLE, using REPEATABLE READ transactions is cheap in PostgreSQL. So this is a solution that is always viable.
Conclusion
You can get surprising transaction anomalies with SELECT ... FOR UPDATE on the default READ COMMITTED transaction isolation level. The simple way to avoid them is to use a REPEATABLE READ transaction.
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
How Enteros Transforms Healthcare Cost Management with AI Financial Intelligence and Database Performance Optimization
- 4 December 2025
- Database Performance Management
Introduction The healthcare sector is facing unprecedented financial and operational pressure. As medical organizations modernize their IT environments—embracing AI-driven diagnostics, telemedicine platforms, electronic health record (EHR) systems, imaging repositories, and cloud-native applications—the cost of operating these digital workloads continues to surge. At the same time, inefficiencies within databases, data pipelines, clinical software platforms, and analytics … Continue reading “How Enteros Transforms Healthcare Cost Management with AI Financial Intelligence and Database Performance Optimization”
Optimizing Retail Digital Operations: Enteros AI Platform for Accurate Cost Estimation and Superior Performance Management
Introduction The retail sector is undergoing one of the fastest digital transformations in history. From omnichannel commerce and predictive analytics to inventory automation and personalized customer experiences, today’s retail enterprises depend on complex, high-volume digital systems. These systems—spanning eCommerce platforms, databases, cloud services, POS solutions, and logistics software—process massive real-time workloads that directly influence customer … Continue reading “Optimizing Retail Digital Operations: Enteros AI Platform for Accurate Cost Estimation and Superior Performance Management”
How Technology Teams Improve Performance Management with Enteros’ AIOps and AI-First Operations Framework
- 3 December 2025
- Database Performance Management
Introduction The technology sector is undergoing a rapid transformation as cloud-native architectures, SaaS ecosystems, and real-time data systems redefine how organizations operate. Yet with this digital acceleration comes an overwhelming surge in complexity — distributed microservices, multi-cloud deployments, AI-augmented workflows, and massive data pipelines that demand precision, speed, and resilience. To navigate this complexity, enterprises … Continue reading “How Technology Teams Improve Performance Management with Enteros’ AIOps and AI-First Operations Framework”
The Future of Healthcare Databases: Enteros’ GenAI and AI Performance Management at Work
Introduction The healthcare sector is undergoing a digital revolution unlike anything seen before. From AI-assisted diagnostics and precision medicine to telehealth platforms and clinical research systems, today’s healthcare organizations rely heavily on massive data ecosystems. Databases power everything — electronic health records (EHRs), patient management systems, revenue cycle applications, insurance claim platforms, imaging archives, and … Continue reading “The Future of Healthcare Databases: Enteros’ GenAI and AI Performance Management at Work”