Preamble
I’ve already covered a few specifics regarding the PostgreSQL storage engine, zheap. The goal of zheap is to make UPDATE statements run faster so that tables don’t get too big. When an UPDATE is done on a row of data, PostgreSQL now puts two copies of that row in the same data file. While not a harmful tactic, in some circumstances it can result in table bloat. The idea behind zheap undo, which is similar to Oracle, is to handle old copies of rows. The issue was that this “zheap undo” was not cleaned up until now. We created a “discard worker” to take care of the cleanup because of this. The worker minimizes UPDATE-related table bloat by eliminating heaps of UNDO logs.
Getting PostgreSQL and zheap
You must install a PostgreSQL version with zheap support before learning more about what the discard worker does. We developed a container that enables users to test Zip immediately in an effort to make things simpler for everyone.
Zheap was originally intended to be an extension. But unlike other extensions, you can’t simply install it on top of PostgreSQL because it’s too intrusive. zheap does need core patches.
Additionally, I want to emphasize that this is a tech preview rather than a production-ready version. We’re delighted to make it available to the public in order to solicit comments and perhaps even some bug reports.
You can use the free Docker container by clicking here. The installation process is easy and clear-cut:
[hs@fedora ~]$ docker run --name zHeap -p 5432:5432 \ -d cybertecpostgresql/zheap
Run psql as usual to connect to the newly created instance. Postgres serves as both the username and the default password.
Username: postgres
Password: postgres
[hs@fedora ~]$ psql -h localhost -p 5432 -U postgres UNDO: Cleaning up using the discard worker
Once PostgreSQL with zheap support is operational, we can look at the process table as follows:
[hs@fedora ~]$ ps axf | grep post ... 2403084 ? Ss 0:00 \_ /home/hs/pgzheap/bin/postgres -D ./dbzheap 2403086 ? Ss 0:00 \_ postgres: checkpointer 2403087 ? Ss 0:00 \_ postgres: background writer 2403088 ? Ss 0:00 \_ postgres: walwriter 2403089 ? Ss 0:00 \_ postgres: autovacuum launcher 2403090 ? Ss 0:00 \_ postgres: stats collector 2403091 ? Ss 0:00 \_ postgres: undo discard worker 2403093 ? Ss 0:00 \_ postgres: logical replication launcher ...
Automatic launch of the discard worker. Check out its actions and the method used to store the undo log.
A variable needs to be changed before we can create a table and populate it with data.
test=# SHOW default_table_access_method; default_table_access_method ----------------------------- heap (1 row) test=# SET default_table_access_method TO zheap; SET
When creating a new table, this setting instructs PostgreSQL which storage engine to use by default. We usually want a lot. The preferred option in this instance, though, is zheap. To each CREATE TABLE, we can add the zheap option. You may want to set it, as I just did, for the duration of the session in many circumstances:
test=# CREATE TABLE t_large (id serial, name text); CREATE TABLE
It doesn’t really matter how the table is set up. Any layout will do.
Testing zheap UNDO with INSERT load
Let’s begin loading data:
test=# BEGIN; BEGIN test=*# INSERT INTO t_large (name) SELECT 'dummy' FROM generate_series(1, 10000000); INSERT 0 10000000 test=*# \d+ List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | Description --------+----------------+----------+----------+-------------+---------------+------------+------------- public | t_large | table | hs | permanent | zheap | 193 MB | public | t_large_id_seq | sequence | hs | permanent | | 8192 bytes | (2 rows)
The ability of zheap to rollback in the event of an error is crucial. As a result, a great deal of undo must be written: Before we commit the transaction, let’s take a look at the undo directories:
[hs@fedora dbzheap]$ ls -l ./pg_undo/ && ls -l ./base/undo/ total 8 -rw-------. 1 hs hs 52 12. Nov 11:47 00000001A84012A0 -rw-------. 1 hs hs 52 12. Nov 11:47 00000001C9F48970 total 432128 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.0065A00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.0065B00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.0065C00000 ... -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.007FD00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.007FE00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.007FF00000
As you can see, we’ve created a fairly large number of files in our base/undo directory. Let’s commit and see what happens:
test=*# COMMIT; COMMIT
When the new discard worker starts, PostgreSQL will recycle (= delete] these logs. Let’s take a closer look:
[hs@fedora dbzheap]$ ls -l ./pg_undo/ && ls -l ./base/undo/ total 8 -rw-------. 1 hs hs 52 12. Nov 11:47 00000001A84012A0 -rw-------. 1 hs hs 52 12. Nov 11:47 00000001C9F48970 total 2048 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.007FF00000 -rw-------. 1 hs hs 1048576 12. Nov 11:47 000000.0080000000
We anticipated the files to be gone, and they are.
Testing zheap UNDO with UPDATE load
The same will happen when we run UPDATE statements:
test=# BEGIN; BEGIN test=*# UPDATE t_large SET id = id - 1; UPDATE 10000000
The fact that a lot of WAL has been produced is crucial. As you can see, the file is about 600 MB in size.
[hs@fedora dbzheap]$ cd base/undo/ [hs@fedora undo]$ du -h 603M
Now, let’s commit the transaction:
test=*# COMMIT; COMMIT
What follows the commit statement is particularly intriguing:
[hs@fedora undo]$ date && du -h Fr 12. Nov 11:55:50 CET 2021 603M . [hs@fedora undo]$ date && du -h Fr 12. Nov 11:55:57 CET 2021 2,0M .
The zheap UNDO logs have been deleted, so that’s it. The cleanup is NOT a component of a commit; rather, it is carried out by the discard worker, whose job it is to ensure that the log is not terminated too soon. Keep in mind: You might not be the only transaction at work here, so the cleanup still has to be asynchronous (just like a vacuum is).
About Enteros
Enteros offers a patented database performance management SaaS platform. It proactively identifies root causes of complex business-impacting database scalability and performance issues across a growing number of clouds, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
How to Drive eCommerce Revenue Growth with Enteros Growth Management, RevOps Efficiency, and Cloud FinOps
- 8 April 2026
- Database Performance Management
Introduction The eCommerce sector has witnessed explosive growth over the past decade, fueled by digital transformation, mobile shopping, AI-driven personalization, and global online marketplaces. From startups to enterprise retailers, businesses are scaling rapidly to meet rising customer expectations for speed, convenience, and seamless experiences. However, this rapid growth introduces a critical challenge:how to increase revenue … Continue reading “How to Drive eCommerce Revenue Growth with Enteros Growth Management, RevOps Efficiency, and Cloud FinOps”
How to Drive Healthcare Sector Performance Growth with Enteros Database Management and AI SQL Optimization
Introduction The healthcare sector is undergoing a significant digital transformation driven by electronic health records (EHRs), telemedicine, AI-powered diagnostics, and real-time patient monitoring systems. Healthcare organizations are increasingly relying on data to deliver better patient outcomes, improve operational efficiency, and ensure regulatory compliance. However, with the exponential growth of healthcare data comes a major challenge:how … Continue reading “How to Drive Healthcare Sector Performance Growth with Enteros Database Management and AI SQL Optimization”
What to Know About Enteros Cost Attribution and AI Performance Management for Media Sector Growth with Generative AI
- 7 April 2026
- Database Performance Management
Introduction The media sector is undergoing a massive transformation fueled by digital streaming, real-time content delivery, AI-driven personalization, and data-intensive production workflows. From OTT platforms and digital publishing to gaming and broadcasting, media companies are handling enormous volumes of data while striving to deliver seamless user experiences. However, with this rapid growth comes a pressing … Continue reading “What to Know About Enteros Cost Attribution and AI Performance Management for Media Sector Growth with Generative AI”
How to Transform Financial Sector Operations with Enteros Database Management Platform, AIOps, and Cloud FinOps
Introduction The financial sector is undergoing rapid digital transformation driven by mobile banking, real-time payments, algorithmic trading, and AI-powered services. Financial institutions must deliver seamless, secure, and high-performance digital experiences while managing rising infrastructure costs and strict regulatory requirements. At the center of this transformation lies a critical challenge:how to optimize database performance, control cloud … Continue reading “How to Transform Financial Sector Operations with Enteros Database Management Platform, AIOps, and Cloud FinOps”