How Table Bloat Can Be Caused by a Bad Network Configuration

Preamble

I recently encountered an intriguing support case that demonstrates how a problem’s root cause may occasionally be found in the most unlikely of places.

About table bloat

After an UPDATE or DELETE, PostgreSQL keeps old versions of a table row around. This way, sessions that want to read the row at the same time don’t have to wait. But eventually cleanup will be required for this “garbage.” That is what the autovacuum daemon is supposed to do.

In most cases, you won’t need to worry about that, but occasionally something will go wrong. The table then continues to expand because the old row versions are not deleted. This will not only waste storage space, but it will also make index scans take longer.

To get rid of the bloat, you can use VACUUM (FULL) and other tools like pg_squeeze. But it is important to find and fix the cause of the table bloat so that it does not reappear.

The problem

A client of mine called me after noticing table bloat in the pg attribute system catalog table, which holds the metadata for the table columns.

This may occur if table columns are often changed or removed. Most of the time, these are temporary tables that get deleted automatically when a session or transaction ends.

Temporary tables were used extensively by the client. But the problem was only seen on a small number of Linux servers that run databases and have a lot of Linux machines.

Searching the cause

I looked over the list of typical reasons for table bloat:

The status “idle in transaction” for open database transactions.
These will prevent autovacuum from removing outdated row versions after the transaction has begun.
A data modification rate that is too fast for autovacuum to handle.
Making autovacuum more aggressive is the appropriate response in this situation.

Both were untrue; the second possibility was ruled out because it would cause bloat on all machines, not just a select few, so it could not be true.

I next examined the impacted table’s usage information:

 dbname=> SELECT * FROM pg_stat_sys_tables dbname-> WHERE relname = 'pg_attribute'; -[ RECORD 1 ]-------+------------- relid | 1249 schemaname | pg_catalog relname | pg_attribute seq_scan | 167081 seq_tup_read | 484738 idx_scan | 1506941 idx_tup_fetch | 4163837 n_tup_ins | 0 n_tup_upd | 0 n_tup_del | 0 n_tup_hot_upd | 0 n_live_tup | 0 n_dead_tup | 0 n_mod_since_analyze | 0 last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0 autovacuum_count | 0 analyze_count | 0 autoanalyze_count | 0

This demonstrates that the autovacuum has never operated. But what’s more intriguing is that we discover the reason why it didn’t run:

PostgreSQL made no attempt to remove any dead tuples (row versions that could be), as it believes there aren’t any.

The statistics collector

Suspicion was confirmed when the following message was found in the server logs:

Because the statistician is not responding, outdated statistics are being used instead of current ones.

The PostgreSQL backend process that gathers usage statistics is called the statistics collector process.

Following each activity, PostgreSQL backends send statistics about that activity. These statistics updates are sent through a UDP socket on localhost; that is created at PostgreSQL startup time. The statistics collector reads from the socket and aggregates the collected statistics.

Closing in on the problem

The statistics gatherer was active:

 918 1 /usr/pgsql-10/bin/postmaster -D /var/lib/pgsql/10/data/ 947 918 postgres: logger process 964 918 postgres: checkpointer process 965 918 postgres: writer process 966 918 postgres: wal writer process 967 918 postgres: autovacuum launcher process 968 918 postgres: stats collector process 969 918 postgres: bgworker: logical replication launcher

I followed the statistics collector’s execution to see what it was doing and to identify any issues.

 # strace -p 968 strace: Process 968 attached epoll_pwait(3,

On the UDP socket, the statistics collector was waiting for messages, but none were getting through.

I examined the UPD socket:

 # netstat -u -n -p Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name udp6 0 0 ::1:59517 ::1:59517 ESTABLISHED 918/postmaster

Nothing odd has happened so far.

However, I got a different result when I tried the same thing on a machine that wasn’t having the issue:

 # netstat -u -n -p Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name udp 0 0 127.0.0.1:46031 127.0.0.1:46031 ESTABLISHED 9303/postmaster

It turned out that on all systems that experienced table bloat, the statistics collector socket was created on the IPv6 address for localhost, while all working systems were using the IPv4 address!

However, IPv6 was turned off for the loopback interface on all machines:

 # ifconfig lo lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 loop txqueuelen 1000 (Local Loopback) RX packets 6897 bytes 2372420 (2.2 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 6897 bytes 2372420 (2.2 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

Nailing the bug to the wall

PostgreSQL uses the POSIX function getaddrinfo(3) to resolve localhost.

PostgreSQL will loop through each address returned by that call, create a UDP socket, and test it until it has one that functions because it is crucial to have a functional statistics collection.

As a result, we can assume that IPv6 was functional at the time PostgreSQL was launched.

After more research, it was found that IPv6 was turned off during the boot process, but there was a race condition:
PostgreSQL would occasionally start with IPv6 disabled and other times without it. And it was on these latter machines that the statistics collector stopped working and the tables became bloated!

The problem was fixed by changing the boot order so that IPv6 is always turned off before PostgreSQL starts.

Conclusion

This demonstrates once more how a computer issue’s root cause might not be what you initially think it is. It also demonstrates the importance of having solid operating system knowledge for database administrators.

About Enteros

Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.

The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.

Are you interested in writing for Enteros’ Blog? Please send us a pitch!

Preventing Database Bottlenecks with Intelligent Workload Analytics and Automation

11 June 2026
Database Performance Management

In today’s digital economy, application performance directly impacts customer satisfaction, operational efficiency, and business growth. Organizations rely on databases to power customer-facing applications, financial transactions, e-commerce platforms, analytics systems, SaaS solutions, and countless other mission-critical services. As enterprises continue to embrace cloud-native architectures, microservices, multi-cloud deployments, and real-time data processing, database workloads have become increasingly … Continue reading “Preventing Database Bottlenecks with Intelligent Workload Analytics and Automation”

The Future of AI-Powered Database Performance Management in Enterprise IT Operations

Database Performance Management

Enterprise IT operations are undergoing a significant transformation. As organizations accelerate digital transformation initiatives, adopt cloud-native architectures, expand multi-cloud deployments, and implement AI-driven business strategies, the complexity of managing database environments continues to grow. Databases have evolved from simple data repositories into mission-critical components that power applications, analytics platforms, customer experiences, and business operations. Modern … Continue reading “The Future of AI-Powered Database Performance Management in Enterprise IT Operations”

How to Transform Financial Operations with Enteros Database Software and Growth Intelligence

10 June 2026
Database Performance Management

Introduction The financial services industry is experiencing unprecedented digital transformation. Banks, insurance providers, fintech organizations, investment firms, and financial institutions are rapidly modernizing their technology infrastructures to meet evolving customer expectations, regulatory requirements, and competitive market demands. Modern financial organizations now rely on: Digital banking platforms Mobile financial applications Payment processing systems Risk management platforms … Continue reading “How to Transform Financial Operations with Enteros Database Software and Growth Intelligence”

How to Enable Intelligent AI Growth with Enteros Database Performance Management and Operational Intelligence

Database Performance Management

Introduction Artificial Intelligence (AI) is transforming industries across the globe. From generative AI applications and large language models (LLMs) to predictive analytics, intelligent automation, and machine learning platforms, organizations are investing heavily in AI technologies to improve productivity, accelerate innovation, and drive business growth. Modern AI ecosystems now support: Generative AI platforms Machine learning environments … Continue reading “How to Enable Intelligent AI Growth with Enteros Database Performance Management and Operational Intelligence”