Preamble
End users can store BLOBs (binary large objects) in the database using PostgreSQL. These functions are frequently used by users to manage data directly in SQL. Whether or not this is a good thing has been the subject of lengthy debate. We anticipate that particular conversation to last forever. If you fall into the “pro BLOB” camp, however, we want to offer some tips on how PostgreSQL can handle binary data with the greatest amount of effectiveness.
Loading files into the database
BLOBs are essential; to show how they work, I have created a simple file that we can import into the database later:
iMac:~ hs$ echo abcdefgh > /tmp/file.txt
Calling the lo_import function with the name of the file you want to load (as the superuser) will import this file. The process is as follows:
test=# SELECT lo_import('/tmp/file.txt'); lo_import ----------- 98425 (1 row)
As you can see, PostgreSQL provides us with a number (= object ID). If you want to keep track of those values, you must store them somewhere because the filename has “vanished.” Making a simple table is the way to do this:
test=# CREATE TABLE t_file (name text, oid_number oid); CREATE TABLE test=# INSERT INTO t_file VALUES ('/tmp/file.txt', lo_import('/tmp/file.txt')) RETURNING *; name | oid_number ---------------+------------ /tmp/file.txt | 98432 (1 row)
You have the choice of storing object IDs using the OID data type in PostgreSQL. In case the path is required later, storing the filename is simple and can be done in a text or varchar column. In general, the large object has nothing to do with the file in the filesystem and is entirely independent of it. Therefore, saving the filename serves only as a reminder of what we imported.
However, don’t worry about a single file. We’ll see what happens if we import it a million times:
test=# INSERT INTO t_file SELECT '/tmp/file.txt', lo_import('/tmp/file.txt') FROM generate_series(1, 1000000); INSERT 0 1000000 test=# TABLE t_file LIMIT 10; name | oid_number ---------------+------------ /tmp/file.txt | 98432 /tmp/file.txt | 98433 /tmp/file.txt | 98434 /tmp/file.txt | 98435 /tmp/file.txt | 98436 /tmp/file.txt | 98437 /tmp/file.txt | 98438 /tmp/file.txt | 98439 /tmp/file.txt | 98440 /tmp/file.txt | 98441 (10 rows)
The file was imported flawlessly. As can be seen in the preceding listing, every file has a unique object ID.
Behind the scenes: How PostgreSQL stores BLOBs
We can examine the internals to determine how data is stored internally now that you have seen how to import data. PostgreSQL takes care of things by adding the information to pg_largeobject:
test=# \d pg_largeobject Table "pg_catalog.pg_largeobject" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- loid | oid | | not null | pageno | integer | | not null | data | bytea | | not null | Indexes: "pg_largeobject_loid_pn_index" UNIQUE, btree (loid, pageno)
The object ID that we have kept in our table is contained in the “loid.” In a column labeled “bytea” (byte array), the actual data is kept. PostgreSQL divides up large files into smaller ones because some of them might be quite large.
Fields in byte arrays are simple to display. However, in its default configuration, it is not really human readable (which is pointless because we are dealing with binary data). In my example, I’ve imported some text to make the “escape” format a little easier to read:
test=# SET bytea_output TO escape; SET test=# SELECT * FROM pg_largeobject WHERE loid = 98437 ORDER BY pageno; loid | pageno | data -------+--------+-------------- 98437 | 0 | abcdefgh\012 (1 row)
The “bytea” data type is actually a wrapper around the BLOB interface, as can be seen in this example. The real magic lies in that.
Removing BLOBs
Let’s continue by deleting BLOBs once more. Calling the “lo_unlink” function is the appropriate action to take. The entry in the system catalog will be deleted as a result.
test=# SELECT lo_unlink(98432); lo_unlink ----------- 1 (1 row)
It will be simple to remove a simple entry. Let’s attempt to respond to the following query, though: What happens when a single transaction removes millions of BLOBs from the system?
Removing millions of BLOBs at a time
Executing the deletion is simple. On the other hand, the outcome is somewhat unexpected:
test=# BEGIN; BEGIN test=*# SELECT lo_unlink(oid_number) FROM t_file ; ERROR: out of shared memory HINT: You might need to increase max_locks_per_transaction. test=!# ROLLBACK; ROLLBACK
We are out of memory, so PostgreSQL cannot execute the transaction! How is that possible? A large object needs to be locked before being deleted. The issue is that shared memory can only support a certain number of locks.
Please take note that row locks are NOT kept in shared memory. Only objects, such as table locks, are true of this.
As a result, our memory is getting low. How many locks can we store is now the main concern. Three configuration variables (found in postgresql.conf) contain the solution:
test=# SHOW max_locks_per_transaction; max_locks_per_transaction --------------------------- 64 (1 row) test=# SHOW max_connections; max_connections ----------------- 100 (1 row) test=# SHOW max_prepared_transactions; max_prepared_transactions --------------------------- 0 (1 row)
There are: locks at my disposal.
number_of_locks = max_locks_per_transaction * (max_connections + max_prepared_transactions).
In my case, the answer is 6.400. In other words, we cannot process these deletions in a single transaction.
There are essentially two approaches to this issue:
- A restart is necessary when changing the PostgreSQL configuration.
- In chunks, remove BLOBs
These are the only methods we have to remove the BLOBs from the database.
Dealing with orphan BLOBs
Occasionally, BLOBs are lost but the object itself is not unlinked because the reference to the OID is lost. A command-line tool is available in this situation:
iMac: hs$ vacuumlo --help vacuumlo removes unreferenced large objects from databases. Usage: vacuumlo [OPTION]... DBNAME... Options: -l, --limit=LIMIT commit after removing each of the LIMIT large objects -n, --dry-run, don't remove large objects, just show what would be done. -v, --verbose, write a lot of progress messages -V, --version output version information, then exit -?, --help show this help, then exit Connection options: -h, --host=HOSTNAME database server host or socket directory -p, --port=PORT database server port -U, --username=USERNAME user name to connect as -w, --no-password never prompt for password -W, --password force password prompt
Vacuumlo will search for and remove any large objects that are not referenced. In the event that your application is glitchy or something unfavorable occurs, this is a good way to clean a database.
Finally …
We want to let you know that in case you are having trouble with BLOBs, we also provide PostgreSQL consulting and round-the-clock database support. You can undoubtedly get assistance from our highly competent team with your BLOB-related issues.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Harnessing Generative AI for Smarter Database Performance Management in the BFSI Sector—Powered by Enteros
- 18 September 2025
- Database Performance Management
Introduction The Banking, Financial Services, and Insurance (BFSI) sector is the backbone of the global economy. With millions of transactions occurring every second, the industry relies heavily on the ability to store, process, and analyze massive volumes of data. From real-time fraud detection and credit risk assessments to claims processing and regulatory compliance, databases play … Continue reading “Harnessing Generative AI for Smarter Database Performance Management in the BFSI Sector—Powered by Enteros”
Driving RevOps Efficiency in the Healthcare Sector with Enteros: AIops-Powered Database Performance Optimization
Introduction The healthcare sector is under immense pressure to modernize operations while delivering high-quality, cost-effective care. Hospitals, research institutions, and pharmaceutical companies are generating massive amounts of data from electronic health records (EHRs), diagnostic imaging, genomic sequencing, clinical trials, IoT-enabled medical devices, and insurance claim systems. Managing and optimizing these vast databases is critical not … Continue reading “Driving RevOps Efficiency in the Healthcare Sector with Enteros: AIops-Powered Database Performance Optimization”
Black Friday e-commerce crashes from DB latency
Introduction Black Friday is the biggest day of the year for e-commerce. Shoppers flood online stores, hunting for deals, and businesses prepare for record-breaking traffic. But too often, excitement turns into frustration as websites freeze, checkouts fail, and carts vanish. Behind the scenes, it’s not just the servers struggling—it’s the databases. When databases can’t keep … Continue reading “Black Friday e-commerce crashes from DB latency”
Space research simulations collapsing from DB overload
Introduction Space research depends on simulations that push technology to its limits. From modeling rocket launches to predicting orbital dynamics, these simulations generate massive streams of data. But increasingly, the bottleneck isn’t computing power—it’s the databases that store and process this information. When databases fail, simulations stall, research timelines slip, and millions in funding are … Continue reading “Space research simulations collapsing from DB overload”