Preamble
A key component of PostgreSQL performance tuning is sorting. But many people frequently misunderstand or simply ignore tuning sorts. I made the decision to create a PostgreSQL blog to demonstrate how sorts can be customized in PostgreSQL.
Creating sample data
I first created a couple of million rows to demonstrate how sorting functions:
test=# CREATE TABLE t_test (x numeric); CREATE TABLE test=# INSERT INTO t_test SELECT random() FROM generate_series(1, 5000000); INSERT 0 5000000 test=# ANALYZE ; ANALYZE
The code loads 5 million random values into a table that is created. You’ll see that data can load in just a few seconds.
Sorting data in PostgreSQL
Let’s try to organize the information. I’m using the clearest possible statements to keep things simple. As you can see, PostgreSQL cannot fit the data we want to sort in memory, so it must sort the data on disk. A little bit more than 100 MB of data are transferred to disk in this case:
test=# explain analyze SELECT * FROM t_test ORDER BY x; QUERY PLAN -------------------------------------------------------------------------- Sort (cost=804270.42..816770.42 rows=5000000 width=11) (actual time=4503.484..6475.222 rows=5000000 loops=1) Sort Key: x Sort Method: external merge Disk: 102896kB -> Seq Scan on t_test (cost=0.00..77028.00 rows=5000000 width=11) (actual time=0.035..282.427 rows=5000000 loops=1) Planning time: 0.139 ms Execution time: 6606.637 ms (6 rows)
Why doesn’t PostgreSQL perform an in-memory sort of all the data? This is due to the work_mem parameter, which has a default value of 4 MB:
test=# SHOW work_mem; work_mem ---------- 4MB (1 row)
Work_mem informs the server that a maximum of 4 MB can be used for each operation (per sort, grouping operation, etc.). Sorting a large amount of data requires PostgreSQL to move a lot of data to disk, which obviously takes time.
Thankfully, changing work_mem is simple and can even be done at the session level.
Speeding up sorts in PostgreSQL – using more work_mem
Change work_mem for the duration of the current session and observe what happens to the previous example.
test=# SET work_mem TO '1 GB'; SET
Using SET is the most convenient method for instantly changing work_mem. I’ve set the parameter in this instance to 1 GB. PostgreSQL now has enough RAM to execute tasks in memory:
test=# explain analyze SELECT * FROM t_test ORDER BY x; QUERY PLAN --------------------------------------------------------------------------- Sort (cost=633365.42..645865.42 rows=5000000 width=11) (actual time=1794.953..2529.398 rows=5000000 loops=1) Sort Key: x Sort Method: quicksort Memory: 430984kB -> Seq Scan on t_test (cost=0.00..77028.00 rows=5000000 width=11) (actual time=0.075..296.167 rows=5000000 loops=1) Planning time: 0.067 ms Execution time: 2686.635 ms (6 rows)
The effect on performance is astounding. The time has decreased by about 60%, from 6.6 seconds to about 2.7 seconds. As can be seen, PostgreSQL substitutes “quicksort” for “external merge Disk.” If you want to speed up and tune sorting in PostgreSQL, there is no way of doing that without changing work_mem. THE MOST IMPORTANT KNOB YOU HAVE IS THE work_mem parameter. The cool thing is that work_mem is used for more than just speeding up sorts; it will also benefit aggregations and other operations.
Taking care of partial sorts
There are three different types of sort algorithms in PostgreSQL as of version 10:
- Disk external sort
- quicksort
- prime-N heapsort
If you only need a few sorted rows, “top-N heapsort” is what you should use. For instance: The top 10, the bottom 10, and so forth. The efficient “top-N heapsort” returns the desired data almost immediately:
test=# explain analyze SELECT * FROM t_test ORDER BY x LIMIT 10; QUERY PLAN ---------------------------------------------------------------------------------- Limit (cost=185076.20..185076.23 rows=10 width=11) (actual time=896.739..896.740 rows=10 loops=1) -> Sort (cost=185076.20..197576.20 rows=5000000 width=11) (actual time=896.737..896.738 rows=10 loops=1) Sort Key: x Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on t_test (cost=0.00..77028.00 rows=5000000 width=11) (actual time=1.154..282.408 rows=5000000 loops=1) Planning time: 0.087 ms Execution time: 896.768 ms (7 rows)
Wow, the query returns in less than one second.
Improving sorting: Consider indexing …
Sorts can be accelerated well by work_mem. To start with, not sorting can make sense in many circumstances. Indexes are a useful tool for “sorting input” for the database engine. In actuality, a sorted list and a btree are somewhat comparable.
Sorting will also be necessary when creating indexes (btrees). When the CREATE INDEX command was first introduced, PostgreSQL used work_mem to specify how much memory should be used to create the index. This is no longer the case: The maintenance_work_mem parameter in current versions of PostgreSQL instructs DDLs how much memory to use.
Here’s an illustration:
timing test=# Timing is on. test=# CREATE INDEX idx_x ON t_test (x); CREATE INDEX Time: 4648.530 ms (00:04.649)
The default setting for maintenance_work_mem is 64 MB, but this can of course be changed:
test=# SET maintenance_work_mem TO '1 GB'; SET Time: 0.469 ms
With more memory, index creation will be much faster:
test=# CREATE INDEX idx_x2 ON t_test (x); CREATE INDEX Time: 3083.661 ms (00:03.084)
In this situation, CREATE INDEX can sort the data using up to 1 GB of RAM, which is obviously much faster than using the disk. This is particularly helpful if you want to build big indexes.
The query will run much more quickly if the proper indexes are set up. Here’s an illustration:
test=# explain analyze SELECT * FROM t_test ORDER BY x LIMIT 10; QUERY PLAN -------------------------------------------------------------------------------- Limit (cost=0.43..0.95 rows=10 width=11) (actual time=0.068..0.087 rows=10 loops=1) -> Index Only Scan using idx_x2 on t_test (cost=0.43..260132.21 rows=5000000 width=11) (actual time=0.066..0.084 rows=10 loops=1) Heap Fetches: 10 Planning time: 0.130 ms Execution time: 0.119 ms (5 rows)
In my illustration, the query takes much less time than a millisecond. If your database frequently sorts large amounts of data, instead of continually increasing work_mem, think about using better indexes to speed up the process.
Sorting in PostgreSQL and tablespaces
Tablespaces are widely used to scale I/O by many people today. PostgreSQL only employs one tablespace by default, which is a bottleneck that is easily created. Adding more hardware to PostgreSQL is possible with tablespaces.
Let’s suppose you frequently sort a lot of data: Administrators can manage where temporary files are sent to disk by using the parameter temp_tablespaces. Sorting can be sped up by using a different tablespace for temporary files.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
How Enteros Transforms Healthcare Cost Management with AI Financial Intelligence and Database Performance Optimization
- 4 December 2025
- Database Performance Management
Introduction The healthcare sector is facing unprecedented financial and operational pressure. As medical organizations modernize their IT environments—embracing AI-driven diagnostics, telemedicine platforms, electronic health record (EHR) systems, imaging repositories, and cloud-native applications—the cost of operating these digital workloads continues to surge. At the same time, inefficiencies within databases, data pipelines, clinical software platforms, and analytics … Continue reading “How Enteros Transforms Healthcare Cost Management with AI Financial Intelligence and Database Performance Optimization”
Optimizing Retail Digital Operations: Enteros AI Platform for Accurate Cost Estimation and Superior Performance Management
Introduction The retail sector is undergoing one of the fastest digital transformations in history. From omnichannel commerce and predictive analytics to inventory automation and personalized customer experiences, today’s retail enterprises depend on complex, high-volume digital systems. These systems—spanning eCommerce platforms, databases, cloud services, POS solutions, and logistics software—process massive real-time workloads that directly influence customer … Continue reading “Optimizing Retail Digital Operations: Enteros AI Platform for Accurate Cost Estimation and Superior Performance Management”
How Technology Teams Improve Performance Management with Enteros’ AIOps and AI-First Operations Framework
- 3 December 2025
- Database Performance Management
Introduction The technology sector is undergoing a rapid transformation as cloud-native architectures, SaaS ecosystems, and real-time data systems redefine how organizations operate. Yet with this digital acceleration comes an overwhelming surge in complexity — distributed microservices, multi-cloud deployments, AI-augmented workflows, and massive data pipelines that demand precision, speed, and resilience. To navigate this complexity, enterprises … Continue reading “How Technology Teams Improve Performance Management with Enteros’ AIOps and AI-First Operations Framework”
The Future of Healthcare Databases: Enteros’ GenAI and AI Performance Management at Work
Introduction The healthcare sector is undergoing a digital revolution unlike anything seen before. From AI-assisted diagnostics and precision medicine to telehealth platforms and clinical research systems, today’s healthcare organizations rely heavily on massive data ecosystems. Databases power everything — electronic health records (EHRs), patient management systems, revenue cycle applications, insurance claim platforms, imaging archives, and … Continue reading “The Future of Healthcare Databases: Enteros’ GenAI and AI Performance Management at Work”