Preamble
I’ve been intrigued by the prospect of using PostgreSQL’s SKYLINE OF operator ever since Hannes Eder first proposed the concept on the PostgreSQL mailing list years ago. So what does a “Skyline query” actually mean? The basic idea is as follows: Consider that you want to take a vacation and are looking for a nice hotel by the water. The problem is that the hotels closer to the beach are overpriced while those farther back are inexpensive but far from the water. What constitutes the ideal compromise?
Exactly this is the topic of this article.
Here’s an illustration:
test=# CREATE TABLE t_hotel ( id serial, name text, price numeric, distance_beach numeric ); CREATE TABLE
The table stores the name of a hotel, the price and the distance to the beach. Let us add a couple of rows manually:
test=# INSERT INTO t_hotel (name, price, distance_beach) VALUES ('ABC Motel', 120, 2.4); INSERT 0 1 test=# INSERT INTO t_hotel (name, price, distance_beach) VALUES ('Crapstone Hotel', 90, 2.2); INSERT 0 1 test=# INSERT INTO t_hotel (name, price, distance_beach) VALUES ('Luxury Watch Spa Hotel', 495, 0.2); INSERT 0 1 test=# INSERT INTO t_hotel (name, price, distance_beach) VALUES ('Nowhere Middle Hotel', 45, 9.6); INSERT 0 1
If we choose our hotels and arrange them according to price, we will see that we are likely to stay in a cheap, subpar hotel that is far from the beach. This is obviously undesirable:
test=# SELECT * FROM t_hotel ORDER BY price; id | name | price | distance_beach ----+------------------------+-------+---------------- 4 | Nowhere Middle Hotel | 45 | 9.6 2 | Crapstone Hotel | 90 | 2.2 1 | ABC Motel | 120 | 2.4 3 | Luxury Watch Spa Hotel | 495 | 0.2 (4 rows)
Even though we will be close to the beach if we sort by distance, we won’t be able to afford it. Unfortunately, none of those questions will actually provide us with a good compromise:
test=# SELECT * FROM t_hotel ORDER BY distance_beach; id | name | price | distance_beach ----+------------------------+-------+---------------- 3 | Luxury Watch Spa Hotel | 495 | 0.2 2 | Crapstone Hotel | 90 | 2.2 1 | ABC Motel | 120 | 2.4 4 | Nowhere Middle Hotel | 45 | 9.6 (4 rows)
More advanced ordering in PostgreSQL
Thankfully, PostgreSQL enables the use of more complex sort criteria. It is tedious to sort by a single column. We want to handle various columns differently in some way. Customers may feel that distance in this situation is not truly linear. It doesn’t really matter if you are 20 or 50 meters from the beach any more. However, it really doesn’t matter if you’re 50 meters or 1 km away. I chose to use the square root of the distance to simplify things while keeping the price the same. The outcome appears much more favorable than before:
test=# SELECT price * sqrt(distance_beach), * FROM t_hotel ORDER BY 1; ?column? | id | name | price | distance_beach --------------+----+------------------------+-------+---------------- 133.491572 | 2 | Crapstone Hotel | 90 | 2.2 139.427400 | 4 | Nowhere Middle Hotel | 45 | 9.6 185.903200 | 1 | ABC Motel | 120 | 2.4 221.37072977 | 3 | Luxury Watch Spa Hotel | 495 | 0.2 (4 rows)
The Crapstone Hotel appears to be the best value in this area. Although it is not the cheapest hotel, it is fairly close and still affordable, so it might be best to reserve that one.
Looking at this tiny PostgreSQL query’s execution plan is where the trouble begins.
test=# explain SELECT price * sqrt(distance_beach), * FROM t_hotel ORDER BY 1; QUERY PLAN ------------------------------------------------------------------ Sort (cost=48.74..50.32 rows=630 width=132) Sort Key: ((price * sqrt(distance_beach))) -> Seq Scan on t_hotel (cost=0.00..19.45 rows=630 width=132) (3 rows)
All the data will be read by PostgreSQL and sorted using our specific criteria. Although this is nice for a small data set, if the amount of data keeps increasing, it will kill us.
Scaling up: Increasing the size of our data set
Let’s experiment by adding 5 million rows to our table to see what happens:
test=# TRUNCATE t_hotel ; TRUNCATE TABLE test=# INSERT INTO t_hotel (price, distance_beach) SELECT 40 + random() * 200, random() * 15 FROM generate_series(1, 5000000); INSERT 0 5000000
Clearly, loading all of this data was not difficult, but observe what happens next:
test=# \timing Timing is on. test=# SELECT price * sqrt(distance_beach), price, distance_beach FROM t_hotel ORDER BY 1 LIMIT 10; ?column? | price | distance_beach ------------------------------+------------------+------------------------- 0.15700293278521447089153382 | 199.127877652645 | 0.000000621657818555832 0.3200968902259212440086465 | 77.0173728093505 | 0.0000172737054526806 0.3452837023672139082940331 | 59.0800635144114 | 0.0000341562554240227 ... (10 rows) Time: 18916.807 ms (00:18.917)
My laptop needed almost 19 seconds to run the query. The majority of users wouldn’t put up with this type of behavior for very long, so we need to find a way to make the runtime better.
Like every other database engine I’m aware of, PostgreSQL lacks the SKYLINE OF operator. However, functional indexes, which PostgreSQL provides, are perfect in this situation:
test=# CREATE FUNCTION rate_hotel(numeric, numeric) RETURNS numeric AS $$ SELECT $1 * sqrt($2) $$ LANGUAGE 'sql' IMMUTABLE; CREATE FUNCTION
Use of an IMMUTABLE function is crucial in this case. We must ensure that the function used to rank the data is completely deterministic and that, given the same input parameters, its output does not change over time.
It’s simple to create the index:
test=# CREATE INDEX idx_fix_hotel ON t_hotel (rate_hotel(price, distance_beach)); CREATE INDEX Time: 22706.882 ms (00:22.707) test=# ANALYZE ; ANALYZE Time: 354.660 ms
Speeding up the query using an index
The runtime of this query is reduced to about 1 millisecond thanks to our new index, which is about 20.000 faster than before. The outcome is identical:
test=# SELECT rate_hotel(price, distance_beach), price, distance_beach FROM t_hotel ORDER BY 1 LIMIT 10; rate_hotel | price | distance_beach ------------------------------+------------------+------------------------- 0.15700293278521447089153382 | 199.127877652645 | 0.000000621657818555832 0.3200968902259212440086465 | 77.0173728093505 | 0.0000172737054526806 0.3452837023672139082940331 | 59.0800635144114 | 0.0000341562554240227 ... (10 rows) Time: 1.024 ms
The execution plan demonstrates that PostgreSQL will go directly to the index and retrieve the required data. There is no need to sort or touch more than a few rows because PostgreSQL indexes return sorted data:
test=# explain SELECT rate_hotel(price, distance_beach), price, distance_beach FROM t_hotel ORDER BY 1 LIMIT 10; QUERY PLAN -------------------------------------------------------------------------- Limit (cost=0.43..1.12 rows=10 width=55) -> Index Scan using idx_fix_hotel on t_hotel (cost=0.43..346214.73 rows=4999993 width=55) (2 rows)
The approach I’ve just demonstrated is effective, simple to use, and appropriate for the majority of real-world situations. Without sacrificing performance, you can make your rating function somewhat sophisticated.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Why BFSI Leaders Are Turning to Enteros for Database Optimization, AI Ops, and Cloud FinOps Excellence
- 16 April 2026
- Database Performance Management
Introduction The Banking, Financial Services, and Insurance (BFSI) sector is undergoing a massive digital transformation. With the rise of digital banking, real-time payments, fraud detection systems, and AI-driven financial services, organizations are becoming increasingly dependent on high-performance data infrastructure. From managing millions of transactions per second to enabling real-time risk analysis and personalized customer experiences, … Continue reading “Why BFSI Leaders Are Turning to Enteros for Database Optimization, AI Ops, and Cloud FinOps Excellence”
How to Optimize Telecom Sector Growth with Enteros AIOps Platform, Resource Metadata, Hierarchy Metadata, Spot Instances, and RevOps Efficiency
Introduction The telecom sector is at the center of global digital transformation, enabling connectivity for billions of users, businesses, and emerging technologies like IoT, 5G, and edge computing. As demand for high-speed, reliable communication services continues to rise, telecom providers are under immense pressure to scale operations efficiently while maintaining performance and controlling costs. However, … Continue reading “How to Optimize Telecom Sector Growth with Enteros AIOps Platform, Resource Metadata, Hierarchy Metadata, Spot Instances, and RevOps Efficiency”
Who Should Adopt Enteros for Retail Growth Management with AI SQL and Cloud FinOps Efficiency
Introduction The retail sector is evolving at an unprecedented pace, driven by digital transformation, omnichannel experiences, and data-driven decision-making. From global eCommerce giants to mid-sized retail chains, businesses are increasingly relying on cloud infrastructure, databases, and analytics platforms to fuel growth. However, this rapid expansion introduces a fundamental challenge:how to scale efficiently while maintaining performance, … Continue reading “Who Should Adopt Enteros for Retail Growth Management with AI SQL and Cloud FinOps Efficiency”
How to Optimize Technology Sector Growth with Enteros Database Management Platform, Cloud FinOps, and RevOps Efficiency
Introduction The technology sector is at the forefront of innovation, powering digital transformation across industries. From SaaS platforms and cloud-native applications to AI-driven solutions, technology companies are scaling rapidly to meet growing global demand. However, this rapid expansion introduces a critical challenge:how to sustain growth while maintaining high-performance systems, controlling cloud costs, and aligning operations … Continue reading “How to Optimize Technology Sector Growth with Enteros Database Management Platform, Cloud FinOps, and RevOps Efficiency”