Preamble
The topic of PostgreSQL’s index scans is frequently brought up. Because many people don’t seem to be aware of what the optimizer does when processing a single query, this blog is meant to be an introduction to the subject. I decided to give a short introduction that goes over some of the most basic ways to get to a table. Let us get started with PostgreSQL indexing.
In some ways, indices form the basis of strong performance. Without proper indexing, your PostgreSQL database may be in bad shape, and users may complain about slow queries and slow response times. Because of this, it makes sense to look at what PostgreSQL does when asked about a single column.
Preparing some demo data in PostgreSQL
We can use a straightforward table to demonstrate how things work:
test=# CREATE TABLE sampletable (x numeric); CREATE TABLE
If your table is nearly empty, you won’t ever see an index scan because there may be too much overhead involved in consulting an index; it is more cost-effective to perform a direct table scan and discard some rows that don’t match your query.
So, to show how an index really works, we can add 10 million random rows to the table we already made:
test=# INSERT INTO sampletable SELECT random() * 10000 FROM generate_series(1, 10000000); INSERT 0 10000000
After that, an index is made:
test=# CREATE INDEX idx_x ON sampletable(x); CREATE INDEX
In the event that autovacuum has not yet caught up after loading so much data, it may be a good idea to create optimizer statistics. The PostgreSQL optimizer needs these numbers to decide if it should use an index or not:
test=# ANALYZE ; ANALYZE
Lehman-Yao High-Concurrency Btrees are used in PostgreSQL (more information will be provided in a subsequent blog).
Selecting a small subset of data in PostgreSQL
When only a small number of rows are chosen, PostgreSQL can directly query the index. Due to the fact that the index already contains all necessary columns, it can even use an “Index Only Scan” in this situation:
test=# explain SELECT * FROM sampletable WHERE x = 42353; QUERY PLAN ----------------------------------------------------------------------- Index Only Scan using idx_x on sampletable (cost = 0.43; rows = 1; width = 11) Index Cond: (x = '42353'::numeric) (2 rows)
Using the index, picking a small number of rows will be very effective. If more data is chosen, it will be too expensive to scan both the index and the table.
PostgreSQL indexing: Selecting a lot of data from a table in PostgreSQL
However, PostgreSQL will resort to a sequential scan if you choose a LOT of data from a table. The best course of action in this case is to read the entire table and only filter out a few rows.
The process is as follows:
test=# explain SELECT * FROM sampletable WHERE x < 42353; QUERY PLAN --------------------------------------------------------------- Seq Scan on sampletable (cost=0.00..179054.03 rows=9999922 width=11) Filter: (x < '42353'::numeric) (2 rows)
Only the remaining rows will be returned after PostgreSQL has eliminated those useless rows. The best course of action in this situation is to take this. Therefore, a sequential scan is not always bad; in fact, there are use cases where a sequential scan is ideal.
Still, remember that repeatedly scanning lengthy tables in order will eventually wear you out.
PostgreSQL: Making use of bitmap scans
PostgreSQL will decide whether to read the entire table if you select the majority of the rows or an index scan if you only select a few rows. But what if you read too much for a sequential scan but not enough for an index scan? Use of a bitmap scan is the answer to the issue. A bitmap scan works on the principle that each block is only used once. It can also be very useful if you want to scan a single table using multiple indexes.
Following is what happens:
SELECT * FROM sampletable WHERE x 423; QUERY PLAN test=# explain ---------------------------------------------------------------------------- Bitmap Heap Scan on sampletable (cost=9313.62..68396.35 rows=402218 width=11) Recheck Cond: (x < '423'::numeric) -> Bitmap Index Scan on idx_x (cost=0.00..9213.07 rows=402218 width=0) Index Cond: (x < '423'::numeric) (4 rows)
PostgreSQL will first scan the index and compile those rows / blocks, which are needed at the end of the scan. After that, PostgreSQL will use this list to access the table and actually fetch those rows. The wonder of it is that more than one index can be used and still this mechanism functions.
Therefore, bitmap scans are a fantastic performance improvement.
About Enteros
Enteros offers a patented database performance management SaaS platform. It finds the root causes of complex database scalability and performance problems that affect business across a growing number of cloud, RDBMS, NoSQL, and machine learning database platforms.
The views expressed on this blog are those of the author and do not necessarily reflect the opinions of Enteros Inc. This blog may contain links to the content of third-party sites. By providing such links, Enteros Inc. does not adopt, guarantee, approve, or endorse the information, views, or products available on such sites.
Are you interested in writing for Enteros’ Blog? Please send us a pitch!
RELATED POSTS
Genomics at Scale: How Database Performance Accelerates Drug Discovery
- 5 September 2025
- Software Engineering
Introduction Genomics research and drug discovery generate some of the world’s largest datasets. Sequencing, molecular simulations, and clinical trial analytics all rely on vast, high-speed databases. Yet many organizations struggle when data systems lag, slowing the path from discovery to treatment. In this article, we explore why genomics is so dependent on database performance, the … Continue reading “Genomics at Scale: How Database Performance Accelerates Drug Discovery”
From Metals to Chemicals: Database Performance as the Hidden Driver of Industry
Introduction Modern industry runs on data as much as it does on raw materials. From metals and mining to chemicals and advanced manufacturing, operations rely on massive, complex databases. Yet the performance of those databases often goes unnoticed — until latency, inefficiency, or outages begin costing millions. In this article, we explore how database performance … Continue reading “From Metals to Chemicals: Database Performance as the Hidden Driver of Industry”
How Enteros Transforms Database Performance Management and Cloud FinOps to Elevate AI Performance in the AI Sector
- 4 September 2025
- Database Performance Management
Introduction The AI sector is rapidly reshaping industries worldwide, from healthcare and finance to retail, gaming, and beyond. As generative AI models, machine learning pipelines, and advanced data-driven applications become more resource-intensive, the underlying database infrastructure that supports them faces mounting challenges. High volumes of structured and unstructured data must be ingested, processed, and queried … Continue reading “How Enteros Transforms Database Performance Management and Cloud FinOps to Elevate AI Performance in the AI Sector”
How Enteros Uses Cost Estimation and Database Performance Optimization to Drive AIOps and RevOps Efficiency in the BFSI Sector
Introduction The Banking, Financial Services, and Insurance (BFSI) sector has always been at the forefront of technological transformation. With millions of daily financial transactions, high compliance demands, and an increasingly digital-first customer base, the efficiency of IT operations plays a critical role in ensuring stability, security, and scalability. In recent years, BFSI companies have turned … Continue reading “How Enteros Uses Cost Estimation and Database Performance Optimization to Drive AIOps and RevOps Efficiency in the BFSI Sector”