Post

Blocks and Clustering Factor

In all our previous posts, I skipped over an important detail: blocks. Now it’s time to talk about them.

On disk, data isn’t stored or retrieved in terms of bytes—or even rows. Instead, everything happens in fixed-size blocks. That’s the basic unit of I/O. When Postgres uses an index, the pointers it stores don’t point directly to “row IDs” (because there’s no such thing for disks). Instead, each pointer includes a block number (or page number), which tells Postgres where on disk the row is, and an offset within that block that points to the specific row.

Why does this matter? Let’s go back to our orders table as an example. Postgres is using an 8KB block size. With an average row size of 72 bytes, this means each block can fit around 113 rows. Now imagine you run a query that needs to retrieve 113 rows. If you’re lucky, all 113 rows are packed into the same block, and Postgres can fetch them with a single read operation. But if you’re unlucky, those 113 rows might be scattered across 113 different blocks, forcing Postgres to issue 113 separate read operations.

This idea—how rows for a given index key are spread across blocks—is what Postgres calls the clustering factor.

We’ve already seen the effects of this in earlier posts. For example, when we queried for CANCELLED orders in our orders table, we knew there were 22,552 rows with that status. But look at what the plan told us:

1
2
3
4
5
Bitmap Heap Scan on orders  (cost=227.44..47607.21 rows=20000 width=34) (actual time=5.049..53.531 rows=22552 loops=1)
  Recheck Cond: (status = 'CANCELLED'::text)
  Heap Blocks: exact=19915
  ->  Bitmap Index Scan on idx_orders_status  (cost=0.00..222.44 rows=20000 width=0) (actual time=2.140..2.140 rows=22552 loops=1)
        Index Cond: (status = 'CANCELLED'::text)

Even though there were 22,552 rows, Postgres had to read 19,915 heap blocks to get them all. Why? Because those rows weren’t neatly packed—they were scattered. Some blocks had multiple matching rows, but many blocks held just one.

What does Postgres do with this information? Let’s try a different example, again using our orders table. We know that City 3 has 900,939 rows. Our table, with its 10 million rows and average row size of 72 bytes, has about 6,371,681 blocks in total.

Now think about where those 900,939 rows for City 3 might be. Remember, Postgres reads data from disk in blocks, not rows. This means the number of blocks that need to be read is a key factor in query performance. Fetching 113 rows from the same block is almost “free” compared to fetching 113 rows from 113 different blocks. When rows are tightly packed into blocks, Postgres can retrieve them with very few read operations. But when they’re scattered across the table, even queries returning a relatively small number of rows can end up doing a huge amount of random I/O.

This difference is why Postgres tracks something called the clustering factor. It measures how “clustered” the rows for an index key are. In the best case, if the table is nicely ordered by city, those 900,939 rows could be packed tightly together across about 7,973 blocks (900,939 divided by 113). In the worst case, they might be completely scattered, with each row sitting in a different block, meaning Postgres would have to touch 900,939 blocks to fetch them.

Postgres uses the clustering factor in its cost calculations. If it’s low, Postgres knows it can follow the index and fetch rows efficiently, often favoring a plain Index Scan. But if it’s high and rows are scattered, it realizes that following the index would involve too many random reads. In that case, Postgres might switch to a Bitmap Index Scan—or even decide that a full table scan is cheaper.

Example

Let’s work through an example to see the impact of the clustering factor on the query plan. Here’s the plan when we filter by City 1000, which has 1026 rows:

1
2
3
4
5
Bitmap Heap Scan on orders  (cost=64.73..19598.01 rows=5715 width=41) (actual time=0.264..1.234 rows=1026 loops=1)
  Recheck Cond: (city = 'City 1000'::text)
  Heap Blocks: exact=1018
  ->  Bitmap Index Scan on idx_orders_city  (cost=0.00..63.30 rows=5715 width=0) (actual time=0.123..0.123 rows=1026 loops=1)
        Index Cond: (city = 'City 1000'::text)

Postgres chose a Bitmap Heap Scan, and notice the line Heap Blocks: exact=1018. This tells us the 1026 rows we’re fetching are spread across 1,018 different blocks. Clearly, the rows for City 1000 are not tightly packed, which means our clustering factor (relative to city) is relatively high.

Now let’s cluster the table on city to improve the physical ordering of rows:

1
2
CLUSTER orders USING idx_orders_city;
ANALYZE orders;

Here’s the new query plan for the same filter:

1
2
3
4
Index Scan using idx_orders_city on orders  (cost=0.43..186.68 rows=6014 width=41) (actual time=0.021..0.121 rows=1026 loops=1)
  Index Cond: (city = 'City 1000'::text)
Planning Time: 0.176 ms
Execution Time: 0.156 ms

Now Postgres uses a plain Index Scan. Since the table is physically ordered by city, Postgres knows that following the index will access the heap pages in sequence, and the extra overhead of the bitmap strategy isn’t worth it anymore.

This new plan doesn’t tell us how many heap blocks Postgres touched, but we can find out with a simple query:

1
2
3
4
5
6
SELECT 
    COUNT(DISTINCT (ctid::text::point)[0]::int) AS num_blocks
FROM 
    orders
WHERE 
    city = 'City 1000'

The result is 11. Those same 1026 rows are now packed into just 11 contiguous blocks. That’s why Postgres can confidently use an Index Scan—it only needs to read 11 blocks sequentially instead of jumping across 1,018 blocks like before.

Trending Tags