How to Identify Fragmentation in PostgreSQL RDS

Fragmentation is a popular topic for any databases like Oracle and Microsoft SQL server. This post demonstrates how to identify fragmentation in PostgreSQL including AWS PostgreSQL RDS.

Install and Load pgstattuple Module

As super user, install extension pgstattuple as following if not yet.

CREATE EXTENSION pgstattuple;

Check extension details by query pg_extension and pg_available_extensions.

Use pgstattuple Function to List Table Fragmentation Details

The following sql shows table cdr_record is heavily fragmented with 98.71 % free space, and tuple percent is only 9%.

postgres=>   SELECT * FROM pgstattuple('cdr_record');
  table_len  | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space  | free_percent
-------------+-------------+-----------+---------------+------------------+----------------+--------------------+-------------+--------------
 10459709440 |       58988 |   9583432 |          0.09 |             7876 |        1430999 |               0.01 | 10324502496 |        98.71
(1 row)


postgres=> select count(*) from cdr_record;
 count
-------
 58988
(1 row)

pgstattuple output columns:

ColumnTypeDescription
table_lenbigintPhysical relation length in bytes
tuple_countbigintNumber of live tuples
tuple_lenbigintTotal length of live tuples in bytes
tuple_percentfloat8Percentage of live tuples
dead_tuple_countbigintNumber of dead tuples
dead_tuple_lenbigintTotal length of dead tuples in bytes
dead_tuple_percentfloat8Percentage of dead tuples
free_spacebigintTotal free space in bytes
free_percentfloat8Percentage of free space

Use pgstatindex Function to List Index Fragmentation Details

The following sql shows index idx_customers leaf fragmentation is 67.07%.

elms=> SELECT * FROM pgstatindex('idx_customers');
 version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
       3 |          3 | 1030766592 |         25480 |            835 |     101582 |           0 |         23408 |            77.48 |              67.07
(1 row)

pgstatindex output columns:

ColumnTypeDescription
versionintegerBtree version number
tree_levelintegerTree level of the root page
index_sizebigintTotal number of pages in index
root_block_nobigintLocation of root block
internal_pagesbigintNumber of “internal” (upper-level) pages
leaf_pagesbigintNumber of leaf pages
empty_pagesbigintNumber of empty pages
deleted_pagesbigintNumber of deleted pages
avg_leaf_densityfloat8Average density of leaf pages
leaf_fragmentationfloat8Leaf page fragmentation

By using both pgstattuple and pgstatindex function, the results are accumulated page by page at one point of time,so it is very helpful for users to determine whether vacuum is required or not.

pg_stat_all_tables can also be used to decide necessity of vacuum, but it depends on table or index statistics, which could be stalled for many reasons.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.