Fragmentation is a popular topic for any databases like Oracle and Microsoft SQL server. This post demonstrates how to identify fragmentation in PostgreSQL including AWS PostgreSQL RDS.
Install and Load pgstattuple Module
As super user, install extension pgstattuple as following if not yet.
CREATE EXTENSION pgstattuple;
Check extension details by query pg_extension and pg_available_extensions.
Use pgstattuple Function to List Table Fragmentation Details
The following sql shows table cdr_record is heavily fragmented with 98.71 % free space, and tuple percent is only 9%.
postgres=> SELECT * FROM pgstattuple('cdr_record');
table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
-------------+-------------+-----------+---------------+------------------+----------------+--------------------+-------------+--------------
10459709440 | 58988 | 9583432 | 0.09 | 7876 | 1430999 | 0.01 | 10324502496 | 98.71
(1 row)
postgres=> select count(*) from cdr_record;
count
-------
58988
(1 row)
pgstattuple output columns:
Column | Type | Description |
---|---|---|
table_len | bigint | Physical relation length in bytes |
tuple_count | bigint | Number of live tuples |
tuple_len | bigint | Total length of live tuples in bytes |
tuple_percent | float8 | Percentage of live tuples |
dead_tuple_count | bigint | Number of dead tuples |
dead_tuple_len | bigint | Total length of dead tuples in bytes |
dead_tuple_percent | float8 | Percentage of dead tuples |
free_space | bigint | Total free space in bytes |
free_percent | float8 | Percentage of free space |
Use pgstatindex Function to List Index Fragmentation Details
The following sql shows index idx_customers leaf fragmentation is 67.07%.
elms=> SELECT * FROM pgstatindex('idx_customers');
version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
3 | 3 | 1030766592 | 25480 | 835 | 101582 | 0 | 23408 | 77.48 | 67.07
(1 row)
pgstatindex output columns:
Column | Type | Description |
---|---|---|
version | integer | Btree version number |
tree_level | integer | Tree level of the root page |
index_size | bigint | Total number of pages in index |
root_block_no | bigint | Location of root block |
internal_pages | bigint | Number of “internal” (upper-level) pages |
leaf_pages | bigint | Number of leaf pages |
empty_pages | bigint | Number of empty pages |
deleted_pages | bigint | Number of deleted pages |
avg_leaf_density | float8 | Average density of leaf pages |
leaf_fragmentation | float8 | Leaf page fragmentation |
By using both pgstattuple and pgstatindex function, the results are accumulated page by page at one point of time,so it is very helpful for users to determine whether vacuum is required or not.
pg_stat_all_tables can also be used to decide necessity of vacuum, but it depends on table or index statistics, which could be stalled for many reasons.
2 thoughts on “How to Identify Fragmentation in PostgreSQL RDS”