How to Check and Fix Dead Tuples Not Removed By Vacuum From PostgreSQL Tables

Why Vacuum ?

In PostgreSQL, when rows are updated or deleted from a table, dead rows are created and left behind, vacuum daemon processes will clean those dead tuples ( rows) periodically based on parameter setting for vacuum process. Generally the following four reasons explain why vacuum is needed:

  1. To recover or reuse disk space occupied by updated or deleted rows.
  2. To update data statistics used by the PostgreSQL query planner.
  3. To update the visibility map, which speeds up index-only scans.
  4. To protect against loss of very old data due to transaction ID wraparound or multixact ID wraparound.

Tables’ Dead Tuples Not Removed

The following query result shows the dead tuples are not removed by auto vacuuming for some reasons. But they should have been removed from tables.

 schemaname  |          relname          | n_live_tup | n_dead_tup |    overthread    |              now              |        last_autovacuum        | autovacuum_count
------------+---------------------------+------------+------------+------------------+-------------------------------+-------------------------------+------------------
 pg_catalog | pg_statistic              |       3298 |      48500 |  127.69878883623 | 2020-10-10 03:35:43.573615+00 | 2020-10-10 03:34:11.607386+00 |          1797116
 pg_catalog | pg_attribute              |       1116 |       6643 | 41.1076732673267 | 2020-10-10 03:35:43.573615+00 | 2020-10-10 03:34:28.653988+00 |          1589637
 cdrdb      | users                     |      27316 |      87554 | 31.4761288467069 | 2020-10-10 03:35:43.573615+00 | 2020-10-10 03:35:35.891844+00 |          1717941
 cdrdb      | products                  |        645 |       1689 | 14.7510917030568 | 2020-10-10 03:35:43.573615+00 | 2020-10-10 03:35:32.192822+00 |          1596895
 cdrdb      | price                     |     154492 |     168499 |  10.871464333643 | 2020-10-10 03:35:43.573615+00 | 2020-10-10 03:35:05.049206+00 |          1655072
 cdrdb      | contract_pdf              |     157701 |     165106 | 10.4364700602398 | 2020-10-10 03:35:43.573615+00 | 2020-10-10 03:35:27.945506+00 |          1656837
...
..
.
(27 rows)

Subscribe to get access

Read more of this content when you subscribe today.

PostgreSQL Automatic Vacuuming

Why Vacuum ?

In PostgreSQL, when rows are updated or deleted from a table, dead rows are created and left behind, vacuum daemon processes will clean those dead tuples ( rows) periodically based on parameter setting for vacuum process. Generally the following four reasons explain why vacuum is needed:

  • To recover or reuse disk space occupied by updated or deleted rows.
  • To update data statistics used by the PostgreSQL query planner.
  • To update the visibility map, which speeds up index-only scans.
  • To protect against loss of very old data due to transaction ID wraparound or multixact ID wraparound.

Automatic Vacuuming Parameters

There are following 14 configuration parameters for PostgreSQL 11. For all the details, please refer to  Automatic Vacuuming.

postgres=> select name, setting,unit 
           from pg_settings 
            where name like '%autovacuum%';
                name                 |  setting  | unit
-------------------------------------+-----------+------
 autovacuum                          | on        |
 autovacuum_analyze_scale_factor     | 0.05      |
 autovacuum_analyze_threshold        | 50        |
 autovacuum_freeze_max_age           | 200000000 |
 autovacuum_max_workers              | 3         |
 autovacuum_multixact_freeze_max_age | 400000000 |
 autovacuum_naptime                  | 15        | s
 autovacuum_vacuum_cost_delay        | 20        | ms
 autovacuum_vacuum_cost_limit        | 200       |
 autovacuum_vacuum_scale_factor      | 0.1       |
 autovacuum_vacuum_threshold         | 50        |
 autovacuum_work_mem                 | -1        | kB
 log_autovacuum_min_duration         | 10000     | ms
 rds.force_autovacuum_logging_level  | info      |
(14 rows)

Now we have a look at the two most important parameters of automatic vacuuming.

                name                 |  setting  | unit
-------------------------------------+-----------+------
 autovacuum_vacuum_scale_factor      | 0.1       |
 autovacuum_vacuum_threshold         | 50        |

We still use table TEST as an example to demonstrate how those two parameters work together.

We intentionally modified autovacuum_vacuum_threshold value from default 50 to 5 for easy testing purpose.

postgres=> ALTER TABLE test SET (autovacuum_enabled = true,autovacuum_vacuum_threshold=5);
ALTER TABLE

postgres=> \d+ test
                                   Table "public.test"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 id     | integer |           |          |         | plain   |              |
Options: autovacuum_analyze_threshold=5, autovacuum_enabled=true, autovacuum_vacuum_threshold=5


postgres=> select relname,n_tup_ins , n_tup_upd,n_tup_del,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,vacuum_count,autovacuum_count,analyze_count,autoanalyze_count
from pg_stat_all_tables where relname='test'; \gx
 relname | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum |         last_analyze          |       last_autoanalyze        | vacuum_count | autovacuum_count
 | analyze_count | autoanalyze_count
---------+-----------+-----------+-----------+------------+------------+-------------+-----------------+-------------------------------+-------------------------------+--------------+-----------------
-+---------------+-------------------
 test    |        12 |         0 |         0 |         12 |          0 |             |                 | 2020-10-06 03:31:12.055461+00 | 2020-10-06 10:21:34.177395+00 |            0 |                0
 |             1 |                 2
(1 row)


-[ RECORD 1 ]-----+------------------------------
relname           | test
n_tup_ins         | 12
n_tup_upd         | 0
n_tup_del         | 0
n_live_tup        | 12
n_dead_tup        | 0
last_vacuum       |
last_autovacuum   |
last_analyze      | 2020-10-06 03:31:12.055461+00
last_autoanalyze  | 2020-10-06 10:21:34.177395+00
vacuum_count      | 0
autovacuum_count  | 0
analyze_count     | 1
autoanalyze_count | 2


postgres=> delete from test where id>=6;
DELETE 7

postgres=> select relname,n_tup_ins , n_tup_upd,n_tup_del,n_live_tup,n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,vacuum_count,autovacuum_count,analyze_count,autoanalyze_count
from pg_stat_all_tables where relname='test'; \gx
 relname | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | last_vacuum |        last_autovacuum        |         last_analyze          |       last_autoanalyze        | vacuum_count | au
tovacuum_count | analyze_count | autoanalyze_count
---------+-----------+-----------+-----------+------------+------------+-------------+-------------------------------+-------------------------------+-------------------------------+--------------+---
---------------+---------------+-------------------
 test    |        12 |         0 |         7 |          5 |          0 |             | 2020-10-06 10:28:35.355256+00 | 2020-10-06 03:31:12.055461+00 | 2020-10-06 10:28:35.375704+00 |            0 |
             1 |             1 |                 3
(1 row)


-[ RECORD 1 ]-----+------------------------------
relname           | test
n_tup_ins         | 12
n_tup_upd         | 0
n_tup_del         | 7
n_live_tup        | 5
n_dead_tup        | 0
last_vacuum       |
last_autovacuum   | 2020-10-06 10:28:35.355256+00
last_analyze      | 2020-10-06 03:31:12.055461+00
last_autoanalyze  | 2020-10-06 10:28:35.375704+00
vacuum_count      | 0
autovacuum_count  | 1
analyze_count     | 1
autoanalyze_count | 3

postgres=>

After deleted 7 of 12 records from TEST table, it triggers both auto analyzing and auto vacuuming immediately.

It triggers auto analyzing, because deleted records ( 7 ) > autovacuum_analyze_scale_factor( 0.05 ) *table records (12) + autovacuum_analyze_threshold (5 ).

It triggers auto vacuuming, because deleted records ( 7 ) > autovacuum_vacuum_scale_factor( 0.1 ) *table records(12) + autovacuum_vacuum_threshold (5 ).

Statistics is Missing from Tables in PostgreSQL

One of our customer complained about the PostgreSQL database is very slow just after migrated from Oracle database.

We did initial investigation and found very interesting different behaviours between Oracle and PostgreSQL regarding statistics.

As we know, Oracle has a daily overnight job for looking after user tables and indexes statistics. While for PostgreSQL, the way of collecting statistics is in a totally different way.

In PostgreSQL, the following default parameters are defined for collecting statistics.

 select name,setting  from pg_settings
 where name like '%vacuu%ana%';
              name               | setting
---------------------------------+---------
 autovacuum_analyze_scale_factor | 0.05
 autovacuum_analyze_threshold    | 50
(2 rows)

autovacuum_analyze_scale_factor : Specifies a fraction of the table size to add to autovacuum_analyze_threshold when deciding whether to trigger an ANALYZE. The default is 0.05 (5% of table size).

autovacuum_analyze_threshold : Specifies the minimum number of inserted, updated or deleted tuples needed to trigger an ANALYZE in any one table. The default is 50 tuples.

Let’s create a test table, then try to explain how those two parameters works together.

postgres=>create table test ( id integer);
CREATE TABLE

postgres=>select relname,n_tup_ins, n_tup_upd,n_tup_del,n_live_tup,
n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,
vacuum_count,autovacuum_count,analyze_count,autoanalyze_count 
from pg_stat_all_tables 
where relname='test';
 relname | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze
_count
---------+-----------+-----------+-----------+------------+------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+------------
-------
 test    |         0 |         0 |         0 |          0 |          0 |             |                 |              |                  |            0 |                0 |             0 |
     0
(1 row)


For test purpose, we set up parameter autovacuum_analyze_threshold  for table TEST to a smaller number 5, instead of default 50.

postgres=> ALTER TABLE test SET (autovacuum_enabled = true, autovacuum_analyze_threshold=5);
ALTER TABLE
             
 postgres=> \d+ test
                                   Table "public.test"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 id     | integer |           |          |         | plain   |              |
Options: autovacuum_enabled=true, autovacuum_analyze_threshold=5

Insert just four records which should not trigger auto analysing.

postgres=> insert into test values (1);
INSERT 0 1
postgres=> insert into test values (2);
INSERT 0 1
postgres=> insert into test values (3);
INSERT 0 1
postgres=> insert into test values (4);
INSERT 0 1

After a while, check the stats without collected as expected.

postgres=> select relname,n_tup_ins, n_tup_upd,n_tup_del,n_live_tup,
n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,
vacuum_count,autovacuum_count,analyze_count,autoanalyze_count 
from pg_stat_all_tables 
where relname='test';
 relname | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze
_count
---------+-----------+-----------+-----------+------------+------------+-------------+-----------------+--------------+------------------+--------------+------------------+---------------+------------
-------
 test    |         4 |         0 |         0 |          4 |          0 |             |                 |              |                  |            0 |                0 |             0 |
     0
(1 row)

Now insert extra two records of fifth and sixth records, which should trigger auto analysing.

postgres=> insert into test values (5);
INSERT 0 1

postgres=> insert into test values (6);
INSERT 0 1

postgres=> select now();
now
-----------------------------
2020-10-06 03:18:27.418373+00
(1 row)
postgres=> select relname,n_tup_ins, n_tup_upd,n_tup_del,n_live_tup,
n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,
vacuum_count,autovacuum_count,analyze_count,autoanalyze_count 
from pg_stat_all_tables 
where relname='test'; \gx
 relname | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze |       last_autoanalyze        | vacuum_count | autovacuum_count | analyze_count
| autoanalyze_count
---------+-----------+-----------+-----------+------------+------------+-------------+-----------------+--------------+-------------------------------+--------------+------------------+---------------
+-------------------
 test    |         6 |         0 |         0 |          6 |          0 |             |                 |              | 2020-10-06 03:19:15.765914+00 |            0 |                0 |             0
|                 1
(1 row)


-[ RECORD 1 ]-----+------------------------------
relname           | test
n_tup_ins         | 6
n_tup_upd         | 0
n_tup_del         | 0
n_live_tup        | 6
n_dead_tup        | 0
last_vacuum       |
last_autovacuum   |
last_analyze      |
last_autoanalyze  | 2020-10-06 03:19:15.765914+00
vacuum_count      | 0
autovacuum_count  | 0
analyze_count     | 0
autoanalyze_count | 1


postgres=>

As expected, the autoanalyze kicked off immediately just after two records inserted.

Beside the autoanalyze, we can analyze the table manually as below.

postgres=> analyze test;
ANALYZE
postgres=> select relname,n_tup_ins, n_tup_upd,n_tup_del,n_live_tup,
n_dead_tup,last_vacuum,last_autovacuum,last_analyze,last_autoanalyze,
vacuum_count,autovacuum_count,analyze_count,autoanalyze_count 
from pg_stat_all_tables 
where relname='test'; \gx
 relname | n_tup_ins | n_tup_upd | n_tup_del | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum |         last_analyze          |       last_autoanalyze        | vacuum_count | autovacuum_count
 | analyze_count | autoanalyze_count
---------+-----------+-----------+-----------+------------+------------+-------------+-----------------+-------------------------------+-------------------------------+--------------+-----------------
-+---------------+-------------------
 test    |         6 |         0 |         0 |          6 |          0 |             |                 | 2020-10-06 03:31:12.055461+00 | 2020-10-06 03:19:15.765914+00 |            0 |                0
 |             1 |                 1
(1 row)


-[ RECORD 1 ]-----+------------------------------
relname           | test
n_tup_ins         | 6
n_tup_upd         | 0
n_tup_del         | 0
n_live_tup        | 6
n_dead_tup        | 0
last_vacuum       |
last_autovacuum   |
last_analyze      | 2020-10-06 03:31:12.055461+00
last_autoanalyze  | 2020-10-06 03:19:15.765914+00
vacuum_count      | 0
autovacuum_count  | 0
analyze_count     | 1
autoanalyze_count | 1

Let’s go back to the customer’s problem. There are various reasons that can cause statistics missing , like PostgreSQL RDS restored from snapshot, etc.

Since Oracle database was migrated to PostgreSQL, all the statistics were reset to zero. In the beginning, there were not enough transactions to trigger auto-analysing, so many SQL plans were not optimised until auto-analysing kicks off or starts a manual analysing.

In this case, it is a good practice to do a manual analysing immediately after Oracle database migrated to PostgreSQL .

How to Identify Fragmentation in PostgreSQL RDS

Fragmentation is a popular topic for any databases like Oracle and Microsoft SQL server. This post demonstrates how to identify fragmentation in PostgreSQL including AWS PostgreSQL RDS.

Install and Load pgstattuple Module

As super user, install extension pgstattuple as following if not yet.

CREATE EXTENSION pgstattuple;

Check extension details by query pg_extension and pg_available_extensions.

Use pgstattuple Function to List Table Fragmentation Details

The following sql shows table cdr_record is heavily fragmented with 98.71 % free space, and tuple percent is only 9%.

postgres=>   SELECT * FROM pgstattuple('cdr_record');
  table_len  | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space  | free_percent
-------------+-------------+-----------+---------------+------------------+----------------+--------------------+-------------+--------------
 10459709440 |       58988 |   9583432 |          0.09 |             7876 |        1430999 |               0.01 | 10324502496 |        98.71
(1 row)


postgres=> select count(*) from cdr_record;
 count
-------
 58988
(1 row)

pgstattuple output columns:

ColumnTypeDescription
table_lenbigintPhysical relation length in bytes
tuple_countbigintNumber of live tuples
tuple_lenbigintTotal length of live tuples in bytes
tuple_percentfloat8Percentage of live tuples
dead_tuple_countbigintNumber of dead tuples
dead_tuple_lenbigintTotal length of dead tuples in bytes
dead_tuple_percentfloat8Percentage of dead tuples
free_spacebigintTotal free space in bytes
free_percentfloat8Percentage of free space

Use pgstatindex Function to List Index Fragmentation Details

The following sql shows index idx_customers leaf fragmentation is 67.07%.

elms=> SELECT * FROM pgstatindex('idx_customers');
 version | tree_level | index_size | root_block_no | internal_pages | leaf_pages | empty_pages | deleted_pages | avg_leaf_density | leaf_fragmentation
---------+------------+------------+---------------+----------------+------------+-------------+---------------+------------------+--------------------
       3 |          3 | 1030766592 |         25480 |            835 |     101582 |           0 |         23408 |            77.48 |              67.07
(1 row)

pgstatindex output columns:

ColumnTypeDescription
versionintegerBtree version number
tree_levelintegerTree level of the root page
index_sizebigintTotal number of pages in index
root_block_nobigintLocation of root block
internal_pagesbigintNumber of “internal” (upper-level) pages
leaf_pagesbigintNumber of leaf pages
empty_pagesbigintNumber of empty pages
deleted_pagesbigintNumber of deleted pages
avg_leaf_densityfloat8Average density of leaf pages
leaf_fragmentationfloat8Leaf page fragmentation

By using both pgstattuple and pgstatindex function, the results are accumulated page by page at one point of time,so it is very helpful for users to determine whether vacuum is required or not.

pg_stat_all_tables can also be used to decide necessity of vacuum, but it depends on table or index statistics, which could be stalled for many reasons.