Post

Postgres -- Vacuum

The purpose of vacuum:

  1. recover disk space.
  2. Update planner statistics
  3. Prevent transaction id wraparound failure

    • how dd dashboard show low row count after failover

When will autovacuum run?

Autovacuum is controlled by two parameters

  • autovacuum_vacuum_threshold
  • autovacuum_vacuum_scale_factor

When the number of dead tuples go beyond threshold vac_base_thresh + vac_scale_factor * reltuples, autovacuum runs. See the code here. The number of dead tuples can be found inside table pg_stat_all_tables.

1
2
3
4
admincoin=> select relname, n_dead_tup from pg_stat_all_tables where relname = 'event_log' \gx
-[ RECORD 1 ]---------
relname    | event_log
n_dead_tup | 3

Meanwhile, we can check vacuum progress from table pg_stat_progress_vacuum.

Lock needed

Vacuum requires ShareUpdateExclusiveLock, which does not block usual CRUD operations, but this lock type conflicts with itself, so it there is at most one vacuum for a table at a given time. See code. Meanwhile, vacuum blocks ALTER TABLE command.

I understand that it requires ShareUpdateExclusiveLock at the beginning and release it at the end. But during the process, it needs to scan disk blocks, what locks are needed then? Suppose another transaction is locking a tuple, such as update or select for update, what will this vacuum process do?

Ok. Let’s answer the first question. What locks are needed when scanning blocks? When scanning blocks, vacuum needs to hold cleanup lock. The terminolgy is subtle. To be honest, I still do not know whether cleanup lock is a buffer pin or a buffer lock. No matter what, the lock/pin is released quickly after finishing scanning the page. In practice, I do see any real case that we need to worry that this lock/pin is held for two long.

The answer to the second question depends on the version of Postgres. For version <= 14, vacuum skips all buffers that it cannot acquire the cleanup lock immediately. For version >= 15. It waits “infinitely”. This is the changing commit with a long discussion about why the author made this change. From my point of view, waiting “infinitely” seems better because if you simply skip, then we just invoke the vacuum process repeatably without any real progress. Also, in any practical situation, this wait is very short. I tried to simulate a long buffer pin with the help of chatgpt for about 3 hours without any luck. We can tell the difference between the old and new behavior from the vacuum logs. Below is an example log of the old behavior.

1
2
3
4
5
6
7
8
9
10
11
12
INFO:  vacuuming "public.large_table"
INFO:  table "large_table": index scan bypassed: 1 pages from table (0.12% of total) have 1 dead item identifiers
INFO:  table "large_table": found 0 removable, 119 nonremovable row versions in 2 out of 834 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 767
Skipped 1 page due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "pg_toast.pg_toast_16441"
INFO:  table "pg_toast_16441": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 767
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM

You see that it skips 1 page due to buffer pins.

Freezing

MVCC in PostgreSQL — 8. Freezing from postgrespro is the best article talking about freezing I can find online. It creates a live experiment that I can follow. The important parameters that control the freezing are below:

  • vacuum_freeze_min_age: the minimum age of the xmin transaction for which a tuple can be frozen
  • vacuum_freeze_table_age: the age of the transaction for which vacuuming ignores the visibility map and looks through all the table pages in order to do freezing
  • autovacuum_freeze_max_age: forced autovacuuming is launched if pg_class.relfrozenxid is older than this.

As said in the article, xid is compareable circularly. code. Freezing prevents transaction id wraparound failure.

Note, during freezing process, only tuples with committed and aborted xmin are validate candidates. See code.

Another note is about the relation between MVCC and vacuum. Long-running transactions can block vacuum process from freezing old tuples. For a tuple to be frozen, PostgreSQL must ensure it’s not visible to ANY open transaction. See code. Here is how cutoffs->OldestXmin calcuated: code. Basically, it itearates over all active connections and find the oldest xid.

How to get the current and oldest transaction id of a table then?

1
2
3
4
5
admincoin=> SELECT relname, age(relfrozenxid) as xmin_age, relfrozenxid as xmin, txid_current() cur_xid from pg_class where relname = 'object';
 relname | xmin_age  |   xmin    |  cur_xid
---------+-----------+-----------+-----------
 object  | 134866017 | 378302674 | 513168691
(1 row)

Logs

log_autovacuum_min_duration controls the threshold of long-running autovacuum actions. The code is here. An example logs is as following

1
2
3
4
5
6
7
8
9
10
11
automatic vacuum of table "admincoin.public.object_column_index": index scans: 0
	pages: 0 removed, 14094016 remain, 978387 scanned (6.94% of total)
	tuples: 0 removed, 823163314 remain, 10252998 are dead but not yet removable
	removable cutoff: 345727619, which was 19811910 XIDs old when operation ended
	frozen: 27 pages from table (0.00% of total) had 372 tuples frozen
	index scan bypassed: 96277 pages from table (0.68% of total) have 1027041 dead item identifiers
	I/O timings: read: 0.000 ms, write: 0.000 ms
	avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
	buffer usage: 1870163 hits, 0 misses, 0 dirtied
	WAL usage: 0 records, 0 full page images, 0 bytes
	system usage: CPU: user: 6.09 s, system: 0.00 s, elapsed: 21.81 s

Find last vacuum time of each table

1
2
3
4
5
SELECT relname AS table_name,
       last_autovacuum,
       last_vacuum
FROM pg_stat_user_tables
WHERE last_autovacuum IS NOT NULL OR last_vacuum IS NOT NULL;
This post is licensed under CC BY 4.0 by the author.