Prune & Compact Concepts Explained¶
Prime Backup involves multiple "housekeeping" concepts (prune / compact / vacuum), each operating at a different layer, triggered at different times, and producing different effects
This page provides a side-by-side overview of these concepts, focusing on their differences to help administrators get a clear picture at a glance
Overview¶
| Concept | Target | Auto-triggered | Manual Command |
|---|---|---|---|
| Backup Prune | Backups | Yes (scheduled job) | !!pb prune |
| Database Prune | Composite operation | No | !!pb database prune |
| Pack Compaction | Pack files | Yes (on chunk deletion / scheduled job) | !!pb database compact_packs |
| SQLite Vacuum | Database file | Yes (scheduled job) | !!pb database vacuum |
| Base Fileset Shrink | Base filesets | Yes (on backup deletion) | (included in Database Prune) |
| Orphan Object Scan | Database objects | No | (included in Database Prune) |
| Unknown File Scan | Storage directories | No | (included in Database Prune) |
Backup Prune¶
Operates at the backup layer, deleting excess old backups according to the retention policy
What It Does¶
Deletes backups that no longer need to be retained one by one according to the configured retention policy, and cascades to release the database objects and physical storage exclusively owned by those backups
Scope¶
Backups are divided into three categories by tag and evaluated separately: regular, scheduled, and temporary
Each category independently applies its own PruneSetting configuration; backups with the protection tag (is_protected = true) are never pruned regardless
Retention Decision Process¶
- Use
last,hour,day,week,month,yearto select a representative backup from each time bucket according to the PBS retention policy and mark it for retention - Among the backups marked in step 1, further eliminate those that are excessive or expired using
max_amount(retention cap) andmax_lifetime(maximum lifetime) - Delete all backups that were not marked for retention
The detailed deletion decisions are logged to pb_files/logs/prune.log
How to Trigger¶
- Automatic: scheduled job
prune_backup, triggered according toprune.interval(default6h) orprune.crontab - Manual:
!!pb prune(requires permission level 3)
Database Prune¶
Operates across multiple storage layers, performing a full bottom-level cleanup in one run
What It Does¶
This is a composite command that executes all of the following cleanup steps in order:
- Orphan Object Scan
- Base Fileset Shrink
- Unknown Blob File Scan
- Pack Compaction, using
backup.pack_maintenance_compact_threshold - Unknown Pack File Scan
How to Trigger¶
- Manual:
!!pb database prune(requires permission level 4)
Note
Prime Backup already cleans up the corresponding data promptly during routine operations (such as deleting backups and pruning), so this command rarely needs to be run manually
Pack Compaction¶
Operates at the pack file layer, eliminating "dead space" in pack files by rewriting them
What It Does¶
Pack files are written in an append-only fashion; when chunks are deleted, the byte ranges they occupied are not immediately reclaimed and become dead space
During compaction, pack files whose dead ratio exceeds the threshold have their live entries rewritten into new pack files, and the old files are then deleted; pack files where all entries are dead are deleted directly
Thresholds¶
backup.pack_auto_compact_threshold(default0.5): the minimum live ratio used when triggering immediately; if the live data in the affected pack file falls below this ratio, compaction runs right awaybackup.pack_maintenance_compact_threshold(default0.8): the minimum live ratio used by maintenance tasks (scheduled jobs anddatabase prune); the looser threshold means more pack files get compacted
How to Trigger¶
| Trigger Scenario | Details |
|---|---|
| After a chunk is deleted (immediate) | If the affected pack file's live ratio falls below pack_auto_compact_threshold, compaction runs immediately |
Scheduled job compact_pack |
Default crontab 0 5 * * 0 (every Sunday at 05:00), uses pack_maintenance_compact_threshold |
!!pb database prune step 4 |
Uses pack_maintenance_compact_threshold |
!!pb database compact_packs |
Threshold fixed at 1.0, only skips fully-live pack files (requires permission level 4) |
SQLite Vacuum¶
Operates at the database file layer, defragmenting the SQLite database file itself
What It Does¶
SQLite does not shrink the database file immediately after deleting data, instead leaving holes in place; the VACUUM command rebuilds the database file, eliminating holes and defragmenting it to reduce disk usage
This operation does not modify any backup data or storage objects; it only affects the file size of prime_backup.db itself
How to Trigger¶
- Automatic: scheduled job
vacuum_sqlite, default crontab0 7 * * 0(every Sunday at 07:00) - Manual:
!!pb database vacuum(requires permission level 4)
Base Fileset Shrink¶
Operates at the fileset layer, removing redundant file entries from base filesets
What It Does¶
Filesets use a base + delta structure; a file entry in the base fileset is considered redundant if it has been completely overridden or deleted by all delta filesets that reference it
The shrink operation will:
- Remove these redundant file entries from the base fileset
- Reclassify delta entries originally marked as "override (delta_override)" to "add (delta_add)" so they can survive independently going forward
- Remove delta entries originally marked as "delete (delta_remove)", since the base entry they referenced no longer exists and the delete marker becomes meaningless
How to Trigger¶
- Automatic: when deleting a backup, if the base fileset it belongs to is still shared by other backups, a shrink is automatically performed on that base fileset
- Indirect:
!!pb database prunestep 2 (ShrinkAllBaseFilesetsAction, scans all base filesets)
Note
The routine backup deletion process already cleans up redundant file entries in filesets in a timely manner, so this scan normally finds no redundant objects
Orphan Object Scan & Delete¶
Operates at the database object layer, scanning and deleting "orphaned" database records that are no longer referenced by any parent object
Source of Orphan Objects¶
The normal deletion process cascades to clean up the corresponding objects, but in rare circumstances (such as unexpected interruptions or concurrency anomalies) orphan objects may be left behind
What Gets Cleaned¶
The following objects are scanned and deleted in order:
| Object | Criteria |
|---|---|
| Orphan Fileset | A fileset not referenced by any backup |
| Orphan File | A file object not referenced by any fileset |
| Orphan Blob | A blob not referenced by any file |
| Orphan Chunk Group | A chunk group not referenced by any blob |
| Orphan Chunk | A chunk not referenced by any chunk group |
| Orphan Binding | Applies to the Blob-ChunkGroup and ChunkGroup-Chunk binding tables; removes rows pointing to non-existent objects |
How to Trigger¶
- Indirect:
!!pb database prunestep 1
Note
The routine backup deletion process already cleans up the corresponding orphan objects in a timely manner, so this scan normally finds no orphan objects
Unknown File Scan & Delete¶
Operates at the filesystem layer, scanning storage directories and deleting files that have no corresponding database record
What It Does¶
Prime Backup may leave temporary files in storage directories when a restore fails or is unexpectedly interrupted; these files have no database record and are not cleaned up by normal operations
Scan Scope¶
| Storage Directory | Details |
|---|---|
pb_files/blobs/ |
Scans the direct blob storage directory and deletes files absent from the database |
pb_files/packs/ |
Scans the pack file storage directory and deletes files absent from the database |
How to Trigger¶
- Indirect:
!!pb database prunesteps 3 and 5