Refine physical backup and restore documentation#346
Conversation
Clarify the definition of physical backup and restore processes, update warnings regarding encryption key management, and improve overall readability.
There was a problem hiding this comment.
Pull request overview
Refines the “Physical backups and restores” documentation to better explain the physical restore flow and strengthen guidance around encryption key management during restores.
Changes:
- Rewords the physical backup/restore definitions and removes duplicated content.
- Clarifies that restore involves temporary/intermediate
mongodstarts and updates log-file behavior wording. - Updates the encryption admonition from “important” to “warning” and expands key-rotation / key-retention guidance.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|  | ||
|
|
||
| During the restore, the ``pbm-agents`` temporarily start the ``mongod`` nodes using the WiredTiger storage options retrieved from the backup's metadata. The logs for these starts are saved to the ``pbm.restore.log`` file inside the ``dbPath``. Upon successful restore, this file is deleted. However, it remains for debugging if the restore were to fail. | ||
| During the restore, the ``pbm-agents`` start a temporary non-user reachable instance of each ``mongod`` nodes using the WiredTiger storage options retrieved from the backup's metadata. The logs for these "intermediate" starts are saved to the ``pbm.restore.log`` file inside the ``dbPath``. Upon successful restore, these files are deleted. However, they remain for debugging if the restore were to fail. |
There was a problem hiding this comment.
Grammar/clarity issues in this sentence make the restore flow hard to follow: "instance of each mongod nodes" mixes singular/plural, and the text says logs are saved to a single "pbm.restore.log" file but later refers to "these files" being deleted. Consider rephrasing to consistently describe whether there is one temporary instance/log per node and use consistent singular/plural (e.g., "each mongod node" and "log file(s)").
| During the restore, the ``pbm-agents`` start a temporary non-user reachable instance of each ``mongod`` nodes using the WiredTiger storage options retrieved from the backup's metadata. The logs for these "intermediate" starts are saved to the ``pbm.restore.log`` file inside the ``dbPath``. Upon successful restore, these files are deleted. However, they remain for debugging if the restore were to fail. | |
| During the restore, the ``pbm-agents`` start a temporary, non-user-reachable instance for each ``mongod`` node using the WiredTiger storage options retrieved from the backup's metadata. The logs for these "intermediate" starts are saved to a ``pbm.restore.log`` file inside each node's ``dbPath``. Upon successful restore, these log files are deleted. However, they remain for debugging if the restore were to fail. |
| | Advantages | Disadvantages | | ||
| | ------------------------------ | ------------------------------- | | ||
| |- Faster backup and restore speed <br> - Recommended for big, multi-terabyte datasets <br> - No database overhead | - The backup size is bigger than for logical backups due to data fragmentation extra cost of keeping data and indexes in appropriate data structures <br> - Extra manual operations are required after the restore <br> - Point-in-time recovery requires manual operations | Sharded clusters and non-sharded replica sets | | ||
| |- Faster backup and restore speed <br> - Recommended for big, multi-TB datasets <br> - No overhead at database level | - The backup size could be bigger than for logical backups due to data fragmentation, and the cost of storing the files of each index <br> - Extra manual post-restore steps are required | |
There was a problem hiding this comment.
The doc uses both "multi-terabyte" (line 25) and the abbreviated "multi-TB" (line 29) to describe the same scale. Standardizing on one term in this section would improve readability and avoid potential ambiguity for readers unfamiliar with the abbreviation.
| |- Faster backup and restore speed <br> - Recommended for big, multi-TB datasets <br> - No overhead at database level | - The backup size could be bigger than for logical backups due to data fragmentation, and the cost of storing the files of each index <br> - Extra manual post-restore steps are required | | |
| |- Faster backup and restore speed <br> - Recommended for big, multi-terabyte datasets <br> - No overhead at database level | - The backup size could be bigger than for logical backups due to data fragmentation, and the cost of storing the files of each index <br> - Extra manual post-restore steps are required | |
| !!! warning | ||
|
|
||
| Make sure that you know which master encryption key was used and keep it safe, as this key is required for the restore. | ||
| The master encryption key is not stored as part of the backup. Make sure that you know which encryption key was used at the time of each backup (and back it up separately). If using a key management system, make sure to enable key versioning (Vault does this by default). Do NOT delete old key versions as these are still required for restoring "old" backups that happened before key rotation. |
There was a problem hiding this comment.
In this warning, the terminology switches from "master encryption key" to the more generic "encryption key", which can be ambiguous (e.g., readers may not realize you mean the master key and/or its specific version at backup time). Consider using consistent terminology throughout and explicitly calling out that the restore requires the same master key version/ID that was active when the backup was taken.
Clarify the definition of physical backup and restore processes, update warnings regarding encryption key management, and improve overall readability.