You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/manage/partials/whole-cluster-restore.adoc
+139Lines changed: 139 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -53,6 +53,7 @@ By default, Redpanda uploads cluster metadata to object storage periodically. Yo
53
53
* xref:reference:cluster-properties.adoc#enable_cluster_metadata_upload_loop[`enable_cluster_metadata_upload_loop`]: Enable metadata uploads. This property is enabled by default and is required for Whole Cluster Restore.
54
54
* xref:reference:properties/object-storage-properties.adoc#cloud_storage_cluster_metadata_upload_interval_ms[`cloud_storage_cluster_metadata_upload_interval_ms`]: Set the time interval to wait between metadata uploads.
55
55
* xref:reference:cluster-properties.adoc#controller_snapshot_max_age_sec[`controller_snapshot_max_age_sec`]: Maximum amount of time that can pass before Redpanda attempts to take a controller snapshot after a new controller command appears. This property affects how current the uploaded metadata can be.
56
+
* xref:reference:properties/object-storage-properties.adoc#cloud_storage_cluster_name[`cloud_storage_cluster_name`]: *Advanced: This is an internal-only configuration and should be enabled only after consulting with Redpanda support.* Specify a custom name for cluster's metadata in object storage. For use when multiple clusters share the same storage bucket (for example, for Whole Cluster Restore).
56
57
57
58
NOTE: You can monitor the xref:reference:public-metrics-reference.adoc#redpanda_cluster_latest_cluster_metadata_manifest_age[redpanda_cluster_latest_cluster_metadata_manifest_age] metric to track the age of the most recent metadata upload.
When the cluster restore is successfully completed, you can redirect your application workload to the new cluster. Make sure to update your application code to use the new addresses of your brokers.
229
+
230
+
== (Advanced) Restore data when multiple clusters share data
231
+
232
+
[CAUTION]
233
+
====
234
+
This is an advanced use case and should be performed only after consulting with Redpanda support.
235
+
====
236
+
237
+
Typically, there is a one-to-one mapping between a Redpanda cluster and its object storage bucket. However, you can also run multiple clusters that share the same bucket. This allows you to move tenants between clusters without moving data, as the data remains in the same bucket. For example, you can mount topics to multiple clusters in the same bucket.
238
+
239
+
Running multiple clusters that share the same storage bucket presents unique challenges during Whole Cluster Restore operations. To manage these challenges, you must first understand how Redpanda uses <<the-role-of-cluster-uuids-in-whole-cluster-restore,UUIDs>> (universal unique identifiers) to identify clusters during Whole Cluster Restore.
240
+
241
+
=== The role of cluster UUIDs in Whole Cluster Restore
242
+
243
+
Every time a Redpanda cluster (single node or more) starts, it is automatically assigned a random UUID. From that moment forward, all entities created by the cluster are identifiable using that cluster UUID. Such entities include:
244
+
245
+
- Topic data
246
+
- Topic metadata
247
+
- Whole Cluster Restore manifests
248
+
- Controller log snapshots for Whole Cluster Restore
249
+
- Consumer offsets for Whole Cluster Restore
250
+
251
+
However, not all entities _managed_ by the cluster are identifiable using this cluster UUID. In fact, Redpanda can recover a different cluster in lieu of the existing cluster, or mount topics from different clusters. For a cluster that has been running for some time, your object storage may look like this:
252
+
253
+
[source,bash]
254
+
----
255
+
/
256
+
+- cluster_metadata/
257
+
+- <uuid-a>/manifests/
258
+
| +- 0/cluster_manifest.json
259
+
| +- 1/cluster_manifest.json
260
+
| +- 2/cluster_manifest.json
261
+
+ <uuid-b>/manifests/
262
+
| +- 3/cluster_manifest.json
263
+
| +- 4/cluster_manifest.json
264
+
+ <uuid-c>/manifests/ # Previously active but not restored.
265
+
| # Still, the manifest number starts at
266
+
| # highest found in the bucket plus one.
267
+
| +- 5/cluster_manifest.json
268
+
| +- 6/cluster_manifest.json
269
+
+ <uuid-d>/manifests/ # active cluster (not restored)
270
+
+- 7/cluster_manifest.json
271
+
+- 8/cluster_manifest.json
272
+
----
273
+
274
+
Redpanda's algorithm lists all objects (cluster manifests) from object storage and during a Whole Cluster Restore, picks the object with the _highest ID available_, not the current UUID. In this case, if you attempt to restore you would recover `/cluster_metadata/<uuid-c>/manifests/6/cluster_manifest.json`, even though the active cluster is `<uuid-d>`.
275
+
276
+
However, this algorithm does not work if you have multiple clusters sharing the same object storage bucket. For example, your object storage might look like:
277
+
278
+
[source,bash]
279
+
----
280
+
/
281
+
+- cluster_metadata/
282
+
+ <uuid-a>/manifests/
283
+
| +- 0/cluster_manifest.json
284
+
| +- 1/cluster_manifest.json
285
+
| +- 2/cluster_manifest.json
286
+
+ <uuid-b>/manifests/
287
+
+- 0/cluster_manifest.json
288
+
+- 1/cluster_manifest.json (lost cluster)
289
+
----
290
+
291
+
Here, if you've lost the cluster `uuid-b` and wish to recover it, the recovery process will select the metadata for `uuid-a`, which will lead to a split-brain/data corruption scenario. For troubleshooting details, see <<resolve-repeated-recovery-failures,Resolve repeated recovery failures>>
292
+
293
+
=== Configure cluster names for multiple source clusters
294
+
295
+
To disambiguate cluster metadata from multiple clusters, use the xref:reference:properties/object-storage-properties.adoc#cloud_storage_cluster_name[`cloud_storage_cluster_name`] property (off by default), which allows you to assign a unique name to each cluster sharing the same object storage bucket. This name must be unique within the bucket, 1-64 characters, and use only letters, numbers, underscores, and hyphens. Do not change this value once set. Once set, your object storage bucket may look like this:
296
+
297
+
[source,bash]
298
+
----
299
+
/
300
+
+- cluster_metadata/
301
+
| + <uuid-a>/manifests/
302
+
| | +- 0/cluster_manifest.json
303
+
| | +- 1/cluster_manifest.json
304
+
| | +- 2/cluster_manifest.json
305
+
| + <uuid-b>/manifests/
306
+
| +- 0/cluster_manifest.json
307
+
| +- 1/cluster_manifest.json # lost cluster
308
+
+- cluster_name/
309
+
+- rp-foo/uuid/<uuid-a>
310
+
+- rp-qux/uuid/<uuid-b>
311
+
----
312
+
313
+
When a new cluster is created, and you have specified its `cloud_storage_cluster_name` (here, `rp-qux`), your object storage bucket may look like this:
314
+
315
+
[source,bash]
316
+
----
317
+
+- cluster_metadata/
318
+
| + <uuid-a>/manifests/
319
+
| | +- 0/cluster_manifest.json
320
+
| | +- 1/cluster_manifest.json
321
+
| | +- 2/cluster_manifest.json
322
+
| + <uuid-b>/manifests/
323
+
| | +- 0/cluster_manifest.json
324
+
| | +- 1/cluster_manifest.json # lost cluster
325
+
| + <uuid-c>/manifests/
326
+
| +- 3/cluster_manifest.json # new cluster
327
+
| # ^- next highest sequence number globally
328
+
+- cluster_name/
329
+
+- rp-foo/uuid/<uuid-a>
330
+
+- rp-qux/uuid/
331
+
+- <uuid-b>
332
+
+- <uuid-c> # reference to new cluster
333
+
----
334
+
335
+
During a Whole Cluster Restore, Redpanda will look for the cluster name specified in `cloud_storage_cluster_name` and only consider manifests associated with that name. In this example, if you start a cluster with `cloud_storage_cluster_name` set to `rp-qux`, Redpanda will only consider manifests under `<uuid-b>` and `<uuid-c>`, ignoring `<uuid-a>` entirely.
336
+
337
+
Redpanda uses this name to organize the cluster metadata within the shared object storage bucket. This ensures that each cluster's data remains distinct and prevents conflicts during recovery operations.
338
+
339
+
=== Resolve repeated recovery failures
340
+
341
+
If you are experiencing repeated failures when a cluster is lost and recreated, the automated recovery algorithm may have selected the manifest with the highest sequence number, which might be the most recent one with no data, instead of the original one that contains the data. Your object storage bucket might look like this:
342
+
343
+
[source,bash]
344
+
----
345
+
/
346
+
+- cluster_metadata/
347
+
+ <uuid-a>/manifests/
348
+
| +- 0/cluster_manifest.json
349
+
| +- 1/cluster_manifest.json #lost cluster
350
+
+ <uuid-b>/manifests/
351
+
+- 3/cluster_manifest.json # lost again (not recovered)
352
+
+ <uuid-d>/manifests/
353
+
+- 7/cluster_manifest.json # new attempt to recover uuid-b
354
+
# it does not have the data
355
+
----
356
+
357
+
In such cases, you can explicitly run a POST request using the Admin API:
0 commit comments