Skip to content

Commit 3cd160a

Browse files
committed
Doc-1601: Specify cluster UUID to restore with Whole Cluster Recovery
1 parent fa837c7 commit 3cd160a

File tree

1 file changed

+139
-0
lines changed

1 file changed

+139
-0
lines changed

modules/manage/partials/whole-cluster-restore.adoc

Lines changed: 139 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ By default, Redpanda uploads cluster metadata to object storage periodically. Yo
5353
* xref:reference:cluster-properties.adoc#enable_cluster_metadata_upload_loop[`enable_cluster_metadata_upload_loop`]: Enable metadata uploads. This property is enabled by default and is required for Whole Cluster Restore.
5454
* xref:reference:properties/object-storage-properties.adoc#cloud_storage_cluster_metadata_upload_interval_ms[`cloud_storage_cluster_metadata_upload_interval_ms`]: Set the time interval to wait between metadata uploads.
5555
* xref:reference:cluster-properties.adoc#controller_snapshot_max_age_sec[`controller_snapshot_max_age_sec`]: Maximum amount of time that can pass before Redpanda attempts to take a controller snapshot after a new controller command appears. This property affects how current the uploaded metadata can be.
56+
* xref:reference:properties/object-storage-properties.adoc#cloud_storage_cluster_name[`cloud_storage_cluster_name`]: *Advanced: This is an internal-only configuration and should be enabled only after consulting with Redpanda support.* Specify a custom name for cluster's metadata in object storage. For use when multiple clusters share the same storage bucket (for example, for Whole Cluster Restore).
5657

5758
NOTE: You can monitor the xref:reference:public-metrics-reference.adoc#redpanda_cluster_latest_cluster_metadata_manifest_age[redpanda_cluster_latest_cluster_metadata_manifest_age] metric to track the age of the most recent metadata upload.
5859

@@ -225,3 +226,141 @@ NODE CONFIG-VERSION NEEDS-RESTART INVALID UNKNOWN
225226
endif::[]
226227

227228
When the cluster restore is successfully completed, you can redirect your application workload to the new cluster. Make sure to update your application code to use the new addresses of your brokers.
229+
230+
== (Advanced) Restore data when multiple clusters share data
231+
232+
[CAUTION]
233+
====
234+
This is an advanced use case and should be performed only after consulting with Redpanda support.
235+
====
236+
237+
Typically, there is a one-to-one mapping between a Redpanda cluster and its object storage bucket. However, you can also run multiple clusters that share the same bucket. This allows you to move tenants between clusters without moving data, as the data remains in the same bucket. For example, you can mount topics to multiple clusters in the same bucket.
238+
239+
Running multiple clusters that share the same storage bucket presents unique challenges during Whole Cluster Restore operations. To manage these challenges, you must first understand how Redpanda uses <<the-role-of-cluster-uuids-in-whole-cluster-restore,UUIDs>> (universal unique identifiers) to identify clusters during Whole Cluster Restore.
240+
241+
=== The role of cluster UUIDs in Whole Cluster Restore
242+
243+
Every time a Redpanda cluster (single node or more) starts, it is automatically assigned a random UUID. From that moment forward, all entities created by the cluster are identifiable using that cluster UUID. Such entities include:
244+
245+
- Topic data
246+
- Topic metadata
247+
- Whole Cluster Restore manifests
248+
- Controller log snapshots for Whole Cluster Restore
249+
- Consumer offsets for Whole Cluster Restore
250+
251+
However, not all entities _managed_ by the cluster are identifiable using this cluster UUID. In fact, Redpanda can recover a different cluster in lieu of the existing cluster, or mount topics from different clusters. For a cluster that has been running for some time, your object storage may look like this:
252+
253+
[source,bash]
254+
----
255+
/
256+
+- cluster_metadata/
257+
+- <uuid-a>/manifests/
258+
| +- 0/cluster_manifest.json
259+
| +- 1/cluster_manifest.json
260+
| +- 2/cluster_manifest.json
261+
+ <uuid-b>/manifests/
262+
| +- 3/cluster_manifest.json
263+
| +- 4/cluster_manifest.json
264+
+ <uuid-c>/manifests/ # Previously active but not restored.
265+
| # Still, the manifest number starts at
266+
| # highest found in the bucket plus one.
267+
| +- 5/cluster_manifest.json
268+
| +- 6/cluster_manifest.json
269+
+ <uuid-d>/manifests/ # active cluster (not restored)
270+
+- 7/cluster_manifest.json
271+
+- 8/cluster_manifest.json
272+
----
273+
274+
Redpanda's algorithm lists all objects (cluster manifests) from object storage and during a Whole Cluster Restore, picks the object with the _highest ID available_, not the current UUID. In this case, if you attempt to restore you would recover `/cluster_metadata/<uuid-c>/manifests/6/cluster_manifest.json`, even though the active cluster is `<uuid-d>`.
275+
276+
However, this algorithm does not work if you have multiple clusters sharing the same object storage bucket. For example, your object storage might look like:
277+
278+
[source,bash]
279+
----
280+
/
281+
+- cluster_metadata/
282+
+ <uuid-a>/manifests/
283+
| +- 0/cluster_manifest.json
284+
| +- 1/cluster_manifest.json
285+
| +- 2/cluster_manifest.json
286+
+ <uuid-b>/manifests/
287+
+- 0/cluster_manifest.json
288+
+- 1/cluster_manifest.json (lost cluster)
289+
----
290+
291+
Here, if you've lost the cluster `uuid-b` and wish to recover it, the recovery process will select the metadata for `uuid-a`, which will lead to a split-brain/data corruption scenario. For troubleshooting details, see <<resolve-repeated-recovery-failures,Resolve repeated recovery failures>>
292+
293+
=== Configure cluster names for multiple source clusters
294+
295+
To disambiguate cluster metadata from multiple clusters, use the xref:reference:properties/object-storage-properties.adoc#cloud_storage_cluster_name[`cloud_storage_cluster_name`] property (off by default), which allows you to assign a unique name to each cluster sharing the same object storage bucket. This name must be unique within the bucket, 1-64 characters, and use only letters, numbers, underscores, and hyphens. Do not change this value once set. Once set, your object storage bucket may look like this:
296+
297+
[source,bash]
298+
----
299+
/
300+
+- cluster_metadata/
301+
| + <uuid-a>/manifests/
302+
| | +- 0/cluster_manifest.json
303+
| | +- 1/cluster_manifest.json
304+
| | +- 2/cluster_manifest.json
305+
| + <uuid-b>/manifests/
306+
| +- 0/cluster_manifest.json
307+
| +- 1/cluster_manifest.json # lost cluster
308+
+- cluster_name/
309+
+- rp-foo/uuid/<uuid-a>
310+
+- rp-qux/uuid/<uuid-b>
311+
----
312+
313+
When a new cluster is created, and you have specified its `cloud_storage_cluster_name` (here, `rp-qux`), your object storage bucket may look like this:
314+
315+
[source,bash]
316+
----
317+
+- cluster_metadata/
318+
| + <uuid-a>/manifests/
319+
| | +- 0/cluster_manifest.json
320+
| | +- 1/cluster_manifest.json
321+
| | +- 2/cluster_manifest.json
322+
| + <uuid-b>/manifests/
323+
| | +- 0/cluster_manifest.json
324+
| | +- 1/cluster_manifest.json # lost cluster
325+
| + <uuid-c>/manifests/
326+
| +- 3/cluster_manifest.json # new cluster
327+
| # ^- next highest sequence number globally
328+
+- cluster_name/
329+
+- rp-foo/uuid/<uuid-a>
330+
+- rp-qux/uuid/
331+
+- <uuid-b>
332+
+- <uuid-c> # reference to new cluster
333+
----
334+
335+
During a Whole Cluster Restore, Redpanda will look for the cluster name specified in `cloud_storage_cluster_name` and only consider manifests associated with that name. In this example, if you start a cluster with `cloud_storage_cluster_name` set to `rp-qux`, Redpanda will only consider manifests under `<uuid-b>` and `<uuid-c>`, ignoring `<uuid-a>` entirely.
336+
337+
Redpanda uses this name to organize the cluster metadata within the shared object storage bucket. This ensures that each cluster's data remains distinct and prevents conflicts during recovery operations.
338+
339+
=== Resolve repeated recovery failures
340+
341+
If you are experiencing repeated failures when a cluster is lost and recreated, the automated recovery algorithm may have selected the manifest with the highest sequence number, which might be the most recent one with no data, instead of the original one that contains the data. Your object storage bucket might look like this:
342+
343+
[source,bash]
344+
----
345+
/
346+
+- cluster_metadata/
347+
+ <uuid-a>/manifests/
348+
| +- 0/cluster_manifest.json
349+
| +- 1/cluster_manifest.json #lost cluster
350+
+ <uuid-b>/manifests/
351+
+- 3/cluster_manifest.json # lost again (not recovered)
352+
+ <uuid-d>/manifests/
353+
+- 7/cluster_manifest.json # new attempt to recover uuid-b
354+
# it does not have the data
355+
----
356+
357+
In such cases, you can explicitly run a POST request using the Admin API:
358+
359+
[source,bash]
360+
----
361+
curl -XPOST \
362+
--data '{"cluster_uuid_override": "<uuid-a>"}'
363+
http://localhost:9644/v1/cloud_storage/automated_recovery
364+
----
365+
366+
For details, see the Admin API.

0 commit comments

Comments
 (0)