Adding multi-cluster support via array of cluster configs based on topology by dridge93 · Pull Request #6116 · ceph/ceph-csi

dridge93 · 2026-02-23T15:24:26Z

Adding multi-cluster support via array of cluster configs based on topology

Describe what this PR does

Adds multi-cluster support to Ceph CSI RBD (#5177)

Related issues

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

/retest ci/centos/<job-name>: retest the <job-name> after unrelated
failure (please report the failure too!)

dridge93 · 2026-03-10T20:21:17Z

Has anyone had a chance to review this PR?

github-actions · 2026-04-09T21:12:59Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

mergify · 2026-04-09T21:13:34Z

This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏

…pology Signed-off-by: Devin Ridge <dridge@globalnoc.iu.edu> Signed-off-by: Devin Ridge <dridge@globalnoc.iu.edu>

akaitux · 2026-04-28T20:26:00Z

@dridge93

I found what looks like a functional bug in the original multi-AZ / multi-cluster topology patch
(933bbff4a9a4747dc593ff7df577dfed95a4884c).

The new helper already returns topology requirements for the clusterTopologyConfigMap path:

func GetClusterTopologiesFromRequest(
    req *csi.CreateVolumeRequest,
) (*[]ClusterTopology, *csi.TopologyRequirement, error)

and inside it:

accessibilityRequirements := req.GetAccessibilityRequirements()
...
return &clusterTopologies, accessibilityRequirements, nil

But in parseVolCreateRequest() that return value is discarded:

rbdVol.TopologyPools, rbdVol.TopologyRequirement, err = util.GetTopologyFromRequest(req)
...
rbdVol.ClusterTopologies, _, err = util.GetClusterTopologiesFromRequest(req)

So when the StorageClass uses only:

parameters:
  clusterTopologyConfigMap: ceph-csi-rbd-cluster-topology

and does not use topologyConstrainedPools, then:

  - rbdVol.ClusterTopologies is populated
  - rbdVol.TopologyRequirement stays nil

Later both CreateVolume() and updateTopologyConstraints() do:

selectedCluster, _, err = util.FindClusterAndTopology(
    rbdVol.ClusterTopologies,
    rbdVol.TopologyRequirement,
)

or:

cluster, topology, err := util.FindClusterAndTopology(
    rbdVol.ClusterTopologies,
    rbdVol.TopologyRequirement,
)

but FindClusterAndTopology() starts with:

if clusterTopologies == nil || accessibilityRequirements == nil {
    return ClusterTopology{}, nil, nil
}

So the cluster selection silently returns empty, and CreateVolume fails with:

no matching cluster found for provided topology requirements

even though the incoming request already contains valid topology like:

"accessibility_requirements": {
  "preferred": [
    {"segments":{"topology.rbd.csi.ceph.com/zone":"az01"}},
    {"segments":{"topology.rbd.csi.ceph.com/zone":"az02"}}
  ]
}

and the live clusterTopologyConfigMap contains matching zones:

{
  "clusterTopology": [
    {
      "clusterID": "...",
      "zones": ["az01"],
      ...
    },
    {
      "clusterID": "...",
      "zones": ["az02"],
      ...
    }
  ]
}

The minimal fix seems to be to preserve the topology requirement returned by GetClusterTopologiesFromRequest() when
rbdVol.TopologyRequirement is still nil.

diff --git a/internal/rbd/controllerserver.go b/internal/rbd/controllerserver.go
index 6ef562373..dd4a3a148 100644
--- a/internal/rbd/controllerserver.go
+++ b/internal/rbd/controllerserver.go
@@ -237,10 +237,14 @@ func (cs *ControllerServer) parseVolCreateRequest(
 	}

 	// store cluster topology information from the request if present
-	rbdVol.ClusterTopologies, _, err = util.GetClusterTopologiesFromRequest(req)
+	var clusterTopologyRequirement *csi.TopologyRequirement
+	rbdVol.ClusterTopologies, clusterTopologyRequirement, err = util.GetClusterTopologiesFromRequest(req)
 	if err != nil {
 		return nil, status.Error(codes.InvalidArgument, err.Error())
 	}
+	if rbdVol.TopologyRequirement == nil {
+		rbdVol.TopologyRequirement = clusterTopologyRequirement
+	}

 	// parse QOS parameters from mutable parameters
 	err = rbdVol.SetQOS(ctx, req.GetMutableParameters())

akaitux · 2026-04-28T20:28:11Z

@dridge93
Second, there is another gap in the original multi-cluster topology flow: the selected cluster secret is only used on the controller/
provisioning path, but not on the node-stage path.

In CreateVolume(), when req.GetSecrets() is empty, the implementation already falls back to the cluster-specific secret
selected from clusterTopologyConfigMap:

selectedCluster, _, err = util.FindClusterAndTopology(...)
...
rbdVol.ClusterSecretName = selectedCluster.SecretName
...
if len(secrets) == 0 && selectedCluster.SecretName != "" {
    secrets, err = k8s.GetSecret(selectedCluster.SecretName, namespace)
}

and then clusterSecretName is persisted into VolumeContext:

  if rbdVol.ClusterSecretName != "" {
      vol.VolumeContext["clusterSecretName"] = rbdVol.ClusterSecretName
  }

So the controller side already knows which secret belongs to the selected cluster.

However, on the node path, NodeStageVolume() still requires standard CSI node-stage secrets to be present in the request:

  if req.GetSecrets() == nil || len(req.GetSecrets()) == 0 {
      return status.Error(codes.InvalidArgument, "stage secrets cannot be nil or empty")
  }

and then immediately does:

  cr, err := util.NewUserCredentialsWithMigration(req.GetSecrets())

This means that a StorageClass using only:

  parameters:
    clusterTopologyConfigMap: ceph-csi-rbd-cluster-topology

still fails at node stage unless csi.storage.k8s.io/node-stage-secret-* is also set statically.

That is problematic for the multi-cluster mode, because:

cluster selection is dynamic (az01 / az02 / az03)
but node-stage-secret-* in the StorageClass is static
so a single StorageClass cannot correctly represent per-cluster credentials unless the same credentials are valid
everywhere

The minimal fix is:

relax ValidateNodeStageVolumeRequest() so empty req.Secrets is allowed when req.VolumeContext["clusterSecretName"] is
present
in NodeStageVolume(), if req.GetSecrets() is empty, load the secret from clusterSecretName in the driver namespace and use
it to build credentials

Example fix:

diff --git a/internal/util/validate.go b/internal/util/validate.go
@@
- if req.GetSecrets() == nil || len(req.GetSecrets()) == 0 {
-     return status.Error(codes.InvalidArgument, "stage secrets cannot be nil or empty")
- }
+ if req.GetSecrets() == nil || len(req.GetSecrets()) == 0 {
+     if req.GetVolumeContext()["clusterSecretName"] == "" {
+         return status.Error(codes.InvalidArgument, "stage secrets cannot be nil or empty")
+     }
+ }

diff --git a/internal/rbd/nodeserver.go b/internal/rbd/nodeserver.go
@@
- cr, err := util.NewUserCredentialsWithMigration(req.GetSecrets())
+ secrets := req.GetSecrets()
+ if len(secrets) == 0 {
+     clusterSecretName := req.GetVolumeContext()["clusterSecretName"]
+     if clusterSecretName != "" {
+         namespace, nsErr := util.GetPodNamespace()
+         if nsErr != nil {
+             return nil, status.Error(codes.InvalidArgument, nsErr.Error())
+         }
+         secrets, err = k8s.GetSecret(clusterSecretName, namespace)
+         if err != nil {
+             return nil, status.Error(codes.InvalidArgument, err.Error())
+         }
+     }
+ }
+
+ cr, err := util.NewUserCredentialsWithMigration(secrets)

Without this, the multi-cluster topology flow is only partially implemented:

controller/provisioning can use per-cluster secrets
node-stage still requires a static StorageClass secret

dridge93 force-pushed the multi-cluster-support branch 2 times, most recently from ef3f0c6 to 035176f Compare February 25, 2026 20:55

moskalev mentioned this pull request Apr 2, 2026

doc: add design proposal for topology-aware cluster selection #5986

Open

6 tasks

github-actions Bot added the stale label Apr 9, 2026

github-actions Bot removed the stale label Apr 10, 2026

Adding multi-cluster support via array of cluster configs based on to…

933bbff

…pology Signed-off-by: Devin Ridge <dridge@globalnoc.iu.edu> Signed-off-by: Devin Ridge <dridge@globalnoc.iu.edu>

dridge93 force-pushed the multi-cluster-support branch from 035176f to 933bbff Compare April 24, 2026 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding multi-cluster support via array of cluster configs based on topology#6116

Adding multi-cluster support via array of cluster configs based on topology#6116
dridge93 wants to merge 1 commit intoceph:develfrom
dridge93:multi-cluster-support

dridge93 commented Feb 23, 2026

Uh oh!

dridge93 commented Mar 10, 2026

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

mergify Bot commented Apr 9, 2026

Uh oh!

akaitux commented Apr 28, 2026

Uh oh!

akaitux commented Apr 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dridge93 commented Feb 23, 2026

Adding multi-cluster support via array of cluster configs based on topology

Describe what this PR does

Related issues

Uh oh!

dridge93 commented Mar 10, 2026

Uh oh!

github-actions Bot commented Apr 9, 2026

Uh oh!

mergify Bot commented Apr 9, 2026

Uh oh!

akaitux commented Apr 28, 2026

Uh oh!

akaitux commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

akaitux commented Apr 28, 2026 •

edited

Loading