Skip to content

feat(clone): add cloneFullCopy parameter for snapshot-based volume cloning#416

Draft
kvaps wants to merge 1 commit intopiraeusdatastore:masterfrom
kvaps:feat/clone-full-copy
Draft

feat(clone): add cloneFullCopy parameter for snapshot-based volume cloning#416
kvaps wants to merge 1 commit intopiraeusdatastore:masterfrom
kvaps:feat/clone-full-copy

Conversation

@kvaps
Copy link
Copy Markdown
Member

@kvaps kvaps commented Feb 27, 2026

Summary

  • Add a new StorageClass parameter linstor.csi.linbit.com/cloneFullCopy
    that changes Clone() to use snapshot + restore instead of LINSTOR rd clone
  • When enabled, clones are created as fully independent copies placed on
    different nodes from the source, avoiding thin pool exhaustion on source nodes
  • Migration is best-effort: if no alternative nodes exist, the clone stays
    co-located but remains a full independent copy

Motivation

LINSTOR's built-in rd clone creates thin COW snapshots pinned to the same
nodes as the source volume. This works well for occasional clones, but becomes
problematic when cloning golden images (e.g. VM disk templates) at scale — all
clones accumulate on the source nodes, eventually exhausting the thin pool.

With cloneFullCopy: "true", the CSI driver instead:

  1. Creates a temporary LINSTOR snapshot from the source
  2. Restores it via VolFromSnap (which uses autoplace for new placement)
  3. Migrates clone data away from source nodes (Autoplace with NotPlaceWithRsc)
  4. Waits for DRBD sync to complete on new nodes
  5. Removes the source-node copies

This produces fully independent volumes distributed across the cluster.

Usage

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: linstor-distributed
provisioner: linstor.csi.linbit.com
parameters:
  linstor.csi.linbit.com/cloneFullCopy: "true"
  linstor.csi.linbit.com/storagePool: "pool"

Test plan

  • Clone a volume with cloneFullCopy: "true" — verify clone lands on different nodes
  • Clone without the parameter — verify original rd clone behavior is unchanged
  • Clone with cloneFullCopy: "true" on a fully saturated cluster (replicas = node count)
    — verify graceful fallback (clone co-located but functional)
  • Restart CSI driver mid-clone — verify leftover snapshot cleanup on retry

…sed cloning

Add a new StorageClass parameter `cloneFullCopy` that changes the Clone()
behavior to use snapshot + restore instead of LINSTOR's built-in rd clone.

When enabled, CSI Clone creates a temporary LINSTOR snapshot from the source
volume, restores it via VolFromSnap (which uses autoplace for placement),
and then migrates the data away from source nodes using Autoplace with
NotPlaceWithRsc. This produces a fully independent copy on different nodes
rather than a thin COW snapshot pinned to the source nodes.

This is useful when:
- Cloning golden images (e.g. VM disk templates) where 100+ clones from
  the same source would exhaust the thin pool on source nodes
- Users want clone data distributed across the cluster rather than
  co-located with the source

The migration is best-effort: if no alternative nodes are available
(e.g. replicas=3 on a 3-node cluster), the clone remains co-located
with the source but is still a full independent copy.

Usage:
  parameters:
    linstor.csi.linbit.com/cloneFullCopy: "true"

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant