Skip to content

Conversation

@kvaps
Copy link

@kvaps kvaps commented Jan 5, 2026

Summary

Different systems may create LUKS2 with different default header sizes (16 MiB vs 32 MiB) depending on system configuration (cryptsetup defaults, /etc/cryptsetup-default.conf, etc.).

This causes issues during toggle-disk operations when adding a disk to a diskless resource:

  • DRBD expects the backing device size based on peer nodes
  • The local LUKS device has less usable space due to a larger header
  • Result: Low.dev. smaller than requested DRBD-dev. size error

Example:

  • Node A (original): LUKS offset 32768 sectors (16 MiB), usable size 1073971328 sectors
  • Node B (toggle-disk): LUKS offset 65536 sectors (32 MiB), usable size 1073938560 sectors
  • DRBD cannot attach the smaller backing device

Solution

Always specify --offset 32768 (16 MiB) when creating LUKS devices to ensure consistent header size across all nodes. This is the minimum standard offset for LUKS2.

Also exclude --offset from additionalOptions to prevent users from accidentally breaking consistency.

Different systems may create LUKS2 with different default header sizes
(16 MiB vs 32 MiB) depending on system configuration. This causes
"Low.dev. smaller than requested DRBD-dev. size" errors when performing
toggle-disk operations, as the DRBD device expects a certain size from
peer nodes but the local LUKS device has less usable space due to a
larger header.

Fix by always specifying --offset 32768 (16 MiB) when creating LUKS
devices to ensure consistent header size across all nodes.

Co-Authored-By: Claude <[email protected]>
Signed-off-by: Andrei Kvapil <[email protected]>
@kvaps kvaps marked this pull request as ready for review January 6, 2026 09:13
kvaps added a commit to cozystack/cozystack that referenced this pull request Jan 6, 2026
## What this PR does

Build piraeus-server (linstor-server) from source with custom patches:

- **adjust-on-resfile-change.diff** — Use actual device path in res file
during toggle-disk; fix LUKS data offset
- Upstream: [#473](LINBIT/linstor-server#473),
[#472](LINBIT/linstor-server#472)
- **allow-toggle-disk-retry.diff** — Allow retry and cancellation of
failed toggle-disk operations
  - Upstream: [#475](LINBIT/linstor-server#475)
- **force-metadata-check-on-disk-add.diff** — Create metadata during
toggle-disk from diskless to diskful
  - Upstream: [#474](LINBIT/linstor-server#474)
- **skip-adjust-when-device-inaccessible.diff** — Skip DRBD adjust/res
file regeneration when child layer device is inaccessible
  - Upstream: [#471](LINBIT/linstor-server#471)

Also updates plunger-satellite script and values.yaml for the new build.

### Release note

```release-note
[linstor] Build linstor-server with custom patches for improved disk handling
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added automatic DRBD stall detection and recovery, improving storage
resync resilience without manual intervention.
* Introduced configurable container image references via Helm values for
streamlined deployment.

* **Bug Fixes**
* Enhanced disk toggle operations with retry and cancellation support
for better error handling.
  * Improved metadata creation during disk state transitions.
* Added device accessibility checks to prevent errors when underlying
storage devices are unavailable.
* Fixed LUKS encryption header sizing for consistent deployment across
nodes.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@rp-
Copy link
Contributor

rp- commented Jan 7, 2026

LGTM, except we can't merge yet, as we have to remove rhel7 support first, as the old cryptsetup doesn't work with this.

for the other PR's I'll have to review/check them together with Gabor, who will be back next week.

kvaps added a commit to cozystack/cozystack that referenced this pull request Jan 8, 2026
## What this PR does

Build piraeus-server (linstor-server) from source with custom patches:

- **adjust-on-resfile-change.diff** — Use actual device path in res file
during toggle-disk; fix LUKS data offset
- Upstream: [#473](LINBIT/linstor-server#473),
[#472](LINBIT/linstor-server#472)
- **allow-toggle-disk-retry.diff** — Allow retry and cancellation of
failed toggle-disk operations
  - Upstream: [#475](LINBIT/linstor-server#475)
- **force-metadata-check-on-disk-add.diff** — Create metadata during
toggle-disk from diskless to diskful
  - Upstream: [#474](LINBIT/linstor-server#474)
- **skip-adjust-when-device-inaccessible.diff** — Skip DRBD adjust/res
file regeneration when child layer device is inaccessible
  - Upstream: [#471](LINBIT/linstor-server#471)

Also updates plunger-satellite script and values.yaml for the new build.

### Release note

```release-note
[linstor] Build linstor-server with custom patches for improved disk handling
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added automatic DRBD stall detection and recovery, improving storage
resync resilience without manual intervention.
* Introduced configurable container image references via Helm values for
streamlined deployment.

* **Bug Fixes**
* Enhanced disk toggle operations with retry and cancellation support
for better error handling.
  * Improved metadata creation during disk state transitions.
* Added device accessibility checks to prevent errors when underlying
storage devices are unavailable.
* Fixed LUKS encryption header sizing for consistent deployment across
nodes.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
kvaps added a commit to cozystack/cozystack that referenced this pull request Jan 8, 2026
## What this PR does

Build piraeus-server (linstor-server) from source with custom patches:

- **adjust-on-resfile-change.diff** — Use actual device path in res file
during toggle-disk; fix LUKS data offset
- Upstream: [#473](LINBIT/linstor-server#473),
[#472](LINBIT/linstor-server#472)
- **allow-toggle-disk-retry.diff** — Allow retry and cancellation of
failed toggle-disk operations
  - Upstream: [#475](LINBIT/linstor-server#475)
- **force-metadata-check-on-disk-add.diff** — Create metadata during
toggle-disk from diskless to diskful
  - Upstream: [#474](LINBIT/linstor-server#474)
- **skip-adjust-when-device-inaccessible.diff** — Skip DRBD adjust/res
file regeneration when child layer device is inaccessible
  - Upstream: [#471](LINBIT/linstor-server#471)

Also updates plunger-satellite script and values.yaml for the new build.

### Release note

```release-note
[linstor] Build linstor-server with custom patches for improved disk handling
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added automatic DRBD stall detection and recovery, improving storage
resync resilience without manual intervention.
* Introduced configurable container image references via Helm values for
streamlined deployment.

* **Bug Fixes**
* Enhanced disk toggle operations with retry and cancellation support
for better error handling.
  * Improved metadata creation during disk state transitions.
* Added device accessibility checks to prevent errors when underlying
storage devices are unavailable.
* Fixed LUKS encryption header sizing for consistent deployment across
nodes.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
kvaps added a commit to cozystack/cozystack that referenced this pull request Jan 9, 2026
## What this PR does

Build piraeus-server (linstor-server) from source with custom patches:

- **adjust-on-resfile-change.diff** — Use actual device path in res file
during toggle-disk; fix LUKS data offset
- Upstream: [#473](LINBIT/linstor-server#473),
[#472](LINBIT/linstor-server#472)
- **allow-toggle-disk-retry.diff** — Allow retry and cancellation of
failed toggle-disk operations
  - Upstream: [#475](LINBIT/linstor-server#475)
- **force-metadata-check-on-disk-add.diff** — Create metadata during
toggle-disk from diskless to diskful
  - Upstream: [#474](LINBIT/linstor-server#474)
- **skip-adjust-when-device-inaccessible.diff** — Skip DRBD adjust/res
file regeneration when child layer device is inaccessible
  - Upstream: [#471](LINBIT/linstor-server#471)

Also updates plunger-satellite script and values.yaml for the new build.

### Release note

```release-note
[linstor] Build linstor-server with custom patches for improved disk handling
```

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added automatic DRBD stall detection and recovery, improving storage
resync resilience without manual intervention.
* Introduced configurable container image references via Helm values for
streamlined deployment.

* **Bug Fixes**
* Enhanced disk toggle operations with retry and cancellation support
for better error handling.
  * Improved metadata creation during disk state transitions.
* Added device accessibility checks to prevent errors when underlying
storage devices are unavailable.
* Fixed LUKS encryption header sizing for consistent deployment across
nodes.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@ghernadi
Copy link
Contributor

I don't want to sound mean, but is /etc/cryptsetup-default.conf a hallucination? I could not find anything for that, so if you have some resources for me in that regard, I'd appreciate it.

If not using /etc/cryptsetup-default.conf or anything else that is external to LINSTOR I have to assume that you are using some LINSTOR property like StorDriver/LuksFormatOptions '--luks2-metadata-size 32M' (or --offset ...). That would also fit your proposed solution by simply ignoring such user-defined LUKS2 options in the CryptSetupCommands.java. However, as pointed out earlier --offset is just one way to screw up the size calculation, another would be --luks2-metadata-size, so you would need to ignore both.

If we already assume that all of the LUKS related options must be set in LINSTOR, a different approach (which honestly I would prefer) is to not ignore those properties but simply take them into account. We have for example the LuksLayerSizeCalculator that currently hardcoded adds 2MiB of space to the requested user-space for LUKS1 based setups and 16MiB to LUKS2 based setups. This (hardcoded) calculation of course is wrong if the user adds parameters like --offset or --luks2-metadata-size (or others that I am currently not aware of). LuksLayerSizeCalculator has already a method called getLuksHeaderSize, which should make it fairly easy to parse the properties, scan for --offset and other properties and return the correct size.

@kvaps
Copy link
Author

kvaps commented Jan 21, 2026

@ghernadi you're right, /etc/cryptsetup-default.conf doesn't exist — I apologize for the confusion.

I've investigated the root cause and found it: ZFS zvols report optimal_io_size = 32 MiB, and cryptsetup uses this topology info for automatic data offset alignment.

Here's the evidence from my production cluster:

  1. ZFS zvol topology parameters:
# cat /sys/block/zd720/queue/optimal_io_size
33554432
# cat /sys/block/zd720/queue/minimum_io_size
16384
# cat /sys/block/zd720/queue/physical_block_size
16384
  1. Cryptsetup debug output when formatting without --offset:
# echo -n "pass" | cryptsetup -v --debug luksFormat --type luks2 /dev/zvol/data/test - 2>&1 | grep -i "topology\|offset"
# Topology: IO (16384/33554432), offset = 0; Required alignment is 33554432 bytes.
# Device size 104857600, offset 33554432.
  1. Resulting LUKS header:
# cryptsetup luksDump /dev/zvol/data/test | grep offset
      offset: 33554432 [bytes]    # = 32 MiB
  1. Same cryptsetup version on loop device (no optimal_io_size):
# cat /sys/block/loop0/queue/optimal_io_size
0
# cryptsetup luksDump /dev/loop0 | grep offset
      offset: 16777216 [bytes]    # = 16 MiB (default)

So the issue is: when creating LUKS on ZFS zvols without explicit --offset, cryptsetup picks up the 32 MiB optimal_io_size from block device topology. On other storage backends (LVM, loop
devices, etc.) where optimal_io_size is 0 or smaller, cryptsetup uses the default 16 MiB.

This causes inconsistent offsets when nodes run different kernel/zfs versions (with different topology handling behavior), or when new replicas are added after system updates that change the optimal_io_size reporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants