Replace std::clamp with device-compatible clamp_scalar in clamp backend by rahmans1 · Pull Request #98 · acts-project/covfie

rahmans1 · 2026-04-09T03:23:45Z

Replace std::clamp with device-compatible clamp_scalar in clamp backend

Problem

std::clamp is a constexpr host-only function. Calling it from a COVFIE_HOST_DEVICE (host device) context causes nvcc to emit warning #20013-D and produces silently incorrect results in device code (all outputs become zero) unless --expt-relaxed-constexpr is passed.

Minimal Demo

Verified on:
GPU: NVIDIA RTX 1000 Ada Generation
Driver: 581.42
CUDA: 13.0

// Minimal demo: why std::clamp fails in covfie device code,
// and why a ternary-based replacement works.
//
// Works (ternary):
//   nvcc -std=c++17 -o clamp_demo clamp_demo.cu && ./clamp_demo
//
// Reproduce the original covfie error (COVFIE_HOST_DEVICE + std::clamp):
//   nvcc -std=c++17 -DUSE_COVFIE_ORIGINAL -o clamp_demo clamp_demo.cu

#include <algorithm>
#include <cstdio>

#ifdef USE_COVFIE_ORIGINAL
// This mirrors what covfie's adjust() did originally:
// COVFIE_HOST_DEVICE (__host__ __device__) annotated, but calls std::clamp
// which is a __host__-only constexpr function.
// Without --expt-relaxed-constexpr, nvcc rejects the call to std::clamp
// from device code even through a __host__ __device__ wrapper.
template<typename S>
__host__ __device__ S clamp_fn(S v, S lo, S hi)
{
    return std::clamp(v, lo, hi);
}
#else
// The fix: ternary uses only native GPU compare/select, no stdlib call.
// Compiles and runs correctly with no special nvcc flags.
template<typename S>
__host__ __device__ S clamp_fn(S v, S lo, S hi)
{
    return v < lo ? lo : (v > hi ? hi : v);
}
#endif

__global__ void clamp_kernel(float* out)
{
    out[0] = clamp_fn(1.5f,  0.0f, 1.0f);  // above max → 1.0
    out[1] = clamp_fn(-0.5f, 0.0f, 1.0f);  // below min → 0.0
    out[2] = clamp_fn(0.5f,  0.0f, 1.0f);  // in range  → 0.5
}

int main()
{
    float* d;
    cudaMalloc(&d, 3 * sizeof(float));

    clamp_kernel<<<1, 1>>>(d);
    cudaDeviceSynchronize();

    float h[3];
    cudaMemcpy(h, d, 3 * sizeof(float), cudaMemcpyDeviceToHost);
    cudaFree(d);

    printf("clamp(1.5, 0, 1)  = %.1f  (expected 1.0)\n", h[0]);
    printf("clamp(-0.5, 0, 1) = %.1f  (expected 0.0)\n", h[1]);
    printf("clamp(0.5, 0, 1)  = %.1f  (expected 0.5)\n", h[2]);

    printf("\nHost check (via std::clamp):\n");
    printf("clamp(1.5, 0, 1)  = %.1f\n", std::clamp(1.5f,  0.0f, 1.0f));
    printf("clamp(-0.5, 0, 1) = %.1f\n", std::clamp(-0.5f, 0.0f, 1.0f));
    printf("clamp(0.5, 0, 1)  = %.1f\n", std::clamp(0.5f,  0.0f, 1.0f));

    return 0;
}

nvcc -std=c++17 -DUSE_COVFIE_ORIGINAL -o clamp_demo clamp_demo.cu
clamp_demo.cu(22): warning #20013-D: calling a constexpr __host__ function("clamp") from a __host__ __device__ function("clamp_fn") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
      return std::clamp(v, lo, hi);
             ^
          detected during instantiation of "S clamp_fn(S, S, S) [with S=float]" at line 36

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

clamp_demo.cu(22): warning #20013-D: calling a constexpr __host__ function("const T1 &  ::std::clamp<float> (const T1 &, const T1 &, const T1 &)") from a __host__ __device__ function("clamp_fn<float> ") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

jug_dev+> ubuntu@LPO-174581:~/celeritas-dd4hep/covfie$ ./clamp_demo
clamp(1.5, 0, 1)  = 0.0  (expected 1.0)
clamp(-0.5, 0, 1) = 0.0  (expected 0.0)
clamp(0.5, 0, 1)  = 0.0  (expected 0.5)

Host check (via std::clamp):
clamp(1.5, 0, 1)  = 1.0
clamp(-0.5, 0, 1) = 0.0
clamp(0.5, 0, 1)  = 0.5

Fix

Introduce a COVFIE_HOST_DEVICE static helper clamp_scalar() using the ternary pattern inside non_owning_data_t::adjust(), and update the pack-expansion call accordingly. Remove the now-unused #include .

nvcc -std=c++17 -o clamp_demo clamp_demo.cu && ./clamp_demo
clamp(1.5, 0, 1)  = 1.0  (expected 1.0)
clamp(-0.5, 0, 1) = 0.0  (expected 0.0)
clamp(0.5, 0, 1)  = 0.5  (expected 0.5)

Host check (via std::clamp):
clamp(1.5, 0, 1)  = 1.0
clamp(-0.5, 0, 1) = 0.0
clamp(0.5, 0, 1)  = 0.5

…backend std::clamp is a constexpr __host__-only function. Calling it from a COVFIE_HOST_DEVICE (__host__ __device__) context causes nvcc to emit warning #20013-D and produces silently incorrect results in device code (all outputs become zero) unless --expt-relaxed-constexpr is passed. Verified on: GPU: NVIDIA RTX 1000 Ada Generation Driver: 581.42 CUDA: 13.0 Broken (mirrors original adjust() behaviour): template<typename S> __host__ __device__ S clamp_fn(S v, S lo, S hi) { return std::clamp(v, lo, hi); // warning #20013-D; wrong at runtime } clamp(1.5, 0, 1) = 0.0 (expected 1.0) <- wrong clamp(-0.5, 0, 1) = 0.0 (expected 0.0) clamp(0.5, 0, 1) = 0.0 (expected 0.5) <- wrong Fixed (ternary lowers to native GPU compare/select, no stdlib call): template<typename S> __host__ __device__ S clamp_fn(S v, S lo, S hi) { return v < lo ? lo : (v > hi ? hi : v); } clamp(1.5, 0, 1) = 1.0 (expected 1.0) ✓ clamp(-0.5, 0, 1) = 0.0 (expected 0.0) ✓ clamp(0.5, 0, 1) = 0.5 (expected 0.5) ✓ Introduce a COVFIE_HOST_DEVICE static helper clamp_scalar() using the ternary pattern inside non_owning_data_t::adjust(), and update the pack-expansion call accordingly. Remove the now-unused #include <algorithm>. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

rahmans1 · 2026-04-09T15:07:58Z

@stephenswat Would you mind giving this a review?

rahmans1 mentioned this pull request Apr 11, 2026

Pr/rzmap covfie backend celeritas-project/celeritas#2344

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace std::clamp with device-compatible clamp_scalar in clamp backend#98

Replace std::clamp with device-compatible clamp_scalar in clamp backend#98
rahmans1 wants to merge 1 commit intoacts-project:mainfrom
BNLNPPS:fix/clamp-device-code

rahmans1 commented Apr 9, 2026

Uh oh!

rahmans1 commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rahmans1 commented Apr 9, 2026