Replace std::clamp with device-compatible clamp_scalar in clamp backend#98
Open
rahmans1 wants to merge 1 commit intoacts-project:mainfrom
Open
Replace std::clamp with device-compatible clamp_scalar in clamp backend#98rahmans1 wants to merge 1 commit intoacts-project:mainfrom
rahmans1 wants to merge 1 commit intoacts-project:mainfrom
Conversation
…backend
std::clamp is a constexpr __host__-only function. Calling it from a
COVFIE_HOST_DEVICE (__host__ __device__) context causes nvcc to emit
warning #20013-D and produces silently incorrect results in device code
(all outputs become zero) unless --expt-relaxed-constexpr is passed.
Verified on:
GPU: NVIDIA RTX 1000 Ada Generation
Driver: 581.42
CUDA: 13.0
Broken (mirrors original adjust() behaviour):
template<typename S>
__host__ __device__ S clamp_fn(S v, S lo, S hi)
{
return std::clamp(v, lo, hi); // warning #20013-D; wrong at runtime
}
clamp(1.5, 0, 1) = 0.0 (expected 1.0) <- wrong
clamp(-0.5, 0, 1) = 0.0 (expected 0.0)
clamp(0.5, 0, 1) = 0.0 (expected 0.5) <- wrong
Fixed (ternary lowers to native GPU compare/select, no stdlib call):
template<typename S>
__host__ __device__ S clamp_fn(S v, S lo, S hi)
{
return v < lo ? lo : (v > hi ? hi : v);
}
clamp(1.5, 0, 1) = 1.0 (expected 1.0) ✓
clamp(-0.5, 0, 1) = 0.0 (expected 0.0) ✓
clamp(0.5, 0, 1) = 0.5 (expected 0.5) ✓
Introduce a COVFIE_HOST_DEVICE static helper clamp_scalar() using the
ternary pattern inside non_owning_data_t::adjust(), and update the
pack-expansion call accordingly. Remove the now-unused #include <algorithm>.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
|
@stephenswat Would you mind giving this a review? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replace std::clamp with device-compatible clamp_scalar in clamp backend
Problem
std::clamp is a constexpr host-only function. Calling it from a COVFIE_HOST_DEVICE (host device) context causes nvcc to emit warning #20013-D and produces silently incorrect results in device code (all outputs become zero) unless --expt-relaxed-constexpr is passed.
Minimal Demo
Verified on:
GPU: NVIDIA RTX 1000 Ada Generation
Driver: 581.42
CUDA: 13.0
Fix
Introduce a COVFIE_HOST_DEVICE static helper clamp_scalar() using the ternary pattern inside non_owning_data_t::adjust(), and update the pack-expansion call accordingly. Remove the now-unused #include .