feat: add __cuda_arch virtual package#1863
Conversation
1aadbac to
80d7361
Compare
| //! NVIDIA drivers. This is detected via: | ||
| //! | ||
| //! * CUDA driver library (libcuda) - Standard method | ||
| //! * nvidia-smi command - Fallback on musl systems where dynamic library loading is not supported |
There was a problem hiding this comment.
!? I am not familiar with musl libc. Is this a platform restriction or a rust restriction?
There was a problem hiding this comment.
musl is an alternative libc implementation that doesnt allow dynamic library loading.
| /// | ||
| /// This corresponds to the `__cuda_arch` virtual package. Returns `None` if | ||
| /// no CUDA devices are detected or if device enumeration fails. | ||
| pub arch_info: Option<CudaArchInfo>, |
There was a problem hiding this comment.
We will need to have a discussion about which default values make sense. I currently use __cuda_arch=0=None as the fallback value, which you should probably match here.
However, I haven't determined what the correct behavior is if the system has no devices. Use the default values or just don't expose the virtual package?
Also, CEP-26 requires that you print a warning if default/fallback values are used.
There was a problem hiding this comment.
Makes sense, Ill implement that as well
There was a problem hiding this comment.
Currently, __cuda is not present if there are no CUDA drivers. The same goes for __win, __unix, etc. What would be the advantage of always exposing the virtual package? Knowing that the installer has the capability of exposing it?
There was a problem hiding this comment.
I wasn't sure of the existing behavior of __cuda if the driver is not detected. While I think there is value in knowing that the installer has the capability of detecting a virtual package but didn't find anything. I think it's better that we follow existing conventions and hide the virtual package when a device is not detected or fails to detect.
I suppose there could be a warning message to notify the user that the installer is capable of __cuda and __cuda_arch, but didn't detect anything? OR maybe that would be annoying to users without NVIDIA hardware.
|
Thanks for taking a look at this @carterbox I highly appreciate it. Ill try to follow your implementation closely. |
df0852a to
f1e9534
Compare
…ection This commit implements the __cuda_arch virtual package based on the nvidia-virtual-packages reference implementation. The package detects the minimum CUDA compute capability across all detected CUDA devices. Implementation Details: - Unified CUDA detection that queries both driver version and compute capability in a single library load for efficiency - Uses CUDA Driver API (cuDeviceGetCount, cuDeviceGetAttribute, cuDeviceGetName) to enumerate devices - Returns minimum compute capability to ensure package compatibility across multi-GPU systems - Proper C string handling with CStr for device names - Independent failure modes (version can exist without devices) - Cached detection for performance Virtual Package Integration: - Added CudaArch struct with version and device_name fields - Implements EnvOverride trait with CONDA_OVERRIDE_CUDA_ARCH support - Supports both 'version' and 'version=build' override formats (e.g., '8.6' or '8.6=NVIDIA GeForce RTX 3090') - Device name used as build string to identify which GPU determined the minimum compute capability - Full integration with VirtualPackage enum and VirtualPackages struct Rationale: CUDA compute capability determines which instruction sets and features are available on a GPU. This virtual package enables hardware-specific constraints and multi-variant builds optimized for different GPU architectures while ensuring packages work on all GPUs in a system. Testing: - All existing tests pass - Added comprehensive tests for override parsing - Test both version-only and version=build formats
f1e9534 to
68d6f14
Compare
Update character validation and NVIDIA removal to match the stricter approach from conda-incubator/nvidia-virtual-packages PR #5: - Change `is_valid_build_string_char()` to allow only alphanumeric characters (a-z, A-Z, 0-9), removing support for underscore, period, and plus - Update `sanitize_device_name()` to remove "NVIDIA" (case-insensitive) anywhere in the string, not just as a prefix - Add `remove_nvidia_ci()` helper function for removing all occurrences of NVIDIA from a string - Remove now-unused `strip_prefix_ci()` function - Update all related tests to match new stricter behavior
|
Ran this on a Jetson Nano Orin: |
Update implementation to follow the official CEP specification for __cuda_arch virtual package, which explicitly rejects including device names to avoid problematic dependency constraints on specific GPU models. Breaking changes: - Remove device_name field from CudaArch and CudaArchInfo structs - Build string is now always "0" per CEP standard (not device name) - Environment override format simplified to just version (e.g., "8.6") - Remove all device name sanitization logic and related functions Removed functions: - sanitize_device_name() - remove_nvidia_ci() - is_valid_build_string_char() Key improvements: - Simpler, more maintainable code - Follows official CEP specification - Prevents problematic GPU model-specific constraints - Reduces detection overhead (no device name queries) CEP rationale: Device product names were explicitly rejected because they would encourage packages to constrain on specific GPU models rather than compute capability, which is the proper constraint. https://claude.ai/code/session_01Aeo9zN7n665RzTuY9tXQhD
Remove verbose sections and streamline documentation to be more direct. Removed 'Why Device Names Are Excluded' and 'Use Cases' sections as they add unnecessary context for future readers. https://claude.ai/code/session_01Aeo9zN7n665RzTuY9tXQhD
Replace ' - ' with ': ' in documentation for clearer formatting. https://claude.ai/code/session_01Aeo9zN7n665RzTuY9tXQhD
This PR implements the
__cuda_archvirtual package based on the nvidia-virtual-packages reference implementation. The package detects the minimum CUDA compute capability (SM version) across all CUDA devices on the system, enabling hardware-specific package constraints and multi-variant builds.@carterbox I wanted to get a head start with this to also make it available in rattler-build so we can experiment with this more easily.
AI Disclosure
Written by Claude Sonnet 4.5, edited and reviewed by me.