Fabric manager Shared NVSwitch virt model support by mresvanis · Pull Request #166 · NVIDIA/kubevirt-gpu-device-plugin

mresvanis · 2026-02-19T16:55:31Z

Summary

This PR adds the following changes:

Add NVIDIA Fabric Manager integration for multi-GPU NVSwitch-based systems (e.g., DGX/HGX), enabling automatic fabric partition management during device allocation
Introduce CGO bindings for libnvfm and a partition manager that coordinates GPU grouping via NVLink fabric partitions
Refactor GetPreferredAllocation to prefer devices belonging to the same fabric partition when FM is enabled, falling back to NUMA-based selection otherwise

Related NVIDIA GPU Operator changes: NVIDIA/gpu-driver-container#538 and NVIDIA/gpu-operator#2045

Changes

This change adds optional Fabric Manager support behind the ENABLE_FABRIC_MANAGER environment variable (disabled by default). When enabled, the device plugin:

Connects to the FM daemon over a Unix socket at startup
Uses a PCI-to-module mapping to resolve GPU physical IDs to FM module IDs
Selects preferred allocations that align with FM partition boundaries and NUMA locality
Activates the appropriate fabric partition during Allocate, ensuring NVLink connectivity between allocated GPUs

New packages:

pkg/nvfm -- CGO bindings for the libnvfm shared library
pkg/fabricmanager -- High-level FM client, partition manager, and PCI module mapping utilities

Test plan

Unit tests added for pkg/fabricmanager (client, partition manager) and pkg/device_plugin
Verify device plugin starts and operates normally with ENABLE_FABRIC_MANAGER=false (default)
Verify FM partition activation on an NVSwitch node with ENABLE_FABRIC_MANAGER=true

Signed-off-by: Michail Resvanis <mresvani@redhat.com>

… image Signed-off-by: Michail Resvanis <mresvani@redhat.com>

Signed-off-by: Michail Resvanis <mresvani@redhat.com>

…hen FM enabled Extract NUMA-based device selection into a standalone preferDevicesByNUMA method. When a partition manager is active, GetPreferredAllocation now delegates to it for FM-aware selection with NUMA locality; otherwise it falls back to the original NUMA-only logic. Add comprehensive tests for the FM-aware path covering partition matching, NUMA tie-breaking, error cases, unavailable GPUs, and must-include device ordering. Signed-off-by: Michail Resvanis <mresvani@redhat.com>

When the fabric manager is enabled, the Allocate handler now activates partitions for the requested device IDs before returning the allocation response, failing the request if the connection is lost or activation errors out. Signed-off-by: Michail Resvanis <mresvani@redhat.com>

copy-pr-bot · 2026-02-19T16:55:35Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

alaypatel07 · 2026-02-24T17:25:15Z

@mresvanis I am interested in reviewing this PR, once it is ready please ping me.

fanzhangio · 2026-02-27T17:56:59Z

pkg/device_plugin/generic_device_plugin.go

+
+		log.Printf("Fabric partition activated successfully for devices: %v", allDeviceIDs)
+	}
+


In Allocate, all device IDs from all ContainerRequests are aggregated and passed in one call to ActivateForDevices. ActivateForDevices requires an exact partition-size match against that full list. So FM activation is done once on the union of all container requests, which can reject valid multi-container pod allocations. I am wondering whether union-based activation can be over-constraining for multi-container allocate requests.
if a pod has multiple GPU-consuming containers (or request splitting), allocation can fail even if each container’s assignment is valid individually.

fanzhangio · 2026-02-27T18:08:32Z

pkg/device_plugin/generic_device_plugin.go

+			pciToModule, moduleToPCI, mapErr := fabricmanager.LoadPCIModuleMapping(pciModuleMappingPath)
+			if mapErr != nil {
+				log.Printf("WARNING: Failed to load PCI module mapping: %v", mapErr)
+				log.Print("Falling back to legacy device plugin mode")


This seems a FM connection lifecycle leak on startup fallback path, if connect succeeds but mapping load fails, connection is not closed.
On this mapping-fail path, fmClient is already connected, but it is never disconnected.
Below dpi.partitionManager is only assigned in the mapping-success branch, and Stop() only disconnects via dpi.partitionManager. So if connect succeeds but mapping fails, the client handle/socket stays open with no retained reference to close it.

mresvanis added 7 commits February 18, 2026 15:32

Add fabric manager SDK CGO bindings

aa44477

Signed-off-by: Michail Resvanis <mresvani@redhat.com>

Add driver version and fabric manager shared library to the container…

ce454de

… image Signed-off-by: Michail Resvanis <mresvani@redhat.com>

Add FM client based on libnvfm CGO bindings

b274083

Signed-off-by: Michail Resvanis <mresvani@redhat.com>

Add Fabric Manager partition manager

ffdbb56

Signed-off-by: Michail Resvanis <mresvani@redhat.com>

Integrate fabric manager into generic device plugin

3024c25

Signed-off-by: Michail Resvanis <mresvani@redhat.com>

fanzhangio reviewed Feb 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fabric manager Shared NVSwitch virt model support#166

Fabric manager Shared NVSwitch virt model support#166
mresvanis wants to merge 7 commits intoNVIDIA:masterfrom
mresvanis:fabric-manager-support

mresvanis commented Feb 19, 2026

Uh oh!

copy-pr-bot bot commented Feb 19, 2026

Uh oh!

alaypatel07 commented Feb 24, 2026

Uh oh!

fanzhangio Feb 27, 2026

Uh oh!

fanzhangio Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		log.Printf("Fabric partition activated successfully for devices: %v", allDeviceIDs)
		}

Conversation

mresvanis commented Feb 19, 2026

Summary

Changes

Test plan

Uh oh!

copy-pr-bot bot commented Feb 19, 2026

Uh oh!

alaypatel07 commented Feb 24, 2026

Uh oh!

fanzhangio Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

fanzhangio Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants