-
Notifications
You must be signed in to change notification settings - Fork 124
Description
Hello,
on Kubecon I heard about this feature to enable/disable MIG and create slices of it based on the requests of a pod. something similar to gpu-test4 and I believe DRA should be exactly for this.
What I'm expecting from this is that by default, MIG on node would be disabled, but as soon as a pod with request for a MIG device via resourceClaims comes, for example mig-1g.10gb. Kubernetes scheduler should then find a MIG capable node, enable MIG and create the slice needed. Is this dynamic slicing something that is already released? I'm trying to implement this but running into some issues. However before I share any deeper details, I first wanted to know if it's even something already doable and if yes, is there some documentation for it?
So far, I was able to statically create MIG slices via GPU-operator's ConfigMap and assign those slices with ResourceClaimTemplate and ResourceClaim, but if i wanted a different slicing of GPU I would need to change the ConfigMap manually, before pods can use different MIG slices, when I would like the operator to slice up GPU based on pods requests automatically.
GPU used: NVIDIA-H100-PCIe
kube-apiserver version: v1.35.0
Metadata
Metadata
Assignees
Labels
Type
Projects
Status