Skip to content

Feature availability: Dynamic MIG slicing based on k8s pod requests #894

@Pavel-Okruhlica-SZN

Description

@Pavel-Okruhlica-SZN

Hello,

on Kubecon I heard about this feature to enable/disable MIG and create slices of it based on the requests of a pod. something similar to gpu-test4 and I believe DRA should be exactly for this.
What I'm expecting from this is that by default, MIG on node would be disabled, but as soon as a pod with request for a MIG device via resourceClaims comes, for example mig-1g.10gb. Kubernetes scheduler should then find a MIG capable node, enable MIG and create the slice needed. Is this dynamic slicing something that is already released? I'm trying to implement this but running into some issues. However before I share any deeper details, I first wanted to know if it's even something already doable and if yes, is there some documentation for it?

So far, I was able to statically create MIG slices via GPU-operator's ConfigMap and assign those slices with ResourceClaimTemplate and ResourceClaim, but if i wanted a different slicing of GPU I would need to change the ConfigMap manually, before pods can use different MIG slices, when I would like the operator to slice up GPU based on pods requests automatically.

GPU used: NVIDIA-H100-PCIe
kube-apiserver version: v1.35.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionCategorizes issue or PR as a support question.

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions