Skip to content

Plugin Fails to Assign MAC Address to Pre-existing Virtual Functions (VFs) on Mellanox NICs #29

@tariromukute

Description

@tariromukute

First and foremost, thank you for creating and maintaining this project. It has been a great help in simplifying our SR-IOV and Docker setup.

I've encountered an issue where the SR-IOV CNI plugin does not assign a MAC address to a Virtual Function (VF) if that VF was created manually before the container is started. This behavior appears to be specific to our Mellanox network cards; we have not observed the same issue with Intel NICs.

In our environment, we need to pre-create a specific number of VFs on each of our four physical NICs. If we let the plugin create the maximum number of VFs automatically, some of the NICs fail to initialise, which we believe is due to resource exhaustion. Manually creating a limited number of VFs beforehand is a necessary step to keep the system stable.

Steps to Reproduce

First, deploy the SR-IOV CNI plugin:

docker compose -f docker-compose-sriov-plugin.yaml up -d

Scenario 1: Plugin-Managed VFs (Working as Expected)

In this scenario, we allow the plugin to create the VFs automatically when the containers are launched.

  1. Create VFs and containers simultaneously:

    docker compose -f docker-compose-sriov.yaml up -d
  2. Observation:
    The ip link command shows that the VFs are correctly created and assigned unique MAC addresses.

Click to see `ip link` output ``` 7: ens2f0np0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 5c:25:73:8c:e9:f0 brd ff:ff:ff:ff:ff:ff vf 0 link/ether aa:0a:6b:9e:db:f1 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 1 link/ether 16:c5:f5:47:da:77 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 2 link/ether 1e:fb:c5:b7:73:75 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 3 link/ether aa:da:51:ed:99:90 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 4 link/ether d6:9f:16:3f:bd:88 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 5 link/ether a2:d6:0c:ee:f7:56 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 6 link/ether 3e:4b:c6:47:c6:28 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off, query_rss off vf 7 link/ether 56:6c:b3:3f:b2:84 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off altname enp42s0f0np0 8: ens2f1np1: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 5c:25:73:8c:e9:f1 brd ff:ff:ff:ff:ff:ff vf 0 link/ether fa:c8:e5:49:5e:09 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 1 link/ether 6e:08:da:51:2b:a9 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 2 link/ether d2:fd:47:3e:de:03 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 3 link/ether c6:35:b6:6f:f1:b7 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 4 link/ether c2:a5:a2:bf:5e:a2 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 5 link/ether 86:94:88:07:00:ac brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 6 link/ether d2:47:96:95:ef:db brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off, query_rss off vf 7 link/ether 6a:35:2a:52:0f:44 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off altname enp42s0f1np1 9: docker0: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default link/ether 96:f1:bf:be:c5:83 brd ff:ff:ff:ff:ff:ff 221: ens2f0v3: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether aa:da:51:ed:99:90 brd ff:ff:ff:ff:ff:ff permaddr f6:90:eb:4c:eb:2f altname enp42s0f0v3 227: ens2f0v4: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether d6:9f:16:3f:bd:88 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v4 228: ens2f0v2: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 1e:fb:c5:b7:73:75 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v2 229: ens2f0v0: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether aa:0a:6b:9e:db:f1 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v0 230: ens2f0v7: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 56:6c:b3:3f:b2:84 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v7 231: ens2f0v5: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether a2:d6:0c:ee:f7:56 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v5 232: ens2f0v1: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 16:c5:f5:47:da:77 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v1 242: ens2f1v4: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether c2:a5:a2:bf:5e:a2 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v4 243: ens2f1v2: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether d2:fd:47:3e:de:03 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v2 244: ens2f1v0: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether fa:c8:e5:49:5e:09 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v0 245: ens2f1v7: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 6a:35:2a:52:0f:44 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v7 246: ens2f1v5: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 86:94:88:07:00:ac brd ff:ff:ff:ff:ff:ff altname enp42s0f1v5 247: ens2f1v3: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether c6:35:b6:6f:f1:b7 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v3 248: ens2f1v1: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 6e:08:da:51:2b:a9 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v1 ```

Scenario 2: Manually Pre-created VFs (Issue Occurs)

In this scenario, we create the VFs on the host before launching the containers.

  1. Manually create two VFs on each physical interface:

    echo 2 > /sys/class/net/ens2f0np0/device/sriov_numvfs
    echo 2 > /sys/class/net/ens2f1np1/device/sriov_numvfs
  2. Create containers and have the plugin attach the existing VFs:

    docker compose -f docker-compose-sriov.yaml up -d
  3. Observation:
    The output of ip link shows that the VFs are attached but their MAC addresses are 00:00:00:00:00:00. As a result, network connectivity (e.g., ping) between the client and server containers fails.

Click to see `ip link` output
7: ens2f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 5c:25:73:8c:e9:f0 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    altname enp42s0f0np0
8: ens2f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 5c:25:73:8c:e9:f1 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
    altname enp42s0f1np1
9: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default 
    link/ether 96:f1:bf:be:c5:83 brd ff:ff:ff:ff:ff:ff
249: ens2f1v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether fa:c8:e5:49:5e:09 brd ff:ff:ff:ff:ff:ff permaddr 6a:55:0b:fe:e4:6d
    altname enp42s0f1v0
250: ens2f1v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 6e:08:da:51:2b:a9 brd ff:ff:ff:ff:ff:ff permaddr e2:15:1d:b9:95:19
    altname enp42s0f1v1
251: ens2f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether aa:0a:6b:9e:db:f1 brd ff:ff:ff:ff:ff:ff permaddr 16:f5:15:15:9f:22
    altname enp42s0f0v0
252: ens2f0v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 16:c5:f5:47:da:77 brd ff:ff:ff:ff:ff:ff permaddr 5a:26:c7:64:85:75
    altname enp42s0f0v1
docker-compose-sriov-plugin.yaml ``` version: '2.4' services: sriov-plugin: privileged: true image: rdma/sriov-plugin:latest container_name: sriov-plugin restart: always volumes: - /run/docker/plugins:/run/docker/plugins - /etc/docker:/etc/docker - /var/run:/var/run network_mode: host ```

docker-compose-sriov.yaml ``` version: '2.4' services:

client:
privileged: true # To run ip r
image: tariromukute/test-tools:latest
container_name: client
restart: always
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /sys/fs/bpf:/sys/fs/bpf
- /sys/kernel/debug/:/sys/kernel/debug/
command:
- /bin/bash
- -c
- |
ip r replace 192.168.5.0/24 dev eth0

    sysctl -w net.ipv4.ip_forward=1

    tail -f /dev/null
networks:
  sriov-net-client:
    ipv4_address: 192.168.4.3

server:
privileged: true # To run ip r
image: tariromukute/test-tools:latest
container_name: server
restart: always
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /sys/fs/bpf:/sys/fs/bpf
- /sys/kernel/debug/:/sys/kernel/debug/
command:
- /bin/bash
- -c
- |
ip r replace 192.168.4.0/24 dev eth0

    sysctl -w net.ipv4.ip_forward=1

    tail -f /dev/null
networks:
  sriov-net-server:
    ipv4_address: 192.168.5.3

networks:
sriov-net-client:
name: sriov-net-client
driver: sriov
driver_opts:
netdevice: ens2f0np0
ipam:
config:
- subnet: "192.168.4.0/24"
sriov-net-server:
name: sriov-net-server
driver: sriov
driver_opts:
netdevice: ens2f1np1
ipam:
config:
- subnet: "192.168.5.0/24"

</details>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions