-
Notifications
You must be signed in to change notification settings - Fork 19
Description
First and foremost, thank you for creating and maintaining this project. It has been a great help in simplifying our SR-IOV and Docker setup.
I've encountered an issue where the SR-IOV CNI plugin does not assign a MAC address to a Virtual Function (VF) if that VF was created manually before the container is started. This behavior appears to be specific to our Mellanox network cards; we have not observed the same issue with Intel NICs.
In our environment, we need to pre-create a specific number of VFs on each of our four physical NICs. If we let the plugin create the maximum number of VFs automatically, some of the NICs fail to initialise, which we believe is due to resource exhaustion. Manually creating a limited number of VFs beforehand is a necessary step to keep the system stable.
Steps to Reproduce
First, deploy the SR-IOV CNI plugin:
docker compose -f docker-compose-sriov-plugin.yaml up -dScenario 1: Plugin-Managed VFs (Working as Expected)
In this scenario, we allow the plugin to create the VFs automatically when the containers are launched.
-
Create VFs and containers simultaneously:
docker compose -f docker-compose-sriov.yaml up -d
-
Observation:
Theip linkcommand shows that the VFs are correctly created and assigned unique MAC addresses.
Click to see `ip link` output
``` 7: ens2f0np0: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 5c:25:73:8c:e9:f0 brd ff:ff:ff:ff:ff:ff vf 0 link/ether aa:0a:6b:9e:db:f1 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 1 link/ether 16:c5:f5:47:da:77 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 2 link/ether 1e:fb:c5:b7:73:75 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 3 link/ether aa:da:51:ed:99:90 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 4 link/ether d6:9f:16:3f:bd:88 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 5 link/ether a2:d6:0c:ee:f7:56 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 6 link/ether 3e:4b:c6:47:c6:28 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off, query_rss off vf 7 link/ether 56:6c:b3:3f:b2:84 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off altname enp42s0f0np0 8: ens2f1np1: mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 5c:25:73:8c:e9:f1 brd ff:ff:ff:ff:ff:ff vf 0 link/ether fa:c8:e5:49:5e:09 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 1 link/ether 6e:08:da:51:2b:a9 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 2 link/ether d2:fd:47:3e:de:03 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 3 link/ether c6:35:b6:6f:f1:b7 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 4 link/ether c2:a5:a2:bf:5e:a2 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 5 link/ether 86:94:88:07:00:ac brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off vf 6 link/ether d2:47:96:95:ef:db brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off, query_rss off vf 7 link/ether 6a:35:2a:52:0f:44 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust on, query_rss off altname enp42s0f1np1 9: docker0: mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default link/ether 96:f1:bf:be:c5:83 brd ff:ff:ff:ff:ff:ff 221: ens2f0v3: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether aa:da:51:ed:99:90 brd ff:ff:ff:ff:ff:ff permaddr f6:90:eb:4c:eb:2f altname enp42s0f0v3 227: ens2f0v4: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether d6:9f:16:3f:bd:88 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v4 228: ens2f0v2: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 1e:fb:c5:b7:73:75 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v2 229: ens2f0v0: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether aa:0a:6b:9e:db:f1 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v0 230: ens2f0v7: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 56:6c:b3:3f:b2:84 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v7 231: ens2f0v5: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether a2:d6:0c:ee:f7:56 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v5 232: ens2f0v1: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 16:c5:f5:47:da:77 brd ff:ff:ff:ff:ff:ff altname enp42s0f0v1 242: ens2f1v4: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether c2:a5:a2:bf:5e:a2 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v4 243: ens2f1v2: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether d2:fd:47:3e:de:03 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v2 244: ens2f1v0: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether fa:c8:e5:49:5e:09 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v0 245: ens2f1v7: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 6a:35:2a:52:0f:44 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v7 246: ens2f1v5: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 86:94:88:07:00:ac brd ff:ff:ff:ff:ff:ff altname enp42s0f1v5 247: ens2f1v3: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether c6:35:b6:6f:f1:b7 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v3 248: ens2f1v1: mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 6e:08:da:51:2b:a9 brd ff:ff:ff:ff:ff:ff altname enp42s0f1v1 ```Scenario 2: Manually Pre-created VFs (Issue Occurs)
In this scenario, we create the VFs on the host before launching the containers.
-
Manually create two VFs on each physical interface:
echo 2 > /sys/class/net/ens2f0np0/device/sriov_numvfs echo 2 > /sys/class/net/ens2f1np1/device/sriov_numvfs
-
Create containers and have the plugin attach the existing VFs:
docker compose -f docker-compose-sriov.yaml up -d
-
Observation:
The output ofip linkshows that the VFs are attached but their MAC addresses are00:00:00:00:00:00. As a result, network connectivity (e.g., ping) between the client and server containers fails.
Click to see `ip link` output
7: ens2f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 5c:25:73:8c:e9:f0 brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
altname enp42s0f0np0
8: ens2f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether 5c:25:73:8c:e9:f1 brd ff:ff:ff:ff:ff:ff
vf 0 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
vf 1 link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff, spoof checking off, link-state auto, trust off, query_rss off
altname enp42s0f1np1
9: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 96:f1:bf:be:c5:83 brd ff:ff:ff:ff:ff:ff
249: ens2f1v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether fa:c8:e5:49:5e:09 brd ff:ff:ff:ff:ff:ff permaddr 6a:55:0b:fe:e4:6d
altname enp42s0f1v0
250: ens2f1v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 6e:08:da:51:2b:a9 brd ff:ff:ff:ff:ff:ff permaddr e2:15:1d:b9:95:19
altname enp42s0f1v1
251: ens2f0v0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether aa:0a:6b:9e:db:f1 brd ff:ff:ff:ff:ff:ff permaddr 16:f5:15:15:9f:22
altname enp42s0f0v0
252: ens2f0v1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 16:c5:f5:47:da:77 brd ff:ff:ff:ff:ff:ff permaddr 5a:26:c7:64:85:75
altname enp42s0f0v1
docker-compose-sriov-plugin.yaml
``` version: '2.4' services: sriov-plugin: privileged: true image: rdma/sriov-plugin:latest container_name: sriov-plugin restart: always volumes: - /run/docker/plugins:/run/docker/plugins - /etc/docker:/etc/docker - /var/run:/var/run network_mode: host ```docker-compose-sriov.yaml
``` version: '2.4' services:client:
privileged: true # To run ip r
image: tariromukute/test-tools:latest
container_name: client
restart: always
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /sys/fs/bpf:/sys/fs/bpf
- /sys/kernel/debug/:/sys/kernel/debug/
command:
- /bin/bash
- -c
- |
ip r replace 192.168.5.0/24 dev eth0
sysctl -w net.ipv4.ip_forward=1
tail -f /dev/null
networks:
sriov-net-client:
ipv4_address: 192.168.4.3
server:
privileged: true # To run ip r
image: tariromukute/test-tools:latest
container_name: server
restart: always
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /sys/fs/bpf:/sys/fs/bpf
- /sys/kernel/debug/:/sys/kernel/debug/
command:
- /bin/bash
- -c
- |
ip r replace 192.168.4.0/24 dev eth0
sysctl -w net.ipv4.ip_forward=1
tail -f /dev/null
networks:
sriov-net-server:
ipv4_address: 192.168.5.3
networks:
sriov-net-client:
name: sriov-net-client
driver: sriov
driver_opts:
netdevice: ens2f0np0
ipam:
config:
- subnet: "192.168.4.0/24"
sriov-net-server:
name: sriov-net-server
driver: sriov
driver_opts:
netdevice: ens2f1np1
ipam:
config:
- subnet: "192.168.5.0/24"
</details>