-
Notifications
You must be signed in to change notification settings - Fork 2.3k
MicroVM hangs during SMP boot with ≥16 vCPUs on dual-socket NUMA hosts #5744
Copy link
Copy link
Open
Labels
Status: Awaiting authorIndicates that an issue or pull request requires author actionIndicates that an issue or pull request requires author action
Description
Summary:
When starting a microVM with a higher vCPU count (e.g., 16 vCPUs) on a dual-socket NUMA host, the guest occasionally hangs during SMP initialization.
During the hang, some Firecracker vCPU threads remain blocked in futex_wait_queue, and the guest kernel does not complete bringing up all secondary CPUs.
The issue occurs randomly during VM startup and has been observed when running Firecracker via jailer.
Firecracker Version:
Firecracker v1.13.1
Environment:
host kernel version:6.1.23
guest kernel version:5.4.116(The issue persists with 5.10.245 also ,when tried)
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 46 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: GenuineIntel
Model name: Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
CPU family: 6
Model: 85
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
Stepping: 4
BogoMIPS: 4200.00
Virtualization features:
Virtualization: VT-x
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 16 MiB (16 instances)
L3: 22 MiB (2 instances)
NUMA:
NUMA node(s): 2
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31
Vulnerabilities:
Gather data sampling: Mitigation; Microcode
Itlb multihit: KVM: Mitigation: Split huge pages
L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Mds: Mitigation; Clear CPU buffers; SMT vulnerable
Meltdown: Mitigation; PTI
Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable
Reg file data sampling: Not affected
Retbleed: Mitigation; IBRS
Spec rstack overflow: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; IBRS; IBPB conditional; STIBP conditional; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Srbds: Not affected
Tsx async abort: Mitigation; Clear CPU buffers; SMT vulnerable
VM Configuration:
"machine-config": {
"vcpu_count": 16,
"mem_size_mib": 72817,
"smt": false
}
Steps to Reproduce:
Run Firecracker using jailer on a dual-socket NUMA host.
Configure the microVM with 16 vCPUs.
Boot a Linux guest kernel.
Observe that VM startup occasionally hangs during SMP initialization.
The issue does not occur consistently, but appears randomly when starting the VM.
Guest Kernel Logs:
During the failure the guest kernel stops while bringing up secondary CPUs:
[ 1.745873] smp: Bringing up secondary CPUs ...
[ 1.746812] x86: Booting SMP configuration:
The boot process does not proceed further.
Firecracker Thread State:
During the hang, Firecracker shows all vCPU threads created:
PID SPID TTY TIME CMD
2358731 2358731 pts/4 00:00:01 firecracker
2358731 2358743 pts/4 00:00:01 fc_vcpu 0
2358731 2358744 pts/4 00:00:00 fc_vcpu 1
2358731 2358745 pts/4 00:00:00 fc_vcpu 2
2358731 2358746 pts/4 00:00:00 fc_vcpu 3
2358731 2358747 pts/4 00:00:00 fc_vcpu 4
2358731 2358748 pts/4 00:00:00 fc_vcpu 5
2358731 2358749 pts/4 00:00:00 fc_vcpu 6
2358731 2358750 pts/4 00:00:00 fc_vcpu 7
2358731 2358751 pts/4 00:00:00 fc_vcpu 8
2358731 2358752 pts/4 00:00:00 fc_vcpu 9
2358731 2358753 pts/4 00:00:00 fc_vcpu 10
2358731 2358754 pts/4 00:00:00 fc_vcpu 11
2358731 2358755 pts/4 00:00:00 fc_vcpu 12
2358731 2358756 pts/4 00:00:00 fc_vcpu 13
2358731 2358757 pts/4 00:00:00 fc_vcpu 14
2358731 2358758 pts/4 00:00:00 fc_vcpu 15
Stuck vCPU Thread Stack:
Inspecting a stuck vCPU thread shows it blocked in a futex wait
[<0>] futex_wait_queue+0x60/0x90
[<0>] futex_wait+0x185/0x270
[<0>] do_futex+0x106/0x1b0
[<0>] __x64_sys_futex+0x8e/0x1d0
[<0>] do_syscall_64+0x55/0xb0
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Expected Behavior
The microVM should boot normally and the guest OS should complete SMP initialization and start executing the init process.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Status: Awaiting authorIndicates that an issue or pull request requires author actionIndicates that an issue or pull request requires author action