-
Notifications
You must be signed in to change notification settings - Fork 2.3k
[Bug] Deadlocks when instrumenting tracing #5807
Description
Describe the bug
When instrumenting tracing in Firecracker, there are at least two sources of deadlocks using the default clippy-tracing command.
-
Firecracker process hangs while starting up
a. main_exec() in main.rs → LOGGER.update(config)
b. Logger::update() at logging.rs → acquires LOGGER mutex → calls open_file_nonblock() when log-path is configured.
c. open_file_nonblock() is instrumented → __Instrument::new() → log::trace!()
d. Logger::log() tries to acquire LOGGER mutex → deadlock -
When resuming from snapshot:
a. Main thread calls resume_vm() → send_event() → sends Resume on channel, sets immediate_exit = 1 → sends RT signal to fc_vcpu
b. fc_vcpu is in paused(), wakes up from recv(), receives Resume, checks immediate_exit = 1 and calls warn!() → Logger::log() → holds LOGGER mutex.
c. RT signal arrives and interrupts fc_vcpu, the signal handler handle_signal is instrumented so it tries to acquire the LOGGER mutex as well but deadlocks.
To Reproduce
As above.
Expected behaviour
No deadlocks. I had to exclude utils/ and vpu.rs from the tracing instrumentation.
Environment
- Firecracker version: 1.15.0
- Host and guest kernel versions:
- Rootfs used:
- Architecture:
- Any other relevant software versions:
Checks
- Have you searched the Firecracker Issues database for similar problems?
- Have you read the existing relevant Firecracker documentation?
- Are you certain the bug being reported is a Firecracker issue?