Fix: call playback started in sound device callback (console mode)#4958
Fix: call playback started in sound device callback (console mode)#4958chenghao-mou wants to merge 1 commit intomainfrom
Conversation
| self._playback_started_fired = True | ||
| t = _wall_time() | ||
| self._loop.call_soon_threadsafe(lambda: self.on_playback_started(created_at=t)) |
There was a problem hiding this comment.
🟡 Stale on_playback_started can fire after on_playback_finished due to call_soon_threadsafe scheduling
When playback is interrupted, a stale on_playback_started event can fire after on_playback_finished for the same (or even a new) segment, because the event is scheduled asynchronously via call_soon_threadsafe from the sound device thread.
Race condition timeline
- Sound device thread (under
audio_lock):_maybe_mark_playback_started()sets_playback_started_fired = Trueand callsself._loop.call_soon_threadsafe(lambda: self.on_playback_started(created_at=t))— the callback is now queued on the event loop but not yet executed. - Event loop thread:
clear_buffer()runs, acquiresaudio_lock, resets_playback_started_fired = False, then sets_interrupted_ev. - Event loop thread:
_wait_for_playoutresumes, callsself.on_playback_finished(...)atcli.py:245. - Event loop thread: The stale
on_playback_startedcallback from step 1 finally executes — afteron_playback_finished.
This violates the expected playback_started → playback_finished ordering. If a new segment has started by step 4 and a new listener is registered, the stale event could set first_frame_fut in livekit-agents/livekit/agents/voice/generation.py:367-368 with an incorrect (old) timestamp.
The primary consumer in generation.py has a if not out.first_frame_fut.done() guard which limits the damage, and this only affects console mode, so practical impact is limited. A fix would be to add a generation counter or sequence ID so stale callbacks can be detected and dropped in the lambda.
Prompt for agents
In livekit-agents/livekit/agents/cli/cli.py, the _maybe_mark_playback_started method at line 255-261 schedules on_playback_started via call_soon_threadsafe, but the callback can execute after on_playback_finished if an interruption occurs between scheduling and execution.
To fix this, add a generation counter (e.g. self._playback_generation: int = 0) that increments on every reset (in clear_buffer and _wait_for_playout). Capture the current generation in _maybe_mark_playback_started and check it in the scheduled lambda:
def _maybe_mark_playback_started(self) -> None:
if self._playback_started_fired:
return
self._playback_started_fired = True
t = _wall_time()
gen = self._playback_generation
self._loop.call_soon_threadsafe(lambda: self.on_playback_started(created_at=t) if self._playback_generation == gen else None)
Increment self._playback_generation wherever _playback_started_fired is reset to False (lines 199, 252).
Was this helpful? React with 👍 or 👎 to provide feedback.
Previously, the playback started is called when the first frame is captured. Now it is delayed until we have data consumed from the sound device callback. This improves the start timestamp by about 30~100ms.