Fix 2-second systemd notify stalls in cloud-hypervisor#493
Fix 2-second systemd notify stalls in cloud-hypervisor#493rzhikharevich wants to merge 3 commits intomicrovm-nix:mainfrom
Conversation
|
I think |
|
@RaitoBezarius Please help review. |
|
Unfortunately, there is no EOF because cloud-hypervisor doesn't seem to propagate the socket half-close to the host socket, so |
|
|
|
By the way, should I use |
| # -T2 is required because cloud-hypervisor does not handle partial | ||
| # shutdown of the stream, like systemd v256+ does. | ||
| ${pkgs.socat}/bin/socat -T2 UNIX-LISTEN:${vsockPath}_8888,fork UNIX-SENDTO:$NOTIFY_SOCKET & | ||
| VSOCK_NOTIFY_PATH=${vsockPath}_8888 ${vsockNotifyProxy}/bin/vsock-notify-proxy & |
There was a problem hiding this comment.
| VSOCK_NOTIFY_PATH=${vsockPath}_8888 ${vsockNotifyProxy}/bin/vsock-notify-proxy & | |
| ${lib.getExe vsockNotifyProxy} ${vsockPath}_8888 & |
| path = os.environ["VSOCK_NOTIFY_PATH"] | ||
| server = await asyncio.start_unix_server(handle, path=path) |
There was a problem hiding this comment.
| path = os.environ["VSOCK_NOTIFY_PATH"] | |
| server = await asyncio.start_unix_server(handle, path=path) | |
| server = await asyncio.start_unix_server(handle, path=sys.argv[0]) |
We can make this kinda dirty here, as we control all invocations and this fails loudly when no arg is there, os.environ does IIRC return empty string.
| import asyncio | ||
| import os | ||
| import socket |
There was a problem hiding this comment.
glad we do not need anything outside the stdenv.
| # An important caveat is that the message could theoretically be truncated but this should not | ||
| # happen in practice unless the message is large (systemd notifications are small). | ||
| # | ||
| # TODO(rzhikharevich): Ideally, cloud-hypervisor should propagate the half-close, then the |
There was a problem hiding this comment.
| # TODO(rzhikharevich): Ideally, cloud-hypervisor should propagate the half-close, then the | |
| # TODO: Ideally, cloud-hypervisor should propagate the half-close, then the |
|
|
||
|
|
||
| async def handle(reader, writer): | ||
| data = await reader.read(65536) |
There was a problem hiding this comment.
Can we detect with a one line if we read to little and print that to stdout?
There was a problem hiding this comment.
Pull request overview
This PR addresses slow microvm boot times when running cloud-hypervisor under systemd with microvm.vsock.cid enabled, by replacing the previous socat-based notify forwarding (which introduced a 2s stall per notification) with a purpose-built proxy that immediately closes the vsock-stream connection after a single read.
Changes:
- Add a Python-based
vsock-notify-proxythat forwards guest notify messages from the cloud-hypervisor vsock socket to the hostNOTIFY_SOCKET. - Replace the
socat -T2forwarding logic inpreStartwith the new proxy to avoid repeated systemd stalls.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| notify_socket = os.environ["NOTIFY_SOCKET"] | ||
| notify = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM) | ||
|
|
||
|
|
||
| async def handle(reader, writer): | ||
| data = await reader.read(65536) | ||
| writer.close() | ||
| if data: | ||
| notify.sendto(data, notify_socket) |
There was a problem hiding this comment.
NOTIFY_SOCKET can legally start with @ to indicate an abstract Unix socket (sd_notify uses this convention). Passing that string directly to sendto() will try to send to a pathname literally starting with @ and will fail with ENOENT in those environments. Consider translating a leading @ to a \0-prefixed address before calling sendto() (or otherwise handling the abstract-namespace case).
There was a problem hiding this comment.
I would also be fine to just error out if notify_socket starts with @.
| data = await reader.read(65536) | ||
| writer.close() | ||
| if data: | ||
| notify.sendto(data, notify_socket) | ||
|
|
||
|
|
There was a problem hiding this comment.
await reader.read(65536) on a stream socket may return a partial message even when the total message is small (stream reads aren’t message-framed). This means the truncation risk isn’t limited to “large” notifications; it can also happen if the write is split across packets/scheduling. To reduce the chance of forwarding partial notifications without reintroducing multi-second stalls, consider draining the socket in a short loop (e.g., read initial data, then keep reading additional immediately-available bytes with a very small timeout / non-blocking drain) before closing the connection.
| data = await reader.read(65536) | |
| writer.close() | |
| if data: | |
| notify.sendto(data, notify_socket) | |
| # Read initial data, then drain any immediately-available extra bytes with a very | |
| # small timeout to reduce the chance of truncating a notification split across | |
| # multiple packets/scheduling slices. | |
| buffer = bytearray() | |
| chunk = await reader.read(65536) | |
| if chunk: | |
| buffer.extend(chunk) | |
| # Drain any additional data that becomes available immediately without | |
| # incurring a multi-second stall. | |
| while True: | |
| try: | |
| chunk = await asyncio.wait_for(reader.read(65536), 0.01) | |
| except asyncio.TimeoutError: | |
| break | |
| if not chunk: | |
| break | |
| buffer.extend(chunk) | |
| writer.close() | |
| if buffer: | |
| notify.sendto(bytes(buffer), notify_socket) |
There was a problem hiding this comment.
I do not do async python programming, so I do not know if that is a good solution or not.
|
|
||
| async def handle(reader, writer): | ||
| data = await reader.read(65536) | ||
| writer.close() |
There was a problem hiding this comment.
After calling writer.close(), it’s generally best to await writer.wait_closed() (or equivalent) to ensure the transport is actually closed and file descriptors are released promptly. This helps avoid accumulating open connections under load and prevents noisy ResourceWarnings in some Python/asyncio versions.
| writer.close() | |
| writer.close() | |
| await writer.wait_closed() |
This is a fix for #474. See the code comment for explanation and caveats.
Relevant parts of: