Skip to content

Fix 2-second systemd notify stalls in cloud-hypervisor#493

Open
rzhikharevich wants to merge 3 commits intomicrovm-nix:mainfrom
rzhikharevich:cloud-hv-notify-proxy-fix
Open

Fix 2-second systemd notify stalls in cloud-hypervisor#493
rzhikharevich wants to merge 3 commits intomicrovm-nix:mainfrom
rzhikharevich:cloud-hv-notify-proxy-fix

Conversation

@rzhikharevich
Copy link
Copy Markdown

@rzhikharevich rzhikharevich commented Mar 27, 2026

This is a fix for #474. See the code comment for explanation and caveats.

Relevant parts of:

@rzhikharevich rzhikharevich changed the title Fix 2-second systemd notify deadlocks Fix 2-second systemd notify stalls Mar 27, 2026
@rzhikharevich rzhikharevich changed the title Fix 2-second systemd notify stalls Fix 2-second systemd notify stalls in cloud-hypervisor Mar 27, 2026
@yorickvP
Copy link
Copy Markdown

I think reader.read(-1) should also work, it reads until EOF

@astro
Copy link
Copy Markdown
Member

astro commented Mar 30, 2026

@RaitoBezarius Please help review.

@rzhikharevich
Copy link
Copy Markdown
Author

Unfortunately, there is no EOF because cloud-hypervisor doesn't seem to propagate the socket half-close to the host socket, so read(-1) doesn't work.

@rzhikharevich
Copy link
Copy Markdown
Author

rzhikharevich commented Mar 31, 2026

read(-1) hangs until systemd decides to kill the unit because of the activation timeout:

roman@nixform ~/D/S/microvm.nix (cloud-hv-notify-proxy-fix) [1]> pgrep -af notify-proxy
322387 /nix/store/pzdalg368npikvpq4ncz2saxnz19v53k-python3-3.13.12/bin/python3.13 /nix/store/lqm440b0y5alm3nb6fj044y7hr5aykxs-vsock-notify-proxy/bin/vsock-notify-proxy
roman@nixform ~/D/S/microvm.nix (cloud-hv-notify-proxy-fix)> sudo strace -p 322387
strace: Process 322387 attached
epoll_wait(5

@rzhikharevich
Copy link
Copy Markdown
Author

By the way, should I use vmHostPackages or pkgs here? I see that some other runners use the former but cloud-hypervisor.nix doesn't seem to.

# -T2 is required because cloud-hypervisor does not handle partial
# shutdown of the stream, like systemd v256+ does.
${pkgs.socat}/bin/socat -T2 UNIX-LISTEN:${vsockPath}_8888,fork UNIX-SENDTO:$NOTIFY_SOCKET &
VSOCK_NOTIFY_PATH=${vsockPath}_8888 ${vsockNotifyProxy}/bin/vsock-notify-proxy &
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VSOCK_NOTIFY_PATH=${vsockPath}_8888 ${vsockNotifyProxy}/bin/vsock-notify-proxy &
${lib.getExe vsockNotifyProxy} ${vsockPath}_8888 &

Comment on lines +160 to +161
path = os.environ["VSOCK_NOTIFY_PATH"]
server = await asyncio.start_unix_server(handle, path=path)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
path = os.environ["VSOCK_NOTIFY_PATH"]
server = await asyncio.start_unix_server(handle, path=path)
server = await asyncio.start_unix_server(handle, path=sys.argv[0])

We can make this kinda dirty here, as we control all invocations and this fails loudly when no arg is there, os.environ does IIRC return empty string.

Comment on lines +144 to +146
import asyncio
import os
import socket
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glad we do not need anything outside the stdenv.

# An important caveat is that the message could theoretically be truncated but this should not
# happen in practice unless the message is large (systemd notifications are small).
#
# TODO(rzhikharevich): Ideally, cloud-hypervisor should propagate the half-close, then the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# TODO(rzhikharevich): Ideally, cloud-hypervisor should propagate the half-close, then the
# TODO: Ideally, cloud-hypervisor should propagate the half-close, then the



async def handle(reader, writer):
data = await reader.read(65536)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we detect with a one line if we read to little and print that to stdout?

@SuperSandro2000 SuperSandro2000 requested a review from Copilot March 31, 2026 13:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses slow microvm boot times when running cloud-hypervisor under systemd with microvm.vsock.cid enabled, by replacing the previous socat-based notify forwarding (which introduced a 2s stall per notification) with a purpose-built proxy that immediately closes the vsock-stream connection after a single read.

Changes:

  • Add a Python-based vsock-notify-proxy that forwards guest notify messages from the cloud-hypervisor vsock socket to the host NOTIFY_SOCKET.
  • Replace the socat -T2 forwarding logic in preStart with the new proxy to avoid repeated systemd stalls.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +148 to +156
notify_socket = os.environ["NOTIFY_SOCKET"]
notify = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)


async def handle(reader, writer):
data = await reader.read(65536)
writer.close()
if data:
notify.sendto(data, notify_socket)
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTIFY_SOCKET can legally start with @ to indicate an abstract Unix socket (sd_notify uses this convention). Passing that string directly to sendto() will try to send to a pathname literally starting with @ and will fail with ENOENT in those environments. Consider translating a leading @ to a \0-prefixed address before calling sendto() (or otherwise handling the abstract-namespace case).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also be fine to just error out if notify_socket starts with @.

Comment on lines +153 to +158
data = await reader.read(65536)
writer.close()
if data:
notify.sendto(data, notify_socket)


Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

await reader.read(65536) on a stream socket may return a partial message even when the total message is small (stream reads aren’t message-framed). This means the truncation risk isn’t limited to “large” notifications; it can also happen if the write is split across packets/scheduling. To reduce the chance of forwarding partial notifications without reintroducing multi-second stalls, consider draining the socket in a short loop (e.g., read initial data, then keep reading additional immediately-available bytes with a very small timeout / non-blocking drain) before closing the connection.

Suggested change
data = await reader.read(65536)
writer.close()
if data:
notify.sendto(data, notify_socket)
# Read initial data, then drain any immediately-available extra bytes with a very
# small timeout to reduce the chance of truncating a notification split across
# multiple packets/scheduling slices.
buffer = bytearray()
chunk = await reader.read(65536)
if chunk:
buffer.extend(chunk)
# Drain any additional data that becomes available immediately without
# incurring a multi-second stall.
while True:
try:
chunk = await asyncio.wait_for(reader.read(65536), 0.01)
except asyncio.TimeoutError:
break
if not chunk:
break
buffer.extend(chunk)
writer.close()
if buffer:
notify.sendto(bytes(buffer), notify_socket)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not do async python programming, so I do not know if that is a good solution or not.


async def handle(reader, writer):
data = await reader.read(65536)
writer.close()
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After calling writer.close(), it’s generally best to await writer.wait_closed() (or equivalent) to ensure the transport is actually closed and file descriptors are released promptly. This helps avoid accumulating open connections under load and prevents noisy ResourceWarnings in some Python/asyncio versions.

Suggested change
writer.close()
writer.close()
await writer.wait_closed()

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants