-
Notifications
You must be signed in to change notification settings - Fork 52
Description
Hey folks,
I tried upgrading from 2.5.22 to 2.6.7 using the operator, but it's failing to do so because it starts rolling out the streaming node first. But the streaming node seems to be needing a mixcoord which is 2.6.0 or newer, but obviously thats not available yet because it's not upgraded yet. My deployment is already using the consolidated mixcoord as specified in the upgrade docs. So the streaming node fails to become healthy and I see logs like these:
[2025/12/05 23:00:05.847 +00:00] [WARN] [lazygrpc/conn.go:67] ["async dial failed, wait for retry..."] [error="context deadline exceeded"]
[2025/12/05 23:00:13.446 +00:00] [INFO] [resolver/resolver_with_discoverer.go:157] ["new grpc resolver registered"] [prefix=milvus-cluster/meta/session/mixcoord] [exclusive=true] [semver=">=2.6.0-dev"] [component=grpc-resolver] [scheme=milvus-session] [id=176]
[2025/12/05 23:00:13.446 +00:00] [WARN] [resolver/watch_based_grpc_resolver.go:54] ["fail to update resolver state"] [prefix=milvus-cluster/meta/session/mixcoord] [exclusive=true] [semver=">=2.6.0-dev"] [component=grpc-resolver] [scheme=milvus-session] [id=176] [state="Version: 2116096, Addrs: "] [error="bad resolver state"]
[2025/12/05 23:00:13.646 +00:00] [WARN] [lazygrpc/conn.go:67] ["async dial failed, wait for retry..."] [error="context deadline exceeded"]
[2025/12/05 23:00:16.610 +00:00] [INFO] [resolver/resolver_with_discoverer.go:157] ["new grpc resolver registered"] [component=grpc-resolver] [scheme=channel-assignment] [id=177]
[2025/12/05 23:00:16.610 +00:00] [WARN] [resolver/watch_based_grpc_resolver.go:54] ["fail to update resolver state"] [component=grpc-resolver] [scheme=channel-assignment] [id=177] [state="Version: -1/-1, Addrs: "] [error="bad resolver state"]
[2025/12/05 23:00:16.810 +00:00] [WARN] [lazygrpc/conn.go:67] ["async dial failed, wait for retry..."] [error="context deadline exceeded"]
Is this an unsupported upgrade path, a bug, or am I doing something wrong?
Later edit: I looked briefly through the code of the operator and I see that the streaming nodes have no upgrade dependencies which means they get deployed first. I would expect that the mixcoord should be upgraded first before deploying the new streamingnode. Does that sound right to you?
Thanks in advance!