fix(scheduler): remove node from cache when node-selector label is removed#5145
fix(scheduler): remove node from cache when node-selector label is removed#5145brick-pixel wants to merge 3 commits intovolcano-sh:masterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request modifies the SyncNode function in the scheduler cache to ensure that nodes are removed from the cache if they no longer match the configured node selector after a label update. It also includes a new test case to verify this behavior. A critical issue was identified where the current implementation would cause SyncNode to return errors and trigger unnecessary retries for any node in the cluster that does not match the selector, even if it was never in the cache. A suggestion was provided to check for the node's existence in the cache before attempting removal.
| if deleteErr := sc.RemoveNode(nodeName); deleteErr != nil { | ||
| klog.Errorf("Failed to remove node <%s> from cache after label change: %s", nodeName, deleteErr.Error()) | ||
| return deleteErr | ||
| } | ||
| klog.V(3).Infof("Node <%s> no longer matches node selector, removed from cache.", nodeName) |
There was a problem hiding this comment.
The call to sc.RemoveNode(nodeName) returns an error if the node is not present in the cache (see lines 540-542). Since SyncNode is triggered for all node updates, and nodeCanAddCache returns false for any node that does not match the configured selector, this implementation will cause SyncNode to return an error for every non-matching node in the cluster. This results in continuous error logging and unnecessary retries in the nodeQueue. You should check if the node exists in the cache before attempting to remove it.
| if deleteErr := sc.RemoveNode(nodeName); deleteErr != nil { | |
| klog.Errorf("Failed to remove node <%s> from cache after label change: %s", nodeName, deleteErr.Error()) | |
| return deleteErr | |
| } | |
| klog.V(3).Infof("Node <%s> no longer matches node selector, removed from cache.", nodeName) | |
| sc.Mutex.Lock() | |
| _, ok := sc.Nodes[nodeName] | |
| sc.Mutex.Unlock() | |
| if ok { | |
| if deleteErr := sc.RemoveNode(nodeName); deleteErr != nil { | |
| klog.Errorf("Failed to remove node <%s> from cache after label change: %s", nodeName, deleteErr.Error()) | |
| return deleteErr | |
| } | |
| klog.V(3).Infof("Node <%s> no longer matches node selector, removed from cache.", nodeName) | |
| } |
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug where removing a node-selector label from a node does not evict it from the scheduler cache, allowing the scheduler to continue scheduling pods to non-matching nodes. The fix adds a call to RemoveNode() when nodeCanAddCache() returns false in the SyncNode() method, and includes a test to verify this behavior.
Changes:
- Modified
SyncNode()to remove nodes from cache when they no longer match the configured node selector - Added test case
TestSchedulerCache_SyncNode_RemoveOnLabelChangeto verify node removal on label change
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pkg/scheduler/cache/event_handlers.go | Added logic to call RemoveNode() when nodeCanAddCache() returns false |
| pkg/scheduler/cache/event_handlers_test.go | Added test to verify node removal when label is removed |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if deleteErr := sc.RemoveNode(nodeName); deleteErr != nil { | ||
| klog.Errorf("Failed to remove node <%s> from cache after label change: %s", nodeName, deleteErr.Error()) | ||
| return deleteErr | ||
| } |
There was a problem hiding this comment.
The RemoveNode() function returns an error if the node doesn't exist in the cache (line 541). This causes SyncNode() to fail when nodeCanAddCache() returns false for a node that was never added to cache. This breaks the existing test case "Node not added to cache" which expects wantErr: false. The RemoveNode() call should only propagate errors that aren't "node does not exist" errors, or RemoveNode() should be modified to not error when the node is not found.
| if deleteErr := sc.RemoveNode(nodeName); deleteErr != nil { | ||
| klog.Errorf("Failed to remove node <%s> from cache after label change: %s", nodeName, deleteErr.Error()) | ||
| return deleteErr | ||
| } |
There was a problem hiding this comment.
The PR description states that "RemoveNode() is safe to call even if the node is not in the cache (it will simply be a no-op)", but the implementation at line 540-541 returns an error if the node doesn't exist in cache. This is a discrepancy between the PR description and the actual code behavior.
| err = sc.SyncNode("n1") | ||
| assert.NoError(t, err) | ||
| assert.NotContains(t, sc.Nodes, "n1", "node should be removed from cache after label removal") | ||
| } |
There was a problem hiding this comment.
The new test only covers the case where a node is removed from cache after label removal. It should also test the case where a node never matches the selector in the first place, to ensure no regression in existing behavior. This edge case is currently broken by the fix.
| } | |
| } | |
| func TestSchedulerCache_SyncNode_NeverMatchesSelector(t *testing.T) { | |
| // Simulate: node never has the matching label, so it should never be added to cache | |
| n2 := util.BuildNode("n2", nil, map[string]string{}) | |
| sc := NewDefaultMockSchedulerCache("volcano") | |
| sc.nodeSelectorLabels = map[string]sets.Empty{ | |
| "train:true": {}, | |
| } | |
| // Add node without matching label to informer and sync | |
| sc.nodeInformer.Informer().GetIndexer().Add(n2) | |
| err := sc.SyncNode("n2") | |
| assert.NoError(t, err) | |
| assert.NotContains(t, sc.Nodes, "n2", "node without matching label should not be added to cache") | |
| } |
…moved When using --node-selector in volcano scheduler, removing the matching label from a node did not evict the node from the scheduler cache. SyncNode() checked nodeCanAddCache() but only returned nil without calling RemoveNode(), causing the scheduler to continue scheduling pods to nodes that no longer match the selector. Add RemoveNode() call when nodeCanAddCache() returns false, so the node is properly evicted from cache when its labels no longer match the configured node selector. Signed-off-by: brick-pixel <johnpixel128@gmail.com>
c0c29ed to
1906734
Compare
hajnalmt
left a comment
There was a problem hiding this comment.
Thanks for the PR @brick-pixel !
We will need a much more robust implementation for this with e2e verificaton.
Please check my comment here.
This reverts commit 59e1bb7.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
/area scheduling
What this PR does / why we need it
When using
--node-selectorin the volcano scheduler, removing the matching label from a node does not evict the node from the scheduler cache. This causes the scheduler to continue scheduling pods to nodes that no longer match the configured selector.Root Cause
In
SyncNode(), whennodeCanAddCache()returnsfalse(node labels no longer match the selector), the function simply returnsnilwithout callingRemoveNode(). If the node was previously in the cache (from when it had the matching label), it remains cached and eligible for scheduling.Fix
Call
RemoveNode()before returning whennodeCanAddCache()returnsfalse. This ensures the node is evicted from cache when its labels no longer match the configured node selector.RemoveNode()is safe to call even if the node is not in the cache (it will simply be a no-op).Which issue(s) this PR fixes
Fixes #5122
Does this PR introduce a user-facing change?
How Has This Been Tested?
Added
TestSchedulerCache_SyncNode_RemoveOnLabelChangethat: