Implement O(1) PredicateEvaluator for inter pod affinity by x13n · Pull Request #9523 · kubernetes/autoscaler

x13n · 2026-04-20T18:00:11Z

This change introduces a high-performance ClusterSnapshot implementation that replaces traditional O(PodsOnNode) selector matching with incremental indexing, Copy-on-Write (CoW) simulation, and phased evaluation.

Key architectural pillars:

Incremental Indexing: Leverages the 'fort' pipeline library and StreamingSnapshotStore to update indices reactively as pods and nodes change.
CoW Simulation: Uses PatchSet-backed BTreeMap structures and slice operations to efficiently share state across simulation forks with O(1) cost.
Phased Evaluation: Splits computation into a serial 'PreparePod' phase and a parallel 'FastCheckAffinity' phase using bi-directional label indexing.

Other changes:

Support for complex namespace logic and AffinityTerm mapping.
Native integration with StreamingSnapshotStore via event propagation.
Disable legacy scheduler plugin when the fast path is enabled.
Introduced a new BenchmarkRunOnceAffinitySurge with heavy pod anti-affinity use to measure the improvement.

Performance Gain (BenchmarkRunOnceAffinitySurge - 5000 nodes, 50,000 pods):

Before (Default): 164.4s/op
After (Fast O(1)): 9.36s/op
Speedup: 17.56x

What type of PR is this?

/kind feature

What this PR does / why we need it:

Anti affinity is the most expensive part of scheduler logic used by Cluster Autoscaler.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Large part is AI generated, needs careful review.

Does this PR introduce a user-facing change?

A new, experimental --fast-predicates-enabled flag can be used to enable alternative implementation of pod anti-affinity checks.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2026-04-20T18:00:13Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot · 2026-04-20T18:00:18Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~cluster-autoscaler/OWNERS~~ [x13n]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2026-04-20T18:00:19Z

This issue is currently awaiting triage.

If SIG Autoscaling contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

This change introduces a high-performance ClusterSnapshot implementation that replaces traditional O(PodsOnNode) selector matching with incremental indexing, Copy-on-Write (CoW) simulation, and phased evaluation. Key architectural pillars: - Incremental Indexing: Leverages the 'fort' pipeline library and StreamingSnapshotStore to update indices reactively as pods and nodes change. - CoW Simulation: Uses PatchSet-backed BTreeMap structures and slice operations to efficiently share state across simulation forks with O(1) cost. - Phased Evaluation: Splits computation into a serial 'PreparePod' phase and a parallel 'FastCheckAffinity' phase using bi-directional label indexing. Other changes: - Support for complex namespace logic and AffinityTerm mapping. - Native integration with StreamingSnapshotStore via event propagation. - Disable legacy scheduler plugin when the fast path is enabled. - Introduced a new BenchmarkRunOnceAffinitySurge with heavy pod anti-affinity use to measure the improvement. Performance Gain (BenchmarkRunOnceAffinitySurge - 5000 nodes, 50,000 pods): - Before (Default): 164.4s/op - After (Fast O(1)): 9.36s/op - Speedup: 17.56x

Reallocate presence/forbiden slices once per fork, not once per pod.

k8s-ci-robot requested a review from BigDarkClown April 20, 2026 18:00

k8s-ci-robot requested a review from elmiko April 20, 2026 18:00

k8s-ci-robot added area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 20, 2026

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 20, 2026

This was referenced Apr 20, 2026

Implement fast predicate index for cluster-autoscaler simulator #9461

Closed

[Scheduling] Add a Caching Mechanism for InterPodAffinity and PodTopologySpread Plugins kubernetes/kubernetes#137654

Open

x13n force-pushed the streaming-snapshot branch from c95f36a to a289b06 Compare April 21, 2026 09:33

Reduce memory allocations

94028a8

Reallocate presence/forbiden slices once per fork, not once per pod.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement O(1) PredicateEvaluator for inter pod affinity#9523

Implement O(1) PredicateEvaluator for inter pod affinity#9523
x13n wants to merge 2 commits intokubernetes:masterfrom
x13n:streaming-snapshot

x13n commented Apr 20, 2026 •

edited

Loading

Uh oh!

k8s-ci-robot commented Apr 20, 2026

Uh oh!

k8s-ci-robot commented Apr 20, 2026

Uh oh!

k8s-ci-robot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

x13n commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

k8s-ci-robot commented Apr 20, 2026

Uh oh!

k8s-ci-robot commented Apr 20, 2026

Uh oh!

k8s-ci-robot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

x13n commented Apr 20, 2026 •

edited

Loading