Skip to content

[Cleanup] Combine Batched and Regular KMeans Impl#2015

Open
tarang-jain wants to merge 16 commits intorapidsai:mainfrom
tarang-jain:combine-batch
Open

[Cleanup] Combine Batched and Regular KMeans Impl#2015
tarang-jain wants to merge 16 commits intorapidsai:mainfrom
tarang-jain:combine-batch

Conversation

@tarang-jain
Copy link
Copy Markdown
Contributor

@tarang-jain tarang-jain commented Apr 10, 2026

Combine batched and regular k-means implementations

  • Unified the batched (host-data) and regular (device-data) k-means fit into a single kmeans_fit template that works with both host and device mdspans via batch_load_iterator
  • Unified the device and host initialization paths in init_centroids — both now use raft::matrix::sample_rows which handles host/device transparently
  • Removed the inertia_check parameter — inertia-based convergence checking now always runs. Zero clustering cost (perfect fit) logs a warning instead of asserting. This is needed because spectral clustering can cause all points to converge on the cluster centroids itself.
  • Added init_size parameter to control how many samples are drawn for KMeansPlusPlus initialization. Defaults to n_samples for device data, min(3 * n_clusters, n_samples) for host data
  • Replaced per-iteration centroid raft::copy with std::swap of buffer pointers
  • For streaming fit, precompute data norms once and cache them: host norms cached to a host buffer on the first iteration and copied back for subsequent iterations. process_batch no longer computes norms internally
  • Replaced raw cudaPointerGetAttributes call with raft::memory_type_from_pointer
  • Updated compute_weight_scale to use raft handle and mdspan-based raft::copy
  • Precompute centroid norms once per Lloyd iteration and pass to minClusterAndDistanceCompute via a new optional precomputed_centroid_norms parameter, avoiding redundant recomputation across batches

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 10, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@tarang-jain tarang-jain self-assigned this Apr 10, 2026
@tarang-jain tarang-jain added improvement Improves an existing functionality non-breaking Introduces a non-breaking change cpp labels Apr 10, 2026
@tarang-jain tarang-jain marked this pull request as ready for review April 14, 2026 01:10
@tarang-jain tarang-jain requested review from a team as code owners April 14, 2026 01:10
int batch_centroids;

/** Check inertia during iterations for early convergence. */
/** Deprecated, ignored. Kept for ABI compatibility. */
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably shouldn't be modifying the wording here. And we probably want to use a different struct that breaks ABI, suffixed by the version (26.06).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant