Skip to content

Enhance Bulk Reindexing with custom_import_scope for Faster Partial Indexing#1707

Open
Yasoob01 wants to merge 1 commit intoankane:masterfrom
Yasoob01:bulk-reindexing-optimization
Open

Enhance Bulk Reindexing with custom_import_scope for Faster Partial Indexing#1707
Yasoob01 wants to merge 1 commit intoankane:masterfrom
Yasoob01:bulk-reindexing-optimization

Conversation

@Yasoob01
Copy link

@Yasoob01 Yasoob01 commented Mar 3, 2025

This PR enhances Searchkick's bulk reindexing by introducing custom_import_scope, allowing selective indexing of associated data. Previously, reindexing would load all related records via search_import, even if only a subset was needed. This resulted in slower performance and unnecessary memory usage.

With this enhancement, developers can explicitly define which associations to include(for partial reindexing), making indexing up to 70% faster in cases where only partial data (e.g., client details) is required.

Key Changes
✅ Added custom_import_scope to limit loaded associations during reindexing.
✅ Ensures backward compatibility with search_import.
✅ Performance boost by avoiding unnecessary data fetches.

Example Usage

1. Full Reindexing (Legacy Approach)
This would load all related data, making reindexing slower.

Searchkick::BulkReindexJob.perform_later(
  class_name: "Vehicle",
  record_ids: [100, 101, 102],
  index_name: "vehicles_index",
  method_name: :make_data
)

Internally, this includes all associations present in import scope. such as: ( Vehicle.includes(:make, :model, :variant, inventory: [:user, :city, :tasks]) )

2. Optimized Partial Reindexing (New Approach)

Searchkick::BulkReindexJob.perform_later(
  class_name: "Vehicle",
  record_ids: [100, 101, 102],
  index_name: "vehicles_index",
  method_name: :make_data,
  custom_import_scope: [:make]
)

Internally, this only includes only make : such as: ( Vehicle.includes(:make) )

Why This Matters?

  1. Faster Reindexing – Avoids loading irrelevant relations.
  2. Lower Memory Usage – Fetches only what's needed.
  3. More Control – Allows selective indexing based on use case.

This update significantly improves performance while ensuring existing functionality remains intact. Let me know if any refinements are needed! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant