Skip to content

MB-66295: avoid redundant calculation of bm25 metrics#2180

Merged
abhinavdangeti merged 4 commits intomasterfrom
bm25
Apr 16, 2025
Merged

MB-66295: avoid redundant calculation of bm25 metrics#2180
abhinavdangeti merged 4 commits intomasterfrom
bm25

Conversation

@Thejas-bhat
Copy link
Copy Markdown
Member

@Thejas-bhat Thejas-bhat commented Apr 16, 2025

  • cache a map which maps the field to its cardinality value so that we don't create the fieldDict everytime
  • Furthermore in the case all the threads have a cache miss, use the write lock to make sure that we create the fieldDict only once to minimize garbage.
  • Use a separate mutex construct so that the contention is localised to the metrics gathering and doesn't affect the other threads' TFR generation.
  • MB-66295: Introduce BM25Reader interface  bleve_index_api#68

Comment thread index/scorch/snapshot_index.go
Comment thread index/scorch/snapshot_index.go Outdated
Comment thread index/scorch/snapshot_index.go Outdated
@abhinavdangeti abhinavdangeti added this to the v2.5.1 milestone Apr 16, 2025
Copy link
Copy Markdown
Member

@abhinavdangeti abhinavdangeti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @Thejas-bhat, nice one.

@abhinavdangeti
Copy link
Copy Markdown
Member

Already reviewed by @CascadingRadium , so will move ahead.

@abhinavdangeti abhinavdangeti merged commit ff43c15 into master Apr 16, 2025
9 checks passed
@abhinavdangeti abhinavdangeti deleted the bm25 branch April 16, 2025 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants