Skip to content

Conversation

@lobsterkatie
Copy link
Member

@lobsterkatie lobsterkatie commented Dec 8, 2025

Currently, we only tag our grouping.get_or_create_grouphashes.check_secondary_hash_existence and grouping.get_or_create_grouphashes.get_or_create_grouphash metrics with the cache expiry time in cases where there's a cache hit. When there was only one possible expiry time, that worked fine, because anything that was a cache miss also can be assumed to have used that same expiry period. However, now that we're testing multiple expiry times at once, it's impossible to know which expiry time led to a given miss, which in turn throws off our hit-rate calculations.

To fix that problem, this PR adds the expiry_seconds tag to misses as well. Since by definition misses don't find anything in the cache, this also switches from using a cached expiry value to using the current expiry value. Finally, so that using the current value doesn't pollute results based on hits stored with an old value, it also introduces the use of cache versioning. That way, any time the list of potential cache times changes, no previously-stored values will be found.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Dec 8, 2025
@codecov
Copy link

codecov bot commented Dec 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #104537      +/-   ##
===========================================
- Coverage   80.63%    80.52%   -0.12%     
===========================================
  Files        9335      9330       -5     
  Lines      402949    400659    -2290     
  Branches    25689     25689              
===========================================
- Hits       324921    322621    -2300     
- Misses      77562     77572      +10     
  Partials      466       466              

@lobsterkatie lobsterkatie force-pushed the kmclb-fix-grouphash-cache-metric-expiry-tagging branch 2 times, most recently from 19a6a8e to 99b8543 Compare December 8, 2025 23:51
@lobsterkatie lobsterkatie force-pushed the kmclb-fix-grouphash-cache-metric-expiry-tagging branch from 99b8543 to 1569e30 Compare December 9, 2025 01:11
@lobsterkatie lobsterkatie marked this pull request as ready for review December 9, 2025 04:18
@lobsterkatie lobsterkatie requested review from a team as code owners December 9, 2025 04:18
Copy link
Contributor

@cvxluo cvxluo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

not blocking: i'm not completely sold on option version being needed. i understand that it likely will make the data collection easier to stop using our old cached values when we change the possible expiry times, but it seems like a complicated solution — i'm not sure why we couldn't just have a single option "expiry_time" and experimentally tweak that up and down to find a good balance between hit rate and cache storage

@lobsterkatie lobsterkatie merged commit d2638a1 into master Dec 9, 2025
67 checks passed
@lobsterkatie lobsterkatie deleted the kmclb-fix-grouphash-cache-metric-expiry-tagging branch December 9, 2025 17:43
@lobsterkatie
Copy link
Member Author

You’re not entirely wrong about the complication. I could have just had an option for version and manually incremented it any time the trial values were changed, too, which would have been simpler. This was a (maybe weird) way of me being a little lazy, LOL. Since this is all temporary anyway, I ended up prioritizing ease of testing/comparison over maintainability, which I probably wouldn't have done otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants