Skip to content

Conversation

@ilan-gold
Copy link
Collaborator

@ilan-gold ilan-gold commented Jan 19, 2026

  1. Big chunks are better for dask
  2. Reading the categoricals into memory means we don't have to go through dask for concatenation at a memory penalty (negligible from my testing)

From what I see, this should be about a ~40% reduction in shuffling time

@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 94.44444% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 92.45%. Comparing base (627eb08) to head (273e655).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/annbatch/io.py 94.44% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #117   +/-   ##
=======================================
  Coverage   92.44%   92.45%           
=======================================
  Files           6        6           
  Lines         622      636   +14     
=======================================
+ Hits          575      588   +13     
- Misses         47       48    +1     
Files with missing lines Coverage Δ
src/annbatch/io.py 94.71% <94.44%> (-0.12%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ilan-gold ilan-gold force-pushed the ig/load_obs_chunking branch from 4b7d1ee to 5d3bd03 Compare January 21, 2026 14:18
@ilan-gold ilan-gold marked this pull request as ready for review January 21, 2026 15:18
@ilan-gold ilan-gold requested a review from felix0097 January 21, 2026 15:23
Copy link
Collaborator

@felix0097 felix0097 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ilan-gold ilan-gold merged commit 76ba1a1 into main Jan 21, 2026
9 checks passed
@ilan-gold ilan-gold deleted the ig/load_obs_chunking branch January 21, 2026 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants