Fix `class_weights` background voxel count to use actual array volume size by Copilot · Pull Request #66 · janelia-cellmap/cellmap-data

Copilot · 2026-03-13T18:22:44Z

class_weights was computing total_voxels = sum(fg_counts.values()) — the sum of all foreground voxels across all classes — rather than the actual spatial volume of each class's data array. This made bg = total_voxels - fg incorrect, corrupting the fg/bg imbalance ratio used for weighted sampling.

Changes

CellMapImage.total_voxels — new property; product of the s0 array's spatial shape, scaled to training resolution via the existing _scale_count helper
EmptyImage.total_voxels — returns 0 (no backing array)
CellMapDataset.class_counts — adds "totals_total" key alongside "totals", sourced from each target image's total_voxels
CellMapMultiDataset.class_counts — aggregates "totals_total" across datasets
CellMapMultiDataset.class_weights — computes bg = totals_total[cls] - fg per class instead of summing across all fg counts

# Before (wrong): bg was total fg across all classes minus this class's fg
total_voxels = sum(counts.values())  # sum of fg voxels for all classes
bg = total_voxels - fg

# After (correct): bg is the actual array volume minus this class's fg
bg = max(total_counts[cls] - fg, 0)  # total_counts[cls] = np.prod(array_shape) scaled to training res

📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

rhoadesScholar · 2026-03-13T18:41:52Z

@copilot Address the purpose of this PR

Copilot

Copilot wasn't able to review any files in this pull request.

codecov · 2026-03-13T18:56:40Z

Codecov Report

❌ Patch coverage is 81.25000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.22%. Comparing base (ce2a150) to head (ef33153).
⚠️ Report is 9 commits behind head on rewrite.

Files with missing lines	Patch %	Lines
src/cellmap_data/image.py	69.23%	4 Missing ⚠️
src/cellmap_data/dataset.py	80.00%	1 Missing ⚠️
src/cellmap_data/empty_image.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           rewrite      #66      +/-   ##
===========================================
+ Coverage    80.81%   81.22%   +0.41%     
===========================================
  Files           23       23              
  Lines         1538     1561      +23     
===========================================
+ Hits          1243     1268      +25     
+ Misses         295      293       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: rhoadesScholar <[email protected]>

Copilot

Pull request overview

This PR updates how class voxel statistics are computed across CellMapImage → CellMapDataset → CellMapMultiDataset, adding explicit “total voxel” counts so class sampling weights can use true background volume rather than deriving totals from foreground counts.

Changes:

Add CellMapImage.total_voxels and propagate per-class total-volume counts via a new totals_total key in class_counts.
Update CellMapMultiDataset.class_weights to compute background as totals_total - totals per class.
Extend unit tests to validate the new keys and the corrected weight calculation.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/cellmap_data/image.py`	Adds `total_voxels` for computing per-image volume at training resolution.
`src/cellmap_data/empty_image.py`	Adds a `total_voxels` stub for API compatibility with `EmptyImage`.
`src/cellmap_data/dataset.py`	Extends `class_counts` to include `totals_total` per class.
`src/cellmap_data/multidataset.py`	Aggregates `totals_total` and fixes `class_weights` background calculation.
`tests/test_image.py`	Adds a test asserting `total_voxels` equals array size in the fixture.
`tests/test_multidataset.py`	Adds tests for `totals_total` and for corrected background/weight math.

src/cellmap_data/image.py

src/cellmap_data/dataset.py

Co-authored-by: Copilot Autofix powered by AI <[email protected]>

src/cellmap_data/dataset.py

Co-authored-by: rhoadesScholar <[email protected]>

Copilot · 2026-03-13T19:20:02Z

@copilot Address the purpose of this PR

The PR's purpose is to fix total_voxels to reflect the actual data volume size (spatial shape) for each class, rather than using the sum of all foreground voxel counts. CellMapImage.total_voxels is now derived from the cached bounding_box (world-space extent divided by training-resolution voxel size) so no extra zarr I/O is needed. CellMapDataset.class_counts exposes a "totals_total" key, and class_weights computes bg = totals_total[cls] - fg per class. Committed in 0d4bf5a.

Copilot

Pull request overview

This PR refines how class statistics are computed for sampling by distinguishing per-class foreground voxel counts from the total voxel volume of the underlying label arrays, ensuring background-vs-foreground ratios are based on the real volume size rather than inferred totals.

Changes:

Extend class_counts to return both per-class foreground totals (totals) and per-class total volume (totals_total) across CellMapDataset and CellMapMultiDataset.
Update CellMapMultiDataset.class_weights to compute background voxels using totals_total - totals.
Add CellMapImage.total_voxels (and EmptyImage.total_voxels) and introduce targeted regression tests for the new behavior.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`src/cellmap_data/image.py`	Adds `total_voxels` derived from `bounding_box` and training scale.
`src/cellmap_data/empty_image.py`	Provides `total_voxels` for empty sources (returns 0).
`src/cellmap_data/dataset.py`	Expands `class_counts` to include `totals_total` and caches the result.
`src/cellmap_data/multidataset.py`	Aggregates `totals_total` across datasets and fixes `class_weights` background calculation.
`tests/test_image.py`	Adds a test validating `total_voxels` equals array size for a simple fixture.
`tests/test_multidataset.py`	Adds tests asserting `totals_total` and class weight math use the true volume size.

src/cellmap_data/image.py

Co-authored-by: Copilot Autofix powered by AI <[email protected]>

Initial plan

057f91c

Copilot AI assigned Copilot and rhoadesScholar Mar 13, 2026

Copilot started work on behalf of rhoadesScholar March 13, 2026 18:42 View session

Copilot AI requested a review from rhoadesScholar March 13, 2026 18:47

Copilot finished work on behalf of rhoadesScholar March 13, 2026 18:47

Copilot started work on behalf of rhoadesScholar March 13, 2026 18:49 View session

rhoadesScholar requested a review from Copilot March 13, 2026 18:54

Copilot AI reviewed Mar 13, 2026

View reviewed changes

Fix total_voxels to use actual data volume spatial shape

ab1bf11

Co-authored-by: rhoadesScholar <[email protected]>

Copilot AI changed the title ~~[WIP] [WIP] Address feedback on voxel count calculation in rewrite PR~~ Fix class_weights background voxel count to use actual array volume size Mar 13, 2026

Copilot finished work on behalf of rhoadesScholar March 13, 2026 18:59

rhoadesScholar requested a review from Copilot March 13, 2026 19:00

Copilot started reviewing on behalf of rhoadesScholar March 13, 2026 19:01 View session

Copilot AI reviewed Mar 13, 2026

View reviewed changes

src/cellmap_data/image.py Outdated Show resolved Hide resolved

src/cellmap_data/dataset.py Show resolved Hide resolved

Potential fix for pull request finding

b2db1b6

Co-authored-by: Copilot Autofix powered by AI <[email protected]>

rhoadesScholar requested changes Mar 13, 2026

View reviewed changes

src/cellmap_data/dataset.py Outdated Show resolved Hide resolved

Copilot started work on behalf of rhoadesScholar March 13, 2026 19:11 View session

rhoadesScholar added 2 commits March 13, 2026 15:11

Update src/cellmap_data/dataset.py

357f2e0

Merge branch 'rewrite' into copilot/sub-pr-65

71e1832

rhoadesScholar self-requested a review March 13, 2026 19:12

rhoadesScholar approved these changes Mar 13, 2026

View reviewed changes

black format

9fa8953

rhoadesScholar marked this pull request as ready for review March 13, 2026 19:13

Improve total_voxels to use cached bounding_box instead of reopening s0

0d4bf5a

Co-authored-by: rhoadesScholar <[email protected]>

Copilot AI requested a review from rhoadesScholar March 13, 2026 19:20

Copilot finished work on behalf of rhoadesScholar March 13, 2026 19:20

rhoadesScholar requested a review from Copilot March 13, 2026 19:29

Copilot started reviewing on behalf of rhoadesScholar March 13, 2026 19:30 View session

Copilot AI reviewed Mar 13, 2026

View reviewed changes

src/cellmap_data/image.py Outdated Show resolved Hide resolved

Potential fix for pull request finding

ef33153

Co-authored-by: Copilot Autofix powered by AI <[email protected]>

rhoadesScholar merged commit e3f395d into rewrite Mar 17, 2026
12 checks passed

rhoadesScholar deleted the copilot/sub-pr-65 branch March 17, 2026 18:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `class_weights` background voxel count to use actual array volume size#66

Fix `class_weights` background voxel count to use actual array volume size#66
rhoadesScholar merged 8 commits intorewritefrom
copilot/sub-pr-65

Copilot AI commented Mar 13, 2026 •

edited

Loading

Uh oh!

rhoadesScholar commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

codecov bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

rhoadesScholar commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 13, 2026 •

edited

Loading

codecov bot commented Mar 13, 2026 •

edited

Loading