Fix class_weights background voxel count to use actual array volume size#66
Fix class_weights background voxel count to use actual array volume size#66rhoadesScholar merged 8 commits intorewritefrom
class_weights background voxel count to use actual array volume size#66Conversation
|
@copilot Address the purpose of this PR |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## rewrite #66 +/- ##
===========================================
+ Coverage 80.81% 81.22% +0.41%
===========================================
Files 23 23
Lines 1538 1561 +23
===========================================
+ Hits 1243 1268 +25
+ Misses 295 293 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Co-authored-by: rhoadesScholar <[email protected]>
class_weights background voxel count to use actual array volume size
There was a problem hiding this comment.
Pull request overview
This PR updates how class voxel statistics are computed across CellMapImage → CellMapDataset → CellMapMultiDataset, adding explicit “total voxel” counts so class sampling weights can use true background volume rather than deriving totals from foreground counts.
Changes:
- Add
CellMapImage.total_voxelsand propagate per-class total-volume counts via a newtotals_totalkey inclass_counts. - Update
CellMapMultiDataset.class_weightsto compute background astotals_total - totalsper class. - Extend unit tests to validate the new keys and the corrected weight calculation.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/cellmap_data/image.py |
Adds total_voxels for computing per-image volume at training resolution. |
src/cellmap_data/empty_image.py |
Adds a total_voxels stub for API compatibility with EmptyImage. |
src/cellmap_data/dataset.py |
Extends class_counts to include totals_total per class. |
src/cellmap_data/multidataset.py |
Aggregates totals_total and fixes class_weights background calculation. |
tests/test_image.py |
Adds a test asserting total_voxels equals array size in the fixture. |
tests/test_multidataset.py |
Adds tests for totals_total and for corrected background/weight math. |
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
Co-authored-by: rhoadesScholar <[email protected]>
The PR's purpose is to fix |
There was a problem hiding this comment.
Pull request overview
This PR refines how class statistics are computed for sampling by distinguishing per-class foreground voxel counts from the total voxel volume of the underlying label arrays, ensuring background-vs-foreground ratios are based on the real volume size rather than inferred totals.
Changes:
- Extend
class_countsto return both per-class foreground totals (totals) and per-class total volume (totals_total) acrossCellMapDatasetandCellMapMultiDataset. - Update
CellMapMultiDataset.class_weightsto compute background voxels usingtotals_total - totals. - Add
CellMapImage.total_voxels(andEmptyImage.total_voxels) and introduce targeted regression tests for the new behavior.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/cellmap_data/image.py |
Adds total_voxels derived from bounding_box and training scale. |
src/cellmap_data/empty_image.py |
Provides total_voxels for empty sources (returns 0). |
src/cellmap_data/dataset.py |
Expands class_counts to include totals_total and caches the result. |
src/cellmap_data/multidataset.py |
Aggregates totals_total across datasets and fixes class_weights background calculation. |
tests/test_image.py |
Adds a test validating total_voxels equals array size for a simple fixture. |
tests/test_multidataset.py |
Adds tests asserting totals_total and class weight math use the true volume size. |
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
class_weightswas computingtotal_voxels = sum(fg_counts.values())— the sum of all foreground voxels across all classes — rather than the actual spatial volume of each class's data array. This madebg = total_voxels - fgincorrect, corrupting the fg/bg imbalance ratio used for weighted sampling.Changes
CellMapImage.total_voxels— new property; product of the s0 array's spatial shape, scaled to training resolution via the existing_scale_counthelperEmptyImage.total_voxels— returns0(no backing array)CellMapDataset.class_counts— adds"totals_total"key alongside"totals", sourced from each target image'stotal_voxelsCellMapMultiDataset.class_counts— aggregates"totals_total"across datasetsCellMapMultiDataset.class_weights— computesbg = totals_total[cls] - fgper class instead of summing across all fg counts📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.