Add mask support for training on foreground-only regions by balazsthomay · Pull Request #220 · pierotofy/OpenSplat

balazsthomay · 2025-12-30T17:38:44Z

Summary

Add mask support for training on foreground-only regions. This enables users to exclude background pixels during training, so only the region of interest contributes to the loss. Useful for object-centric reconstruction, cleaner outputs for compositing, and handling complex/cluttered backgrounds.

Changes

CLI: Add --mask-dir option to specify a directory containing binary mask images
Camera struct: Add maskPath, mask tensor, and maskPyramids cache fields
Mask I/O: Add imreadMask(), maskToTensor(), tensorToMask() functions in cv_utils
Mask loading: Load masks in loadImage() with automatic resizing to match image dimensions
Mask pyramid: Add getMask() with nearest-neighbor downscaling (like image pyramids)
L1 loss: Modify to compute weighted mean over masked pixels only
SSIM loss: Apply post-masking to SSIM map before averaging
Training loop: Pass masks through to mainLoss() for both training and validation

Usage

./opensplat /path/to/project --mask-dir /path/to/masks -n 30000

Masks should be binary images (white=include, black=exclude) with filenames matching the source images (e.g., images/001.jpg → masks/001.png).

Testing

Build Verification

CMake configuration succeeds: cmake -B build .
Build completes: cmake --build build -j8
Binary runs: ./build/opensplat --help

Functional Testing

Tested with sample dataset (407 images + masks)
Output PLY file is valid
Loss values differ with masks vs without masks

Platform Testing (if applicable)

CUDA build tested
Metal/MPS build tested
CPU-only build tested

Breaking Changes

None - this is an additive feature. Existing workflows without --mask-dir continue to work unchanged.

Add --mask-dir option to specify a directory containing binary mask images (0=exclude, 1=include). Masks are matched to images by filename stem and applied during loss computation (L1 and SSIM) so only foreground pixels contribute to training. Features: - Binary mask loading with automatic resizing to match image dimensions - Mask pyramid caching (like image pyramids) using nearest-neighbor interpolation - Masked L1 loss: weighted mean over foreground pixels only - Masked SSIM: post-masking of SSIM map before averaging - Validation loss also respects masks

pierotofy · 2025-12-30T18:56:59Z

Thanks for the PR! Did you use LLMs to generate this? It mostly looks very good, but can you explain how you arrived at:

 if (mask.numel() > 0){
        // Expand mask from [H,W,1] to [H,W,3] for broadcasting
        torch::Tensor expandedMask = mask.expand_as(diff);
        // Masked mean: sum of masked values / count of masked pixels
        torch::Tensor maskedDiff = diff * expandedMask;
        return maskedDiff.sum() / (expandedMask.sum() + 1e-8f);
    }

?

balazsthomay · 2025-12-31T09:12:22Z

Thanks for the PR! Did you use LLMs to generate this? It mostly looks very good, but can you explain how you arrived at:

 if (mask.numel() > 0){
        // Expand mask from [H,W,1] to [H,W,3] for broadcasting
        torch::Tensor expandedMask = mask.expand_as(diff);
        // Masked mean: sum of masked values / count of masked pixels
        torch::Tensor maskedDiff = diff * expandedMask;
        return maskedDiff.sum() / (expandedMask.sum() + 1e-8f);
    }

?

Yes, I used Claude to help me, sorry I didn't mention that.

The mask is greyscale but the image is RGB so I just copy the mask value over 3 channels.

Then, I wanted to measure how wrong/different our rendered image is, but only for the white/foreground pixels.

Without mask I'd add up all the errors and divide by the total number of pixels. But with a mask:

Zero out the errors for pixels we don't care about (multiply by the mask)
Add up what's left (only foreground errors)
Divide by how many foreground pixels there are (not total pixels)

and we get the average error across all foreground pixels that mainLoss() can use.

The + 1e-8f is just a safety net in case someone passes in an all-black mask to not divide by zero.

The two failed tests (Docker CUDA) couldn't complete, not sure why.

pierotofy · 2026-01-02T03:07:35Z

The thing is, Claude has no idea of how this works.

You can confirm this with some careful testing.

pierotofy closed this Jan 2, 2026

JorisGoosen mentioned this pull request Jan 28, 2026

Implement masking #225

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mask support for training on foreground-only regions#220

Add mask support for training on foreground-only regions#220
balazsthomay wants to merge 1 commit intopierotofy:mainfrom
balazsthomay:feature/mask-support

balazsthomay commented Dec 30, 2025 •

edited

Loading

Uh oh!

pierotofy commented Dec 30, 2025

Uh oh!

balazsthomay commented Dec 31, 2025

Uh oh!

pierotofy commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

balazsthomay commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Usage

Testing

Build Verification

Functional Testing

Platform Testing (if applicable)

Breaking Changes

Uh oh!

pierotofy commented Dec 30, 2025

Uh oh!

balazsthomay commented Dec 31, 2025

Uh oh!

pierotofy commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

balazsthomay commented Dec 30, 2025 •

edited

Loading