Add mask support for training on foreground-only regions#220
Add mask support for training on foreground-only regions#220balazsthomay wants to merge 1 commit intopierotofy:mainfrom
Conversation
Add --mask-dir option to specify a directory containing binary mask images (0=exclude, 1=include). Masks are matched to images by filename stem and applied during loss computation (L1 and SSIM) so only foreground pixels contribute to training. Features: - Binary mask loading with automatic resizing to match image dimensions - Mask pyramid caching (like image pyramids) using nearest-neighbor interpolation - Masked L1 loss: weighted mean over foreground pixels only - Masked SSIM: post-masking of SSIM map before averaging - Validation loss also respects masks
|
Thanks for the PR! Did you use LLMs to generate this? It mostly looks very good, but can you explain how you arrived at: ? |
Yes, I used Claude to help me, sorry I didn't mention that. The mask is greyscale but the image is RGB so I just copy the mask value over 3 channels. Then, I wanted to measure how wrong/different our rendered image is, but only for the white/foreground pixels. Without mask I'd add up all the errors and divide by the total number of pixels. But with a mask:
and we get the average error across all foreground pixels that mainLoss() can use. The + 1e-8f is just a safety net in case someone passes in an all-black mask to not divide by zero. The two failed tests (Docker CUDA) couldn't complete, not sure why. |

Summary
Add mask support for training on foreground-only regions. This enables users to exclude background pixels during training, so only the region of interest contributes to the loss. Useful for object-centric reconstruction, cleaner outputs for compositing, and handling complex/cluttered backgrounds.
Changes
--mask-diroption to specify a directory containing binary mask imagesmaskPath,masktensor, andmaskPyramidscache fieldsimreadMask(),maskToTensor(),tensorToMask()functions in cv_utilsloadImage()with automatic resizing to match image dimensionsgetMask()with nearest-neighbor downscaling (like image pyramids)mainLoss()for both training and validationUsage
Masks should be binary images (white=include, black=exclude) with filenames matching the source images (e.g.,
images/001.jpg→masks/001.png).Testing
Build Verification
cmake -B build .cmake --build build -j8./build/opensplat --helpFunctional Testing
Platform Testing (if applicable)
Breaking Changes
None - this is an additive feature. Existing workflows without
--mask-dircontinue to work unchanged.