Skip to content

Quantize NafNet deblurring: ORT pipeline + validation script + robust handling#299

Open
Hariom-Nagar211 wants to merge 2 commits intoopencv:mainfrom
Hariom-Nagar211:feat/nafnet-quantization-validation
Open

Quantize NafNet deblurring: ORT pipeline + validation script + robust handling#299
Hariom-Nagar211 wants to merge 2 commits intoopencv:mainfrom
Hariom-Nagar211:feat/nafnet-quantization-validation

Conversation

@Hariom-Nagar211
Copy link
Copy Markdown

Summary

Adds ONNX Runtime static quantization support for deblurring_nafnet, and a validation script to compare FP32 vs INT8 outputs with PSNR/SSIM and timing.
Improves quantization robustness and memory handling across the tools.
Changes

tools/quantize/quantize-ort.py
Adds deblurring_nafnet entry to the central models dict.
Uses QuantFormat.QOperator + op_types_to_quantize=['Conv','MatMul'] for better CPU EP performance, with an automatic fallback to QDQ Conv-only when needed.
Optional pre-processing (skips if onnxruntime-extensions is missing).
Lazy
DataReader
creation to avoid importing all datasets at module import.
MinMax calibration; input resize to 512x512; max_samples=1 to reduce memory.
Auto-excludes Conv nodes without bias initializers to avoid bias=None errors.
tools/quantize/transform.py
Makes
HandAlign
lazy-load its palm detector so unrelated quant runs don’t load Mediapipe ONNX at import time.
models/deblurring_nafnet/validate_quantization.py
New: Runs FP32 and INT8 models via ORT, reports timing, PSNR, SSIM, and shows side-by-side results.
Usage

Quantize NafNet:
cd tools/quantize
python quantize-ort.py deblurring_nafnet
Output: models/deblurring_nafnet/deblurring_nafnet_2025may_int8.onnx
Validate:
cd models/deblurring_nafnet
python validate_quantization.py --input example_outputs/licenseplate_motion.jpg --model_fp32 deblurring_nafnet_2025may.onnx --model_int8 deblurring_nafnet_2025may_int8.onnx
Notes on performance

On CPUExecutionProvider, INT8 speed-ups are not guaranteed unless the provider fuses/accelerates int8 kernels.
For acceleration, consider provider-specific EPs:
OpenVINOExecutionProvider (CPU) or DmlExecutionProvider (GPU) if available.
The script quantizes to QOperator Conv/MatMul by default; falls back to QDQ Conv-only for robustness.
Known limitations

Quantization depends on ORT support for the model’s ops and shapes; the script includes automatic mitigations (resize, sample limit, excludes).
We do not commit generated ONNX or output images.

Screenshot:
Screenshot 2025-09-22 213940

Copy link
Copy Markdown

@jay7-tech jay7-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through the changes. A few things worth resolving before merge:

  1. HandAlign lazy-load is incomplete

The PR description says HandAlign is lazy-loaded to avoid importing the palm detector on unrelated runs. But HandAlign.init still initializes the palm detector at construction time, not inside call. The real problem is that the models = dict(...) block is at module level, so every Quantize object — including mp_handpose with its HandAlign — gets instantiated on import, regardless of what the user passes on the command line. The fix is to defer models construction inside if name == 'main':, or make DataReader construction lazy until .run() is called.

  1. calibration_image_dir='path/to/dataset' in the existing mp_palmdet and mp_handpose entries (in main)

This is unrelated to this PR but is a live crash in main right now — python quantize-ort.py mp_palmdet immediately throws FileNotFoundError: 'path/to/dataset'. Should be fixed alongside this PR since this PR touches the same file.

  1. max_samples=1 for NafNet calibration

MinMax calibration on a single sample produces scales tuned to one image's dynamic range — results in severe accuracy loss on diverse inputs. 64 samples minimum would give representative ranges. Worth documenting the tradeoff at minimum.

The validate_quantization.py PSNR/SSIM comparison script is a great addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants