Skip to content

Optimize qr_parse processing#239

Draft
vmdocua wants to merge 23 commits intomasterfrom
enh-qr-parse-optimization
Draft

Optimize qr_parse processing#239
vmdocua wants to merge 23 commits intomasterfrom
enh-qr-parse-optimization

Conversation

@vmdocua
Copy link
Copy Markdown
Collaborator

@vmdocua vmdocua commented Apr 8, 2026

Summary

Initial qr_parse implementation is quite slow. It processes QR codes from Full HD video at only 10–20 FPS, which is insufficient. In some cases, the processing time even exceeds the video duration.

This PR focuses on improving performance and related areas.

Tasks

  • Measure and investigate bottlenecks in QR processing to better understand the optimization roadmap.
  • Review ideas and benchmark results from this PR and GPU power QR detection/decoding #216. Improve processing speed by at least 10×, and ideally up to 100×:
  • Abstract video frame processing to support different backends (e.g., OpenCV, FFmpeg), with the ability to select them via CLI --video-decoder option.
  • Abstract QR scanning/decoding functionality and provide alternative implementations in addition to the existing pyzbar one, configurable via CLI --qr-decoder option.
  • Add configurable parallel processing for QR decoding, along with appropriate CLI options.
  • Add optional processing using qdet and CUDA (if available).
  • Add tests for the new functionality and for the qr_parse tool, improving code coverage to 80%+.

Related

Closes #216

@vmdocua vmdocua self-assigned this Apr 8, 2026
@vmdocua vmdocua added enhancement New feature or request patch Increment the patch version when merged tests Add or improve existing tests performance Improve performance of an existing feature labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a specification document for the QR parse tool, detailing performance benchmarks and optimization opportunities. A critical security vulnerability was identified regarding the use of eval() on QR code data, which poses an arbitrary code execution risk and should be replaced with ast.literal_eval(). Additionally, while the document correctly identifies performance bottlenecks, the corresponding code changes to implement the recommended cv2.cvtColor optimization are currently missing from the repository.

code found, enabling downstream tools to correlate stimulus events with video timestamps.

The core loop reads frames from an MKV/video file using OpenCV (`cv2.VideoCapture`), converts each
frame to grayscale, and attempts QR code detection via `pyzbar`. Detected codes are deduplicated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

While reviewing the QR code detection logic, a critical security vulnerability was identified in src/reprostim/qr/qr_parse.py on line 529. The use of eval() on data from a QR code is extremely dangerous and can lead to arbitrary code execution.

Vulnerable Code:

# src/reprostim/qr/qr_parse.py:529
data = eval(eval(str(cod[0].data)).decode("utf-8"))

This complex expression is effectively equivalent to eval(cod[0].data.decode('utf-8')), which executes the string content of the QR code as Python code.

Recommendation:
Replace eval() with ast.literal_eval() for safe evaluation of Python literals. This will parse basic Python data structures without executing arbitrary code.

import ast
# ...

try:
    data = ast.literal_eval(cod[0].data.decode('utf-8'))
except (ValueError, SyntaxError):
    logger.error("Failed to parse QR code data.")
    continue

This vulnerability needs to be addressed with high priority.

- **`np.mean` for grayscale is the biggest non-decode bottleneck.** It runs at 34.2 fps vs.
329.7 fps for `cv2.cvtColor` — nearly a 10× difference for the same result.
- **`pyzbar.decode` dominates end-to-end cost** regardless of grayscale method, but switching
grayscale conversion still doubles overall throughput (23.7 → 46.1 fps).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The analysis correctly identifies np.mean as a bottleneck. However, the code changes to implement the cv2.cvtColor optimization are missing from this pull request. The file src/reprostim/qr/qr_parse.py still contains the slow np.mean implementation on line 516.

To achieve the performance gains described in this document, the code must be updated to use cv2.cvtColor for grayscale conversion.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 25.00000% with 78 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.22%. Comparing base (aa85977) to head (1b0a0ba).
⚠️ Report is 58 commits behind head on master.

Files with missing lines Patch % Lines
src/reprostim/qr/qr_parse.py 28.88% 64 Missing ⚠️
src/reprostim/cli/cmd_qr_parse.py 0.00% 13 Missing ⚠️
src/reprostim/cli/entrypoint.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master     #239       +/-   ##
===========================================
+ Coverage    2.45%   40.22%   +37.77%     
===========================================
  Files          26       28        +2     
  Lines        3505     4579     +1074     
  Branches      506      541       +35     
===========================================
+ Hits           86     1842     +1756     
+ Misses       3419     2704      -715     
- Partials        0       33       +33     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

vmdocua added 21 commits April 8, 2026 23:18
… spec with default qr-parse performance for 0.7.28 build, #239.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request patch Increment the patch version when merged performance Improve performance of an existing feature tests Add or improve existing tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GPU power QR detection/decoding

1 participant