Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a specification document for the QR parse tool, detailing performance benchmarks and optimization opportunities. A critical security vulnerability was identified regarding the use of eval() on QR code data, which poses an arbitrary code execution risk and should be replaced with ast.literal_eval(). Additionally, while the document correctly identifies performance bottlenecks, the corresponding code changes to implement the recommended cv2.cvtColor optimization are currently missing from the repository.
| code found, enabling downstream tools to correlate stimulus events with video timestamps. | ||
|
|
||
| The core loop reads frames from an MKV/video file using OpenCV (`cv2.VideoCapture`), converts each | ||
| frame to grayscale, and attempts QR code detection via `pyzbar`. Detected codes are deduplicated |
There was a problem hiding this comment.
While reviewing the QR code detection logic, a critical security vulnerability was identified in src/reprostim/qr/qr_parse.py on line 529. The use of eval() on data from a QR code is extremely dangerous and can lead to arbitrary code execution.
Vulnerable Code:
# src/reprostim/qr/qr_parse.py:529
data = eval(eval(str(cod[0].data)).decode("utf-8"))This complex expression is effectively equivalent to eval(cod[0].data.decode('utf-8')), which executes the string content of the QR code as Python code.
Recommendation:
Replace eval() with ast.literal_eval() for safe evaluation of Python literals. This will parse basic Python data structures without executing arbitrary code.
import ast
# ...
try:
data = ast.literal_eval(cod[0].data.decode('utf-8'))
except (ValueError, SyntaxError):
logger.error("Failed to parse QR code data.")
continueThis vulnerability needs to be addressed with high priority.
| - **`np.mean` for grayscale is the biggest non-decode bottleneck.** It runs at 34.2 fps vs. | ||
| 329.7 fps for `cv2.cvtColor` — nearly a 10× difference for the same result. | ||
| - **`pyzbar.decode` dominates end-to-end cost** regardless of grayscale method, but switching | ||
| grayscale conversion still doubles overall throughput (23.7 → 46.1 fps). |
There was a problem hiding this comment.
The analysis correctly identifies np.mean as a bottleneck. However, the code changes to implement the cv2.cvtColor optimization are missing from this pull request. The file src/reprostim/qr/qr_parse.py still contains the slow np.mean implementation on line 516.
To achieve the performance gains described in this document, the code must be updated to use cv2.cvtColor for grayscale conversion.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #239 +/- ##
===========================================
+ Coverage 2.45% 40.22% +37.77%
===========================================
Files 26 28 +2
Lines 3505 4579 +1074
Branches 506 541 +35
===========================================
+ Hits 86 1842 +1756
+ Misses 3419 2704 -715
- Partials 0 33 +33 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Initial
qr_parseimplementation is quite slow. It processes QR codes from Full HD video at only 10–20 FPS, which is insufficient. In some cases, the processing time even exceeds the video duration.This PR focuses on improving performance and related areas.
Tasks
--video-decoderoption.pyzbarone, configurable via CLI--qr-decoderoption.qr_parsetool, improving code coverage to 80%+.Related
Closes #216