Skip to content

Stegoveritas integration with image transformation and GIF frame extraction#185

Closed
aradhyacp wants to merge 7 commits intoZeecka:mainfrom
aradhyacp:analyzer/stegoveritas
Closed

Stegoveritas integration with image transformation and GIF frame extraction#185
aradhyacp wants to merge 7 commits intoZeecka:mainfrom
aradhyacp:analyzer/stegoveritas

Conversation

@aradhyacp
Copy link
Collaborator

Summary

  • Added stegoveritas analyzer with image extraction and UI display support (copy extracted images into the submission output directory and emit /image/... URLs).
  • Preserve animated GIFs for -extract_frames while still converting RGBA/LA/P static images to RGB to avoid Pillow filter errors.
  • Keep Python 3.14 while restoring binwalk compatibility by removing the pip binwalk entrypoint and adding a minimal imp shim for legacy binwalk imports.
  • Register stegoveritas archives in worker downloads.

Why

Stegoveritas introduced two regressions:

  1. Python 3.14 removed imp, which binwalk’s Python module still imports (triggered by stegoveritas).
  2. GIF conversion to RGB collapsed animation frames, breaking -extract_frames.

Implementation Details

  • Dockerfile
    • Remove /usr/local/bin/binwalk so apt binwalk is used instead of the pip wrapper.
    • Add a small imp compatibility shim at /usr/local/lib/python3.14/imp.py to satisfy binwalk’s legacy imports.
  • Stegoveritas analyzer
    • Preserve animated GIFs (skip RGB conversion when is_animated is true).
    • Convert static RGBA/LA/P images to RGB to avoid Pillow filter errors.
    • Collect stegoveritas output images from the results/ folder, copy into the submission root, and emit URLs for UI rendering.
    • Continue generating a downloadable archive for the full results folder.
  • Config
    • Add stegoveritas to WORKER_FILES so archives are downloadable.

Challenges & How We Solved Them

  • Binwalk import failure on Python 3.14 (imp removal): Added a minimal imp.load_source shim so binwalk’s plugin loader works without downgrading Python.
  • Pip binwalk shadowing apt binwalk: Removed /usr/local/bin/binwalk after installing stegoveritas so the system binary is used consistently.
  • -extract_frames returning nothing on GIFs: Stopped converting animated GIFs to RGB so stegoveritas can access all frames.
  • UI not showing stegoveritas images: Copied extracted images into the submission output directory and emitted /image/ URLs.

Testing

  • Local: docker compose -f compose.dev.yml up --build -d
  • Uploaded static PNG/JPEG and animated GIFs:
    • Verified stegoveritas image outputs render in UI.
    • Verified GIFs produce frame outputs.
    • Confirmed archives download successfully.

Screenshots

Original GIF

The original GIF used for testing:

stewie-stewie-griffin


Extracted Frames

Frames extracted from the GIF:

Extracted frames

JPG Used

The JPG image used for comparison/testing:

aperisolve issue


Image Transformation

Result after image transformation:

Image transformation result

Binwalk Verification

binwalk output confirming expected behavior:

Binwalk output

@aradhyacp
Copy link
Collaborator Author

fixes #159

@aradhyacp aradhyacp linked an issue Feb 10, 2026 that may be closed by this pull request
@aradhyacp
Copy link
Collaborator Author

@Zeecka any updates ? :)

@Zeecka
Copy link
Owner

Zeecka commented Feb 15, 2026

Sorry I missed that PR. I've got few comments:

  • Can you explain why you're removing /usr/local/bin/binwalk, and creating imp.py ? Is the regression due to python 3.14 or stegoveritas ? Where is the new binwalk from ?

  • I wanted to keep decomposer instead of using the -imageTransform option. If we keep this option, we need to remove decomposer, and put some 2nd level title for each channel/layer like in decomposer.

@aradhyacp
Copy link
Collaborator Author

  1. Why I removed /usr/local/bin/binwalk and added imp.py - root cause and provenance
  • What happened
    • Installing stegoveritas pulled a pip-packaged binwalk as a dependency. pip places that CLI at /usr/local/bin/binwalk and its Python package lives in site‑packages.
    • That pip binwalk (the “new” binwalk in this branch) behaves differently from the OS/apt binwalk (/usr/bin/binwalk). The pip package imports legacy stdlib module imp and uses imp.load_source.
  • Why the regression appeared
    • The regression was triggered by adding stegoveritas (it brought the pip binwalk). The pip binwalk imports imp which no longer exists in Python 3.14 - causing the ModuleNotFoundError. On main there’s no stegoveritas, so the system/apt binwalk is used and everything works even under 3.14.
    • So it’s a combination: binwalk’s Python package relies on removed APIs (binwalk bug/legacy code) and stegoveritas caused that package to be present and to shadow the working system binary.
  • Why I removed /usr/local/bin/binwalk
    • Removing the pip entrypoint ensures the runtime resolves to the apt/system binwalk at /usr/bin/binwalk, which is the known-good binary that worked on main and doesn’t trigger the imp import.
  • Why I added imp.py
    • As a safety/backstop for any remaining code paths in site-packages that import imp, I added a minimal compatibility shim (imp.py) implementing load_source via importlib. This prevents immediate crashes where the pip package still imports imp on 3.14 without forcing a Python downgrade.
  • Where the new binwalk came from
    • It came from pip (installed as a dependency of stegoveritas). The apt binwalk remains in /usr/bin; the pip wrapper is /usr/local/bin/binwalk.
  1. Decomposer vs stegoveritas -imageTransform and recommended approach
  • Difference
    • Decomposer produces structured outputs grouped by channel/layer with titles and clear grouping for each channel.
    • stegoveritas -imageTransform produces many transformed images but does not structure them the same way (no second-level headings per channel), so results can feel noisy or duplicate decomposer outputs.
  • Options
    • A) Remove -imageTransform from stegoveritas and keep decomposer
    • B) Keep -imageTransform and post-process stegoveritas output to replicate decomposer’s channel/layer grouping (preserves both tool outputs).
    • C) Keep both and dedupe/merge outputs in analyzer post-processing.
  • Recommendation
    Go with Option B keep both the tools because stegoveritas is also being used to get frames from gif, we need to add a step that groups images and display all the images under that group

I recommend merging the PR now (to restore functionality and fixes), then follow up in a small PR to implement the grouping of images of stegoveritas.

@Zeecka
Copy link
Owner

Zeecka commented Feb 16, 2026

Ok, go for Option B before merging please.

Also can you please provide a time benchmark between stegoveritas VS decomposer on large images to ensure we're not loosing CPU power ?

Thanks

@Zeecka
Copy link
Owner

Zeecka commented Feb 20, 2026

Need to be rebased since #186 MR has been merged

@aradhyacp
Copy link
Collaborator Author

sure i will look into it

@aradhyacp aradhyacp closed this Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Analyzer] "stegoVeritas" Analyzer

2 participants