Fix gowitness URL correlation failures by aconite33 · Pull Request #2985 · blacklanternsecurity/bbot

aconite33 · 2026-03-23T19:51:44Z

Summary

Fixes Gowitness fails to correlate screenshots/network logs/technologies due to URL mismatch #2984
PR Fix gowitness bug #2974 partially addressed gowitness KeyError crashes but missed the screenshot section and didn't fix the root cause (URL mismatch between input and gowitness DB)
Gowitness may record URLs with a different scheme and/or port than the input (e.g. http://host:443/ instead of https://host/, or http://host:443/ instead of http://host/ after redirect). This causes event_dict lookups to fail.
Add _url_key() that produces a scheme-and-port-agnostic key (hostname + path) for correlation, used both when building event_dict and when looking up from the DB
Use .get() with graceful fallback for the screenshot section (missed by PR Fix gowitness bug #2974)
Separate stdin_urls list to preserve original URLs sent to gowitness while using normalized keys in event_dict

Test plan

All 4 existing gowitness tests pass
Verified against live scan with targets that trigger scheme/port mismatches (CDN-fronted hosts redirecting HTTP→HTTPS)

- Add submodule auto-filter: disable submodules whose max severity/confidence is below configured thresholds (avoids running expensive submodules for nothing) - Create baddns.yml base preset (CNAME, MX, TXT) and baddns-heavy.yml (all submodules) - Rename spider-intense→spider-heavy, baddns-intense→baddns-heavy - Fix baddns_zone default min_severity to INFORMATIONAL (NSEC/zonetransfer need it) - Update kitchen-sink.yml, remove stale enable_references v1.x config - Fix baddns_zone NSEC test (bad.dns→bad.com for tldextract compatibility) - Fix baddns_direct test (updated signature matcher for baddns 2.0) - Update all preset warning messages and docs references

…r-version-compat # Conflicts: # bbot/modules/baddns_direct.py # bbot/modules/badsecrets.py # docs/modules/lightfuzz.md # docs/scanning/presets_list.md

Reset the global asndb_client after cleanup so subsequent ASNDB() calls create a fresh client instead of returning a closed one.

…/confidence sync, add baddns dev dep

…s_pip only)

…major-version-compat baddns 2.0.0 / badsecrets 1.0.0 compatibility

…2974. This prevents the crash and logs a warning instead of aborting the entire batch.

The radixtarget 4.x migration introduced a Rust-backed PyRadixTarget type that cannot be pickled. Since the web engine passes BBOTTarget (which contains RadixTarget) to a subprocess via SpawnProcess, every module that makes HTTP requests was failing with: "cannot pickle 'builtins.PyRadixTarget' object" This affected telerik, reflected_parameters, azure_tenant, emailformat, dnsbrute_mutations, and many others. Fix: add __getstate__/__setstate__ to BaseTarget so the RadixTarget is excluded from pickling and reconstructed from event_seeds on the other side. Additionally, fix gowitness handle_batch URL correlation: - Normalize event_dict keys with clean_url() so they match the normalized DB URLs during lookup (fixes the root cause of KeyError crashes like the one partially addressed by PR #2974) - Use .get() instead of bare dict access for the screenshot section, which PR #2974 missed (it only fixed network_logs and technologies)

Gowitness may change both the scheme and port of a URL it records in its database (e.g. recording http://host:443/ for an input of http://host/ when the server redirects from port 80 to HTTPS on port 443). This caused KeyError crashes and later correlation warnings. Use hostname + path as the event_dict key, ignoring scheme and port entirely, so lookups succeed regardless of how gowitness transforms the URL. Also use .get() with graceful fallback for any remaining edge cases.

Remove our __getstate__/__setstate__ from BaseTarget; the upstream fix-target-pickle branch handles this more cleanly (explicit state, direct acl_mode reading, ScanBlacklist override, and a test).

…tness_fix

github-actions · 2026-03-23T20:44:58Z

📊 Performance Benchmark Report

Comparing 3.0 (baseline) vs mtr/gowitness_fix (current)

📈 Detailed Results (All Benchmarks)

📋 Complete results for all benchmarks - includes both significant and insignificant changes

🧪 Test Name	📏 Base	📏 Current	📈 Change	🎯 Status
Bloom Filter Dns Mutation Tracking Performance	`4.26ms`	`4.24ms`	-0.6% ⚪	✅
Bloom Filter Large Scale Dns Brute Force	`17.46ms`	`17.38ms`	-0.4% ⚪	✅
Large Closest Match Lookup	`352.11ms`	`347.91ms`	-1.2% ⚪	✅
Realistic Closest Match Workload	`190.17ms`	`185.54ms`	-2.4% ⚪	✅
Event Memory Medium Scan	`1769 B/event`	`1768 B/event`	-0.0% ⚪	✅
Event Memory Large Scan	`1757 B/event`	`1757 B/event`	+0.0% ⚪	✅
Event Validation Full Scan Startup Small Batch	`406.31ms`	`406.73ms`	+0.1% ⚪	✅
Event Validation Full Scan Startup Large Batch	`583.40ms`	`577.97ms`	-0.9% ⚪	✅
Make Event Autodetection Small	`30.72ms`	`30.50ms`	-0.7% ⚪	✅
Make Event Autodetection Large	`312.72ms`	`312.11ms`	-0.2% ⚪	✅
Make Event Explicit Types	`13.67ms`	`13.73ms`	+0.5% ⚪	✅
Excavate Single Thread Small	`3.953s`	`3.988s`	+0.9% ⚪	✅
Excavate Single Thread Large	`9.730s`	`9.457s`	-2.8% ⚪	✅
Excavate Parallel Tasks Small	`4.133s`	`4.131s`	-0.0% ⚪	✅
Excavate Parallel Tasks Large	`7.246s`	`7.196s`	-0.7% ⚪	✅
Is Ip Performance	`3.15ms`	`3.15ms`	-0.2% ⚪	✅
Make Ip Type Performance	`11.52ms`	`11.42ms`	-0.8% ⚪	✅
Mixed Ip Operations	`4.51ms`	`4.49ms`	-0.4% ⚪	✅
Typical Queue Shuffle	`62.45µs`	`60.40µs`	-3.3% ⚪	✅
Priority Queue Shuffle	`703.54µs`	`685.87µs`	-2.5% ⚪	✅

🎯 Performance Summary

✅ No significant performance changes detected (all changes <10%)

🐍 Python Version 3.11.15

codecov · 2026-03-23T21:36:30Z

Codecov Report

❌ Patch coverage is 89.28571% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 91%. Comparing base (de31418) to head (05e5393).
⚠️ Report is 26 commits behind head on 3.0.

Files with missing lines	Patch %	Lines
bbot/modules/gowitness.py	82%	6 Missing ⚠️

Additional details and impacted files

@@          Coverage Diff          @@
##             3.0   #2985   +/-   ##
=====================================
- Coverage     91%     91%   -0%     
=====================================
  Files        436     436           
  Lines      36918   36959   +41     
=====================================
+ Hits       33567   33587   +20     
- Misses      3351    3372   +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Use tiered lookup: exact raw URL match first, then fall back to the loose hostname+path key. This correctly handles both multi-port URLs (e.g. :80 and :443 on the same host) and gowitness scheme/port transformations from redirects.

Upgrade pinned gowitness version from 3.0.5 to 3.1.1. Replace the unit test for _resolve_parent with a real integration test that runs gowitness against two ports (HTTP :8888 and HTTPS :9999) and verifies both get correctly correlated WEBSCREENSHOT events with the right parent attribution.

liquidsec and others added 19 commits March 2, 2026 14:38

Merge branch '3.0' into badsecrets-baddns-major-version-compat

deeb2aa

Merge remote-tracking branch 'origin/3.0' into badsecrets-baddns-majo…

401d68c

…r-version-compat # Conflicts: # bbot/modules/baddns_direct.py # bbot/modules/badsecrets.py # docs/modules/lightfuzz.md # docs/scanning/presets_list.md

Merge branch '3.0' into badsecrets-baddns-major-version-compat

8d4d3e5

Merge branch '3.0' into badsecrets-baddns-major-version-compat

16918a6

ruff format baddns_direct.py

7470af1

Pass BBOT_IO_API_KEY to test runner for asndb API access

101b8fb

Fix asndb singleton not being reset after cleanup

9fb13f2

Reset the global asndb_client after cleanup so subsequent ASNDB() calls create a fresh client instead of returning a closed one.

Merge branch '3.0' into badsecrets-baddns-major-version-compat

15c4af6

Fix severity levels to use INFO, add AST test for baddns max severity…

6f09580

…/confidence sync, add baddns dev dep

Merge branch '3.0' into badsecrets-baddns-major-version-compat

3e2772b

Remove kreuzberg/pypdfium2 from pyproject.toml (belongs in module dep…

8b1e1ca

…s_pip only)

Merge pull request #2933 from blacklanternsecurity/badsecrets-baddns-…

ba17420

…major-version-compat baddns 2.0.0 / badsecrets 1.0.0 compatibility

The fix uses the same .get() + None check + continue pattern from PR #…

8c60053

…2974. This prevents the crash and logs a warning instead of aborting the entire batch.

Fix BBOTTarget pickle support for engine subprocess

58a4083

Drop local pickle fix in favor of PR #2983

60939bb

Remove our __getstate__/__setstate__ from BaseTarget; the upstream fix-target-pickle branch handles this more cleanly (explicit state, direct acl_mode reading, ScanBlacklist override, and a test).

Merge remote-tracking branch 'origin/fix-target-pickle' into mtr/gowi…

18b4553

…tness_fix

aconite33 force-pushed the mtr/gowitness_fix branch from bee7518 to 18b4553 Compare March 23, 2026 19:53

aconite33 and others added 3 commits March 23, 2026 13:54

ruff format gowitness.py

c7d20d7

Merge branch '3.0' into mtr/gowitness_fix

4b85f51

Merge branch '3.0' into mtr/gowitness_fix

731f707

aconite33 added 3 commits March 23, 2026 20:35

Add unit test for gowitness _resolve_parent tiered lookup

2caf180

liquidsec approved these changes Mar 24, 2026

View reviewed changes

TheTechromancer approved these changes Mar 24, 2026

View reviewed changes

TheTechromancer merged commit b7c0604 into 3.0 Mar 24, 2026
16 checks passed

liquidsec mentioned this pull request Mar 25, 2026

Gowitness parent_url KeyError #2613

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix gowitness URL correlation failures#2985

Fix gowitness URL correlation failures#2985
TheTechromancer merged 25 commits into3.0from
mtr/gowitness_fix

aconite33 commented Mar 23, 2026

Uh oh!

github-actions bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

aconite33 commented Mar 23, 2026

Summary

Test plan

Uh oh!

github-actions bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📊 Performance Benchmark Report

🎯 Performance Summary

Uh oh!

codecov bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Mar 23, 2026 •

edited

Loading

codecov bot commented Mar 23, 2026 •

edited

Loading