Fix gowitness URL correlation failures#2985
Merged
TheTechromancer merged 25 commits into3.0from Mar 24, 2026
Merged
Conversation
- Add submodule auto-filter: disable submodules whose max severity/confidence is below configured thresholds (avoids running expensive submodules for nothing) - Create baddns.yml base preset (CNAME, MX, TXT) and baddns-heavy.yml (all submodules) - Rename spider-intense→spider-heavy, baddns-intense→baddns-heavy - Fix baddns_zone default min_severity to INFORMATIONAL (NSEC/zonetransfer need it) - Update kitchen-sink.yml, remove stale enable_references v1.x config - Fix baddns_zone NSEC test (bad.dns→bad.com for tldextract compatibility) - Fix baddns_direct test (updated signature matcher for baddns 2.0) - Update all preset warning messages and docs references
…r-version-compat # Conflicts: # bbot/modules/baddns_direct.py # bbot/modules/badsecrets.py # docs/modules/lightfuzz.md # docs/scanning/presets_list.md
Reset the global asndb_client after cleanup so subsequent ASNDB() calls create a fresh client instead of returning a closed one.
…/confidence sync, add baddns dev dep
…major-version-compat baddns 2.0.0 / badsecrets 1.0.0 compatibility
…2974. This prevents the crash and logs a warning instead of aborting the entire batch.
The radixtarget 4.x migration introduced a Rust-backed PyRadixTarget type that cannot be pickled. Since the web engine passes BBOTTarget (which contains RadixTarget) to a subprocess via SpawnProcess, every module that makes HTTP requests was failing with: "cannot pickle 'builtins.PyRadixTarget' object" This affected telerik, reflected_parameters, azure_tenant, emailformat, dnsbrute_mutations, and many others. Fix: add __getstate__/__setstate__ to BaseTarget so the RadixTarget is excluded from pickling and reconstructed from event_seeds on the other side. Additionally, fix gowitness handle_batch URL correlation: - Normalize event_dict keys with clean_url() so they match the normalized DB URLs during lookup (fixes the root cause of KeyError crashes like the one partially addressed by PR #2974) - Use .get() instead of bare dict access for the screenshot section, which PR #2974 missed (it only fixed network_logs and technologies)
Gowitness may change both the scheme and port of a URL it records in its database (e.g. recording http://host:443/ for an input of http://host/ when the server redirects from port 80 to HTTPS on port 443). This caused KeyError crashes and later correlation warnings. Use hostname + path as the event_dict key, ignoring scheme and port entirely, so lookups succeed regardless of how gowitness transforms the URL. Also use .get() with graceful fallback for any remaining edge cases.
Remove our __getstate__/__setstate__ from BaseTarget; the upstream fix-target-pickle branch handles this more cleanly (explicit state, direct acl_mode reading, ScanBlacklist override, and a test).
bee7518 to
18b4553
Compare
Contributor
📊 Performance Benchmark Report
📈 Detailed Results (All Benchmarks)
🎯 Performance Summary✅ No significant performance changes detected (all changes <10%) 🐍 Python Version 3.11.15 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 3.0 #2985 +/- ##
=====================================
- Coverage 91% 91% -0%
=====================================
Files 436 436
Lines 36918 36959 +41
=====================================
+ Hits 33567 33587 +20
- Misses 3351 3372 +21 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Use tiered lookup: exact raw URL match first, then fall back to the loose hostname+path key. This correctly handles both multi-port URLs (e.g. :80 and :443 on the same host) and gowitness scheme/port transformations from redirects.
Upgrade pinned gowitness version from 3.0.5 to 3.1.1. Replace the unit test for _resolve_parent with a real integration test that runs gowitness against two ports (HTTP :8888 and HTTPS :9999) and verifies both get correctly correlated WEBSCREENSHOT events with the right parent attribution.
liquidsec
approved these changes
Mar 24, 2026
TheTechromancer
approved these changes
Mar 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
KeyErrorcrashes but missed the screenshot section and didn't fix the root cause (URL mismatch between input and gowitness DB)http://host:443/instead ofhttps://host/, orhttp://host:443/instead ofhttp://host/after redirect). This causesevent_dictlookups to fail._url_key()that produces a scheme-and-port-agnostic key (hostname + path) for correlation, used both when buildingevent_dictand when looking up from the DB.get()with graceful fallback for the screenshot section (missed by PR Fix gowitness bug #2974)stdin_urlslist to preserve original URLs sent to gowitness while using normalized keys inevent_dictTest plan