Conversation
Performance Benchmark Report
|
source_domain was silently failing on both azure_tenant and oauth due to __slots__. Adding it properly enables the cross-module domain context handoff that oauth relies on.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 3.0 #2986 +/- ##
=====================================
- Coverage 91% 91% -0%
=====================================
Files 436 436
Lines 36960 37038 +78
=====================================
+ Hits 33587 33640 +53
- Misses 3373 3398 +25 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add a .url property on BaseEvent that returns the URL string for any
event type via parsed_url.geturl(). Works uniformly across URL,
URL_UNVERIFIED, HTTP_RESPONSE, FINDING, WEB_PARAMETER, TECHNOLOGY,
STORAGE_BUCKET, etc. Returns empty string for non-URL events.
Also fix WEB_PARAMETER.sanitize_data to call super() so parsed_url
gets set (was silently broken).
Replace all event.data["url"], event.data.get("url"), and
type-checking patterns across 34 files to use event.url instead.
bbot/core/event/base.py
Outdated
| class URL_UNVERIFIED(BaseEvent): | ||
| _status_code_regex = re.compile(r"^status-(\d{1,3})$") | ||
|
|
||
| __slots__ = ["_http_title"] |
There was a problem hiding this comment.
we need to be careful not to override the base class
There was a problem hiding this comment.
slots with inheritance are additive, not overriding. Shouldn't cause any problems. This may go away anyway though.
| # Cross-module communication | ||
| "source_domain", |
There was a problem hiding this comment.
what is this for?
There was a problem hiding this comment.
It's for cross-module communication between azure_tenant and oauth. When azure_tenant discovers a federated login URL, it stamps source_domain on the event so that oauth knows which original domain the URL was discovered from (since the URL itself is on a different domain like login.microsoftonline.com). The oauth module then uses it for scope checks and finding descriptions.
- Rename "noisy" flag to "loud" across all modules, cli, tests, docs - Restore "safe" flag — explicitly added to every module that isn't loud or invasive (130 modules) - Add test: every scan module must have safe, loud, and/or invasive; safe is mutually exclusive with loud/invasive - Regenerate docs tables
Each preset YAML and doc section now explicitly lists what submodules, companion modules, POST behavior, WAF handling, and other settings are enabled, making the progression from light→default→heavy→max clear.
|
One last thing, regarding cloud providers:
The gist of "host_metadata": {
"spacex.com": {
"whois": {
"org": "SPACEX INDUSTRIES"
}
},
"104.18.26.217": {
"cloud_providers": ["microsoft", "github"],
"asns": ["AS1234"],
"orgs": ["Microsoft"]
}, |
event.data is now a dict, use event.url to get the URL string for regex matching
- ffuf_shortnames: use event.parent.url instead of event.parent.data (now a dict) - Update test monkeypatches that mutate event.data["url"] to also update parsed_url, since modules now read event.url (backed by parsed_url)
…vent The monkeypatch was setting parsed_url to a file:// URL, which has no hostname, causing make_ip_type(None) crash in trufflehog. Also harden DictHostEvent._host() to tolerate host-less parsed_url schemes.
|
will close #2771 |
- Add cleanup() to mongo output module (only output module missing one) - Await client.aclose() instead of unawaited client.close() in mongo test - Replace blocking time.sleep() with async sleep in mongo and elastic tests - Await docker stop process completion in mongo, rabbitmq, kafka, elastic, nats tests
Closes #2959
Summary
URL Events as Structured Data
.datais now{"url": "https://..."}instead of a bare string.urlproperty on BaseEvent — returns the URL string for any event type, empty string for non-URL eventsevent.data["url"], type-checking likeevent.data if event.type == "URL" else event.data["url"]event.urlinstead of direct data accesshost_metadata
host_metadatafield on BaseEvent — a dict keyed by host string (IP or domain){"104.18.26.217": {"cloud_providers": {"cloudflare": {"types": ["waf"], "match": "ip"}}}}Cloud Tag Cleanup
host_metadatawith structured cloud provider infocloudflare,amazon) + type (cloud,cdn,waf)cloud-microsoft,microsoft-ip,microsoft-domain,microsoft-cnameSlot Scoping
web_spider_distance,parsed_url,url_extension,num_redirects_data_pathto DictPathEvent,envelopesto WEB_PARAMETERFlag Renames
aggressive->loud— "generates a large amount of network traffic"deadly->invasive— "intrusive or potentially destructive"safeflag — explicitly on every module that is not loud or invasiveweb-basic->web,web-thorough->web-heavyPreset Renames
web-basic.yml->web.yml,web-thorough.yml->web-heavy.ymlspider-intense.yml->spider-heavy.yml,baddns-intense.yml->baddns-heavy.ymlnuclei-intense.yml->nuclei-heavy.ymllightfuzz-medium.yml->lightfuzz.yml,lightfuzz-superheavy.yml->lightfuzz-max.ymlDeadly Gate -> Console Warnings
--allow-deadlyCLI argumentTag Cleanup (high-cardinality dynamic tags)
ip-{ip}tags — IP data inresolved_hostsattributehttp-title-{title}tags —http_titleas a proper event attributestatus-{code},distance-{n},extension-{ext},{rdtype}-recordFix: Docker-based test hangs (pre-existing bug)
cleanup()method —AsyncMongoClientwas never closed, leaving 5+ pymongo background tasks (kill_cursors, server_monitor, server_rtt, poll_cancellation) orphaned on the session-scoped event loopclient.close()withoutawait— the returned coroutine was silently discarded, so the client was never actually closedtime.sleep()in async context in mongo and elastic testsdocker stop) to await process completion before returningTests and Docs