Problems
1. Compact format redundantly prefixes every line with ERROR
The severity prefix carries no information when the issue type already encodes it. TaintedSql is always an error; PossiblyInvalidArgument is self-evidently not info. Removing ERROR from CompactReport::create() makes the output shorter and no less informative. Non-error findings could retain their severity prefix (INFO, WARNING) since those are not the default and carry real signal.
2. Taint findings expose no chain — making triage impossible without re-running Psalm
Current compact output for taint issues:
app/Http/Controllers/Auth/LoginController.php:45:35 TaintedHeader: Detected tainted header
app/Http/Controllers/Admin/ReportController.php:94:23 TaintedSql: Detected tainted SQL
To determine if a finding is a real vulnerability or a false positive, you must either re-run without --output-format to get the full default trace, or open each flagged file and manually trace the data flow.
Proposed changes
1. Drop ERROR prefix from compact format
// Before
ERROR app/Http/Controllers/Auth/LoginController.php:45:35 TaintedHeader: Detected tainted header
// After
app/Http/Controllers/Auth/LoginController.php:45:35 TaintedHeader: Detected tainted header
2. Add source→sink chain as a second line for taint findings
When taint_trace !== null, emit a second indented line showing the data flow in actual PHP expressions. Non-taint findings remain single-line.
Line 1 — location and type:
{file}:{line}:{col} {TaintType} [{N}|direct]
Line 2 — source→sink chain (app-code only, stubs stripped):
{source_expr}[@{line}|@[{OtherFile.php}:{line}]] → [{$var} →]* {SinkClass}::{method}()@{line}
Examples
Direct — classifiable immediately, no file access needed:
app/Http/Controllers/Auth/LoginController.php:45:35 TaintedHeader [direct]
$request->input('redirect_url')@45 → Redirector::to()@45
Via local variable — assignment obvious from name:
app/Http/Controllers/PaymentController.php:34:35 TaintedHeader [2]
$request->input('redirect_url')@28 → $nextStepUrl → Redirector::to()@34
Cross-function, same file — line jump implies the call:
app/Http/Controllers/Admin/NotificationController.php:133:24 TaintedCallable [2]
$request->input('notification')@104 → $notificationFqcn → new $notificationFqcn@133
SQL column injection — stub hops (explode, array-destructuring) stripped:
app/Http/Controllers/Admin/ArticleController.php:94:23 TaintedSql [3]
$request->input('sort_by')@82 → $sortBy → $sortByColumn → Builder::orderBy()@94
Cross-file — source in a different controller, visible without grep:
app/Services/ReferralPerformanceReportBuilder.php:94:23 TaintedSql [3]
$request->input('sort_by')@[AdminReferralController.php:50] → $this->sortBy → Builder::orderBy()@94
DB-stored source — false positive detectable without opening any file:
app/Notifications/WelcomeCoach.php:34:18 TaintedHeader [2]
$member->email → MailMessage::cc()@34
SSRF:
app/Http/Controllers/Admin/PreviewController.php:21:30 TaintedSSRF [direct]
$request->input('url')@17 → Http::get()@21
Chain design decisions
Use actual PHP expressions — no invented notation
The taint_trace[1]->snippet already contains the real PHP call that introduces taint (e.g. $request->input('sort_by')). This is used directly — no abbreviation or taxonomy layer needed. The PHP expression itself tells you everything:
$request->input(...) → live request parameter, investigate immediately
$member->email → model attribute loaded from DB, likely false positive
session(...) → session-stored value, context-dependent
Source expression and @line
- Extracted from
taint_trace[1]->snippet (first app-code node), not from taint_trace[0]->label (the unlocated stub)
@line included when source is in the same file as the sink: $request->input('sort_by')@82
@[OtherFile.php:line] when source is in a different file: $request->input('sort_by')@[AdminReferralController.php:50]
@line omitted when source location is unknown (e.g. model attribute with no static call site)
Intermediate nodes — variable names only, no line numbers
Line numbers on intermediate hops add noise without triage value. The variable name is the signal: $sortBy → $sortByColumn shows simple manipulation; $validated or $allowedValue in the chain flags a possible runtime escape that Psalm couldn't model. Only source and sink carry @line.
Cross-file marker
When taint crosses a file boundary, prefix the first node in the new file with [ShortFileName.php]. Subsequent nodes in the same file need no prefix.
Sink shown as ShortClass::method()@line
Not just the tainted variable name. Builder::orderBy() vs Builder::whereRaw() have different severity; the method name makes this visible without reading code.
Stripping rules
Strip from the chain:
- Nodes where
file_path contains /vendor/ — eliminates enormous framework/queue serialization chains
- Nodes where
line_from === 0 — stubs (same logic SarifReport already uses)
- Psalm-synthetic internal labels:
variable-use, arrayvalue-fetch, coalesce, concat — graph plumbing with no PHP equivalent
Hop count
Count of displayed app-code segments minus 1. direct when the chain collapses to source → sink with nothing between.
Why this matters
Token efficiency for AI-assisted triage. Full JSON output for a 7-step taint chain is ~800 tokens. The two-line format above is ~35 tokens — over 20× smaller. For codebases with 30–200 taint findings, this is the difference between fitting a full security scan in a context window or not.
False positive resolution without file access.
| Chain signal |
Classification |
$request->input(...) + [direct] |
Real — investigate immediately |
$request->input(...) + [N] + no validation-named var |
Real — verify chain |
$request->input(...) + [N] + $validated/$allowed* in chain |
Possible false positive |
$member->prop or $model->attr as source |
Likely false positive — DB-stored |
Token comparison
| Format |
Tokens/finding |
Triage without opening files |
| Current compact |
~20 |
~10% |
| This two-line format |
~35 |
~90% |
| Full JSON |
~800 |
100% |
Implementation notes
Change 1 (CompactReport::create()): remove the $severity . ' ' prefix for REPORT_ERROR findings. Retain severity prefix only for non-error findings.
Change 2 (CompactReport::create()): when $issue_data->taint_trace !== null:
- Source: extract call expression from
taint_trace[1]->snippet; append @line or @[File.php:line] depending on whether source file matches sink file
- Intermediates: iterate
taint_trace, skip vendor/stub/synthetic nodes, collect $variableName; inject [ShortFile.php] prefix on file boundary transitions
- Sink: short class name +
::method() from last trace step label + @line
- Hop count: count of displayed nodes minus 1; emit
direct when count is 1
- Emit two lines:
{file}:{line}:{col} {TaintType} [{N}|direct]\n {chain}\n
Future: entry_path_type for reliable source classification
Accurate $request->input() vs $member->email distinction via pattern-matching on the snippet is sufficient for most cases but fragile. Populating entry_path_type on the taint source node would enable a reliable taxonomy (request_input, db_attribute, session, http_response). This is often unresolvable statically for framework-heavy stacks where the full call goes index.php → Kernel → Router → Controller through vendor paths. Left as a future enhancement.
Related
Problems
1. Compact format redundantly prefixes every line with
ERRORThe severity prefix carries no information when the issue type already encodes it.
TaintedSqlis always an error;PossiblyInvalidArgumentis self-evidently not info. RemovingERRORfromCompactReport::create()makes the output shorter and no less informative. Non-error findings could retain their severity prefix (INFO,WARNING) since those are not the default and carry real signal.2. Taint findings expose no chain — making triage impossible without re-running Psalm
Current compact output for taint issues:
To determine if a finding is a real vulnerability or a false positive, you must either re-run without
--output-formatto get the full default trace, or open each flagged file and manually trace the data flow.Proposed changes
1. Drop
ERRORprefix from compact format2. Add source→sink chain as a second line for taint findings
When
taint_trace !== null, emit a second indented line showing the data flow in actual PHP expressions. Non-taint findings remain single-line.Line 1 — location and type:
Line 2 — source→sink chain (app-code only, stubs stripped):
Examples
Direct — classifiable immediately, no file access needed:
Via local variable — assignment obvious from name:
Cross-function, same file — line jump implies the call:
SQL column injection — stub hops (explode, array-destructuring) stripped:
Cross-file — source in a different controller, visible without grep:
DB-stored source — false positive detectable without opening any file:
SSRF:
Chain design decisions
Use actual PHP expressions — no invented notation
The
taint_trace[1]->snippetalready contains the real PHP call that introduces taint (e.g.$request->input('sort_by')). This is used directly — no abbreviation or taxonomy layer needed. The PHP expression itself tells you everything:$request->input(...)→ live request parameter, investigate immediately$member->email→ model attribute loaded from DB, likely false positivesession(...)→ session-stored value, context-dependentSource expression and
@linetaint_trace[1]->snippet(first app-code node), not fromtaint_trace[0]->label(the unlocated stub)@lineincluded when source is in the same file as the sink:$request->input('sort_by')@82@[OtherFile.php:line]when source is in a different file:$request->input('sort_by')@[AdminReferralController.php:50]@lineomitted when source location is unknown (e.g. model attribute with no static call site)Intermediate nodes — variable names only, no line numbers
Line numbers on intermediate hops add noise without triage value. The variable name is the signal:
$sortBy → $sortByColumnshows simple manipulation;$validatedor$allowedValuein the chain flags a possible runtime escape that Psalm couldn't model. Only source and sink carry@line.Cross-file marker
When taint crosses a file boundary, prefix the first node in the new file with
[ShortFileName.php]. Subsequent nodes in the same file need no prefix.Sink shown as
ShortClass::method()@lineNot just the tainted variable name.
Builder::orderBy()vsBuilder::whereRaw()have different severity; the method name makes this visible without reading code.Stripping rules
Strip from the chain:
file_pathcontains/vendor/— eliminates enormous framework/queue serialization chainsline_from === 0— stubs (same logicSarifReportalready uses)variable-use,arrayvalue-fetch,coalesce,concat— graph plumbing with no PHP equivalentHop count
Count of displayed app-code segments minus 1.
directwhen the chain collapses tosource → sinkwith nothing between.Why this matters
Token efficiency for AI-assisted triage. Full JSON output for a 7-step taint chain is ~800 tokens. The two-line format above is ~35 tokens — over 20× smaller. For codebases with 30–200 taint findings, this is the difference between fitting a full security scan in a context window or not.
False positive resolution without file access.
$request->input(...)+[direct]$request->input(...)+[N]+ no validation-named var$request->input(...)+[N]+$validated/$allowed*in chain$member->propor$model->attras sourceToken comparison
Implementation notes
Change 1 (
CompactReport::create()): remove the$severity . ' 'prefix forREPORT_ERRORfindings. Retain severity prefix only for non-error findings.Change 2 (
CompactReport::create()): when$issue_data->taint_trace !== null:taint_trace[1]->snippet; append@lineor@[File.php:line]depending on whether source file matches sink filetaint_trace, skip vendor/stub/synthetic nodes, collect$variableName; inject[ShortFile.php]prefix on file boundary transitions::method()from last trace step label +@linedirectwhen count is 1{file}:{line}:{col} {TaintType} [{N}|direct]\n {chain}\nFuture:
entry_path_typefor reliable source classificationAccurate
$request->input()vs$member->emaildistinction via pattern-matching on the snippet is sufficient for most cases but fragile. Populatingentry_path_typeon the taint source node would enable a reliable taxonomy (request_input,db_attribute,session,http_response). This is often unresolvable statically for framework-heavy stacks where the full call goesindex.php → Kernel → Router → Controllerthrough vendor paths. Left as a future enhancement.Related
SarifReportalready filters stub trace steps via$trace->line_from > 0— same logic applies hereJsonReportemits the fulltaint_trace; this adds a compact projection of the same data