Skip to content

feat(bias-audit): enhanced PDF report with template architecture#3734

Merged
gorkem-bwl merged 9 commits intodevelopfrom
feat/bias-audit-report-v2
Apr 14, 2026
Merged

feat(bias-audit): enhanced PDF report with template architecture#3734
gorkem-bwl merged 9 commits intodevelopfrom
feat/bias-audit-report-v2

Conversation

@gorkem-bwl
Copy link
Copy Markdown
Contributor

Summary

Complete overhaul of the bias audit PDF report generator. The report went from a raw data dump (5 pages) to an actionable compliance artifact (9 pages) with a template architecture for multi-framework support.

Depends on: #3725 (merge that first, then retarget this to develop)

New report sections

  • VerifyWise logo on cover page
  • Overall assessment verdict (green/amber/red) based on impact ratio severity
  • Scope section (in scope / out of scope)
  • Auto-evaluated compliance checklist against LL144 requirements
  • Impact ratio bar charts with threshold line
  • Results tables sorted worst-first with per-flag explanations
  • Recommended actions with legal citations
  • Regulatory context explaining LL144 requirements
  • Glossary of key terms (AEDT, impact ratio, 4/5ths rule, etc.)
  • Rewritten conclusion with actual findings summary

Architecture

Template pattern for multi-framework support:

  • report_templates/base.py — abstract base with default verdict/flag logic
  • report_templates/helpers.py — shared data extraction utilities
  • report_templates/ll144.py — NYC Local Law 144 content
  • report_templates/generic.py — fallback for custom frameworks
  • report_generator.py — layout engine calling template methods

Adding a new framework requires only creating a new template file and registering it in __init__.py. Currently supports 17 bias audit presets — LL144 gets the full template, all others fall back to generic.

@gorkem-bwl gorkem-bwl requested a review from gorkemcetin April 13, 2026 18:14
Base automatically changed from feat/eval-model-inventory-link to develop April 14, 2026 03:31
@gorkem-bwl gorkem-bwl added this to the 2.3 milestone Apr 14, 2026
## Changes
- Create report_templates directory under bias_audit engine
- Add BiasAuditReportTemplate ABC with 13 abstract methods for
  framework-specific PDF report content (verdict, checklist, scope,
  glossary, regulatory context, etc.)
- Add template registry with get_template() that resolves preset
  names to template instances (LL144 or generic fallback)

## Benefits
- Enables multiple compliance frameworks (NYC LL144, EU AI Act)
  to provide their own report content without modifying the layout engine
- Clean separation between report layout and framework-specific content
## Changes
- Create LL144Template implementing BiasAuditReportTemplate base class
- Implement all required methods: verdict, scope_in/out, checklist,
  required_categories, threshold/flag explanations, recommended_actions,
  regulatory_context, glossary, conclusion_summary, additional_limitations
- Add module-level helpers for impact ratio scanning, category extraction,
  group counting, and category detection

## Details
The template provides all LL144-specific content including:
- Three-tier verdict system (green/amber/red) based on 4/5ths rule thresholds
- Six-item compliance checklist covering sex, race, intersectional analysis,
  auditor independence, and timeliness requirements
- Legal citations (NYC Admin Code §§ 20-870–20-874, 6 RCNY § 5-300/5-303,
  42 U.S.C. § 2000e-2(k), EEOC Uniform Guidelines § 60-3.4.D)
- Seven-term glossary with official AEDT definition
Implements GenericTemplate(BiasAuditReportTemplate) used when the
compliance framework preset is unknown or custom. Provides the same
verdict logic (green/amber/red) as LL144 but without framework-specific
legal references, checklists, or required categories.

## Changes
- Three duplicated module-level helpers (_min_impact_ratio,
  _category_names_from_tables, _count_evaluated_groups)
- Simplified scope_in with metric label derived from underscored key
- Two-item scope_out, two-term glossary, no regulatory_context
- Empty checklist and required_categories (no framework mandate)
- Generic threshold and flag explanations without 4/5ths rule refs
- Concise recommended_actions and conclusion_summary
## Changes
- Integrate template system: resolve preset name to template instance
  and thread it through all section functions
- Add logo to cover page (falls back to spacer if file missing)
- Add new sections: overall assessment verdict table, scope (in/out),
  compliance checklist, impact ratio bar charts, recommended actions,
  regulatory context, glossary, and conclusion with limitations
- Modify existing sections: executive summary now includes impact ratio
  charts; methodology uses template.threshold_explanation(); results
  sorts rows by impact ratio ascending and appends flag explanations
- Delete _limitations() (replaced by _conclusion() with template support)
- Add new color constants (verdict green/amber/red, check pass/warn/info)
  and paragraph styles for verdict, checklist, actions, and glossary

## Benefits
- Framework-specific content (LL144, generic, future frameworks) is now
  driven by template classes rather than hardcoded in the generator
- Report is significantly more detailed: verdict summary, scope, checklist,
  bar charts, regulatory context, glossary, and structured conclusion
- Function signature generate_pdf_report(audit) is unchanged
- category_name not group for row names
- total_applicants not total_records for record counts
- auditorIndependence not auditorType for independence check
- intersectional category_key filtering
- cross not dimensions for intersectional config
- Replace logo with proper VerifyWise wordmark, preserve aspect ratio
- Reduce table font size from 9pt to 8pt to prevent overflow
- Reduce table cell padding for tighter layout
- Skip flag explanations for intersectional tables (too verbose)
- Truncate long group names with ellipsis
- Wider first column for intersectional tables
- Shorten "Excluded (<threshold)" to just "Excluded"
- Extract shared helpers into report_templates/helpers.py
- Move verdict() and flag_explanation() to base class as defaults
- Remove duplicated code from ll144.py and generic.py
- Pass styles to _key_value_table instead of recreating on each call
- Fix checklist status docstring (pass/warning/info, not met/not_met)
- Standardize HTML markup: inline only (no <p> tags in Paragraph)
- Remove unused Optional import from base.py
- New colorado_sb169.json preset with 4 BIFSG race categories (White, Hispanic, Black, Asian/Pacific Islander)
- SB169Template overrides flag_explanation with rate-difference framing (percentage points vs White reference)
- Threshold 0.95 as conservative flagging heuristic; actual rate differences shown in flag text
- Colorado-specific checklist: ECDIS documentation, second-level testing, governance framework, DOI annual filing
- Regulatory context cites C.R.S. § 10-3-1104.9 and Reg 10-1-1
- Glossary covers ECDIS, BIFSG, first-level and second-level testing
- Limitations note BIFSG not implemented; insurer provides race-coded data
- Add metric_label() method to base class; SB169 overrides to remove 4/5ths rule language
- Update _cover_page to use template.metric_label() instead of hardcoded labels

LL144 output is byte-identical after changes (verified at 28786 bytes).
@gorkem-bwl gorkem-bwl force-pushed the feat/bias-audit-report-v2 branch from 6fd78f5 to bbd2128 Compare April 14, 2026 03:33
@gorkem-bwl gorkem-bwl merged commit 1fc79ee into develop Apr 14, 2026
@gorkem-bwl gorkem-bwl deleted the feat/bias-audit-report-v2 branch April 14, 2026 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant