feat(bias-audit): enhanced PDF report with template architecture by gorkem-bwl · Pull Request #3734 · verifywise-ai/verifywise

gorkem-bwl · 2026-04-13T17:56:52Z

Summary

Complete overhaul of the bias audit PDF report generator. The report went from a raw data dump (5 pages) to an actionable compliance artifact (9 pages) with a template architecture for multi-framework support.

Depends on: #3725 (merge that first, then retarget this to develop)

New report sections

VerifyWise logo on cover page
Overall assessment verdict (green/amber/red) based on impact ratio severity
Scope section (in scope / out of scope)
Auto-evaluated compliance checklist against LL144 requirements
Impact ratio bar charts with threshold line
Results tables sorted worst-first with per-flag explanations
Recommended actions with legal citations
Regulatory context explaining LL144 requirements
Glossary of key terms (AEDT, impact ratio, 4/5ths rule, etc.)
Rewritten conclusion with actual findings summary

Architecture

Template pattern for multi-framework support:

report_templates/base.py — abstract base with default verdict/flag logic
report_templates/helpers.py — shared data extraction utilities
report_templates/ll144.py — NYC Local Law 144 content
report_templates/generic.py — fallback for custom frameworks
report_generator.py — layout engine calling template methods

Adding a new framework requires only creating a new template file and registering it in __init__.py. Currently supports 17 bias audit presets — LL144 gets the full template, all others fall back to generic.

## Changes - Create report_templates directory under bias_audit engine - Add BiasAuditReportTemplate ABC with 13 abstract methods for framework-specific PDF report content (verdict, checklist, scope, glossary, regulatory context, etc.) - Add template registry with get_template() that resolves preset names to template instances (LL144 or generic fallback) ## Benefits - Enables multiple compliance frameworks (NYC LL144, EU AI Act) to provide their own report content without modifying the layout engine - Clean separation between report layout and framework-specific content

## Changes - Create LL144Template implementing BiasAuditReportTemplate base class - Implement all required methods: verdict, scope_in/out, checklist, required_categories, threshold/flag explanations, recommended_actions, regulatory_context, glossary, conclusion_summary, additional_limitations - Add module-level helpers for impact ratio scanning, category extraction, group counting, and category detection ## Details The template provides all LL144-specific content including: - Three-tier verdict system (green/amber/red) based on 4/5ths rule thresholds - Six-item compliance checklist covering sex, race, intersectional analysis, auditor independence, and timeliness requirements - Legal citations (NYC Admin Code §§ 20-870–20-874, 6 RCNY § 5-300/5-303, 42 U.S.C. § 2000e-2(k), EEOC Uniform Guidelines § 60-3.4.D) - Seven-term glossary with official AEDT definition

Implements GenericTemplate(BiasAuditReportTemplate) used when the compliance framework preset is unknown or custom. Provides the same verdict logic (green/amber/red) as LL144 but without framework-specific legal references, checklists, or required categories. ## Changes - Three duplicated module-level helpers (_min_impact_ratio, _category_names_from_tables, _count_evaluated_groups) - Simplified scope_in with metric label derived from underscored key - Two-item scope_out, two-term glossary, no regulatory_context - Empty checklist and required_categories (no framework mandate) - Generic threshold and flag explanations without 4/5ths rule refs - Concise recommended_actions and conclusion_summary

## Changes - Integrate template system: resolve preset name to template instance and thread it through all section functions - Add logo to cover page (falls back to spacer if file missing) - Add new sections: overall assessment verdict table, scope (in/out), compliance checklist, impact ratio bar charts, recommended actions, regulatory context, glossary, and conclusion with limitations - Modify existing sections: executive summary now includes impact ratio charts; methodology uses template.threshold_explanation(); results sorts rows by impact ratio ascending and appends flag explanations - Delete _limitations() (replaced by _conclusion() with template support) - Add new color constants (verdict green/amber/red, check pass/warn/info) and paragraph styles for verdict, checklist, actions, and glossary ## Benefits - Framework-specific content (LL144, generic, future frameworks) is now driven by template classes rather than hardcoded in the generator - Report is significantly more detailed: verdict summary, scope, checklist, bar charts, regulatory context, glossary, and structured conclusion - Function signature generate_pdf_report(audit) is unchanged

- category_name not group for row names - total_applicants not total_records for record counts - auditorIndependence not auditorType for independence check - intersectional category_key filtering - cross not dimensions for intersectional config

- Replace logo with proper VerifyWise wordmark, preserve aspect ratio - Reduce table font size from 9pt to 8pt to prevent overflow - Reduce table cell padding for tighter layout - Skip flag explanations for intersectional tables (too verbose) - Truncate long group names with ellipsis - Wider first column for intersectional tables - Shorten "Excluded (<threshold)" to just "Excluded"

- Extract shared helpers into report_templates/helpers.py - Move verdict() and flag_explanation() to base class as defaults - Remove duplicated code from ll144.py and generic.py - Pass styles to _key_value_table instead of recreating on each call - Fix checklist status docstring (pass/warning/info, not met/not_met) - Standardize HTML markup: inline only (no <p> tags in Paragraph) - Remove unused Optional import from base.py

- New colorado_sb169.json preset with 4 BIFSG race categories (White, Hispanic, Black, Asian/Pacific Islander) - SB169Template overrides flag_explanation with rate-difference framing (percentage points vs White reference) - Threshold 0.95 as conservative flagging heuristic; actual rate differences shown in flag text - Colorado-specific checklist: ECDIS documentation, second-level testing, governance framework, DOI annual filing - Regulatory context cites C.R.S. § 10-3-1104.9 and Reg 10-1-1 - Glossary covers ECDIS, BIFSG, first-level and second-level testing - Limitations note BIFSG not implemented; insurer provides race-coded data - Add metric_label() method to base class; SB169 overrides to remove 4/5ths rule language - Update _cover_page to use template.metric_label() instead of hardcoded labels LL144 output is byte-identical after changes (verified at 28786 bytes).

gorkem-bwl requested a review from gorkemcetin April 13, 2026 18:14

Base automatically changed from feat/eval-model-inventory-link to develop April 14, 2026 03:31

gorkem-bwl added this to the 2.3 milestone Apr 14, 2026

gorkem-bwl added 9 commits April 13, 2026 23:33

docs(bias-audit): add design spec and implementation plan for report v2

f160d89

gorkem-bwl force-pushed the feat/bias-audit-report-v2 branch from 6fd78f5 to bbd2128 Compare April 14, 2026 03:33

gorkem-bwl merged commit 1fc79ee into develop Apr 14, 2026

gorkem-bwl deleted the feat/bias-audit-report-v2 branch April 14, 2026 04:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(bias-audit): enhanced PDF report with template architecture#3734

feat(bias-audit): enhanced PDF report with template architecture#3734
gorkem-bwl merged 9 commits intodevelopfrom
feat/bias-audit-report-v2

gorkem-bwl commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gorkem-bwl commented Apr 13, 2026

Summary

New report sections

Architecture

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant