Skip to content

Link LLM evals to model inventory#3725

Merged
gorkem-bwl merged 12 commits intodevelopfrom
feat/eval-model-inventory-link
Apr 14, 2026
Merged

Link LLM evals to model inventory#3725
gorkem-bwl merged 12 commits intodevelopfrom
feat/eval-model-inventory-link

Conversation

@gorkem-bwl
Copy link
Copy Markdown
Contributor

Summary

  • Adds optional "Link to model inventory" dropdown on both experiment and bias audit creation modals
  • New "Evaluations" tab on model inventory page showing all linked experiments and bias audits with risk nudge banner
  • "Linked/Unlinked" indicator column on experiment and bias audit list tables
  • Alembic migration adds nullable model_inventory_id column to llm_evals_experiments and llm_evals_bias_audits
  • New GET /api/modelInventory/evaluations endpoint (Servers reads from llm_evals_* tables, no cross-service writes)
  • EvalServer CRUD and API layer updated to accept and return model_inventory_id

Design decisions

  • Optional only — dropdown defaults to "None", zero friction for users who don't want to link
  • User-driven risk awareness — flagged evals show a warning banner ("Consider adding to risk register"), no auto-creation of risks
  • No cross-service writes — EvalServer stores the FK, Servers reads llm_evals_* tables directly (same Postgres)
  • No FK constraint — application-level validation only; graceful "Model removed" handling if a linked model is deleted

Spec for linking LLM evaluations (experiments + bias audits) to model
inventory records via an optional dropdown at eval creation time. Includes
evaluations tab on model detail page and risk nudge for flagged results.
9 tasks covering: Alembic migration, EvalServer CRUD + API updates,
frontend dropdowns on both experiment and bias audit modals, Servers
endpoint for model evaluations, Evaluations tab on model detail page
with risk nudge banner, and unlinked indicators on eval lists.
## Changes
- bias_audits.py: add model_inventory_id param to create_bias_audit, include in INSERT, all SELECT/RETURNING clauses, and _row_to_dict (as modelInventoryId)
- evaluation_logs.py: add model_inventory_id param to create_experiment, include in INSERT/RETURNING, get_experiment_by_id SELECT/return dict, and get_experiments SELECT/return dict
## Changes
- Added model_inventory_id field to Experiment interface and createExperiment method in evaluationLogsService.ts
- Added getAllEntities import to NewExperimentModal for fetching model inventory
- Added modelInventories and selectedModelInventoryId state variables
- Added useEffect to fetch model inventories when modal opens
- Added optional "Link to model inventory" dropdown in Step 1 (model config section)
- Included model_inventory_id in the experiment creation payload
- Reset selectedModelInventoryId on form reset

## Benefits
- Users can now link experiments to a specific model in their inventory
- Dropdown is optional — defaults to None if not selected
- Non-critical fetch failure is silently handled (dropdown stays empty)
## Changes
- Added `modelInventoryId?: number` to `CreateBiasAuditConfig` interface in biasAuditService.ts
- Imported `getAllEntities` from entity.repository in NewBiasAuditModal
- Added state for model inventories list and selected model inventory ID
- Added useEffect to fetch model inventories when modal opens
- Added model inventory link Select dropdown in Step 2 (system info step)
- Included `modelInventoryId` in the submit config
- Reset `selectedModelInventoryId` on modal close
## Changes
- Create Servers/utils/modelEvaluations.utils.ts: raw SQL queries for
  llm_evals_experiments and llm_evals_bias_audits filtered by
  model_inventory_id + organization_id
- Create Servers/controllers/modelEvaluations.ctrl.ts: thin controller
  with logProcessing/logSuccess/logFailure and org-scoped auth
- Modify Servers/routes/modelInventory.route.ts: register
  GET /:id/evaluations before /:id to prevent param shadowing

## Benefits
- Enables the model inventory page to show linked evaluations and bias
  audits in a single authenticated, org-isolated request
## Changes
- Added getAllLinkedEvaluations() util joining llm_evals_experiments and llm_evals_bias_audits with model_inventories
- Added getAllModelEvaluations controller following existing logProcessing/logSuccess/logFailure pattern
- Added GET /modelInventory/evaluations route before /:id to avoid Express ID collision
- Created modelEvaluations.repository.ts with ModelEvaluation and ModelEvaluationsResponse types
- Created ModelEvaluationsTab component with combined sorted table, flagged-risk banner, and empty state
- Integrated Evaluations tab into ModelInventory/index.tsx (TabBar, URL detection, isBuiltInTab, content render)

## Benefits
- Shows all linked LLM experiments and bias audits in one place on the Model Inventory page
- Flags evaluations with failed metrics or flagged bias groups with an amber warning banner
- Follows existing tab pattern (model-risks, evidence-hub) — no new patterns introduced
## Changes
- Added LINKED MODEL column to experiments table in ProjectExperiments.tsx
- Added LINKED MODEL column to bias audits table in BiasAuditsList.tsx
- Extended IEvaluationRow interface with linkedModel field
- Extended BiasAuditSummary interface with modelInventoryId field
- Updated EvaluationTable column widths to accommodate new column
- Rendered green "Linked" badge or grey "Unlinked" label based on model_inventory_id / modelInventoryId presence

## Benefits
- Users can quickly see at a glance whether an experiment or bias audit is linked to a model inventory record
- Consistent visual indicator across both eval list views
Fix Select onChange type mismatch in NewBiasAuditModal and
NewExperimentModal by removing explicit type annotation and
using String() cast for value comparison.
- Use Promise.all for parallel DB queries in modelEvaluations.utils.ts
- Use palette.status.success tokens instead of hardcoded hex colors
- Use apiServices from networkServices instead of raw customAxios
- Remove redundant JSDoc comments that restate the function name
@gorkem-bwl gorkem-bwl requested a review from gorkemcetin April 13, 2026 03:32
@gorkem-bwl gorkem-bwl added this to the 2.3 milestone Apr 13, 2026
Copy link
Copy Markdown
Collaborator

@MuhammadKhalilzadeh MuhammadKhalilzadeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise file look good to me @gorkem-bwl
@HarshP4585 would you please run this branch for functionality and behavior check, also the migrations
Thank you

@MuhammadKhalilzadeh MuhammadKhalilzadeh requested review from HarshP4585 and removed request for gorkemcetin April 13, 2026 07:15
@gorkem-bwl gorkem-bwl merged commit 90e5ced into develop Apr 14, 2026
6 of 7 checks passed
@gorkem-bwl gorkem-bwl deleted the feat/eval-model-inventory-link branch April 14, 2026 03:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants