Skip to content

[OPIK-4845] [FE] Playground Redesign PR 3: Top Bar + Run on Dataset Modal + Per-Prompt Run#5750

Open
miguelgrc wants to merge 3 commits intomiguelg/OPIK-4655-playground-redesign-fullfrom
miguelg/OPIK-4845-playground-top-bar-run-modal-per-prompt
Open

[OPIK-4845] [FE] Playground Redesign PR 3: Top Bar + Run on Dataset Modal + Per-Prompt Run#5750
miguelgrc wants to merge 3 commits intomiguelg/OPIK-4655-playground-redesign-fullfrom
miguelg/OPIK-4845-playground-top-bar-run-modal-per-prompt

Conversation

@miguelgrc
Copy link
Contributor

@miguelgrc miguelgrc commented Mar 19, 2026

Summary

Restructure the playground page with a new top bar, "Run on dataset" modal dialog, per-prompt run/stop, and redesigned output display. This is the third PR in the playground redesign series, building on PR 1 (OPIK-4843).

Details

Top Bar (PlaygroundHeader)

  • Extracted from PlaygroundPrompts into standalone component
  • Experiment chip: dataset name + version, pencil to edit config, X to leave experiment mode with confirmation dialog (links to Experiments page)
  • "Test on dataset" button opens RunOnDatasetDialog
  • Run all / Re-run / Stop all buttons with inline hotkey display (⇧ ⏎)
  • Ghost Reset button separated by vertical separator
  • Full-width gray background extending beyond content area via CSS variable system

Run on Dataset Modal (RunOnDatasetDialog)

  • Dataset selector with version support (DatasetVersionSelectBox)
  • Metrics combobox (MetricSelector rewrite): purple RemovableTag chips, overflow handling (+N), smooth horizontal scroll on expand, delete without toggle
  • Filters button for dataset column filtering
  • Experiment name prefix: auto-fills as DatasetName-YYYY-MM-DD, persisted in Zustand store, restored on reopen
  • Validation: disabled when prompts invalid, experiment running, or dataset empty — tooltip explains reason

Per-Prompt Run (non-experiment mode)

  • isRunning: booleanisRunningMap: Record<string, boolean> in store
  • runSingle / stopSingle in useActionButtonActions
  • Per-prompt Run/Stop button in output header strip with validation tooltip

Output Redesign (PlaygroundPromptOutput)

  • Colored square + "Output A" label matching prompt colors
  • Provider icon + pretty model label (resolved via useLLMProviderModelsData)
  • Clock icon + duration, Coins icon + token count from output.usage
  • Model/provider/usage captured at generation time, not current prompt state
  • Empty state: "No runs yet" with colored icon per prompt
  • White background, border-r matching prompt columns, no card wrapper
  • Value + usage cleared on re-run for immediate loading animation

Experiment Mode Layout

  • Conditional layout: experiment mode separates prompts row from full-width outputs
  • Add variant strip only spans prompts area in experiment mode
  • PlaygroundExperimentOutputActions: progress bar OR experiment results header (mutually exclusive strip)
  • Experiment results header links to comparison page, displays stored prefix
  • Colored dot indicators on output table column headers (PlaygroundOutputColumnHeader)
  • Full-width borderless table via PageBodyStickyTableWrapper

Architecture & Refactoring

  • useActionButtonActions lifted to PlaygroundPage (eliminates fragile actionsRef pattern)
  • Deferred run pattern (pendingRun state) for dialog-triggered runs with onDeferredRunAll callback
  • Keyboard shortcut (Shift+Enter) in PlaygroundPage, run only (no stop)
  • transformDataColumnFilters extracted to lib/filters.ts
  • File reorganization: MetricSelector, useActionButtonActions, usePromptDatasetItemCombination moved to PlaygroundPage/ level
  • PlaygroundOutputPlaygroundPromptOutput, PlaygroundOutputActionsPlaygroundExperimentOutputActions
  • experimentNamePrefix added to Zustand store
  • ConfirmDialog.description accepts ReactNode (backward compatible)
  • HotkeyDisplay xs size variant
  • DatasetVersionSelectBox: version right-aligned in trigger
  • Shared constants: DEFAULT_LOADED_DATASETS, MAX_VERSIONS_TO_FETCH

Store Changes (PlaygroundStore)

  • isRunningisRunningMap with per-prompt selectors (useIsPromptRunning)
  • usage field on PlaygroundOutput: duration, totalTokens, model, provider
  • experimentNamePrefix state with setter
  • Removed pendingRunAll (replaced with local component state)

Issues

Change checklist

  • I have reviewed my own code
  • Lint and type checks pass locally
  • All frontend tests pass (979/979)
  • No new dependencies added
  • Changes are scoped to playground page components

Testing

  • Non-experiment mode: per-prompt Run/Stop, colored output labels with model/duration/tokens, Shift+Enter runs all
  • "Test on dataset" → modal opens, select dataset (version shown), select metrics (combobox with tags), prefix auto-fills
  • Run experiment → chip appears immediately, progress bar shows, results table displays with colored column dots
  • Re-run → same dataset/metrics, experiment results header updates
  • Click experiment chip → modal reopens with current config, prefix preserved
  • Click X on chip → confirmation dialog with Experiments link → leave clears state
  • Reset → clears everything, fresh prompt
  • Disabled states: Run button shows tooltip when prompts invalid, dialog Run button disabled when running
  • Layout: header extends full width, table full width in experiment mode, add variant strip scoped to prompts in experiment mode
  • Keyboard: Shift+Enter runs (doesn't stop), works in both modes

Documentation

No documentation changes required. This is a UI restructuring of existing playground functionality.

@miguelgrc miguelgrc requested a review from a team as a code owner March 19, 2026 17:34
…odal + Per-Prompt Run

Restructure the playground page with a new top bar, "Run on dataset" modal dialog,
per-prompt run/stop, and redesigned output display.

Top Bar (PlaygroundHeader):
- Extract header from PlaygroundPrompts into standalone PlaygroundHeader component
- Add experiment chip (dataset name + version, pencil to edit, X to leave)
- "Test on dataset" button opens RunOnDatasetDialog
- Run all / Re-run / Stop all with inline hotkey display (Shift+Enter)
- Ghost Reset button separated by vertical separator
- Full-width gray background extending beyond content area
- "Leave experiment mode?" confirmation with link to Experiments page

Run on Dataset Modal (RunOnDatasetDialog):
- Dataset selector with version support (DatasetVersionSelectBox)
- Metrics combobox (MetricSelector rewrite) with purple RemovableTag chips,
  overflow handling (+N), smooth scroll on expand, delete without toggle
- Filters button for dataset column filtering
- Experiment name prefix auto-fill (DatasetName-YYYY-MM-DD), persisted in store
- Validation: disabled when prompts invalid, running, or dataset empty
- Tooltips on disabled Run button explaining the reason

Per-Prompt Run (non-experiment mode):
- isRunning: boolean -> isRunningMap: Record<string, boolean> in store
- runSingle/stopSingle in useActionButtonActions
- Per-prompt Run/Stop button in output header strip
- Per-prompt validation with tooltip (no model, empty messages)

Output Redesign (PlaygroundPromptOutput):
- Colored square + "Output A" label matching prompt colors
- Provider icon + pretty model label (from useLLMProviderModelsData)
- Clock icon + duration, Coins icon + token count (from output.usage)
- Model/provider/usage captured at generation time, not from current prompt
- Empty state: "No runs yet" with colored icon
- White background, no card wrapper, border-r matching prompt columns
- Value + usage cleared on re-run for immediate loading animation

Experiment Mode Layout:
- Conditional layout: experiment mode separates prompts from outputs
- Table takes full screen width in experiment mode
- Add variant strip only spans prompts area (not outputs)
- PlaygroundExperimentOutputActions: progress bar OR experiment results header
- Experiment results header with link to comparison page, using stored prefix
- Colored dot indicators on output table column headers

Architecture & Refactoring:
- useActionButtonActions lifted to PlaygroundPage (single instance)
- Deferred run pattern (pendingRun state) for dialog-triggered runs
- onRunAll + onDeferredRunAll callbacks (immediate vs wait-for-items)
- Keyboard shortcut (Shift+Enter) in PlaygroundPage, run only (no stop)
- transformDataColumnFilters extracted to lib/filters.ts
- File reorganization: MetricSelector, useActionButtonActions,
  usePromptDatasetItemCombination moved to PlaygroundPage level
- PlaygroundOutput renamed to PlaygroundPromptOutput
- PlaygroundOutputActions renamed to PlaygroundExperimentOutputActions
- experimentNamePrefix added to Zustand store
- ConfirmDialog.description changed from string to ReactNode
- HotkeyDisplay xs size variant added
- DatasetVersionSelectBox: version name/hash right-aligned in trigger
- DEFAULT_LOADED_DATASETS imported from shared source
- MAX_VERSIONS_TO_FETCH exported from useValidatedDatasetVersion

Store Changes (PlaygroundStore):
- isRunning -> isRunningMap with per-prompt selectors
- Added usage field (duration, totalTokens, model, provider) to PlaygroundOutput
- Added experimentNamePrefix state
- Removed pendingRunAll (replaced with local state)
@miguelgrc miguelgrc force-pushed the miguelg/OPIK-4845-playground-top-bar-run-modal-per-prompt branch from 4d7f6e3 to eab4d2c Compare March 19, 2026 17:36
@miguelgrc miguelgrc added the test-environment Deploy Opik adhoc environment label Mar 19, 2026
@github-actions
Copy link
Contributor

🔄 Test environment deployment process has started

Phase 1: Deploying base version 1.10.40-4499 (from main branch) if environment doesn't exist
Phase 2: Building new images from PR branch miguelg/OPIK-4845-playground-top-bar-run-modal-per-prompt
Phase 3: Will deploy newly built version after build completes

You can monitor the progress here.

Comment on lines +260 to +271
/**
* Transform data column filters from "data.columnName" format to backend format.
* Converts field="data.columnName" to field="data" with key="columnName".
* Used for dataset item filtering in the playground.
*/
export const transformDataColumnFilters = (filters: Filter[]): Filter[] => {
const dataFieldPrefix = `${COLUMN_DATA_ID}.`;
return filters.map((filter) => {
if (filter.field.startsWith(dataFieldPrefix)) {
const columnKey = filter.field.slice(dataFieldPrefix.length);
return { ...filter, field: COLUMN_DATA_ID, key: columnKey };
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transformDataColumnFilters in shared filters.ts rewrites data.columnNamefield="data", key="columnName" and is dataset-specific, should we move it next to DatasetItemPage so filters.ts stays generic?

Finding type: Avoid tight coupling | Severity: 🟠 Medium


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

In apps/opik-frontend/src/lib/filters.ts around lines 260 to 273, the
transformDataColumnFilters function is dataset-page specific and couples the shared
filters utility to DatasetItemPage behavior. Remove this function from filters.ts and
its local imports (COLUMN_DATA_ID if only used by that function). Create a new helper
file next to the dataset page (e.g.
apps/opik-frontend/src/pages/dataset/DatasetItemPage.utils.ts or inside the
DatasetItemPage component file) containing transformDataColumnFilters with the same
logic and export it. Update all call sites that previously imported
transformDataColumnFilters from lib/filters.ts to import it from the new
dataset-specific helper, run TypeScript/ESLint fixes, and ensure tests/build pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used from playground pages, so I'd say it makes more sense to have it here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, that context makes sense—I'll keep your reasoning in mind and save this to memory once the PR is merged.

@CometActions
Copy link
Collaborator

Test environment is now available!

To configure additional Environment variables for your environment, run [Deploy Opik AdHoc Environment workflow] (https://github.com/comet-ml/comet-deployment/actions/workflows/deploy_opik_adhoc_env.yaml)

Access Information

The deployment has completed successfully and the version has been verified.

Extract shared dataset filter column builder to lib/filters.ts to
deduplicate the pattern across RunOnDatasetDialog and dataset pages.
Accepts optional includeId parameter for dataset pages that need an
ID filter column.
Comment on lines +265 to +273
export const buildDatasetFilterColumns = (
datasetColumns: DatasetItemColumn[],
includeId = false,
) => {
const dataFilterColumns = datasetColumns.map((c) => ({
id: `${COLUMN_DATA_ID}.${c.name}`,
label: c.name,
type: COLUMN_TYPE.string,
}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildDatasetFilterColumns hardcodes type: COLUMN_TYPE.string; should we map column.types via mapDynamicColumnTypesToColumnType so filters use correct types/operators?

type: mapDynamicColumnTypesToColumnType(column.types)

Finding type: Type Inconsistency | Severity: 🔴 High


Want Baz to fix this for you? Activate Fixer

Other fix methods

Fix in Cursor

Prompt for AI Agents:

In apps/opik-frontend/src/lib/filters.ts around lines 265 to 273, the
buildDatasetFilterColumns function hardcodes type: COLUMN_TYPE.string for every dataset
column. Change it to derive the column type from the dataset column's metadata by
calling mapDynamicColumnTypesToColumnType(c.types) (falling back to COLUMN_TYPE.string
if c.types is missing/empty) and assign that result to the type property for each mapped
column. Keep the id/label shape and the tags entry unchanged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit 184c990 addressed this comment by switching buildDatasetFilterColumns to use mapDynamicColumnTypesToColumnType for each column’s type instead of hardcoding COLUMN_TYPE.string, ensuring the filters now derive their types from the column metadata.

Use mapDynamicColumnTypesToColumnType instead of hardcoded COLUMN_TYPE.string
so filter operators match the actual column data types.
@miguelgrc miguelgrc added test-environment Deploy Opik adhoc environment and removed test-environment Deploy Opik adhoc environment labels Mar 19, 2026
@github-actions
Copy link
Contributor

🔄 Test environment deployment process has started

Phase 1: Deploying base version 1.10.40-4499 (from main branch) if environment doesn't exist
Phase 2: Building new images from PR branch miguelg/OPIK-4845-playground-top-bar-run-modal-per-prompt
Phase 3: Will deploy newly built version after build completes

You can monitor the progress here.

@CometActions
Copy link
Collaborator

Test environment is now available!

To configure additional Environment variables for your environment, run [Deploy Opik AdHoc Environment workflow] (https://github.com/comet-ml/comet-deployment/actions/workflows/deploy_opik_adhoc_env.yaml)

Access Information

The deployment has completed successfully and the version has been verified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Frontend test-environment Deploy Opik adhoc environment typescript *.ts *.tsx

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants