[BOUNTY #1010] Bulk Trace Annotation UI#5732

Open

zhaog100 wants to merge 1 commit intocomet-ml:mainfrom

zhaog100:feat/bulk-trace-annotation

zhaog100 commented Mar 19, 2026

/claim #1010

Summary

Implemented bulk trace annotation - select multiple traces and annotate at once.

Changes

BulkAnnotateDialog component with multi-select
useTraceBatchFeedbackScoreSetMutation API hook
Integrated bulk action button in TracesActionsPanel


          feat: bulk trace annotation - select and annotate multiple traces at …

c98f613

…once

- Add BulkAnnotateDialog component with multi-select support
- Add useTraceBatchFeedbackScoreSetMutation API hook
- Integrate bulk action button in TracesActionsPanel
- Supports selecting multiple traces and applying annotation in batch

Closes comet-ml#1010
/claim comet-ml#1010

zhaog100 requested a review from a team as a code owner

March 19, 2026 02:51

algora-pbc bot added the 🙋 Bounty claim label

algora-pbc bot mentioned this pull request

[FR]: UI - Ability to Annotate multiple traces at once within a Project #1010

Open

github-actions bot added dependencies Frontend typescript labels

baz-reviewer bot reviewed

View reviewed changes

apps/opik-frontend/src/api/traces/useTraceBatchFeedbackScoreSetMutation.ts

Comment on lines +33 to +41

+                  onError: (error: AxiosError) => {
+                    const message =
+                      (error.response?.data as { message?: string })?.message ||
+                      error.message;
+                    toast({
+                      title: "Error",
+                      description: message,
+                      variant: "destructive",
+                    });

Contributor

baz-reviewer bot Mar 19, 2026

message can be empty leaving the toast blank — should we add an operation-specific fallback like "Unable to update batch trace feedback scores right now; please try again later."?

(error.response?.data as { message?: string })?.message || error.message => ... || "Unable to update batch trace feedback scores right now; please try again later."

_{Finding type: Provide comprehensive error messages | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In apps/opik-frontend/src/api/traces/useTraceBatchFeedbackScoreSetMutation.ts around
lines 33 to 41, the onError handler computes message as (error.response?.data as {
message?: string })?.message || error.message which can be empty, leaving the toast
without a description. Change the onError logic to compute a finalMessage that falls
back to a clear, operation-specific string (for example: "Unable to update batch trace
feedback scores right now; please try again later.") when both the backend message and
error.message are falsy, and pass finalMessage to toast.description. Make this a
minimal, self-contained change so the toast always shows a readable, actionable message.

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +1 to +12

+              import React, { useCallback, useEffect, useMemo, useState } from "react";
+              import { MessageSquarePlus } from "lucide-react";
+              import { Button } from "@/components/ui/button";
+              import {
+                Dialog,
+                DialogContent,
+                DialogDescription,
+                DialogFooter,
+                DialogHeader,
+                DialogTitle,
+              } from "@/components/ui/dialog";
+              import useAppStore from "@/store/AppStore";

Contributor

baz-reviewer bot Mar 19, 2026

Lodash imports appear after internal @/... imports but .agents/skills/opik-frontend/code-quality.md requires React → external → internal, should we move the lodash imports above the internal imports?

_{Finding type: AI Coding Guidelines | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In
apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx
around lines 1-33, the external lodash imports (sortBy, isNumber) are placed after
internal @/... imports which violates the repository rule React → external →
internal. Reorder the import block so React remains first, then move all external
imports (e.g. lucide-react, lodash/sortBy, lodash/isNumber) immediately after the React
import, and then list the internal @/... imports. Preserve the original relative
ordering within each group and keep other lines unchanged.

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +49 to +60

+                const workspaceName = useAppStore((state) => state.activeWorkspaceName);
+                const { toast } = useToast();
+                const { data: feedbackDefinitionsData } = useFeedbackDefinitionsList({
+                  workspaceName,
+                  page: 1,
+                  size: 1000,
+                });
+                const feedbackDefinitions = useMemo(
+                  () => sortBy(feedbackDefinitionsData?.content || [], "name"),
+                  [feedbackDefinitionsData],

Contributor

baz-reviewer bot Mar 19, 2026

feedbackDefinitions is memoized before state, should we move the useState calls above the useMemo to follow the Opik Frontend hook-order guideline?

_{Finding type: AI Coding Guidelines | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In
apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx
around lines 49 to 71, the feedbackDefinitions useMemo is declared before the component
local useState hooks (selectedDefinitionId, value, reason, booleanValue), which violates
the project's hook ordering guideline. Refactor by moving the useState declarations for
selectedDefinitionId, value, reason, and booleanValue so they appear immediately after
the workspaceName and toast hooks and before the feedbackDefinitions useMemo; keep the
useMemo that computes selectedDefinition (which depends on feedbackDefinitions and
selectedDefinitionId) after feedbackDefinitions and update dependency arrays if
necessary. Ensure no logic or initial values change and run TypeScript checks.

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +52 to +56

+                const { data: feedbackDefinitionsData } = useFeedbackDefinitionsList({
+                  workspaceName,
+                  page: 1,
+                  size: 1000,
+                });

Contributor

baz-reviewer bot Mar 19, 2026

useFeedbackDefinitionsList hardcodes pagination page: 1, size: 1000 — should we move these into named constants or a feedbackDefinitionsQuery config like FEEDBACK_DEFINITIONS_PAGE / PAGE_SIZE?

_{Finding type: Avoid hardcoded configuration values | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In
apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx
around lines 52 to 56, the call to useFeedbackDefinitionsList uses hardcoded pagination
values (page: 1, size: 1000). Replace these literals with named constants or a small
config object (for example FEEDBACK_DEFINITIONS_PAGE and FEEDBACK_DEFINITIONS_PAGE_SIZE,
or feedbackDefinitionsQuery = { page, size }) declared near the top of the file (below
imports) or imported from a shared config module, and update the hook call to use those
constants. Ensure types are preserved and run a quick search/replace to update any
related usage if you move constants to a shared file.

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +96 to +101

+                  const numericValue = Number(value);
+                  if (
+                    selectedDefinition.type !== FEEDBACK_DEFINITION_TYPE.boolean &&
+                    !isNumericFeedbackScoreValid(numericValue, selectedDefinition)
+                  ) {
+                    return;

Contributor

baz-reviewer bot Mar 19, 2026

In handleSubmit we're passing args to isNumericFeedbackScoreValid in the wrong order — should we call isNumericFeedbackScoreValid(selectedDefinition.details, numericValue) and only invoke it for numeric definitions?

_{Finding type: Logical Bugs | Severity: 🔴 High}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In
apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx
around lines 96 to 101, the handleSubmit function calls
isNumericFeedbackScoreValid(numericValue, selectedDefinition) which is wrong because the
helper expects (details, value) and this makes validation always fail. Change the logic
to only invoke isNumericFeedbackScoreValid for numerical feedback definitions and pass
selectedDefinition.details as the first argument and numericValue as the second (for
example: if (selectedDefinition.type === FEEDBACK_DEFINITION_TYPE.numerical &&
!isNumericFeedbackScoreValid(selectedDefinition.details, numericValue)) return;). This
ensures correct range validation and allows submits for valid numeric/categorical cases.

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +151 to +156

+                const isSubmitDisabled =
+                  !selectedDefinitionId ||
+                  (selectedDefinition?.type !== FEEDBACK_DEFINITION_TYPE.boolean &&
+                    (!isNumber(Number(value)) ||
+                      !isNumericFeedbackScoreValid(selectedDefinition.details, Number(value))));

Contributor

baz-reviewer bot Mar 19, 2026

isSubmitDisabled applies numeric-range validation to categorical definitions; should we restrict that check to numeric definitions and pass their {min,max}?

_{Finding type: Logical Bugs | Severity: 🔴 High}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In
apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx
around lines 151 to 156, the isSubmitDisabled logic applies numeric range validation for
every non-boolean definition, causing categorical definitions (which have
details.categories) to be treated as numeric and always fail. Change the condition so
that numeric range validation (calling isNumericFeedbackScoreValid and passing
selectedDefinition.details as {min,max}) only runs when selectedDefinition.type ===
FEEDBACK_DEFINITION_TYPE.numerical; for categorical definitions require that a
category/value is selected (non-empty) instead. Ensure boolean, categorical, and
numerical cases are handled explicitly so the submit button is not incorrectly disabled
for categorical selections.

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +199 to +203

+                            <Label>Value</Label>
+                            {selectedDefinition.type ===
+                            FEEDBACK_DEFINITION_TYPE.boolean ? (
+                              <ToggleGroup
+                                type="single"

Contributor

baz-reviewer bot Mar 19, 2026

The value input in BulkAnnotateDialog duplicates the same definition-type branching that already exists in TraceDetailsPanel/TraceAnnotateViewer/AnnotateRow.tsx (see lines 194‑327). Both components do identical work: they inspect selectedDefinition.type, render a boolean ToggleGroup, categorical Select, or numeric input, and enforce the same isNumericFeedbackScoreValid guard. Any future change to how we render a definition (new validation, labeling, UI tweaks) now has to be made in two places, which is a maintenance and drift risk. Can we extract a shared FeedbackDefinitionValueInput/hook (taking a FeedbackDefinition, current value, and change handler) and reuse it in both components so that the type-specific rendering/validation is defined in one place?

_{Finding type: Code Dedup and Conventions | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +263 to +274

+                              <Select
+                                value={isNumber(Number(value)) ? String(value) : ""}
+                                onValueChange={(val) => setValue(val)}
+                              >
+                                <SelectTrigger>
+                                  <SelectValue placeholder="Select category..." />
+                                </SelectTrigger>
+                                <SelectContent>
+                                  {Object.entries(
+                                    (selectedDefinition.details as { categories: Record<string, number> }).categories
+                                  ).map(([label]) => (
+                                    <SelectItem key={label} value={label}>

Contributor

baz-reviewer bot Mar 19, 2026

Categorical Select stores the label via setValue but handleSubmit does Number(value) so labels become NaN — should we map the selected label to its numeric score from feedbackDefinition.details.categories before storing/sending?
setValue(label) => setValue(feedbackDefinition.details.categories.find(c => c.label === label)?.value)

_{Finding type: Logical Bugs | Severity: 🔴 High}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In
apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx
around lines 263 to 275 (the categorical Select) and handleSubmit around lines 93 to
111: the Select currently calls setValue(val) with the human-readable category label,
but handleSubmit casts value to Number(value) which yields NaN for labels. Change the
Select onValueChange to look up the numeric value from
selectedDefinition.details.categories for the chosen label and store that numeric value
(as a number or numeric string) instead of the label; alternatively store both label and
numeric value in state. Also update isSubmitDisabled and handleSubmit to use this
numeric value (or perform the same label->number lookup at validation/submit time) so
the payload sends the category's numeric score rather than NaN.

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +271 to +275

+                                  {Object.entries(
+                                    (selectedDefinition.details as { categories: Record<string, number> }).categories
+                                  ).map(([label]) => (
+                                    <SelectItem key={label} value={label}>
+                                      {label}

Contributor

baz-reviewer bot Mar 19, 2026

Select stores the label as value so Number(value) can be NaN for labels like Stable — should we store/send the category's numeric value via categories[label]?

_{Finding type: prefer direct React patterns | Severity: 🔴 High}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In
apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx
around lines 271 to 275 (the categorical Select rendering and value handling) and the
related submit logic in handleSubmit, fix the categorical value handling so the
component stores/sends the category's numeric value instead of the label string.
Specifically, when rendering the Select for FEEDBACK_DEFINITION_TYPE.categorical, set
each SelectItem value to the category's numeric value (String(number)) while keeping the
label as the displayed text, or alternatively, in the Select onValueChange translate the
selected label to categories[label] and call setValue with that numeric value. Ensure
isSubmitDisabled and handleSubmit continue to treat value as a numeric string
(Number(value) should produce a valid number) and that the displayed option text remains
the human-readable label.

apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx

Comment on lines +272 to +276

+                                    (selectedDefinition.details as { categories: Record<string, number> }).categories
+                                  ).map(([label]) => (
+                                    <SelectItem key={label} value={label}>
+                                      {label}
+                                    </SelectItem>

Contributor

baz-reviewer bot Mar 19, 2026

uses the category label instead of the numeric score — should we set the value to the numeric score (e.g. value={String(score)}) while still rendering the label? Finding type: Component responsibility | Severity: 🔴 High Want Baz to fix this for you? Activate Fixer Other fix methods Prompt for AI Agents: In apps/opik-frontend/src/components/pages-shared/traces/BulkAnnotateDialog/BulkAnnotateDialog.tsx around lines 272 to 276, the categorical Select currently uses the category label as the SelectItem value which leads to Number(value) === NaN. Change the Select mapping to iterate Object.entries((selectedDefinition.details as { categories: Record<string, number> }).categories) and for each [label, score] set SelectItem key={label} value={String(score)} but render the label as the item text. Also ensure the Select value binding (line ~264) and onValueChange continue to set the component value state to that numeric string so downstream isNumber(Number(value)), isNumericFeedbackScoreValid and handleSubmit receive a valid numeric value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🙋 Bounty claim dependencies Frontend typescript