Skip to content

Conversation

@CodingOnStar
Copy link
Contributor

Summary

  • Refactor three complex components (DocumentList, EmbeddingDetail, Metadata) by extracting reusable hooks and sub-components
  • Reduce cyclomatic complexity and improve code maintainability
  • Add comprehensive test coverage for all extracted modules

Changes

  • DocumentList: Extract sorting, selection, and action hooks; split into table row, sort header components
  • EmbeddingDetail: Extract status polling hook and mutation hooks; split UI into smaller components
  • Metadata: Restructure into DocTypeSelector, FieldInfo, MetadataFieldList components with editor/save hooks

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran make lint and make type-check (backend) and cd web && npx lint-staged (frontend) to appease the lint gods

CodingOnStar and others added 3 commits January 29, 2026 17:01
Refactor the EmbeddingDetail component to improve maintainability:
- Extract useEmbeddingStatus hook with React Query for status polling
- Extract usePauseIndexing and useResumeIndexing mutation hooks
- Split UI into StatusHeader, ProgressBar, SegmentProgress, RuleDetail components
- Add comprehensive tests with 98.57% statement coverage

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Extract reusable hooks and sub-components from the DocumentList component
to reduce complexity and improve testability:

- useDocumentSort: handles sorting state and logic
- useDocumentSelection: manages document selection state
- useDocumentActions: handles batch actions with React Query mutations
- DocumentTableRow: extracted table row component
- SortHeader: reusable sort header component
- DocumentSourceIcon: data source type icon renderer
- renderTdValue: utility for displaying formatted values

The main component remains in list.tsx with document-list/index.tsx
providing a re-export for backwards compatibility.

Test coverage: 156 tests covering all extracted modules.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
This commit refactors the metadata components within the document detail section by extracting and organizing them into separate files for better maintainability and readability. The following components were created or modified:

- DocTypeSelector: A new component for selecting document types.
- FieldInfo: A new component for displaying and editing field information.
- IconButton: A new component for rendering document type icons.
- MetadataFieldList: A new component for rendering a list of metadata fields.
- TypeIcon: A new component for displaying icons based on document types.
- Utility functions for mapping options.

Additionally, comprehensive tests were added for each new component to ensure functionality and reliability.
Copilot AI review requested due to automatic review settings January 30, 2026 02:15
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Jan 30, 2026
@github-actions github-actions bot added the web This relates to changes on the web. label Jan 30, 2026
@dosubot dosubot bot added the refactor label Jan 30, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @CodingOnStar, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the maintainability and testability of core web components by breaking down monolithic structures into a modular architecture. By extracting reusable logic into custom hooks and UI into focused sub-components, the codebase becomes easier to understand, debug, and extend. This refactoring also introduces comprehensive unit tests for the new modules, ensuring the reliability of these critical parts of the application.

Highlights

  • Modular Component Extraction: Three complex web components (DocumentList, EmbeddingDetail, and Metadata) have been refactored by extracting their logic into smaller, reusable sub-components and custom hooks. This improves code organization and reduces cyclomatic complexity.
  • Comprehensive Test Coverage: Extensive test files have been added for all newly extracted components and hooks, ensuring robust functionality and maintainability of the refactored modules.
  • Improved Document List Management: The DocumentList component now utilizes dedicated hooks for managing selection (useDocumentSelection), sorting (useDocumentSort), and batch actions (useDocumentActions), along with new UI components for table rows (DocumentTableRow) and sort headers (SortHeader).
  • Streamlined Embedding Detail Display: The EmbeddingDetail component has been modularized with new components for progress display (ProgressBar, SegmentProgress), rule details (RuleDetail), and status headers (StatusHeader), backed by dedicated hooks for status polling and mutations (useEmbeddingStatus, usePauseIndexing, useResumeIndexing).
  • Enhanced Metadata Editing Experience: The Metadata component now features a structured approach to editing with DocTypeSelector for document type management, MetadataFieldList for displaying and editing fields, and hooks (useMetadataEditor, useMetadataSave) to handle the editing and saving logic.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This refactor modularizes three complex dataset document components (Metadata, EmbeddingDetail, DocumentList) into smaller hooks and sub-components, and adds focused test coverage around their behaviors. The changes reduce component complexity, centralize shared logic (sorting, selection, indexing status, metadata editing/saving), and strengthen confidence via extensive Vitest/RTL tests.

Changes:

  • Extracted metadata editing logic into useMetadataEditor/useMetadataSave and UI sub-components (doc type selector, field rows, icons), simplifying metadata/index.tsx and adding targeted unit tests.
  • Extracted embedding status polling, pause/resume mutations, and presentational pieces (status header, progress bar, rule details) into dedicated hooks/components under embedding/, with comprehensive hook and component tests.
  • Extracted document list concerns into a document-list module (sorting, selection, batch actions, row rendering, shared cell renderers), and added thorough tests for hooks and render logic, along with adjusting eslint suppressions for the new structure.

Reviewed changes

Copilot reviewed 53 out of 53 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
web/eslint-suppressions.json Updates eslint suppression paths to point from old monolithic metadata component to the new use-metadata-editor hook file and removes obsolete suppressions for the legacy document list/metadata component.
web/app/components/datasets/documents/detail/metadata/index.tsx Refactors Metadata into a thin container using useMetadataEditor/useMetadataSave, and composes doc-type selector and metadata-field list components instead of inline logic.
web/app/components/datasets/documents/detail/metadata/hooks/use-metadata-save.ts Introduces a hook encapsulating the save-metadata API call, toast notifications, loading state, and success/error handling.
web/app/components/datasets/documents/detail/metadata/hooks/use-metadata-save.spec.ts Adds unit tests validating useMetadataSave behavior: correct API parameters, precedence of metadataParams.documentType, loading flag transitions, callbacks, and success/error toast paths.
web/app/components/datasets/documents/detail/metadata/hooks/use-metadata-editor.ts Introduces a hook to manage metadata editor state (doc type normalization, edit mode, temp doc type, field updates, reset behavior) with side effects responding to docDetail changes.
web/app/components/datasets/documents/detail/metadata/hooks/use-metadata-editor.spec.ts Provides exhaustive tests for useMetadataEditor initial state, doc-type confirmation/cancel, resetting, metadata updates, selector behavior, and response to changing docDetail.
web/app/components/datasets/documents/detail/metadata/hooks/index.ts Centralizes exports for metadata hooks and the MetadataState type to simplify imports.
web/app/components/datasets/documents/detail/metadata/components/utils.ts Extracts a utility to convert a key→label map into a select options array, shared across metadata components.
web/app/components/datasets/documents/detail/metadata/components/utils.spec.ts Tests map2Options for normal, empty, single-entry, special-character, and ordering behavior.
web/app/components/datasets/documents/detail/metadata/components/type-icon.tsx Extracts a memoized TypeIcon component rendering document-type icons based on CSS module class names.
web/app/components/datasets/documents/detail/metadata/components/type-icon.spec.tsx Verifies TypeIcon rendering, class composition (base and icon-specific), handling of extra classes, and memoization behavior.
web/app/components/datasets/documents/detail/metadata/components/metadata-field-list.tsx Encapsulates rendering of a metadata section (e.g., book, originInfo, technicalParameters) with mapping to label, input type, select options, and value display.
web/app/components/datasets/documents/detail/metadata/components/metadata-field-list.spec.tsx Thoroughly tests MetadataFieldList for different main sections, select fields, edit vs read-only modes, custom render functions, category maps, and memoization.
web/app/components/datasets/documents/detail/metadata/components/index.ts Re-exports all metadata UI components and utilities (doc-type selector, field info, icon button, metadata list, type icon, options mapper).
web/app/components/datasets/documents/detail/metadata/components/icon-button.tsx Extracts a memoized icon-button for choosing doc types, using useMetadataMap and tooltip for localized labels.
web/app/components/datasets/documents/detail/metadata/components/icon-button.spec.tsx Tests IconButton for rendering, button semantics, checked styling, icon class composition, tooltip integration, different doc types, hover behavior, and memoization.
web/app/components/datasets/documents/detail/metadata/components/field-info.tsx Extracts a reusable field row component handling label alignment, text wrapping heuristics, and three input modes (input, textarea, select) with i18n-aware placeholders.
web/app/components/datasets/documents/detail/metadata/components/field-info.spec.tsx Exercises FieldInfo for display vs edit modes, each input type’s change handling, placeholder/defaults, icon rendering, alignment classes, and default props.
web/app/components/datasets/documents/detail/metadata/components/doc-type-selector.tsx Extracts the document-type selection UI (first-time vs change flows, radio group, confirmation/cancel actions) using the new IconButton component.
web/app/components/datasets/documents/detail/metadata/components/doc-type-selector.spec.tsx Tests DocTypeSelector across first-time selection and change flows, button enable/disable logic, radio interaction, callbacks, and memoization.
web/app/components/datasets/documents/detail/embedding/index.tsx Refactors EmbeddingDetail to use useEmbeddingStatus for polling, usePauseIndexing/useResumeIndexing for mutations, and composes status/progress/rule-detail components instead of embedding the logic inline.
web/app/components/datasets/documents/detail/embedding/index.spec.tsx Adds integration-style tests around EmbeddingDetail covering indexing status transitions, pause/resume interactions, dataset/document ID resolution, rule-detail rendering, callback behavior, and skeleton rendering.
web/app/components/datasets/documents/detail/embedding/hooks/use-embedding-status.ts Implements a react-query–based hook for polling indexing status, computing derived flags and percentage, exposing invalidation/reset helpers, and pause/resume mutations plus a utility invalidator.
web/app/components/datasets/documents/detail/embedding/hooks/use-embedding-status.spec.tsx Provides comprehensive tests for isEmbeddingStatus, isTerminalStatus, calculatePercent, useEmbeddingStatus, pause/resume hooks, and the invalidate helper.
web/app/components/datasets/documents/detail/embedding/hooks/index.ts Re-exports embedding hooks and the EmbeddingStatusType alias for easier consumption.
web/app/components/datasets/documents/detail/embedding/components/status-header.tsx Adds a memoized header component that renders the current indexing state label and pause/resume buttons with appropriate icons, loading/disabled states, and translations.
web/app/components/datasets/documents/detail/embedding/components/status-header.spec.tsx Validates StatusHeader rendering, status text resolution, spinner visibility, button presence/enabled state, callbacks, and styling across different state combinations.
web/app/components/datasets/documents/detail/embedding/components/segment-progress.tsx Separates the “segments X/Y · Z%” line into a memoized component that formats counts and percent using i18n.
web/app/components/datasets/documents/detail/embedding/components/segment-progress.spec.tsx Tests SegmentProgress for correct layout classes, display of segments/percent across normal, zero, undefined, large, and decimal cases.
web/app/components/datasets/documents/detail/embedding/components/rule-detail.tsx Extracts the embedding rule details section (segmentation mode, lengths, text-cleaning rules, index mode, retrieval settings) into a dedicated presentational component.
web/app/components/datasets/documents/detail/embedding/components/rule-detail.spec.tsx Verifies RuleDetail across modes, rule name mapping, index mode variants, retrieval method variants, empty/missing data, and general layout.
web/app/components/datasets/documents/detail/embedding/components/progress-bar.tsx Adds a memoized progress bar component that styles its container and fill differently based on embedding state and percent complete.
web/app/components/datasets/documents/detail/embedding/components/progress-bar.spec.tsx Tests ProgressBar width calculation, background/fill classes for different embedding/paused/error combinations, and edge-case percentages.
web/app/components/datasets/documents/detail/embedding/components/index.ts Re-exports embedding UI components (progress bar, rule detail, segment progress, status header) for centralized imports.
web/app/components/datasets/documents/components/document-list/index.tsx Introduces a small compatibility shim that re-exports the existing list.tsx DocumentList and the new renderTdValue helper under a document-list subdirectory.
web/app/components/datasets/documents/components/document-list/index.spec.tsx Adds high-level tests for the DocumentList integration, including rendering, selection behavior, sorting interactions, batch actions, navigation, rename/metadata modals, and edge cases.
web/app/components/datasets/documents/components/document-list/hooks/use-document-sort.ts Introduces a sorting/filtering hook that applies status filters and client-side sorts by name, word count, hit count, or created time, and attempts to reset sort state when remoteSortValue changes.
web/app/components/datasets/documents/components/document-list/hooks/use-document-sort.spec.ts Thoroughly tests useDocumentSort’s initial state, sort toggling, each sort field, status filtering behavior, remote sort reset behavior, and edge cases with missing values and empty arrays.
web/app/components/datasets/documents/components/document-list/hooks/use-document-selection.ts Extracts selection logic (all/some selection flags, toggling, error-document detection, downloadable IDs, clearing) into a reusable hook.
web/app/components/datasets/documents/components/document-list/hooks/use-document-selection.spec.ts Tests useDocumentSelection across all/some/none-selected states, select-all and single-select behaviors, error-document detection, downloadability, and clearing.
web/app/components/datasets/documents/components/document-list/hooks/use-document-actions.ts Extracts batch action behavior (archive/summary/enable/disable/delete, retry-index, bulk download ZIP) into a hook that wires up service mutations, toast notifications, and selection clearing.
web/app/components/datasets/documents/components/document-list/hooks/use-document-actions.spec.tsx Verifies useDocumentActions for each action type, correct mutation params, success-path callbacks, download behavior including in-progress guard, and error-path handling.
web/app/components/datasets/documents/components/document-list/hooks/index.ts Re-exports document list hooks and sort types to simplify consumer imports.
web/app/components/datasets/documents/components/document-list/components/utils.tsx Adds a small helper renderTdValue to consistently render table cell values with appropriate typography and “empty” styling.
web/app/components/datasets/documents/components/document-list/components/utils.spec.tsx Tests renderTdValue rendering for different value types, null handling, empty-style toggling, and a range of edge-case values.
web/app/components/datasets/documents/components/document-list/components/sort-header.tsx Extracts a memoized sortable table header cell that shows an arrow icon and visual state based on current sort field/order.
web/app/components/datasets/documents/components/document-list/components/sort-header.spec.tsx Validates SortHeader rendering, active/inactive styling, rotation behavior for ascending/descending, interaction callbacks, and support for all sort fields.
web/app/components/datasets/documents/components/document-list/components/index.ts Re-exports document list row/column components and helpers for simpler imports.
web/app/components/datasets/documents/components/document-list/components/document-table-row.tsx Extracts per-row rendering for the document table, handling navigation, selection checkbox, file/source icon, rename affordance, chunking mode, counts, timestamps, status, and operations menu.
web/app/components/datasets/documents/components/document-list/components/document-table-row.spec.tsx Exhaustively tests DocumentTableRow for rendering, index display, selection behavior, navigation, number formatting, summary/operations integration, various data source types, and edge cases.
web/app/components/datasets/documents/components/document-list/components/document-source-icon.tsx Pulls out logic to choose and render the correct icon for each document’s data source (local file, online document, online drive, web crawl, or unknown).
web/app/components/datasets/documents/components/document-list/components/document-source-icon.spec.tsx Tests DocumentSourceIcon across all supported data source types, legacy vs pipeline sources, filename/extension edge cases, unknown types, and memoization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant and well-executed refactoring of the DocumentList, EmbeddingDetail, and Metadata components. Extracting complex logic into modular hooks and smaller, focused sub-components greatly improves code organization, maintainability, and testability. The addition of comprehensive tests for the new modules is also a valuable contribution. Overall, this is an excellent improvement to the codebase. I have a couple of minor suggestions to further refine the code.

CodingOnStar added 2 commits January 30, 2026 10:53
This commit refactors the document actions and metadata components to enhance code clarity and maintainability. Key changes include:

- Simplified the  hook by removing unnecessary type assertions.
- Streamlined the rendering logic in the  component by consolidating JSX elements.
- Updated dependencies in the  hook to ensure proper reactivity.

These adjustments aim to improve the overall structure and performance of the document management features.
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Jan 30, 2026
@CodingOnStar CodingOnStar force-pushed the refactor/datasets-complex-components branch from f05b924 to a90c056 Compare January 30, 2026 03:36
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jan 30, 2026
CodingOnStar added 4 commits January 30, 2026 13:29
This commit modifies the test files for the CreateAppModal component to replace exact string matches with regular expressions for button role queries. This change enhances the robustness of the tests by allowing for more flexible matching of button names, ensuring they remain valid even if the text changes slightly in the future. Additionally, type assertions were improved in the CreateAppModal tests to enhance type safety.
…nents across various components to clean up the linting configuration.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor size:XXL This PR changes 1000+ lines, ignoring generated files. web This relates to changes on the web.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants