Skip to content

feat(datasets): Add info operation to lerobot-edit-dataset command#2917

Open
masato-ka wants to merge 3 commits intohuggingface:mainfrom
masato-ka:feat/show-dataset-info
Open

feat(datasets): Add info operation to lerobot-edit-dataset command#2917
masato-ka wants to merge 3 commits intohuggingface:mainfrom
masato-ka:feat/show-dataset-info

Conversation

@masato-ka
Copy link
Contributor

@masato-ka masato-ka commented Feb 7, 2026

Type / Scope

Type: Feature
Scope: datasets / CLI tools

Summary / Motivation

This PR adds a new info operation to the lerobot-edit-dataset command that displays comprehensive dataset information without modifying the dataset. This helps users quickly understand their dataset's structure and contents.

The current dataset tools allow users to manipulate datasets (delete, split, merge, etc.), but there was no convenient way to inspect dataset information from the command line. This feature fills that gap.

Related issues

#2326 - This contribution adds a new dataset tool as requested in the "Call for Contributions: Expanding Dataset Tools in LeRobotDataset" issue.

What changed

New info operation

  • Added InfoConfig dataclass with show_features parameter
  • Added handle_info() function that displays:
    • Repository ID
    • Total episodes
    • Total tasks
    • Total frames (with actual count)
    • Average frames per episode
    • Average episode duration (in seconds)
    • FPS
    • Dataset size (in MB)
    • Feature details (optional, controlled by --operation.show_features)

Documentation

  • Updated docs/source/using_dataset_tools.mdx with usage examples for the new info operation

Usage examples

# Show basic dataset information
lerobot-edit-dataset \
    --repo_id lerobot/pusht_image \
    --operation.type info

# Show dataset information with feature details
lerobot-edit-dataset \
    --repo_id lerobot/pusht_image \
    --operation.type info \
    --operation.show_features true

Manual verification

Tested locally with the following datasets:

  • lerobot/pusht_image
  • Local custom datasets

Reviewer notes

  • This is a read-only operation that does not modify the dataset
  • The implementation follows the existing pattern of other operations in the script
  • The show_features flag allows users to optionally view detailed feature information

🤖 Generated with Claude Code

@github-actions github-actions bot added the documentation Improvements or fixes to the project’s docs label Feb 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or fixes to the project’s docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant