Open
Conversation
## Motivation During hands-on testing of the Grove installation process, several critical usability issues were discovered that would block new users from successfully deploying Grove. Additionally, the README was too verbose and didn't quickly communicate the core value proposition to developers evaluating the project. ## Changes Made ### installation.md - Fixed Critical Blockers **Working Directory Confusion** - Added explicit "Navigate to operator directory" instructions - Impact: Users can now follow the guide linearly without trial-and-error **KUBECONFIG Setup Broken** - kind-up script has a bug and doesn't export KUBECONFIG properly - Added manual workaround using `kind get kubeconfig` - Impact: Users can now successfully deploy after creating kind cluster **Wrong Resource Names** - Fixed: simple1-0-pcsg → simple1-0-sga (actual resource name) - Impact: Scaling examples now work as documented **Added Troubleshooting Section** - Covers deployment issues, runtime issues, and community resources - Impact: Users can self-serve when encountering common issues ### README.md - Refocused on Problem → Solution → Action **Shortened from ~80 lines to ~40 lines of core content** New structure: 1. Problem First: What's broken in K8s for AI inference 2. Solution: Grove's one-liner positioning 3. Quick Start: 4 commands to deploy in 5 minutes 4. What Grove Solves: Table mapping scenarios to capabilities 5. How It Works: Simplified concept explanations Roadmap simplified to Q4 2025 / Q1 2026 (removed specific outdated dates) Impact: Users understand value prop in 30 seconds and can start immediately ### quickstart.md - New 10-Minute Tutorial - Explains the 4-component example architecture - Step-by-step deployment with expected outputs - Demonstrates both PCSG and PCS scaling - Includes hierarchy visualization - Kind-specific troubleshooting tips Impact: New users get immediate success experience in 10 minutes ## Testing Performed All changes validated through fresh kind cluster deployment on macOS, following installation.md step-by-step, and verifying all examples work. Co-authored-by: Claude <noreply@anthropic.com>
Introduces a proposal for kubectl-grove, a kubectl plugin providing: - Arborist TUI for hierarchical resource navigation - Topology visualization command - Status, health, and diagnostics commands - Lifecycle management commands (rollout, scale, etc.) Fixes ai-dynamo#373 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Author
|
@gflarity for viz |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind feature
/kind documentation
What this PR does / why we need it:
Introduces GREP-373, a proposal for
kubectl-grove- a kubectl plugin that provides a rich interaction layer for Grove workloads on Kubernetes.The CLI bridges the gap between raw
kubectlcommands and the complex, hierarchical nature of Grove resources (PodCliqueSets, PodGangs, PodCliques, Pods), offering both command-line tools and an interactive Terminal User Interface (TUI) called Arborist.Proposed Features
Critical (Must Have)
kubectl grove tui) - Hierarchical navigation with real-time refresh and embedded topology viewkubectl grove topology) - Rack → Node → Pod visualization with GPU allocation barsHigh Priority
kubectl grove status- PodCliqueSet status with progress visualizationkubectl grove health- Gang-aware health dashboardkubectl grove diagnostics- Comprehensive diagnostic data collectionMedium Priority
rollout,scale,update,restart,applykubectl grove metrics- Live metrics from pod endpointsWhich issue(s) this PR fixes:
Fixes #373
Special notes for your reviewer:
This is a proposal document (GREP) following the template from #362. Looking for feedback on:
Does this PR introduce an API change?
Additional documentation:
🤖 Generated with Claude Code