Simplifies CI change detection and concurrency groups#5466
Simplifies CI change detection and concurrency groups#5466myurasov-nv wants to merge 15 commits intoisaac-sim:developfrom
Conversation
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR simplifies CI workflows by removing redundant aggregator gate jobs (Docker Tests Gate and Installation Tests Gate), unifying concurrency keys to use github.ref, and adding improved change-detection reporting via GitHub job summaries. The core assumption is that branch protection rules already handle skipped jobs correctly without explicit gate jobs.
Architecture Impact
Self-contained CI/workflow changes. No impact on Isaac Lab runtime code. The changes affect:
build.yaml— Docker test orchestrationinstall-ci.yml— Installation test orchestration- 5 other workflows — concurrency key standardization
- Removes
test_required_ci_gates.py— regression test for the gate jobs
The key architectural question is whether GitHub branch protection correctly treats jobs that skip via if: as passing. This is a GitHub Actions behavior dependency, not a code correctness issue.
Implementation Verdict
Ship it — The changes are correct and well-reasoned. Minor improvements possible.
Test Coverage
The PR removes test_required_ci_gates.py which validated the gate job structure. The PR description claims the test is "now-redundant" because the gate jobs themselves are removed. This is logically consistent — the test was specifically for the gate jobs' structure, not for the change-detection logic itself.
🟡 Gap: There's no test coverage for the new decide() helper function's behavior (GitHub step summary formatting, fail-safe handling). However, this is bash in a workflow file, which is typically tested via integration/manual verification rather than unit tests.
CI Status
No CI checks available yet — cannot verify the workflows function correctly in practice.
Findings
🔵 Improvement: .github/workflows/build.yaml:118-121 — Fail-safe path could mask persistent API issues
The fail-safe that defaults to running tests when gh api fails is good for resilience, but emits only a ::warning::. If the GitHub API is persistently broken (e.g., token scope issue), PRs will silently run all tests forever without clear diagnosis. Consider adding a counter or distinct message if this happens repeatedly.
if ! changed_files="$(gh api --paginate "repos/$REPO/pulls/$PR_NUMBER/files" --jq '.[].filename')"; then
echo "::warning::Could not list changed files; defaulting to running tests"🔵 Improvement: .github/workflows/build.yaml:92 — Duplicated path_regex definition
The path_regex in build.yaml and install-ci.yml are different (correctly so), but the decide() helper function is duplicated verbatim across both files (lines 94-115 in build.yaml, lines 53-74 in install-ci.yml). Consider extracting this to a composite action to reduce duplication and ensure consistent behavior.
🔵 Improvement: .github/workflows/labeler.yml:7-9 — Concurrency removal is correct but undocumented
The labeler workflow runs on pull_request_target which has different security/triggering semantics than pull_request. Removing concurrency here is fine (labeling is idempotent and fast), but the PR description doesn't explain why this specific workflow's concurrency was "redundant" while others just had their keys changed.
🟡 Warning: .github/workflows/build.yaml:57 and others — Concurrency key change may affect workflow cancellation
# Before:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
# After:
group: ${{ github.workflow }}-${{ github.ref }}For PRs, github.ref is refs/pull/<number>/merge, which is already PR-unique as claimed. However, for workflow_dispatch events, the github.ref is the branch name. If someone manually triggers build.yaml twice on the same branch in quick succession (e.g., debugging), the second will cancel the first. This is likely intended behavior, but worth noting since the original code explicitly handled the distinction.
🔵 Improvement: .github/workflows/install-ci.yml:85-90 — PYTEST_EXTRA_ARGS array construction
PYTEST_EXTRA_ARGS=(-sv)
if [ -n "$TEST_FILTER" ]; then
PYTEST_EXTRA_ARGS+=(-k "$TEST_FILTER")
fiThis bash array is correctly constructed, but note that if TEST_FILTER contains spaces or special characters, it will be passed as a single argument (correct behavior for -k). This is fine.
🔵 Improvement: Deleted test_required_ci_gates.py — Consider documenting the branch protection assumption
The deleted test file (test_required_ci_gates.py) served as documentation that the gate jobs were load-bearing for branch protection. With its removal, there's no in-repo documentation explaining why the gate jobs aren't needed. Consider adding a comment in build.yaml or CONTRIBUTING.md explaining that branch protection treats if:-skipped jobs as passing, which is the critical assumption enabling this simplification.
Greptile SummaryThis PR simplifies CI infrastructure by removing redundant aggregator gate jobs ( Confidence Score: 5/5Safe to merge; changes are purely CI/infra with no product code impact and correct reasoning throughout. All concurrency key changes are semantically equivalent for their trigger events, gate-job removal relies on documented GitHub branch-protection behaviour for skipped checks, and the enhanced change-detection script handles error paths gracefully. No P0 or P1 findings identified. No files require special attention beyond the already-discussed deletion of test_required_ci_gates.py. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
PR[Pull Request / Push / workflow_dispatch] --> Changes
subgraph changes["changes job (Detect Changes)"]
direction TB
C1{EVENT_NAME == pull_request?}
C1 -- No --> D_true_np["decide true\n(non-PR event)"]
C1 -- Yes --> API[gh api: list changed files]
API -- API error --> D_failsafe["decide true\n(fail-safe)"]
API -- OK --> AnyMatch{any_match\nagainst patterns?}
AnyMatch -- Yes --> D_true["decide true\n(relevant paths changed)"]
AnyMatch -- No --> D_false["decide false\n(no relevant paths changed)"]
D_true --> OUT_true["run_docker_tests=true\n-> GITHUB_STEP_SUMMARY"]
D_failsafe --> OUT_true
D_true_np --> OUT_true
D_false --> OUT_false["run_docker_tests=false\n-> GITHUB_STEP_SUMMARY"]
end
Changes --> Build
Changes --> TestJobs
subgraph Build["build / build-curobo\n(if: run_docker_tests == 'true')"]
end
subgraph TestJobs["test-* matrix jobs\n(if: run_docker_tests == 'true')"]
end
Build --> BranchProtection["Branch Protection\n(skipped = green ✓)"]
TestJobs --> BranchProtection
OUT_false --> BranchProtection
Reviews (2): Last reviewed commit: "Display detected changes as a path-patte..." | Re-trigger Greptile |
|
Dear 🤖 Isaac Lab Review Bot! On the fail-safe diagnostics: the |
|
Dear 🤖 Isaac Lab Review Bot! On the duplicated |
|
Dear 🤖 Isaac Lab Review Bot! On the labeler concurrency removal: the labeler runs on |
|
Dear 🤖 Isaac Lab Review Bot! On |
|
Dear 🤖 Isaac Lab Review Bot! On documenting the branch protection assumption: it is already inline in |
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
The new commit (95f4169) adds documentation comments to build.yaml explaining that skipped jobs satisfy branch protection requirements. This directly addresses my previous suggestion about documenting the branch protection assumption.
Follow-up Assessment
Previous concern addressed: The new comment block at lines 83-91 in build.yaml now explains why Docker test jobs can skip without gate jobs:
# Docker test jobs run only when paths in `path_regex` change. Otherwise
# they skip via `if:` and report green to branch protection, which is
# why we don't use a workflow-level `paths:` filter (a not-triggered
# required check would block the PR forever).Similar documentation was added to install-ci.yml (lines 46-50). This is exactly what was suggested.
Implementation Verdict
Ship it — The remaining suggestions from the previous review (composite action extraction, persistent API failure handling) are nice-to-haves, not blockers. The core change is correct and now well-documented.
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This follow-up commit (f74a8b4) further refines the change detection logic by improving the GitHub job summary output and making the patterns more readable with tab-separated descriptions. The implementation is clean.
Follow-up Assessment
Previous concerns remain addressed. The documentation explaining that skipped jobs satisfy branch protection is still present and correct.
New changes look good: The refactored render_table() and decide() functions produce cleaner job summaries with pattern descriptions. The $'...\t...' syntax for tab-separated pattern/description pairs is a nice readability improvement.
Implementation Verdict
Ship it — The CI simplification is correct, well-documented, and the new summary formatting is a usability improvement.
CI cleanup that drops redundant aggregator gate jobs (skipped jobs already satisfy branch protection), fixes reporting of change-detection in the GitHub job summary, simplifies concurrency keys back to github.ref (run # was redundant).
Type of change
Checklist
./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there