Skip to content

Simplifies CI change detection and concurrency groups#5466

Open
myurasov-nv wants to merge 15 commits intoisaac-sim:developfrom
myurasov-nv:my-ci-cleanup
Open

Simplifies CI change detection and concurrency groups#5466
myurasov-nv wants to merge 15 commits intoisaac-sim:developfrom
myurasov-nv:my-ci-cleanup

Conversation

@myurasov-nv
Copy link
Copy Markdown
Member

@myurasov-nv myurasov-nv commented May 1, 2026

CI cleanup that drops redundant aggregator gate jobs (skipped jobs already satisfy branch protection), fixes reporting of change-detection in the GitHub job summary, simplifies concurrency keys back to github.ref (run # was redundant).

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@myurasov-nv myurasov-nv changed the title Fixes and simplifies change detection reporting Simplify CI required-check gating and concurrency May 1, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR simplifies CI workflows by removing redundant aggregator gate jobs (Docker Tests Gate and Installation Tests Gate), unifying concurrency keys to use github.ref, and adding improved change-detection reporting via GitHub job summaries. The core assumption is that branch protection rules already handle skipped jobs correctly without explicit gate jobs.

Architecture Impact

Self-contained CI/workflow changes. No impact on Isaac Lab runtime code. The changes affect:

  • build.yaml — Docker test orchestration
  • install-ci.yml — Installation test orchestration
  • 5 other workflows — concurrency key standardization
  • Removes test_required_ci_gates.py — regression test for the gate jobs

The key architectural question is whether GitHub branch protection correctly treats jobs that skip via if: as passing. This is a GitHub Actions behavior dependency, not a code correctness issue.

Implementation Verdict

Ship it — The changes are correct and well-reasoned. Minor improvements possible.

Test Coverage

The PR removes test_required_ci_gates.py which validated the gate job structure. The PR description claims the test is "now-redundant" because the gate jobs themselves are removed. This is logically consistent — the test was specifically for the gate jobs' structure, not for the change-detection logic itself.

🟡 Gap: There's no test coverage for the new decide() helper function's behavior (GitHub step summary formatting, fail-safe handling). However, this is bash in a workflow file, which is typically tested via integration/manual verification rather than unit tests.

CI Status

No CI checks available yet — cannot verify the workflows function correctly in practice.

Findings

🔵 Improvement: .github/workflows/build.yaml:118-121 — Fail-safe path could mask persistent API issues
The fail-safe that defaults to running tests when gh api fails is good for resilience, but emits only a ::warning::. If the GitHub API is persistently broken (e.g., token scope issue), PRs will silently run all tests forever without clear diagnosis. Consider adding a counter or distinct message if this happens repeatedly.

if ! changed_files="$(gh api --paginate "repos/$REPO/pulls/$PR_NUMBER/files" --jq '.[].filename')"; then
  echo "::warning::Could not list changed files; defaulting to running tests"

🔵 Improvement: .github/workflows/build.yaml:92 — Duplicated path_regex definition
The path_regex in build.yaml and install-ci.yml are different (correctly so), but the decide() helper function is duplicated verbatim across both files (lines 94-115 in build.yaml, lines 53-74 in install-ci.yml). Consider extracting this to a composite action to reduce duplication and ensure consistent behavior.

🔵 Improvement: .github/workflows/labeler.yml:7-9 — Concurrency removal is correct but undocumented
The labeler workflow runs on pull_request_target which has different security/triggering semantics than pull_request. Removing concurrency here is fine (labeling is idempotent and fast), but the PR description doesn't explain why this specific workflow's concurrency was "redundant" while others just had their keys changed.

🟡 Warning: .github/workflows/build.yaml:57 and others — Concurrency key change may affect workflow cancellation

# Before:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
# After:
group: ${{ github.workflow }}-${{ github.ref }}

For PRs, github.ref is refs/pull/<number>/merge, which is already PR-unique as claimed. However, for workflow_dispatch events, the github.ref is the branch name. If someone manually triggers build.yaml twice on the same branch in quick succession (e.g., debugging), the second will cancel the first. This is likely intended behavior, but worth noting since the original code explicitly handled the distinction.

🔵 Improvement: .github/workflows/install-ci.yml:85-90 — PYTEST_EXTRA_ARGS array construction

PYTEST_EXTRA_ARGS=(-sv)
if [ -n "$TEST_FILTER" ]; then
  PYTEST_EXTRA_ARGS+=(-k "$TEST_FILTER")
fi

This bash array is correctly constructed, but note that if TEST_FILTER contains spaces or special characters, it will be passed as a single argument (correct behavior for -k). This is fine.

🔵 Improvement: Deleted test_required_ci_gates.py — Consider documenting the branch protection assumption
The deleted test file (test_required_ci_gates.py) served as documentation that the gate jobs were load-bearing for branch protection. With its removal, there's no in-repo documentation explaining why the gate jobs aren't needed. Consider adding a comment in build.yaml or CONTRIBUTING.md explaining that branch protection treats if:-skipped jobs as passing, which is the critical assumption enabling this simplification.

@myurasov-nv myurasov-nv marked this pull request as ready for review May 1, 2026 01:34
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 1, 2026

Greptile Summary

This PR simplifies CI infrastructure by removing redundant aggregator gate jobs (docker-tests-gate, installation-tests-gate) from build.yaml and install-ci.yml, since GitHub branch protection already treats skipped jobs as passing. It also standardises all concurrency keys to ${{ github.workflow }}-${{ github.ref }} (valid because github.ref is refs/pull/{N}/merge for pull_request events, giving per-PR uniqueness), and expands the change-detection step with clearer reporting and a fail-safe API-error path.

Confidence Score: 5/5

Safe to merge; changes are purely CI/infra with no product code impact and correct reasoning throughout.

All concurrency key changes are semantically equivalent for their trigger events, gate-job removal relies on documented GitHub branch-protection behaviour for skipped checks, and the enhanced change-detection script handles error paths gracefully. No P0 or P1 findings identified.

No files require special attention beyond the already-discussed deletion of test_required_ci_gates.py.

Important Files Changed

Filename Overview
.github/workflows/build.yaml Concurrency simplified to github.ref; change-detection script expanded with render_table/any_match/decide helpers and a GitHub Step Summary; docker-tests-gate aggregator job removed.
.github/workflows/install-ci.yml Same refactor as build.yaml: concurrency simplified, change-detection helpers added, installation-tests-gate aggregator removed.
.github/workflows/pre-commit.yaml Concurrency changed from github.event.pull_request.number to github.ref; for pull_request events github.ref is refs/pull/{N}/merge, so each PR still gets its own unique group.
.github/workflows/labeler.yml Concurrency block removed entirely; labeling is idempotent and short, so the lack of cancellation has negligible impact.
.github/workflows/test_required_ci_gates.py Deleted; previously validated structural invariants of the gate jobs that no longer exist. No replacement tests for the new invariants (no paths: filter, build if-condition, changes output presence).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    PR[Pull Request / Push / workflow_dispatch] --> Changes

    subgraph changes["changes job (Detect Changes)"]
        direction TB
        C1{EVENT_NAME == pull_request?}
        C1 -- No --> D_true_np["decide true\n(non-PR event)"]
        C1 -- Yes --> API[gh api: list changed files]
        API -- API error --> D_failsafe["decide true\n(fail-safe)"]
        API -- OK --> AnyMatch{any_match\nagainst patterns?}
        AnyMatch -- Yes --> D_true["decide true\n(relevant paths changed)"]
        AnyMatch -- No --> D_false["decide false\n(no relevant paths changed)"]
        D_true --> OUT_true["run_docker_tests=true\n-> GITHUB_STEP_SUMMARY"]
        D_failsafe --> OUT_true
        D_true_np --> OUT_true
        D_false --> OUT_false["run_docker_tests=false\n-> GITHUB_STEP_SUMMARY"]
    end

    Changes --> Build
    Changes --> TestJobs

    subgraph Build["build / build-curobo\n(if: run_docker_tests == 'true')"]
    end

    subgraph TestJobs["test-* matrix jobs\n(if: run_docker_tests == 'true')"]
    end

    Build --> BranchProtection["Branch Protection\n(skipped = green ✓)"]
    TestJobs --> BranchProtection
    OUT_false --> BranchProtection
Loading

Reviews (2): Last reviewed commit: "Display detected changes as a path-patte..." | Re-trigger Greptile

Comment thread .github/workflows/labeler.yml
@myurasov-nv
Copy link
Copy Markdown
Member Author

Dear 🤖 Isaac Lab Review Bot! On the fail-safe diagnostics: the ::warning:: already surfaces in the PR check UI, and a persistent failure becomes a recurring warning across every PR, which is itself the signal. A counter would need cross-run state. Leaving as is.

@myurasov-nv
Copy link
Copy Markdown
Member Author

Dear 🤖 Isaac Lab Review Bot! On the duplicated decide() helper: noted as a follow-up. Two callers do not yet justify a composite action. If a third workflow needs the same gating, that is the right time to extract.

@myurasov-nv
Copy link
Copy Markdown
Member Author

Dear 🤖 Isaac Lab Review Bot! On the labeler concurrency removal: the labeler runs on pull_request_target, is idempotent, and fast. Cancellation could leave a PR partially labeled. Pre #5429 there was no concurrency block on this workflow; the change is a revert to that default.

@myurasov-nv
Copy link
Copy Markdown
Member Author

Dear 🤖 Isaac Lab Review Bot! On workflow_dispatch concurrency: this is a revert to the pre #5429 behavior (${{ github.ref }}). Two manual dispatches on the same branch cancelling each other has been the long standing behavior of this workflow.

@myurasov-nv
Copy link
Copy Markdown
Member Author

Dear 🤖 Isaac Lab Review Bot! On documenting the branch protection assumption: it is already inline in build.yaml (lines 88 to 91) and install-ci.yml (lines 49 to 52): "they skip via if: and report green to branch protection".

Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

The new commit (95f4169) adds documentation comments to build.yaml explaining that skipped jobs satisfy branch protection requirements. This directly addresses my previous suggestion about documenting the branch protection assumption.

Follow-up Assessment

Previous concern addressed: The new comment block at lines 83-91 in build.yaml now explains why Docker test jobs can skip without gate jobs:

# Docker test jobs run only when paths in `path_regex` change. Otherwise
# they skip via `if:` and report green to branch protection, which is
# why we don't use a workflow-level `paths:` filter (a not-triggered
# required check would block the PR forever).

Similar documentation was added to install-ci.yml (lines 46-50). This is exactly what was suggested.

Implementation Verdict

Ship it — The remaining suggestions from the previous review (composite action extraction, persistent API failure handling) are nice-to-haves, not blockers. The core change is correct and now well-documented.

@myurasov-nv
Copy link
Copy Markdown
Member Author

@greptile-apps

@myurasov-nv myurasov-nv changed the title Simplify CI required-check gating and concurrency Simplify CI change detection/gating and concurrency groups May 1, 2026
@myurasov-nv myurasov-nv changed the title Simplify CI change detection/gating and concurrency groups Fixes CI change detection/gating and concurrency groups May 1, 2026
@myurasov-nv myurasov-nv changed the title Fixes CI change detection/gating and concurrency groups Fixes CI change detection/gating and simplifies concurrency groups May 1, 2026
@myurasov-nv myurasov-nv changed the title Fixes CI change detection/gating and simplifies concurrency groups Fixes CI change detection/gating and concurrency groups May 1, 2026
@myurasov-nv myurasov-nv changed the title Fixes CI change detection/gating and concurrency groups Simplifies CI change detection/gating and concurrency groups May 1, 2026
@myurasov-nv myurasov-nv changed the title Simplifies CI change detection/gating and concurrency groups Simplifies CI change detection and concurrency groups May 1, 2026
Comment thread .github/workflows/build.yaml Outdated
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This follow-up commit (f74a8b4) further refines the change detection logic by improving the GitHub job summary output and making the patterns more readable with tab-separated descriptions. The implementation is clean.

Follow-up Assessment

Previous concerns remain addressed. The documentation explaining that skipped jobs satisfy branch protection is still present and correct.

New changes look good: The refactored render_table() and decide() functions produce cleaner job summaries with pattern descriptions. The $'...\t...' syntax for tab-separated pattern/description pairs is a nice readability improvement.

Implementation Verdict

Ship it — The CI simplification is correct, well-documented, and the new summary formatting is a usability improvement.

@myurasov-nv myurasov-nv requested a review from kellyguo11 May 3, 2026 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants