ci: add PR hygiene automation (linked issue check + stale PR cleanup)#521
ci: add PR hygiene automation (linked issue check + stale PR cleanup)#521andreatgretel merged 11 commits intomainfrom
Conversation
Add two workflows to enforce contribution quality and clean up abandoned PRs: - pr-linked-issue.yml: required status check that validates external PRs reference a triaged issue. Collaborators bypass. Re-triggers automatically when a maintainer adds the `triaged` label to the linked issue. - pr-stale.yml: daily cron that reminds authors of failing checks after 7/14 days of inactivity and auto-closes after 14/28 days (external/collaborator). Respects `keep-open` label. New labels created: `triaged`, `task`, `keep-open`. Closes #518 Signed-off-by: Andrea Manoel <[email protected]>
Add a weekly scheduled workflow that uses Claude to triage all open issues and PRs, producing a combined dashboard report on a pinned tracking issue. - New recipe (.agents/recipes/issue-triage/) classifies issues, checks staleness, cross-references merged PRs, detects duplicates, and flags PR health problems (missing linked issues, failing checks, orphaned PRs) - New workflow (.github/workflows/agentic-ci-issue-triage.yml) runs every Monday 10:00 UTC on the agentic-ci runner, with manual dispatch support - pr-stale.yml now adds needs-attention label to linked issues when a PR is auto-closed, bridging the two workflows via labels
c9fb5ed to
3fea4ff
Compare
SummaryPR #521 adds three GitHub Actions workflows and a triage recipe to enforce contribution quality for the DataDesigner repository:
Total: 873 additions, 1 deletion across 6 files. All new files except the CONTRIBUTING.md edit. FindingsCriticalNo critical issues found. High
Medium
Low
VerdictApprove with suggestions. This is a well-structured PR that adds meaningful contribution-quality guardrails. The workflow design follows established patterns in the repo ( Must-fix before merge:
Strongly recommended:
The remaining findings are minor improvements that can be addressed in follow-up work. |
Greptile SummaryAdds three GitHub Actions workflows to enforce PR hygiene: a linked-issue check (
|
| Filename | Overview |
|---|---|
| .agents/recipes/issue-triage/recipe.md | New agent recipe defining data-gathering, classification, and report-generation instructions for the weekly triage workflow; logic and format are consistent with the workflow that invokes it. |
| .github/workflows/agentic-ci-issue-triage.yml | New weekly agentic triage workflow with a correct primary posting path; the fallback comment-lookup query lacks --paginate and will silently target the wrong comment after ~30 runs. |
| .github/workflows/pr-linked-issue.yml | New linked-issue check using pull_request_target safely (no checkout); previously flagged bugs (wrong status output, PR-number multiline output) confirmed fixed in prior commits. |
| .github/workflows/pr-stale.yml | New stale-PR cleanup workflow; reminder/close logic is sound, but the unpaginated commits API call can return a stale "last push" timestamp for PRs with >30 commits, potentially triggering false reminders. |
| CONTRIBUTING.md | Documents linked-issue requirement and stale-PR policy accurately matching the new workflow behavior. |
| plans/518/pr-hygiene-plan.md | Design document for the PR hygiene work; no executable content, consistent with the implemented workflows. |
Sequence Diagram
sequenceDiagram
participant C as Contributor
participant GH as GitHub
participant LIC as pr-linked-issue.yml
participant Stale as pr-stale.yml
participant Triage as agentic-ci-issue-triage.yml
participant TI as Tracking Issue
C->>GH: Opens PR (fork)
GH->>LIC: pull_request_target [opened]
LIC->>GH: Check author permissions (collaborator API)
LIC->>GH: Parse body for Fixes/Closes/Resolves #N
LIC->>GH: Validate issue exists + has triaged label
LIC->>C: Post failure comment (or pass silently)
Note over GH,LIC: Maintainer labels issue as triaged
GH->>LIC: issues [labeled: triaged]
LIC->>GH: Find open PRs referencing issue
LIC->>GH: Edit PR body (hidden timestamp triggers edited event)
GH->>LIC: pull_request_target [edited]
LIC->>C: Check passes, delete failure comment
Note over Stale: Daily cron 09:00 UTC
GH->>Stale: schedule / workflow_dispatch
Stale->>GH: List open PRs with failing checks
Stale->>C: Post reminder (7d external / 14d collab)
Stale->>GH: Close PR + add needs-attention to linked issue (14d / 28d)
Note over Triage: Weekly Monday 10:00 UTC
GH->>Triage: schedule / workflow_dispatch
Triage->>GH: Fetch all issues + open/merged PRs
Triage->>TI: Edit existing triage comment (or post new)
Note over TI: Tracks needs-attention labels from Stale workflow
Prompt To Fix All With AI
This is a comment left during a code review.
Path: .github/workflows/agentic-ci-issue-triage.yml
Line: 111-113
Comment:
**Fallback comment lookup breaks after ~30 weekly runs**
`gh api` without `--paginate` returns only the first page (30 items). GitHub's issue comments API returns comments in ascending (oldest-first) order, so once 30+ weekly triage comments have accumulated, `last` selects the 30th-oldest comment rather than the most recent one. The fallback step would then either silently overwrite a ~7-month-old comment or skip the update entirely (if that old comment happened to contain today's date string), leaving the tracking issue without a current report.
```suggestion
EXISTING=$(gh api "repos/${{ github.repository }}/issues/${TRACKING_ISSUE}/comments" \
--paginate \
--jq "[.[] | select(.user.login == \"github-actions[bot]\") | select(.body | contains(\"${MARKER}\"))] | last | .id" \
2>/dev/null || echo "")
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: .github/workflows/pr-stale.yml
Line: 89-90
Comment:
**Unpaginated commits API gives stale "last push" for PRs with 30+ commits**
`gh api .../commits` without `--paginate` returns the first page (30 commits), ordered oldest-first. For a PR with more than 30 commits, `last` gets the 30th-oldest commit's timestamp rather than the actual most recent push. A contributor who has been actively pushing to a long-lived PR with many commits could have their inactivity clock based on a commit from weeks ago, triggering a false stale reminder or even premature auto-close.
```suggestion
LAST_PUSH=$(gh api "repos/${REPO}/pulls/${PR_NUM}/commits?per_page=100" \
--jq 'last | .commit.committer.date' 2>/dev/null || echo "")
```
For PRs exceeding 100 commits `--paginate` would be needed; `per_page=100` covers the vast majority of cases with a single request.
How can I resolve this? If you propose a fix, please make it concise.Reviews (4): Last reviewed commit: "Merge branch 'main' into andreatgretel/f..." | Re-trigger Greptile
- pr-linked-issue: fix comment gate so failure comments are posted - pr-stale: upgrade issues permission to write for labeling - pr-stale: compare reminder timestamp against last activity so push/comment actually resets the stale timer
PR bodies with backticks or unmatched quotes would break the gh pr edit --body "$NEW_BODY" call. Write to a temp file and use --body-file instead.
jq outputs newline-separated numbers but GITHUB_OUTPUT only preserves the first line. Convert to space-separated so the for loop processes all matching PRs.
- Move attacker-influenced values (${{ user.login }}, step outputs)
from expression interpolation in run: blocks to env vars
- Replace echo "$PR_BODY" | grep with write-to-file + grep-file
to avoid shell expansion of untrusted PR body content
- Same treatment for PR body handling in retrigger and stale jobs
Remove peter-evans/find-comment and peter-evans/create-or-update-comment third-party action dependencies. Replace with gh api calls for finding, creating, updating, and deleting bot comments. Eliminates supply chain risk from unpinned third-party actions.
PR Review: #521 — ci: add PR hygiene automation (linked issue check + stale PR cleanup)SummaryThis PR adds three interconnected GitHub Actions workflows to enforce contribution quality and clean up abandoned PRs:
Also includes a triage recipe ( Files changed: 6 | +888 / -1 | Commits: 10 The security posture is strong. The commit history shows iterative hardening: FindingsCriticalNo critical issues found. HighNo high-severity issues found. Medium1. The existing uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4The new triage workflow uses the unpinned mutable tag: uses: actions/checkout@v4For consistency and supply chain safety, pin to the same SHA. 2. The REMINDER_JSON=$(gh api "repos/${REPO}/issues/${PR_NUM}/comments" --jq "..." ...)
LAST_COMMENT=$(gh api "repos/${REPO}/issues/${PR_NUM}/comments" --jq "..." ...)On PRs with 30+ comments, this could miss the reminder comment (leading to duplicate reminders or premature closure) or the latest author comment (causing false stale classification). Adding The same concern applies to the comment-lookup in 3. The stale check iterates over all open PRs and makes 3-4 API calls per PR (collaborator permission, PR checks, commits endpoint, comments endpoint). With the Recommendation: Consider adding Low4. The fallback step updates an existing comment with: REPORT=$(cat /tmp/issue-triage-report.md)
gh api -X PATCH "repos/.../issues/comments/${EXISTING}" -f body="$REPORT"For very large reports or reports with shell-special characters, this could be fragile. The new-comment path already uses jq -n --rawfile body /tmp/issue-triage-report.md '{"body": $body}' \
| gh api -X PATCH "repos/.../issues/comments/${EXISTING}" --input -5. The plan frontmatter has 6. The recipe frontmatter declares permissions:
issues: write # for posting to the tracking issue, not modifying triaged itemsInfo7. Several 8. macOS Date parsing includes What looks good
VerdictApprove with suggestions. The core logic is correct, security posture is strong, and the architecture is well-designed with good escape hatches. The iterative hardening across the commit history shows careful attention to security. Recommended before merge:
The remaining findings are minor improvements that can be addressed in follow-up work. |
Summary
Add GitHub Actions workflows to enforce contribution quality, clean up
abandoned PRs, and run weekly agentic triage of the full issue+PR landscape.
Closes #518
Changes
Added
pr-linked-issue.yml- linked issue check with two jobs:check(PR events) - validates non-collaborator PRs reference a triaged issue viaFixes #N. Collaborators bypass. Uses the collaborator API pattern fromagentic-ci-pr-review.ymland peter-evans actions for idempotent comment management.retrigger(issue labeled) - when a maintainer addstriagedto an issue, finds open PRs referencing it and edits a hidden HTML comment in their body to re-trigger the check.pr-stale.yml- daily cron (09:00 UTC) that scans open PRs with failing checks. Posts a reminder after 7 days (non-collaborator) or 14 days (collaborator) of inactivity, auto-closes after 14/28 days.keep-openlabel as escape hatch. On auto-close, addsneeds-attentionlabel to the linked issue so the triage workflow picks it up.agentic-ci-issue-triage.yml- weekly scheduled workflow (Monday 10:00 UTC) that uses Claude on theagentic-cirunner to triage all open issues and PRs. Produces a combined dashboard report posted to a pinned tracking issue..agents/recipes/issue-triage/recipe.md- recipe prompt for the triage agent. Classifies issues (bug/feature/chore/discussion), checks staleness (14/30 day thresholds), cross-references merged PRs to detect resolved issues, flags duplicates, and checks PR health (missing linked issues, failing checks, orphaned PRs, duplicate fixes).plans/518/pr-hygiene-plan.md- standalone plantriaged,task,keep-open,needs-attentionChanged
CONTRIBUTING.md- documented linked issue requirement, stale PR policy, and auto-retrigger behaviorWhat changes for external contributors
New contributors opening a PR against this repo will see the following flow:
triagedlabel.Fixes #123,Closes #123, orResolves #123.triagedlabel. If not, the check blocks and a bot comment explains what to fix.triagedlabel to the linked issue. No action needed from the contributor.keep-openlabel (maintainer-only) prevents auto-close.Collaborators (write/admin) bypass the linked issue check entirely and get longer stale thresholds (14-day reminder, 28-day auto-close).
How the workflows connect
Attention areas
pr-linked-issue.ymlusespull_request_target(notpull_request) so the workflow token can post comments on fork PRs. Safe because no checkout or code execution from the PR branch occurs.agentic-ci-issue-triage.ymlrequires two repo-level settings:AGENTIC_CI_MODELvariable (already set for PR review)ISSUE_TRIAGE_TRACKING_ISSUEvariable (number of the pinned tracking issue)Post-merge
Linked Issue Checkas a required status check in branch protection formainISSUE_TRIAGE_TRACKING_ISSUErepo variableDraft: pinned tracking issue body
Repository Triage Dashboard
Automated weekly triage of open issues and pull requests, posted by the
Repository Triage workflow.
How it works
Every Monday at 10:00 UTC, an agent scans all open issues and PRs and posts
a report as a comment below. The comment is edited in place each run - there
is always exactly one report reflecting the latest state.
The report covers:
Labels
needs-attentionkeep-opentriagedManual trigger
Run the workflow on-demand from the Actions tab
using "Run workflow".
Testing
Manual validation:
triagedto linked issue, verify re-check passes automaticallyworkflow_dispatch, verify correct PR targetingkeep-openlabel prevents auto-closeworkflow_dispatch, verify report posted to tracking issueneeds-attentionlabel added to linked issue on stale PR auto-closeDescription updated with AI