Conversation
Deploying elastic-fruit-runner with
|
| Latest commit: |
b82f4a2
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://d912673e.elastic-fruit-runner.pages.dev |
| Branch Preview URL: | https://feat-tart-integration-tests.elastic-fruit-runner.pages.dev |
3a115bb to
660b772
Compare
💡 Codex ReviewThe integration matrix now runs entirely on elastic-fruit-runner/test/integration/steps_test.go Lines 722 to 726 in 3a115bb VM cleanup is only performed in this explicit scenario step, so if any earlier Tart step fails (pull/clone/start/ip/exec), the remaining steps are skipped and stop/delete never runs. Because these tests run on persistent self-hosted machines, leaked VMs can accumulate and interfere with subsequent jobs. Add unconditional teardown in ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Codecov Report❌ Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45b7b78937
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| Scenario: list and cleanup orphaned VMs | ||
| Given a tart manager | ||
| When I clone a VM with a random name |
There was a problem hiding this comment.
Pull image before cloning in orphan-cleanup scenario
The list and cleanup orphaned VMs scenario clones immediately without first ensuring the test image is present locally, so on a clean runner (or when this scenario is run in isolation) Clone can fail before any cleanup assertions run. This makes the scenario order-dependent on pull, clone, start, exec, and cleanup a VM instead of being self-contained.
Useful? React with 👍 / 👎.
| "failed": {}, | ||
| "Succeeded": {}, | ||
| "Failed": {}, | ||
| } |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 90f9e155d2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| _ = state.tartMgr.Stop(context.Background(), name) | ||
| _ = state.tartMgr.Delete(context.Background(), name) |
There was a problem hiding this comment.
Propagate tart cleanup failures instead of discarding them
This step ignores both Stop and Delete errors, so the scenario can still pass even when cleanup fails for prefixed VMs (especially stale orphans from prior runs). Because the next assertion only verifies the current cloned VM name, failures on other matched VMs are silently masked, which can leave orphaned Tart VMs behind and hide regressions in the cleanup path.
Useful? React with 👍 / 👎.
Add BDD scenarios for tart manager operations with nested virtualization: - Pull, clone, start, IP discovery, SSH exec, stop, delete - List and cleanup orphaned VMs with prefix matching Uses ghcr.io/cirruslabs/macos-tahoe-base:latest as a smaller test image (configurable via EFR_TEST_TART_IMAGE). Tart steps auto-skip via godog.ErrPending when tart binary is not available. CI matrix gains a "Tart" job running on self-hosted macOS arm64 runner with nested virtualization enabled.
Use per-matrix runs-on so PAT/GitHub App tests run on Linux arm64 while Tart tests run on macOS arm64 (required for Apple Virtualization.framework and nested virtualization).
… API
GitHub Actions API sends job results in lowercase ("succeeded", "failed")
but the mapping code expected Title Case ("Succeeded", "Failed"), causing
all completed jobs to show as FAIL in the dashboard (issue #68).
90f9e15 to
6953359
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6953359dfb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| }) | ||
|
|
||
| sc.Step(`^I pull the VM image$`, func() error { | ||
| image := envOrDefault("EFR_TEST_TART_IMAGE", "ghcr.io/cirruslabs/macos-tahoe-base:latest") |
There was a problem hiding this comment.
Pin Tart test image to a fixed tag or digest
Defaulting EFR_TEST_TART_IMAGE to ghcr.io/cirruslabs/macos-tahoe-base:latest makes this integration suite non-deterministic, and the workflow does not override that variable for the Tart job. When the upstream latest tag moves, lifecycle behavior (boot timing, SSH defaults, image contents) can change and break CI without any repository change, so this should use a pinned image tag or digest.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e741951162
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| case "succeeded": | ||
| return controlplanev1.JobResult_JOB_RESULT_SUCCESS | ||
| case "Failed": | ||
| case "failed": |
There was a problem hiding this comment.
Accept legacy capitalized job result strings
This switch now only maps lowercase values, but existing jobs.db records written by prior versions used "Succeeded"/"Failed". After upgrade, those rows resolve to JOB_RESULT_UNSPECIFIED, and the dashboard path treats completed unknown results as failures, so historical successful jobs are misreported until old rows roll off. Please keep backward-compatible mappings (or migrate stored rows) in toProtoJobResult to avoid data corruption in the UI during upgrades.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 453b2a90da
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if binpath.Lookup("tart") == "tart" { | ||
| return ctx, fmt.Errorf("tart binary not found in PATH; install via `brew install cirruslabs/cli/tart`") |
There was a problem hiding this comment.
Skip Tart scenarios when binary is unavailable
The a tart manager step hard-fails with a regular error when tart is missing, which causes the scenario (and often the full integration run) to fail on environments that do not have Tart installed. Since the integration harness defaults to running all feature files when EFR_TEST_FEATURES is unset, this makes non-Tart environments unable to run the rest of the suite instead of cleanly skipping Tart-only scenarios.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ccced30ccb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return ctx, fmt.Errorf("tart binary not found in PATH; install via `brew install cirruslabs/cli/tart`") | ||
| } | ||
| state.tartMgr = tart.NewManager() | ||
| state.tartPrefix = "efr-tart-test" |
There was a problem hiding this comment.
Scope Tart cleanup to this test run
Setting state.tartPrefix to the fixed value "efr-tart-test" makes both cleanup paths delete any VM on the host that matches that shared prefix, not just VMs created by the current scenario. On shared/self-hosted machines (or overlapping local runs), one run can remove another run’s VM and cause nondeterministic failures or unintended data loss. Use a per-run/per-scenario unique prefix (while keeping it stable within that run) so cleanup only targets resources created by the current execution.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ead520d3b0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| security list-keychains -d user -s "$tart_keychain" | ||
| security default-keychain -s "$tart_keychain" | ||
| security login-keychain -s "$tart_keychain" |
There was a problem hiding this comment.
Restore keychain defaults after Tart setup
This step mutates persistent user keychain state on a self-hosted macOS runner (list-keychains -s, default-keychain -s, and login-keychain -s) but never restores it, so later jobs can inherit a broken search/default/login keychain configuration (including a temp keychain under RUNNER_TEMP that may be removed after the job). On shared runners this can cause unrelated workflows to fail keychain-backed auth/signing operations.
Useful? React with 👍 / 👎.
| sudo defaults write /Library/Preferences/SystemConfiguration/com.apple.InternetSharing.default.plist bootpd -dict DHCPLeaseTimeSecs -int 600 | ||
| sudo rm -f /var/db/dhcpd_leases |
There was a problem hiding this comment.
Revert host DHCP tweaks after Tart job
The workflow writes /Library/Preferences/SystemConfiguration/com.apple.InternetSharing.default.plist and deletes /var/db/dhcpd_leases with sudo on a self-hosted machine, but there is no cleanup step to restore prior host networking state. Because this is a persistent host-level change, subsequent jobs on the same runner can observe altered DHCP behavior and flaky VM networking unrelated to this test run.
Useful? React with 👍 / 👎.
| cmd := exec.CommandContext(ctx, binpath.Lookup("tart"), "ip", name, "--wait", "60") | ||
| out, err := cmd.Output() | ||
| cmd := exec.CommandContext(ctx, binpath.Lookup("tart"), "ip", name, "--wait", strconv.Itoa(ipAddressWaitSeconds)) | ||
| out, err := cmd.CombinedOutput() |
There was a problem hiding this comment.
[P2] CombinedOutput() mixes stderr into the parsed IP
Switching from cmd.Output() to cmd.CombinedOutput() while still parsing the result with strings.TrimSpace(string(out)) at line 146 means any stderr output from tart ip --wait (progress, warnings, future log lines) gets concatenated with the IP and breaks net.ParseIP. The intent here was clearly to surface stderr in the error message, but doing so via CombinedOutput regresses the success path.
Suggested fix: capture stderr separately and keep Output() for the parsed value:
var stderr bytes.Buffer
cmd.Stderr = &stderr
out, err := cmd.Output()
if err != nil {
err = fmt.Errorf("tart ip %s: %w\n%s", name, err, stderr.Bytes())
...
}
Summary
ghcr.io/cirruslabs/macos-tahoe-base:latestas smaller test image (no Xcode)tartbinary is not availableTest plan