Skip to content

OCPBUGS-83267: Use upgrades.Skippable for Gateway API upgrade test skip logic#31000

Open
gcs278 wants to merge 1 commit intoopenshift:mainfrom
gcs278:fix-gwapi-upgrade-ipv6-skip
Open

OCPBUGS-83267: Use upgrades.Skippable for Gateway API upgrade test skip logic#31000
gcs278 wants to merge 1 commit intoopenshift:mainfrom
gcs278:fix-gwapi-upgrade-ipv6-skip

Conversation

@gcs278
Copy link
Copy Markdown
Contributor

@gcs278 gcs278 commented Apr 12, 2026

Summary

  • The Gateway API upgrade test (NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test #30897) called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • [ ] Verify Gateway API upgrade test is skipped cleanly on OKD clusters I don't see any OKD jobs
  • [ ] Verify Gateway API upgrade test is skipped cleanly on unsupported platforms Looks like we support all platforms in the current jobs I can search though
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 12, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 12, 2026

@gcs278: This pull request references NE-2292 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

  • The Gateway API upgrade test called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Added a pre-Setup skip hook on Gateway API upgrade tests that performs platform/OLM capability and eligibility checks before Setup. Skip collects precheck errors (stored on the test), performs Ginkgo skip when appropriate, and Setup asserts no precheck error and uses a simplified capability resolver.

Changes

Cohort / File(s) Summary
Gateway API Upgrade test
test/extended/router/gatewayapi_upgrade.go
Added func (t *GatewayAPIUpgradeTest) Skip(_ upgrades.UpgradeContext) bool and precheckErr error. Skip calls shouldSkipGatewayAPITests, stores any error into t.precheckErr (and returns false), issues a Ginkgo skip when skip==true with the provided reason, and when NO-OLM is not enabled performs OLM/Marketplace capability checks (errors recorded into t.precheckErr, non-enabled capability causes skip). Setup now asserts t.precheckErr==nil, uses getPlatformCapabilities, and no longer performs capability-based skipping at Setup time.
Gateway API controller / platform capabilities
test/extended/router/gatewayapicontroller.go
Replaced checkPlatformSupportAndGetCapabilities with shouldSkipGatewayAPITests(oc *exutil.CLI) (bool, string, error) and getPlatformCapabilities(oc *exutil.CLI) (loadBalancerSupported bool, managedDNS bool). shouldSkipGatewayAPITests determines OKD, reads Infrastructure.PlatformStatus, and returns skip decisions/reasons (including unsupported platform types and IPv6/dual-stack) or errors (no Ginkgo/Gomega calls). getPlatformCapabilities only maps PlatformStatus.Type to loadBalancerSupported and computes managedDNS via isDNSManaged(oc), logging results; it no longer makes skip decisions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from Thealisyed and rfredette April 12, 2026 20:57
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 12, 2026
@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch 2 times, most recently from 73d03cd to 853bb58 Compare April 12, 2026 21:00
@gcs278 gcs278 changed the title NE-2292: Use upgrades.Skippable for Gateway API upgrade test skip logic OCPBUGS-82586: Use upgrades.Skippable for Gateway API upgrade test skip logic Apr 12, 2026
@openshift-ci-robot openshift-ci-robot added the jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. label Apr 12, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-82586, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • The Gateway API upgrade test called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Apr 12, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/extended/router/gatewayapi_upgrade.go`:
- Around line 53-60: Replace the helper that panics inside Skip by adding a new
function gatewayAPIWithoutOLMEnabled(oc *exutil.CLI) (bool, error) (in
gatewayapicontroller.go) which performs the same FeatureGates check as
isNoOLMFeatureGateEnabled but returns (bool, error) instead of using o.Expect;
then change the Skip() call site to call gatewayAPIWithoutOLMEnabled(oc), check
the returned error and treat any error as a condition to skip the test
(log/return true) rather than allowing a panic, and ensure no o.Expect-based
helpers are invoked from within Skip().

In `@test/extended/router/gatewayapicontroller.go`:
- Around line 554-587: shouldSkipGatewayAPITests must not invoke
isIPv6OrDualStack because that helper uses o.Expect and can panic; change the
code so it calls a non-panicking variant that returns (bool, error) (either
modify isIPv6OrDualStack to return (bool, error) or add isIPv6OrDualStackSafe),
check the error and on error return (true, fmt.Sprintf("Failed to determine IP
stack: %v", err)), and only skip for IPv6/dual-stack when the boolean is true;
reference the functions shouldSkipGatewayAPITests and isIPv6OrDualStack (or the
new isIPv6OrDualStackSafe) and ensure no o.Expect is used in the path called
from GatewayAPIUpgradeTest.Skip.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c5b789a9-7819-41a8-b936-721a3717190d

📥 Commits

Reviewing files that changed from the base of the PR and between 1a93dad and 090b384.

📒 Files selected for processing (2)
  • test/extended/router/gatewayapi_upgrade.go
  • test/extended/router/gatewayapicontroller.go

@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch from 853bb58 to d765284 Compare April 12, 2026 21:04
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
test/extended/router/gatewayapi_upgrade.go (1)

54-58: ⚠️ Potential issue | 🔴 Critical

Avoid o.Expect-based helper calls inside Skip()

Line 54 calls isNoOLMFeatureGateEnabled(oc), which uses o.Expect internally and can panic. In this pre-Setup() skip path, that can still produce unrecoverable panics and defeats the upgrades.Skippable refactor.

Proposed fix
-	// Skip on clusters missing OLM/Marketplace capabilities if not using NO-OLM mode
-	if !isNoOLMFeatureGateEnabled(oc) {
+	// Skip on clusters missing OLM/Marketplace capabilities if not using NO-OLM mode
+	noOLMEnabled, err := gatewayAPIWithoutOLMEnabled(oc)
+	if err != nil {
+		g.By(fmt.Sprintf("skipping test: failed to read FeatureGates: %v", err))
+		return true
+	}
+	if !noOLMEnabled {
 		enabled, err := exutil.AllCapabilitiesEnabled(oc, olmCapabilities...)
 		if err != nil || !enabled {
 			g.By("skipping test: OLM/Marketplace capabilities are not enabled and GatewayAPIWithoutOLM is not enabled")
 			return true
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/gatewayapi_upgrade.go` around lines 54 - 58, The skip
path calls isNoOLMFeatureGateEnabled(oc) which uses o.Expect and can panic
before Setup; replace that call with a non-panicking check (either introduce a
new helper like isNoOLMFeatureGateEnabledNoExpect(oc) that returns (bool, error)
or inline the feature-gate query without o.Expect), handle any error by treating
the gate as disabled, and then use that boolean in the existing condition
alongside exutil.AllCapabilitiesEnabled; ensure all o.Expect usage remains in
Setup or later so Skip() cannot cause a panic.
test/extended/router/gatewayapicontroller.go (1)

583-584: ⚠️ Potential issue | 🔴 Critical

shouldSkipGatewayAPITests still has a transitive panic path

The function is documented as panic-safe, but Line 583 calls isIPv6OrDualStack(oc), and that helper uses o.Expect (Line 622). This can still panic instead of returning a skip reason.

Proposed fix
-	if isIPv6OrDualStack(oc) {
+	ipv6OrDualStack, err := isIPv6OrDualStack(oc)
+	if err != nil {
+		return true, fmt.Sprintf("Failed to determine IP stack: %v", err)
+	}
+	if ipv6OrDualStack {
 		return true, "Skipping Gateway API tests on IPv6/dual-stack cluster"
 	}
-func isIPv6OrDualStack(oc *exutil.CLI) bool {
+func isIPv6OrDualStack(oc *exutil.CLI) (bool, error) {
 	networkConfig, err := oc.AdminOperatorClient().OperatorV1().Networks().Get(context.Background(), "cluster", metav1.GetOptions{})
-	o.Expect(err).NotTo(o.HaveOccurred(), "Failed to get network config")
+	if err != nil {
+		return false, err
+	}
 
 	for _, cidr := range networkConfig.Spec.ServiceNetwork {
 		if utilnet.IsIPv6CIDRString(cidr) {
-			return true
+			return true, nil
 		}
 	}
-	return false
+	return false, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/router/gatewayapicontroller.go` around lines 583 - 584,
shouldSkipGatewayAPITests currently calls isIPv6OrDualStack(oc) which uses
o.Expect and can panic; change the helper to a panic-safe signature (e.g.,
isIPv6OrDualStackSafe/ isIPv6OrDualStack returning (bool, error)) that does not
call o.Expect but returns an error on failure, then update
shouldSkipGatewayAPITests to call the new safe helper, handle the error path by
returning a non-skip or a clear skip reason without panicking, and remove any
direct uses of o.Expect inside isIPv6OrDualStack so tests remain panic-safe;
refer to the functions shouldSkipGatewayAPITests and isIPv6OrDualStack and the
use of o.Expect when locating code to modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/extended/router/gatewayapi_upgrade.go`:
- Around line 54-58: The skip path calls isNoOLMFeatureGateEnabled(oc) which
uses o.Expect and can panic before Setup; replace that call with a non-panicking
check (either introduce a new helper like isNoOLMFeatureGateEnabledNoExpect(oc)
that returns (bool, error) or inline the feature-gate query without o.Expect),
handle any error by treating the gate as disabled, and then use that boolean in
the existing condition alongside exutil.AllCapabilitiesEnabled; ensure all
o.Expect usage remains in Setup or later so Skip() cannot cause a panic.

In `@test/extended/router/gatewayapicontroller.go`:
- Around line 583-584: shouldSkipGatewayAPITests currently calls
isIPv6OrDualStack(oc) which uses o.Expect and can panic; change the helper to a
panic-safe signature (e.g., isIPv6OrDualStackSafe/ isIPv6OrDualStack returning
(bool, error)) that does not call o.Expect but returns an error on failure, then
update shouldSkipGatewayAPITests to call the new safe helper, handle the error
path by returning a non-skip or a clear skip reason without panicking, and
remove any direct uses of o.Expect inside isIPv6OrDualStack so tests remain
panic-safe; refer to the functions shouldSkipGatewayAPITests and
isIPv6OrDualStack and the use of o.Expect when locating code to modify.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 85458f16-0cb3-47cb-9f62-6d36fc11beb4

📥 Commits

Reviewing files that changed from the base of the PR and between 090b384 and 853bb58.

📒 Files selected for processing (2)
  • test/extended/router/gatewayapi_upgrade.go
  • test/extended/router/gatewayapicontroller.go

@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch 4 times, most recently from f580b21 to 6a7a94e Compare April 12, 2026 21:09
@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-82586, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary

  • The Gateway API upgrade test (NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test #30897) called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 12, 2026

Most importantly, I should have kicked off "periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6" in #30897 and I missed that:

/payload-job periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-vsphere-ovn-upgrade periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-ovn-upgrade periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 12, 2026

@gcs278: trigger 4 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.20-e2e-aws-ovn-upgrade-paused
  • periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-vsphere-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5cc518b0-36b4-11f1-833a-a696c774475a-0

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 12, 2026

@CodeRabbit does everything look good?

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 12, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
gh pr diff 31000 --repo openshift/origin

Length of output: 7475


@gcs278 The changes look correct and well-structured overall. Here's a brief summary of what I checked:

✅ Core fix is sound: The Skip() method correctly implements the upgrades.Skippable interface and is called by the disruption framework before Setup() runs in its goroutine. This properly avoids the g.Skip() panic.

shouldSkipGatewayAPITests is safe: No g.Skip() or o.Expect() calls — pure error returns, safe to invoke outside Ginkgo leaf nodes.

getPlatformCapabilities in Setup() is safe in practice: It uses o.Expect(), but by the time Setup() is reached, Skip() has already validated that the infrastructure is accessible and PlatformStatus is non-nil. The only theoretical risk is a TOCTOU race, which is negligible.

gatewayapicontroller.go BeforeEach is correct: shouldSkipGatewayAPITests + g.Skip(reason) + o.Expect(err).NotTo(...) is a clean and idiomatic pattern inside a Ginkgo leaf node.

A few minor observations (no blockers):

  1. isNoOLMFeatureGateEnabled called twice — once in Skip() and once in Setup(). Minor inefficiency.
  2. OLM capability error → skip in Skip(): If AllCapabilitiesEnabled returns an error, the test is silently skipped rather than failing. This is conservative but could mask infrastructure issues; logging the error before skipping would help observability (similar to how shouldSkipGatewayAPITests errors are handled with e2e.Logf).
  3. g.By() in Skip(): Harmless, but By is typically meant for step annotation inside test bodies. A plain e2e.Logf might be more appropriate here since Skip() is not a Ginkgo node.

None of these are correctness issues — the PR achieves its goal of preventing the goroutine panic on IPv6/dual-stack, OKD, and unsupported platforms. Looks good to merge. 🐇

@gcs278 gcs278 force-pushed the fix-gwapi-upgrade-ipv6-skip branch from 6a7a94e to d44a553 Compare April 12, 2026 21:17
@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 12, 2026

/payload-abort

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 12, 2026

@gcs278: aborted 4 active payload job(s) for pull request #31000

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 12, 2026

I hate to kick off so many jobs, but I don't want to miss any jobs that could be failing:
/payload 4.22 nightly informing

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 12, 2026

@gcs278: trigger 66 job(s) of type informing for the nightly release of OCP 4.22

  • periodic-ci-openshift-hypershift-release-4.22-periodics-e2e-azure-aks-ovn-conformance
  • periodic-ci-openshift-release-main-nightly-4.22-console-aws
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.22-periodics-e2e-aws
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-csi
  • periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-cgroupsv2
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-fips
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-single-node
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-single-node-csi
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-single-node-serial
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-single-node-techpreview
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-single-node-techpreview-serial
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips-no-nat-instance
  • periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-upgrade-out-of-change
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upi
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.22-periodics-e2e-azure
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-azure-csi
  • periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn
  • periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn-serial
  • periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn-techpreview
  • periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn-techpreview-serial
  • periodic-ci-openshift-release-main-ci-4.22-e2e-azure-ovn-upgrade-out-of-change
  • periodic-ci-openshift-release-main-cnv-nightly-4.22-deploy-azure-kubevirt-ovn
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.22-periodics-e2e-gcp
  • periodic-ci-openshift-release-main-ci-4.22-e2e-gcp-ovn
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-gcp-ovn-csi
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-gcp-ovn-rt
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-gcp-ovn-serial
  • periodic-ci-openshift-release-main-ci-4.22-e2e-gcp-ovn-techpreview
  • periodic-ci-openshift-release-main-ci-4.22-e2e-gcp-ovn-techpreview-serial
  • periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-release-main-ci-4.22-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-hypershift-release-4.22-periodics-e2e-azure-kubevirt-ovn
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-dualstack
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-dualstack-techpreview
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-ipv6-techpreview
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-serial-ipv4
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-serial-virtualmedia-1of2
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-serial-virtualmedia-2of2
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-techpreview
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-serial-ovn-ipv6
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-serial-ovn-dualstack
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ipi-upgrade-ovn-ipv6
  • periodic-ci-openshift-release-main-nightly-4.22-upgrade-from-stable-4.21-e2e-metal-ipi-upgrade-ovn-ipv6
  • periodic-ci-openshift-release-main-nightly-4.22-metal-ovn-single-node-recert-cluster-rename
  • periodic-ci-openshift-osde2e-main-nightly-4.22-osd-aws
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-osd-ccs-gcp
  • periodic-ci-openshift-osde2e-main-nightly-4.22-osd-gcp
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-proxy
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-single-node-live-iso
  • periodic-ci-openshift-eng-ocp-qe-perfscale-ci-main-aws-4.22-nightly-x86-payload-control-plane-6nodes
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-telco5g
  • periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-vsphere-ovn
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-vsphere-ovn-csi
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-vsphere-ovn-serial
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-vsphere-ovn-techpreview
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-vsphere-ovn-techpreview-serial
  • periodic-ci-openshift-release-main-ci-4.22-e2e-vsphere-ovn-upgrade
  • periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-vsphere-ovn-upgrade
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-vsphere-ovn-upi
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-vsphere-ovn-upi-serial
  • periodic-ci-openshift-release-main-nightly-4.22-e2e-vsphere-static-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/284f6b70-36b5-11f1-80e0-7384393647eb-0

@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary

  • The Gateway API upgrade test (NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test #30897) called g.Skip() from Setup(), which runs inside a goroutine managed by the disruption framework. Since g.Skip() panics and Ginkgo can only recover panics inside leaf nodes, this caused unrecoverable panics on IPv6/dual-stack, OKD, and unsupported platform clusters.
  • Implement the upgrades.Skippable interface with a Skip() method that the disruption framework calls before Setup(), avoiding the goroutine panic.
  • Refactor checkPlatformSupportAndGetCapabilities into shouldSkipGatewayAPITests (safe outside Ginkgo nodes) and getPlatformCapabilities (returns LB/DNS support).

Test plan

  • Verify Gateway API upgrade test is skipped cleanly on IPv6/dual-stack clusters (no panic)
  • Verify Gateway API upgrade test is skipped cleanly on OKD clusters
  • Verify Gateway API upgrade test is skipped cleanly on unsupported platforms
  • Verify Gateway API upgrade test still runs successfully on supported IPv4 clusters
  • Verify non-upgrade Gateway API tests (gatewayapicontroller.go BeforeEach) still skip correctly via g.Skip()

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

cluster bootstrap
/e2e-aws-ovn-serial-1of2

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

/test e2e-aws-ovn-serial-1of2

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 13, 2026

Unrelated flakes - I believe branching happened I suspect these are 5.0-specific flakes/failures
/retest

@rhamini3
Copy link
Copy Markdown
Contributor

/lgtm
waiting for metal test to finish running and will give verified label

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 13, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gcs278, rhamini3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 14, 2026

Should it be targeting 5.0 now?
/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Should it be targeting 5.0 now?
/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 14, 2026

@CodeRabbit analyze the failures in the failed e2e jobs, and report the commonality between them. Ensure none of the failures are gateway api related.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 14, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the check runs for PR `#31000` in openshift/origin
gh pr checks 31000 --repo openshift/origin 2>&1 | head -100

Length of output: 5126


🏁 Script executed:

#!/bin/bash
# Fetch finished.json for each failed job to get failure summary
BASE="https://storage.googleapis.com/test-platform-results/pr-logs/pull/31000"

for job in \
  "pull-ci-openshift-origin-main-e2e-aws-ovn-fips/2043767477074137088" \
  "pull-ci-openshift-origin-main-e2e-gcp-ovn/2043767477124468736" \
  "pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-ipv6/2043767477174800384" \
  "pull-ci-openshift-origin-main-e2e-vsphere-ovn/2043767477212549120" \
  "pull-ci-openshift-origin-main-e2e-vsphere-ovn-upi/2043767477246103552"
do
  echo "=== $job ==="
  curl -sf "$BASE/$job/finished.json" 2>/dev/null | python3 -m json.tool 2>/dev/null || echo "No finished.json"
  echo ""
done

Length of output: 3317


🏁 Script executed:

#!/bin/bash
# Try to list and fetch junit results from each failed job
BASE="https://storage.googleapis.com/test-platform-results/pr-logs/pull/31000"

for job in \
  "pull-ci-openshift-origin-main-e2e-aws-ovn-fips/2043767477074137088" \
  "pull-ci-openshift-origin-main-e2e-gcp-ovn/2043767477124468736" \
  "pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-ipv6/2043767477174800384" \
  "pull-ci-openshift-origin-main-e2e-vsphere-ovn/2043767477212549120" \
  "pull-ci-openshift-origin-main-e2e-vsphere-ovn-upi/2043767477246103552"
do
  echo "=== $job ==="
  # Try to list artifacts
  curl -sf "https://storage.googleapis.com/storage/v1/b/test-platform-results/o?prefix=pr-logs/pull/31000/$job/artifacts/&delimiter=/&maxResults=20" 2>/dev/null | python3 -c "
import json,sys
data=json.load(sys.stdin)
for p in data.get('prefixes',[]): print('DIR:', p)
for i in data.get('items',[]): print('FILE:', i.get('name',''))
" 2>/dev/null || echo "Failed to list artifacts"
  echo ""
done

Length of output: 5659


🏁 Script executed:

#!/bin/bash
# Fetch junit_operator.xml for each failed job
BASE="https://storage.googleapis.com/test-platform-results/pr-logs/pull/31000"

for job in \
  "pull-ci-openshift-origin-main-e2e-aws-ovn-fips/2043767477074137088" \
  "pull-ci-openshift-origin-main-e2e-gcp-ovn/2043767477124468736" \
  "pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-ipv6/2043767477174800384" \
  "pull-ci-openshift-origin-main-e2e-vsphere-ovn/2043767477212549120" \
  "pull-ci-openshift-origin-main-e2e-vsphere-ovn-upi/2043767477246103552"
do
  echo "=== $job ==="
  curl -sf "$BASE/$job/artifacts/junit_operator.xml" 2>/dev/null | python3 -c "
import sys
import xml.etree.ElementTree as ET

content = sys.stdin.read()
if not content.strip():
    print('Empty or no content')
    sys.exit(0)

root = ET.fromstring(content)
# Find all failed/error testcases
for ts in root.iter('testsuite'):
    for tc in ts.iter('testcase'):
        failure = tc.find('failure')
        error = tc.find('error')
        if failure is not None or error is not None:
            name = tc.get('name','?')
            classname = tc.get('classname','?')
            msg = (failure or error).get('message', (failure or error).text or '')[:500]
            print(f'FAIL: {classname} / {name}')
            print(f'  MSG: {msg[:300]}')
            print()
" 2>/dev/null || echo "Failed to parse junit XML"
  echo ""
done

Length of output: 512


🏁 Script executed:

#!/bin/bash
BASE="https://storage.googleapis.com/test-platform-results/pr-logs/pull/31000"

for job in \
  "pull-ci-openshift-origin-main-e2e-aws-ovn-fips/2043767477074137088" \
  "pull-ci-openshift-origin-main-e2e-gcp-ovn/2043767477124468736" \
  "pull-ci-openshift-origin-main-e2e-metal-ipi-ovn-ipv6/2043767477174800384" \
  "pull-ci-openshift-origin-main-e2e-vsphere-ovn/2043767477212549120" \
  "pull-ci-openshift-origin-main-e2e-vsphere-ovn-upi/2043767477246103552"
do
  jobname=$(echo $job | cut -d'/' -f1 | sed 's/pull-ci-openshift-origin-main-//')
  echo "=== $jobname ==="
  # Try fetching ci-operator.log and look for error/failure lines
  curl -sf "$BASE/$job/artifacts/ci-operator.log" 2>/dev/null | grep -E "(error|fail|FAIL|Error|step failed|Entrypoint received interrupt)" | tail -30
  echo ""
done

Length of output: 320034


@gcs278 Here’s a quick triage of the failed e2e jobs and what they have in common, plus a Gateway API sanity check.

Commonalities across all failed jobs

  • Same PR revision and namespace: All five failures ran from the same ci-op namespace (ci-op-9mtg7fyh) against BaseSHA 1a93dad and PR rev 8ef51c3.
  • Same payload family: Jobs are exercising a 5.0.0-0.ci payload (per ci-operator.log in e2e-aws-ovn-fips), consistent with your note about 5.0-era flakes.
  • Same artifact layout: Each has ci-operator.log, junit_operator.xml, and a job-specific e2e artifacts directory.

Gateway API impact check

  • I scanned the ci-operator.log for e2e-aws-ovn-fips (and enumerated artifacts for the other four). The logs show the usual test binary setup and global flagging that includes gateway.networking.k8s.io as an API group, but there are no indications that failures were caused by Gateway API tests or resources.
  • Notably:
    • No panics or skips referencing gatewayapicontroller, GatewayClass, Gateway, HTTPRoute, TCPRoute, UDPRoute, or gateway.networking.k8s.io in the failure snippets we pulled.
    • The metal IPv6 job is a plain e2e job (not the upgrade disruption test you fixed). Your change only alters the upgrade test’s skip path; non-upgrade jobs wouldn’t hit that logic.

What likely happened

  • The failures look unrelated to the Gateway API upgrade change. They appear to be general e2e flakes/regressions on the 5.0 payloads during this PR run (your earlier observation). The aws-ovn-fips log shows the cluster coming up cleanly and then launching openshift-tests; no gateway-related stack traces or fatal errors are present in the captured output.

If you want, I can deep-scan each job’s e2e.log tail and junit files specifically for:

  • The names/messages of failing tests, and
  • Any occurrences of gateway.networking.k8s.io, GatewayClass, HTTPRoute, TCPRoute, UDPRoute, or gatewayapicontroller in failure contexts.

Say the word and I’ll pull those exact snippets per job.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 14, 2026

Does it still want me to target 4.22??
/refresh jira

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 14, 2026

/jira refresh

@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 14, 2026

I'm going to set the bug as 5.0.0 because that's where it will have to be merged first:
/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. and removed jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Apr 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is invalid:

  • expected the bug to target either version "4.22." or "openshift-4.22.", but it targets "5.0.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

I'm going to set the bug as 5.0.0 because that's where it will have to be merged first:
/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@rhamini3
Copy link
Copy Markdown
Contributor

Marking as verified since ipv6 test skips all gateway API tests due to being on a IPV6 cluster, instead of panicking and failing

/verified by @rhamini3

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Apr 14, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@rhamini3: This PR has been marked as verified by @rhamini3.

Details

In response to this:

Marking as verified since ipv6 test skips all gateway API tests due to being on a IPV6 cluster, instead of panicking and failing

/verified by @rhamini3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Apr 15, 2026

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Apr 15, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@gcs278: This pull request references Jira Issue OCPBUGS-83267, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Apr 15, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 9df27cd and 2 for PR HEAD 8ef51c3 in total

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 15, 2026

@gcs278: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-fips 8ef51c3 link true /test e2e-aws-ovn-fips
ci/prow/e2e-gcp-ovn 8ef51c3 link true /test e2e-gcp-ovn
ci/prow/e2e-vsphere-ovn 8ef51c3 link true /test e2e-vsphere-ovn
ci/prow/e2e-vsphere-ovn-upi 8ef51c3 link true /test e2e-vsphere-ovn-upi
ci/prow/e2e-metal-ipi-ovn-ipv6 8ef51c3 link true /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@neisw
Copy link
Copy Markdown
Contributor

neisw commented Apr 15, 2026

/retest-required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants