Skip to content

feat: support port-based routing for Gateway API routes#2703

Open
johannes-engler-mw wants to merge 20 commits intoapache:masterfrom
johannes-engler-mw:feat/2639
Open

feat: support port-based routing for Gateway API routes#2703
johannes-engler-mw wants to merge 20 commits intoapache:masterfrom
johannes-engler-mw:feat/2639

Conversation

@johannes-engler-mw
Copy link
Copy Markdown

Type of change:

  • Bugfix
  • New feature provided
  • Improve performance
  • Backport patches
  • Documentation
  • Refactor
  • Chore
  • CI/CD or Tests

What this PR does / why we need it:

This change implements support for port-based routing in HTTPRoute and GRPCRoute by leveraging the 'server_port' variable in APISIX. When a route targets a specific Gateway listener (via 'sectionName' or matching), the translator now adds a condition to ensure the route only matches traffic on that specific port.

Key changes:

  • Update translator to add 'server_port' matching variables to routes.
  • Enhance controller to track all matching listeners for a route.
  • Update E2E test framework to support multi-port APISIX and in-cluster testing.
  • Add comprehensive E2E tests for port-based routing scenarios.

It fixes #2639.

Pre-submission checklist:

  • Did you explain what problem does this PR solve? Or what new features have been added?
  • Have you added corresponding test cases?
  • Have you modified the corresponding document?
  • Is this PR backward compatible?

johannes-engler-mw and others added 5 commits January 16, 2026 09:11
This change implements support for port-based routing in HTTPRoute and GRPCRoute
by leveraging the 'server_port' variable in APISIX. When a route targets a
specific Gateway listener (via 'sectionName' or matching), the translator now
adds a condition to ensure the route only matches traffic on that specific port.

Key changes:
- Update translator to add 'server_port' matching variables to routes.
- Enhance controller to track all matching listeners for a route.
- Update E2E test framework to support multi-port APISIX and in-cluster testing.
- Add comprehensive E2E tests for port-based routing scenarios.
Reflect the new port-based routing capability in Gateway API documentation.
- Updated concepts/gateway-api.md to mark listener port as partially supported.
- Updated reference/example.md to explain that port is used for matching.
This fixes a linting error flagged by errcheck where the return value of
fmt.Sscanf was being ignored.
This ensures the alpine/curl image is present locally before attempting to
load it into the Kind cluster, preventing build failures in CI.
- Add comprehensive unit tests for addServerPortVars function
- Fix two-port test to use expectedPortList + ElementsMatch for non-deterministic map ordering
- Consolidate redundant single-port tests (removed HTTP/HTTPS duplicates)
- Fix image tag mismatch: alpine/curl:flattened -> alpine/curl:latest
- Fix comment typo: port 9100 -> port 9081

Co-Authored-By: Claude <noreply@anthropic.com>
@johannes-engler-mw
Copy link
Copy Markdown
Author

Hi there,

While reviewing the feat/2639 changes, I noticed a bug in the status propagation logic that affects all route controllers.

Issue

The acceptStatus variable is shared across all parent gateways in the reconciliation loop. If one gateway fails processing, all parent references are marked as "Not Accepted", which violates the Gateway API specification that requires per-parent status tracking.

Example from httproute_controller.go:163-184:

acceptStatus := ResourceStatus{
    status: true,
    msg:    "Route is accepted",
}
// ...
for _, gateway := range gateways {
    if err := ProcessGatewayProxy(...); err != nil {
        acceptStatus.status = false  // Overwrites status for ALL parents
        acceptStatus.msg = err.Error()
    }
}

Affected Controllers

This pattern exists in all 5 route controllers:

  • httproute_controller.go
  • grpcroute_controller.go
  • tcproute_controller.go
  • tlsroute_controller.go
  • udproute_controller.go

Impact

  • HTTPRoute (highest impact): The feat/2639 port-based routing feature encourages multi-listener/multi-gateway topologies, making partial failures more likely
  • Other route types: Lower impact in practice, but still a spec violation

Proposed Fix

Track status per-gateway using a map:

gatewayStatuses := make(map[string]ResourceStatus)
for _, gateway := range gateways {
    if err := ProcessGatewayProxy(...); err != nil {
        gatewayStatuses[gateway.Gateway.Name] = ResourceStatus{status: false, msg: err.Error()}
    } else {
        gatewayStatuses[gateway.Gateway.Name] = ResourceStatus{status: true, msg: "Route is accepted"}
    }
}
// Use per-gateway status when setting parent status conditions

Question

How would you like to proceed?

  1. Fix only HTTPRoute as part of feat/2639 (since it's most impacted by the new feature)
  2. Fix all route controllers in a separate PR
  3. Fix all route controllers as part of feat/2639

Happy to submit a fix either way.

@johannes-engler-mw
Copy link
Copy Markdown
Author

Hi again,

I just realized this PR probably adds a breaking change:

The Change
In internal/controller/utils.go, I updated ParseRouteParentRefs to align with the Gateway API Specification (https://gateway-api.sigs.k8s.io/api-types/httproute/#attaching-to-gateways). Specifically, if a Route's ParentRef does
not specify a sectionName, the spec requires it to attach to all compatible listeners.

My implementation now does exactly this:

1 // internal/controller/utils.go
2
3 // Always add to the list of matched listeners
4 matchedListeners = append(matchedListeners, listener)
5 matched = true
6
7 // Only break if sectionName was explicitly specified
8 if sectionNameSpecified {
9     break
10 }

The Risk
Previously, the controller would stop after the first match. This change means that if a user has a Gateway with multiple listeners (e.g., public-80 and internal-8080) and a Route without a sectionName, that Route will now
automatically be exposed on both listeners.

This is technically "correct" per the spec, but for users relying on the old "first match only" behavior, this could inadvertently expose internal services to public ports upon upgrade.

Proposal
I believe we should address this to avoid surprised users. I see two main options:

  1. Feature Flag: I can add a flag like enable_legacy_listener_matching (defaulting to false or true depending on our safety preference) to let users opt-in to the old behavior.
  2. Documentation/Release Note: We explicitly document this as a breaking change required for Gateway API conformance.

I'm leaning towards option 1 for safety, or at least a very prominent warning. Let me know your preference and I can update the PR accordingly.

@johannes-engler-mw
Copy link
Copy Markdown
Author

Anyone got time to review my PR and answer my questions?

@johannes-engler-mw
Copy link
Copy Markdown
Author

Any update?

@johannes-engler-mw
Copy link
Copy Markdown
Author

@ronething @AlinsRan Can someone of you please have a look?

Copy link
Copy Markdown
Contributor

@Baoyuantop Baoyuantop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. addServerPortVars should only be injected when a route explicitly specifies a sectionName (or has multiple listeners with different ports), not for all routes.
  2. GRPCRoute translator lacks the addServerPortVars call, resulting in incomplete functionality.

- inject server_port vars only for explicit sectionName or multi-port listeners

- add server_port parity for GRPCRoute translation

- use matched listeners for HTTPRoute hostname filtering and avoid zero-value listener fallback

- add translator/controller unit tests and extend HTTPRoute/GRPCRoute e2e coverage
@johannes-engler-mw
Copy link
Copy Markdown
Author

@Baoyuantop Thanks for the feedback, I have added your suggestions.

@AlinsRan
Copy link
Copy Markdown
Contributor

From a Gateway API perspective, Gateway.spec.listeners[].port represents the infrastructure-facing listener configuration of a Gateway instance. Conceptually, it describes which ports the Gateway exposes externally, rather than acting as a route-level filtering condition.

Mapping listener.port directly to APISIX server_port as a route match condition is technically feasible and pragmatic, especially in the current single-instance deployment model. However, this approach effectively interprets the listener port as a data plane route constraint, which slightly shifts the abstraction boundary:

  • Gateway API intent: infrastructure declaration (listener capability)
  • Current implementation: route-level match constraint

This works well in a shared-instance model (one APISIX instance serving multiple listeners), but may introduce constraints if we later want to:

  • Instantiate multiple Gateway instances mapped to independent APISIX deployments
  • Dynamically manage listener ports at the infrastructure level
  • Decouple Gateway lifecycle from route-level matching logic

To balance practicality and future extensibility, I suggest making this behavior configurable (e.g., via a feature gate or controller configuration option).

For example:

  • enableListenerPortMatch: true → map listener.port to server_port match condition (current PR behavior)
  • enableListenerPortMatch: false → do not inject server_port match; treat listener.port purely as infrastructure-level metadata

This would allow:

  1. Backward-compatible and pragmatic default behavior
  2. Flexibility for future multi-instance or infra-driven Gateway implementations
  3. Clear separation between infrastructure semantics and route matching semantics

@johannes-engler-mw
Copy link
Copy Markdown
Author

Thanks @AlinsRan for the feedback on making this behavior configurable. I've evaluated two approaches and would like to propose a mode enum configuration rather than a simple boolean toggle.

Proposed: listener_port_match_mode (enum: auto | explicit | off)

Instead of a binary on/off, a mode enum provides three distinct behaviors:

Mode Behavior
auto (default) Inject server_port vars when route explicitly targets a listener (sectionName or port) or when multiple listener ports are matched. This is the current behavior and maintains backward compatibility.
explicit Inject server_port vars only when route explicitly targets a listener. No heuristic injection for multiple ports.
off Never inject server_port vars. Listener ports are treated as infrastructure-level metadata only.

Why enum over boolean?

  1. Gateway API semantics: The spec distinguishes between implicit attachment (route attaches to gateway, all listeners) and explicit targeting
    (route specifies sectionName or port). A boolean conflates these — explicit mode preserves this distinction while off provides the pure
    infrastructure semantics.

  2. Avoids future breaking changes: If we ship a boolean and later need the explicit behavior, we'd have to break the config schema or add a
    second field.

  3. Minimal additional complexity: The difference is essentially a switch statement, a type definition, and a validation check. The config
    wiring is nearly identical to a boolean approach.

Behavior matrix

Scenario auto explicit off
Explicit sectionName targeting inject inject no injection (warning)
Explicit port targeting inject inject no injection (warning)
Multiple listener ports, no explicit targeting inject no injection no injection
Single listener port, no explicit targeting no injection no injection no injection

Config example

listener_port_match_mode: auto  # Options: auto, explicit, off. Default: auto.

The default auto maintains full backward compatibility. Users who need infrastructure-only semantics can set off, and those who want only explicit intent-based injection can use explicit.

Are you comfortable with this approach, or would you prefer the simpler boolean toggle (enable_listener_port_match: true/false)?

BR,
Johannes

…tRefs

When an HTTPRoute/GRPCRoute has multiple parentRefs pointing at the same
Gateway (e.g. two sectionNames, or one with and one without), the listeners
collected from each RouteParentRefContext can overlap. The controllers were
blindly appending them all, producing duplicate entries in tctx.Listeners and
consequently duplicate hostnames in GRPC route translation.

Add appendUniqueListeners helper that deduplicates by listener Name (unique
within a Gateway per Gateway API spec) and use it in both httproute_controller
and grpcroute_controller instead of plain append.
@Baoyuantop Baoyuantop requested a review from AlinsRan February 24, 2026 06:53
@AlinsRan
Copy link
Copy Markdown
Contributor

谢谢@AlinsRan感谢您就如何使此行为可配置提出的反馈意见。我评估了两种方案,并建议采用模式枚举配置,而不是简单的布尔切换。

建议:(listener_port_match_mode枚举:auto| explicit| off

模式枚举不是简单的二元开/关,而是提供三种不同的行为:

模式 行为
auto(默认) server_port当路由显式指向某个监听器(sectionName``or port匹配到多个监听器端口时,注入变量。这是当前的行为,并保持了向后兼容性。
explicit 当路由明确指向监听器时才注入server_port变量。不支持多端口启发式注入。****
off 切勿注入server_port变量。监听端口仅作为基础设施级别的元数据处理。

为什么选择枚举类型而不是布尔类型?

  1. 网关 API 语义:规范区分了隐式连接(路由连接到网关,所有监听器)和显式目标
    (路由指定目标sectionName或监听器port)。布尔值可以混淆这两种explicit方式——mode 保留了这种区别,而 trueoff提供了纯粹的
    基础架构语义。
  2. 避免将来出现破坏性变更:如果我们发布了一个布尔值,以后又需要该explicit行为,我们就必须破坏配置模式或添加
    第二个字段。
  3. 增加的复杂度极小:区别主要在于一条switch语句、一个类型定义和一个验证检查。配置
    方式与布尔方法几乎完全相同。

行为矩阵

设想 auto explicit off
明确sectionName目标 注入 注入 禁止注射(警告)
明确port目标 注入 注入 禁止注射(警告)
多个监听端口,无明确目标 注入 无需注射 无需注射
单个监听端口,无显式目标 无需注射 无需注射 无需注射

配置示例

listener_port_match_mode: auto  # Options: auto, explicit, off. Default: auto.

默认的自动模式保持完全向后兼容性。只需要基础架构语义的用户可以取消设置,而只需要显式的基于意图的注入的用户可以启用显式模式。

您觉得这种方法可以接受吗?还是更喜欢简单的布尔切换(enable_listener_port_match: true/false)?

BR, 约翰内斯

Very good.

Extract backend-processing block into a new helper method
translateBackendsToUpstreams to bring TranslateHTTPRoute's cyclomatic
complexity from 31 down to 9, fixing the gocyclo lint failure.
Introduces a configurable listener_port_match_mode setting (auto/explicit/off)
that controls when server_port route vars are injected from Gateway listener
ports, allowing operators to tune port-based routing behaviour without
redeploying.

- auto (default): inject when parentRefs explicitly target a listener via
  sectionName or port, or when multiple listener ports need disambiguation;
  preserves existing behaviour
- explicit: inject only on explicit listener targeting (sectionName/port)
- off: never inject server_port vars

Also fixes hasExplicitListenerTarget to skip non-Gateway parentRefs (e.g.
GAMMA Service mesh refs), preventing unintended server_port injection in
mixed Gateway + Service parentRef routes.
@johannes-engler-mw
Copy link
Copy Markdown
Author

Hi @AlinsRan, i have implemented the feature as discussed. Please review the implementation.

@Baoyuantop
Copy link
Copy Markdown
Contributor

Please resolve the code conflict.

Keep the refactored translateBackendsToUpstreams call from feat/2639
instead of the inline backend-processing block from origin/master.
@johannes-engler-mw
Copy link
Copy Markdown
Author

@Baoyuantop Done!

@johannes-engler-mw
Copy link
Copy Markdown
Author

I have set a higher timeout of 30s for the failing e2e test. Please rerun the workflow.

@johannes-engler-mw
Copy link
Copy Markdown
Author

I have fixed the linting issue, please rerun the workflow.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements port-based routing for Gateway API HTTPRoute/GRPCRoute by injecting APISIX server_port route vars based on matched Gateway listener ports, with a new controller configuration switch to control injection behavior.

Changes:

  • Add listener_port_match_mode config (auto / explicit / off) and propagate it through manager → provider → translator.
  • Track all matched Gateway listeners per ParentRef and inject server_port vars into generated APISIX routes where needed.
  • Expand E2E scaffolding/manifests and add E2E + unit tests covering multi-listener port routing scenarios.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/e2e/scaffold/scaffold.go Adds a helper to create httpexpect clients against arbitrary APISIX service ports.
test/e2e/scaffold/k8s.go Adds in-cluster curl runner for testing server_port behavior without port-forward artifacts.
test/e2e/scaffold/grpc.go Adds gRPC request helper that can target specific service ports via port-forward tunnel.
test/e2e/gatewayapi/httproute.go Adds comprehensive E2E coverage for HTTPRoute sectionName/port-based listener targeting across multiple ports.
test/e2e/gatewayapi/grpcroute.go Adds E2E coverage for GRPCRoute sectionName/port-based listener targeting across multiple ports.
test/e2e/framework/manifests/ingress.yaml Makes ADC exec timeout configurable via E2E_EXEC_ADC_TIMEOUT.
test/e2e/framework/manifests/apisix.yaml Configures APISIX to listen on multiple HTTP ports and exposes the extra service port for E2E.
internal/provider/options.go Adds provider option for ListenerPortMatchMode.
internal/provider/apisix/provider.go Passes port-match mode into translator initialization.
internal/manager/run.go Wires controller config ListenerPortMatchMode into provider options.
internal/controller/utils_hostname_test.go Adds unit tests for listener/hostname selection helpers used by Gateway API reconciliation.
internal/controller/utils.go Tracks multiple matched listeners per ParentRef and updates hostname helper logic accordingly.
internal/controller/httproute_controller.go Populates translation context with all matched listeners for HTTPRoute reconciliation.
internal/controller/grpcroute_controller.go Populates translation context with all matched listeners for GRPCRoute reconciliation.
internal/controller/context.go Extends RouteParentRefContext to carry multiple matched listeners.
internal/controller/config/types.go Introduces ListenerPortMatchMode type and constants.
internal/controller/config/config_test.go Adds validation/defaulting tests for listener_port_match_mode.
internal/controller/config/config.go Defaults/validates listener_port_match_mode.
internal/adc/translator/translator.go Adds mode normalization and decision logic for whether to inject server_port vars.
internal/adc/translator/httproute_test.go Adds unit tests verifying server_port injection behavior for HTTPRoute across modes.
internal/adc/translator/httproute.go Adds server_port vars injection and refactors backend→upstream translation into a helper.
internal/adc/translator/grpcroute_test.go Adds unit tests verifying server_port injection behavior for GRPCRoute across modes.
internal/adc/translator/grpcroute.go Adds server_port vars injection for GRPCRoute.
internal/adc/translator/annotations_test.go Updates translator initialization in tests and adds unit tests for server_port var helpers.
docs/en/latest/reference/example.md Updates docs to reflect listener port routing behavior and config knob.
docs/en/latest/reference/configuration-file.md Documents listener_port_match_mode.
docs/en/latest/concepts/gateway-api.md Updates Gateway API support matrix entry for listener port behavior.
config/samples/config.yaml Adds sample listener_port_match_mode configuration.
Makefile Ensures alpine/curl image is available for in-cluster curl E2E helper.
.github/workflows/benchmark-test.yml Increases ADC timeout in benchmark workflow via E2E_EXEC_ADC_TIMEOUT.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

BaseURL: u.String(),
Client: &http.Client{
Transport: &http.Transport{TLSClientConfig: &tls.Config{InsecureSkipVerify: true}},
Timeout: 3 * time.Second,
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewAPISIXClientForPort creates an httpexpect client without the redirect policy used by the other APISIX clients (they set CheckRedirect to http.ErrUseLastResponse). This means tests using this helper may silently follow redirects while other helpers don’t, leading to inconsistent assertions. Consider reusing the same http.Client settings (Transport + CheckRedirect) as NewAPISIXClient/NewAPISIXClientOnTCPPort for parity.

Suggested change
Timeout: 3 * time.Second,
Timeout: 3 * time.Second,
CheckRedirect: func(req *http.Request, via []*http.Request) error {
return http.ErrUseLastResponse
},

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented


t.AttachBackendTrafficPolicyToUpstream(backend.BackendRef, tctx.BackendTrafficPolicies, upstream)
upstream.Nodes = upNodes
upstream.Scheme = appProtocolToUpstreamScheme(protocol)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In translateBackendsToUpstreams, the upstream scheme is being set unconditionally from the Service/AppProtocol (upstream.Scheme = appProtocolToUpstreamScheme(protocol)). This overwrites any scheme that may have been set by BackendTrafficPolicy (AttachBackendTrafficPolicyToUpstream sets upstream.Scheme), changing behavior compared to the previous logic which only filled scheme when it was empty. Preserve policy-configured scheme by only setting scheme when it’s not already set (or use cmp.Or/empty-check).

Suggested change
upstream.Scheme = appProtocolToUpstreamScheme(protocol)
if upstream.Scheme == "" {
upstream.Scheme = appProtocolToUpstreamScheme(protocol)
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1182 to +1196
// appendUniqueListeners appends listeners to the target slice, skipping any
// listener whose Name already exists in the target. Listener names are unique
// within a Gateway per the Gateway API specification.
func appendUniqueListeners(target []gatewayv1.Listener, source ...gatewayv1.Listener) []gatewayv1.Listener {
seen := make(map[gatewayv1.SectionName]struct{}, len(target))
for _, l := range target {
seen[l.Name] = struct{}{}
}
for _, l := range source {
if _, exists := seen[l.Name]; !exists {
seen[l.Name] = struct{}{}
target = append(target, l)
}
}
return target
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

appendUniqueListeners de-dupes solely by listener Name. Listener names are only guaranteed unique within a single Gateway, but a Route can reference multiple Gateways via multiple parentRefs; two different Gateways can both have a listener named "http" on different ports. In that case the current de-dupe will drop one listener and its port, causing incorrect server_port var injection (missing ports) and potentially wrong routing. Consider either (a) not de-duping listeners at all when building tctx.Listeners (since downstream logic already de-dupes ports via a map), or (b) de-dupe on a composite key that can’t collide across Gateways (e.g., gateway namespace/name + listener name, or at least listener name+port+protocol).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented

@Baoyuantop Baoyuantop requested a review from Copilot March 11, 2026 14:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 30 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +329 to +331
// Create new tunnel for custom port
serviceName := s.dataplaneService.Name
tunnel := k8s.NewTunnel(s.kubectlOptions, k8s.ResourceTypeService, serviceName, 0, port)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says it 'uses existing tunnels if available, otherwise creates a new one', but custom ports always create a brand-new tunnel and there’s no caching/reuse on subsequent calls. Either adjust the comment to reflect the actual behavior, or add a simple per-port tunnel cache on the Scaffold to avoid repeated port-forward setup in tests that call this helper multiple times.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented

@Baoyuantop
Copy link
Copy Markdown
Contributor

Hi @johannes-engler-mw, please fix the failed CI

@johannes-engler-mw
Copy link
Copy Markdown
Author

Done, please retry CI.

@johannes-engler-mw
Copy link
Copy Markdown
Author

Fixed the failing e2e tests, it wasnt a regression but more an test config problem.

Please rerun CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: sectionname in parentrefs is ignored

4 participants