Skip to content

[SPARK-56275] Set appProtocol on spark-connect service port#599

Open
j1wonpark wants to merge 1 commit intoapache:mainfrom
j1wonpark:add-app-protocol-to-services
Open

[SPARK-56275] Set appProtocol on spark-connect service port#599
j1wonpark wants to merge 1 commit intoapache:mainfrom
j1wonpark:add-app-protocol-to-services

Conversation

@j1wonpark
Copy link
Copy Markdown
Contributor

@j1wonpark j1wonpark commented Mar 28, 2026

What changes were proposed in this pull request?

Set appProtocol to "grpc" on the Spark Connect driver service port (spark-connect) in SparkAppResourceSpec. Ports that already have an appProtocol set by the user are left unchanged.

Why are the changes needed?

Without appProtocol: grpc, service meshes (e.g. Istio, Linkerd) treat gRPC traffic as opaque TCP. This prevents L7-aware load balancing, protocol-specific retries, and proper gRPC health checking for Spark Connect sessions.

This change is limited to the spark-connect port because gRPC is unambiguous regardless of SSL configuration. Other ports (HTTP-based) are left untouched since their protocol depends on spark.ssl.* settings and will be addressed separately.

Does this PR introduce any user-facing change?

No. The appProtocol field is advisory and has no effect in environments without a service mesh.

How was this patch tested?

  • testSparkConnectAppProtocol: verifies the spark-connect port receives appProtocol: "grpc" and other ports remain unset.
  • testSparkConnectPreservesExistingAppProtocol: verifies that a pre-existing appProtocol value is not overwritten.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.6

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't reference a 3rd party library inside ASF project because we are independent projects and respect their IPs. We don't want to copy and paste from someone-else's effort, @j1wonpark .

See also: kubeflow/spark-operator#2823

@j1wonpark
Copy link
Copy Markdown
Contributor Author

@dongjoon-hyun
Thank you for the feedback. I’ve removed the reference to the third-party project from the PR description.

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot hard-cord http naively because Apache Spark has a security feature like spark.ssl.ui.enabled=true (and corresponding configurations) for both SparkApp and SparkCluster CRDs.

@dongjoon-hyun
Copy link
Copy Markdown
Member

Unfortunately, this PR seems to be generated by the hallucination . Have you use this code in your company production, @j1wonpark ?

@j1wonpark
Copy link
Copy Markdown
Contributor Author

j1wonpark commented Mar 30, 2026

Unfortunately, this PR seems to be generated by the hallucination . Have you use this code in your company production, @j1wonpark ?

Thank you for the feedback. You're right that hardcoding http is incorrect when SSL is enabled — I overlooked that Spark can handle SSL internally via spark.ssl.*.enabled configurations.

We haven't used this in production yet. We're currently testing the operator with Istio, and the primary motivation was appProtocol: grpc on the spark-connect port. Without it, Istio treats gRPC as plain TCP, which breaks L7 load balancing.

I'd like to revise this PR to only set appProtocol on ports where the value is unambiguous regardless of SSL settings: tcp for RPC/block-manager ports and grpc for spark-connect. HTTP-based ports would be left unset. Does that sound reasonable?

@dongjoon-hyun
Copy link
Copy Markdown
Member

Shall we focus on spark-connect in this PR, @j1wonpark ? It's good enough contribution as your first commit. 😄 We can consider the others later independently.

@j1wonpark j1wonpark changed the title [SPARK-56275] Set appProtocol on Kubernetes service ports [SPARK-56275] Set appProtocol on spark-connect service port Mar 31, 2026
@j1wonpark j1wonpark force-pushed the add-app-protocol-to-services branch from b9c1e43 to 89b4afa Compare March 31, 2026 05:32
@j1wonpark
Copy link
Copy Markdown
Contributor Author

@dongjoon-hyun Thank you for the guidance! Narrowed the scope to only set appProtocol: "grpc" on the spark-connect port. All other changes have been removed.

@j1wonpark j1wonpark requested a review from dongjoon-hyun March 31, 2026 06:48
Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @j1wonpark .

For this one, I reviewed again but I'd like to recommend you to make a PR to the main Apache Spark repository for DriverServiceFeatureStep class instead of here. You can revise the JIRA issue, SPARK-56275, as a subtask of SPARK-54137 .

Here is the original code.

https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/DriverServiceFeatureStep.scala

I can help you there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants