feat(client): reload in-cluster CA bundle on rotation (rustls-tls) by chrnorm · Pull Request #1962 · kube-rs/kube

chrnorm · 2026-03-17T15:49:30Z

Motivation

Config::incluster() reads /var/run/secrets/kubernetes.io/serviceaccount/ca.crt once at startup and bakes the bytes into a RootCertStore. After the cluster CA rotates, new TLS handshakes fail with cert errors until the process restarts. The projected service account volume already swaps the file in place — kube just never re-reads it.

TokenFile already solves the symmetric problem for the sibling token file in the same projected volume (re-reads every 60s). This PR adds the same treatment for ca.crt.

Closes #1953. Related client-go issue: kubernetes/kubernetes#119483.

Solution

Config.root_cert_file: Option<PathBuf> — new field, set automatically by Config::incluster(). Takes precedence over root_cert for server cert verification when set.
ReloadingVerifier — a ServerCertVerifier that rebuilds an inner WebPkiServerVerifier on a ~60s timer. On reload failure it keeps serving with the stale roots rather than failing closed.
rustls-tls only — openssl-tls path is unchanged.
Config is now #[non_exhaustive] — per review feedback on the issue, so future field additions don't break downstream struct literals again. Users who were constructing Config { ... } directly need to switch to Config::new() + field mutation (already recommended by the existing docs).

Config::incluster() reads /var/run/secrets/kubernetes.io/serviceaccount/ca.crt once and bakes the bytes into a RootCertStore. After CA rotation, new TLS handshakes fail until the process restarts. TokenFile already re-reads the sibling token file in that same projected volume every 60s. This adds the symmetric piece for ca.crt: - Config.root_cert_file: Option<PathBuf>, set by Config::incluster() - ReloadingVerifier: ServerCertVerifier that rebuilds an inner WebPkiServerVerifier on a 60s timer, keeps stale roots on reload failure - rustls-tls only; openssl-tls unchanged Config is now #[non_exhaustive] so this field addition (and future ones) doesn't break downstream struct literals again. Closes kube-rs#1953 Signed-off-by: Chris Norman <[email protected]>

codecov · 2026-03-17T16:02:37Z

Codecov Report

❌ Patch coverage is 78.12500% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.4%. Comparing base (288053e) to head (bc0e7db).
⚠️ Report is 9 commits behind head on main.

Files with missing lines	Patch %	Lines
kube-client/src/client/tls.rs	81.5%	10 Missing ⚠️
kube-client/src/client/config_ext.rs	57.2%	3 Missing ⚠️
kube-client/src/config/mod.rs	66.7%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##            main   #1962     +/-   ##
=======================================
+ Coverage   76.4%   76.4%   +0.1%     
=======================================
  Files         89      89             
  Lines       8540    8602     +62     
=======================================
+ Hits        6520    6568     +48     
- Misses      2020    2034     +14

Files with missing lines	Coverage Δ
kube-client/src/config/incluster_config.rs	`67.5% <ø> (ø)`
kube-client/src/config/mod.rs	`54.6% <66.7%> (-0.1%)`	⬇️
kube-client/src/client/config_ext.rs	`52.5% <57.2%> (-0.1%)`	⬇️
kube-client/src/client/tls.rs	`86.1% <81.5%> (-3.2%)`	⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

doxxx93 · 2026-03-19T09:17:16Z

+                let guard = self.inner.read().unwrap();
+                if guard.1.elapsed() < Self::RELOAD_INTERVAL {
+                    return guard.0.clone();
+                }
+            }
+            let mut guard = self.inner.write().unwrap();


nit: Consider using .unwrap_or_else(|e| e.into_inner()) instead of .unwrap() on both the read lock (L153) and write lock (L158).

Realistically there is no panic path inside the write-lock critical section, so poisoning is extremely unlikely. But this verifier sits on the critical path of every TLS handshake — if the lock were ever poisoned:

.unwrap() → panic → process crash

.unwrap_or_else(|e| e.into_inner()) → falls back to stale roots → still serves during CA overlap period, and even after overlap it fails with a TLS error (retryable) rather than a panic (process death)

Two-line change, zero cost on the happy path, and consistent with the "keep stale on failure" policy already applied to file-reload errors in L162-166.

doxxx93

Clean approach — rustls's ServerCertVerifier trait makes this much simpler than the equivalent client-go fix (kubernetes/kubernetes#119483, which took ~2.5 years to land). The double-check pattern in current() is correct, the fail-open policy on reload errors mirrors TokenFile, and the test coverage hits the key scenarios.

Left one minor nit on lock poison handling (L153/L158), but not blocking — the realistic chance of triggering it is near zero.

doxxx93 reviewed Mar 19, 2026

View reviewed changes

doxxx93 approved these changes Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(client): reload in-cluster CA bundle on rotation (rustls-tls)#1962

feat(client): reload in-cluster CA bundle on rotation (rustls-tls)#1962
chrnorm wants to merge 1 commit intokube-rs:mainfrom
chrnorm:incluster-ca-reload

chrnorm commented Mar 17, 2026

Uh oh!

codecov Bot commented Mar 17, 2026 •

edited

Loading

Uh oh!

doxxx93 Mar 19, 2026

Uh oh!

doxxx93 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

chrnorm commented Mar 17, 2026

Motivation

Solution

Uh oh!

codecov Bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

doxxx93 Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

doxxx93 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Mar 17, 2026 •

edited

Loading