Prefer most recent start time on duplicate Pod metrics by dippynark · Pull Request #1778 · kubernetes-sigs/metrics-server

dippynark · 2026-03-19T11:54:23Z

What this PR does / why we need it:

There are situations when Metrics Server can scrape duplicate Pod metrics from different nodes (e.g. kubelet metrics have become stale on a particular node due to some garbage collection bug). In this case Metrics Server ignores the duplicate:

metrics-server/pkg/scraper/scraper.go

Lines 171 to 174 in 78192ed

    
           if _, podFind := res.Pods[podRef]; podFind { 
        
           	klog.ErrorS(nil, "Got duplicate pod point", "pod", klog.KRef(podRef.Namespace, podRef.Name)) 
        
           	continue 
        
           }

However, because Metrics Server scrapes Pod metrics from each node in parallel, the metrics that we end up using for calculating utilisation is fairly random and can change from scrape to scrape due to small request latency differences. This can lead to utilisation changing dramatically from scrape to scrape (e.g. in the case of CPU we get utilisation of 0 if we scrape the stale node first twice in a row, or we don't get any metrics at all if we scrape the current node and then the stale node due to this PR).

Instead, this PR uses the metrics corresponding to the Pod with the latest container start time.

I am running this PR in our dev environment and it is allowing Metrics Server to carry on providing accurate information despite kubelet serving stale metrics.

k8s-ci-robot · 2026-03-19T11:54:32Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dippynark
Once this PR has been reviewed and has the lgtm label, please assign dgrisonnet for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2026-03-19T11:54:32Z

This issue is currently awaiting triage.

If metrics-server contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2026-03-19T11:54:33Z

Welcome @dippynark!

It looks like this is your first PR to kubernetes-sigs/metrics-server 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/metrics-server has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2026-03-19T11:54:34Z

Hi @dippynark. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

dippynark · 2026-03-19T12:22:47Z

/assign @RainbowMango

dippynark · 2026-03-19T13:42:30Z

/assign @stevehipwell

dippynark · 2026-03-19T13:42:55Z

/unassign @RainbowMango

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 19, 2026

k8s-ci-robot requested a review from RainbowMango March 19, 2026 11:54

k8s-ci-robot requested a review from serathius March 19, 2026 11:54

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 19, 2026

github-project-automation Bot added this to SIG Instrumentation Mar 19, 2026

github-project-automation Bot moved this to Needs Triage in SIG Instrumentation Mar 19, 2026

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 19, 2026

dippynark changed the title ~~WIP: fix: Improve duplicate Pod handling~~ Prefer most recent start time on duplicate Pods metrics Mar 19, 2026

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 19, 2026

k8s-ci-robot assigned RainbowMango Mar 19, 2026

k8s-ci-robot assigned stevehipwell Mar 19, 2026

k8s-ci-robot unassigned RainbowMango Mar 19, 2026

This was referenced Mar 19, 2026

Prefer highest value on duplicate container metrics #1779

Closed

Metrics-Server scraping duplicate entry of Pod which causes the abnormality in Pod's CPU utilization kubernetes/kubernetes#134518

Open

dippynark changed the title ~~Prefer most recent start time on duplicate Pods metrics~~ Prefer most recent start time on duplicate Pod metrics Mar 22, 2026

Prefer most recent start time on duplicate Pod metrics

0b15f49

dippynark force-pushed the improve-duplicate-pod-handling branch from 89a76bb to 0b15f49 Compare March 23, 2026 17:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefer most recent start time on duplicate Pod metrics#1778

Prefer most recent start time on duplicate Pod metrics#1778
dippynark wants to merge 1 commit intokubernetes-sigs:masterfrom
dippynark:improve-duplicate-pod-handling

dippynark commented Mar 19, 2026 •

edited

Loading

Uh oh!

k8s-ci-robot commented Mar 19, 2026

Uh oh!

k8s-ci-robot commented Mar 19, 2026

Uh oh!

k8s-ci-robot commented Mar 19, 2026

Uh oh!

k8s-ci-robot commented Mar 19, 2026

Uh oh!

dippynark commented Mar 19, 2026

Uh oh!

dippynark commented Mar 19, 2026

Uh oh!

dippynark commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	if _, podFind := res.Pods[podRef]; podFind {
	klog.ErrorS(nil, "Got duplicate pod point", "pod", klog.KRef(podRef.Namespace, podRef.Name))
	continue
	}

Conversation

dippynark commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Mar 19, 2026

Uh oh!

k8s-ci-robot commented Mar 19, 2026

Uh oh!

k8s-ci-robot commented Mar 19, 2026

Uh oh!

k8s-ci-robot commented Mar 19, 2026

Uh oh!

dippynark commented Mar 19, 2026

Uh oh!

dippynark commented Mar 19, 2026

Uh oh!

dippynark commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dippynark commented Mar 19, 2026 •

edited

Loading