Skip to content

Add nydusd_image_info metric mapping daemons to served images#739

Open
Fricounet wants to merge 1 commit intocontainerd:mainfrom
DataDog:fricounet/upstream/nydusd-image-info
Open

Add nydusd_image_info metric mapping daemons to served images#739
Fricounet wants to merge 1 commit intocontainerd:mainfrom
DataDog:fricounet/upstream/nydusd-image-info

Conversation

@Fricounet
Copy link
Copy Markdown
Contributor

@Fricounet Fricounet commented Apr 16, 2026

Overview

Currently, there's no way to map a daemon id to an image ref using the prometheus metrics which makes debugging a bit cumbersome.
Add a new Prometheus gauge metric nydusd_image_info that maps nydus daemon IDs
to the image references they serve. For dedicated daemons this is a 1:1 mapping;
for shared daemons it is 1:N.

Related Issues

None.

Change Details

  • New nydusd_image_info gauge with labels {daemon_id, image_ref}, set to 1
    for each active daemon-to-image association.
  • Metric is collected in AddRafsInstance and deleted in RemoveRafsInstance.
  • DestroyDaemon now calls RemoveRafsInstance for any remaining RAFS instances
    before UmountRafsInstances, covering error/forced-shutdown paths where
    individual Umount calls were skipped.

Test Results

Tested locally, metrics are added as expected and cleaned up when the image is removed:

$ curl -s http://localhost:9110/v1/metrics | grep image_info
# HELP nydusd_image_info Mapping of nydus daemon to served image references.
# TYPE nydusd_image_info gauge
nydusd_image_info{daemon_id="d7gal69jjoq0eer34190",image_ref="localhost:5000/my-image:nydus"} 1

Change Type

  • Bug Fix
  • Feature Addition
  • Documentation Update
  • Code Refactoring
  • Performance Improvement
  • Other (please describe)

Self-Checklist

  • I have run a code style check and addressed any warnings/errors.
  • I have added appropriate comments to my code (if applicable).
  • I have updated the documentation (if applicable).
  • I have written appropriate unit tests.

Add a new gauge metric mapping daemon IDs to served image
references. For dedicated daemons this is 1:1, for shared
daemons it is 1:N.

The metric is collected on AddRafsInstance and deleted on
RemoveRafsInstance. DestroyDaemon now also cleans up any
remaining RAFS instances to handle error/forced-shutdown
paths where individual Umount calls were skipped.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 16, 2026

Codecov Report

❌ Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 24.42%. Comparing base (fc330cc) to head (07e2aad).
⚠️ Report is 23 commits behind head on main.

Files with missing lines Patch % Lines
pkg/metrics/collector/daemon.go 0.00% 4 Missing ⚠️
pkg/daemon/daemon.go 0.00% 3 Missing ⚠️
pkg/manager/manager.go 0.00% 2 Missing ⚠️
pkg/metrics/collector/collector.go 0.00% 2 Missing ⚠️
pkg/metrics/registry/registry.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #739      +/-   ##
==========================================
+ Coverage   22.02%   24.42%   +2.39%     
==========================================
  Files         130      132       +2     
  Lines       11931    12249     +318     
==========================================
+ Hits         2628     2992     +364     
+ Misses       8960     8893      -67     
- Partials      343      364      +21     
Files with missing lines Coverage Δ
pkg/metrics/registry/registry.go 0.00% <0.00%> (ø)
pkg/manager/manager.go 0.00% <0.00%> (ø)
pkg/metrics/collector/collector.go 0.00% <0.00%> (ø)
pkg/daemon/daemon.go 8.04% <0.00%> (+8.04%) ⬆️
pkg/metrics/collector/daemon.go 0.00% <0.00%> (ø)

... and 15 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant