Skip to content

feat: move mock images to official repo#852

Open
anirudh240 wants to merge 5 commits intovolcano-sh:mainfrom
anirudh240:feat/move-mock-images-v2
Open

feat: move mock images to official repo#852
anirudh240 wants to merge 5 commits intovolcano-sh:mainfrom
anirudh240:feat/move-mock-images-v2

Conversation

@anirudh240
Copy link
Copy Markdown

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
move vllm-mock and sglang-mock source code to hack/mock-images and add a workflow to build and push them to ghcr.io/volcano-sh. update all image references from personal registries accordingly
Which issue(s) this PR fixes:
Fixes #849

Special notes for your reviewer:
The vllm-mock source was extracted directly from the existing ghcr.io/yaozengzeng/vllm-mock:latest image since the source repo was private. The behavior is unchanged.
Does this PR introduce a user-facing change?:


Copilot AI review requested due to automatic review settings April 1, 2026 00:40
@volcano-sh-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hzxuzhonghu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request migrates the vllm-mock image to the volcano-sh registry across documentation, examples, and E2E tests. It also introduces new mock server implementations for SGLang and vLLM to support runtime verification and Prometheus metrics simulation. The review feedback identifies a missing /health endpoint required for Kubernetes readiness probes, a copy-paste error in the DEPLOYMENT_NAME environment variable initialization, and incorrect assertions in the test suite that do not match the mock server's output.

Comment on lines +148 to +149


Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The mock server is missing a /health endpoint. This is required because the readiness probes in the deployment templates (e.g., pkg/model-booster-controller/convert/templates/vllm.yaml) expect a /health endpoint on port 8000. Without this, the pods will fail to become ready.

Suggested change
@app.route('/health')
def health():
return "ok", 200

overrides = {}

MODEL_NAME = os.getenv('MODEL_NAME', 'deepseek-r1-1-5b')
DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The DEPLOYMENT_NAME variable is incorrectly initialized using the MODEL_NAME environment variable. This appears to be a copy-paste error and will cause the mock to query the wrong deployment for replica counts if MODEL_NAME is set.

Suggested change
DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')
DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME', 'deepseek-r1')

# the metrics and results in lots of meaningless requests that we do not want to log.
def disable_endpoint_logs():
"""Disable logs for requests to specific endpoints."""
disabled_endpoints = ('/', '/healthz', '/metrics', '/v1/models')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add /health to the list of disabled endpoints to suppress logs for health checks, ensuring consistency with the endpoint used by Kubernetes probes.

Suggested change
disabled_endpoints = ('/', '/healthz', '/metrics', '/v1/models')
disabled_endpoints = ('/', '/health', '/healthz', '/metrics', '/v1/models')

Comment on lines +23 to +29
self.assertIn(
f'vllm:request_success_total{{finished_reason="stop",model_name="llama2-70b"}} {expected_total / replica}',
data)
self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}',
data)
self.assertIn(
f'vllm:avg_generation_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}', data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test assertions are incorrect. They expect labels (finished_reason) and a model name (llama2-70b) that are not produced by the current implementation in app.py. Furthermore, the expected values do not match the default behavior where replicas defaults to 1, resulting in a value of 100.0 instead of 33.33.

Suggested change
self.assertIn(
f'vllm:request_success_total{{finished_reason="stop",model_name="llama2-70b"}} {expected_total / replica}',
data)
self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}',
data)
self.assertIn(
f'vllm:avg_generation_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}', data)
# assert metric value
self.assertIn(
f'vllm:request_success_total{{model_name="deepseek-r1-1-5b"}} 100.0',
data)
self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="deepseek-r1-1-5b"}} 100.0',
data)
self.assertIn(
f'vllm:avg_generation_throughput_toks_per_s{{model_name="deepseek-r1-1-5b"}} 100.0', data)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR moves vLLM and SGLang mock image source code from personal registries to the official volcano-sh repository, enabling centralized maintenance and easier future updates. The mock images are used for end-to-end testing without deploying actual LLM servers.

Changes:

  • Added vllm-mock source code (hack/mock-images/vllm-mock/) with Python Flask application, simulator, tests, and Dockerfile
  • Added sglang-mock source code (hack/mock-images/sglang-mock/) with Python Flask application and Dockerfile
  • Updated all image references across YAML test files and examples from personal registries to ghcr.io/volcano-sh/vllm-mock:latest and ghcr.io/volcano-sh/sglang-mock:latest
  • Added GitHub Actions workflow to automatically build and push mock images to GHCR

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
hack/mock-images/vllm-mock/ Complete vLLM mock server implementation with Flask app, simulator, tests, and Docker configuration
hack/mock-images/sglang-mock/ SGLang mock server implementation with Flask app and Docker configuration
.github/workflows/build-mock-images.yml Automated workflow to build and push images to ghcr.io/volcano-sh
test/e2e/router/testdata/*.yaml Updated all vLLM mock image references to official registry
examples/kthena-router/*.yaml Updated all vLLM mock image references to official registry
test/e2e/controller-manager/model_booster_test.go Updated mock image references in test fixtures
docs/kthena/**/*.yaml Updated mock image references across all documentation versions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

branches:
- main
paths:
- kthena/hack/mock-images/**
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow file references kthena/hack/mock-images/ in both the paths trigger and the context fields, but the actual mock image files are located at hack/mock-images/, not under a kthena/ subdirectory. This mismatch will cause the workflow trigger to never fire when mock image files are changed, and the build context paths will be incorrect. The paths should be hack/mock-images/** for both the trigger and the build contexts should be hack/mock-images/vllm-mock and hack/mock-images/sglang-mock.

Copilot uses AI. Check for mistakes.
overrides = {}

MODEL_NAME = os.getenv('MODEL_NAME', 'deepseek-r1-1-5b')
DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable DEPLOYMENT_NAME is being set using the MODEL_NAME environment variable instead of DEPLOYMENT_NAME. This is inconsistent and will cause incorrect deployment name lookups. It should use os.getenv('DEPLOYMENT_NAME', 'deepseek-r1') to get the correct deployment name.

Suggested change
DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')
DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME', 'deepseek-r1')

Copilot uses AI. Check for mistakes.
{
"name": "request_prompt_tokens",
"type": "histogram",
"description": "Histogram of number of prefill tokens processed..",
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Double period at end of metric description. Should be a single period.

Copilot uses AI. Check for mistakes.
{
"name": "request_generation_tokens",
"type": "histogram",
"description": "Histogram of number of generation tokens processed..",
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Double period at end of metric description. Should be a single period.

Copilot uses AI. Check for mistakes.
# You can use this mock server to test the inference router without deploying a real LLM server.
#
# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.

Suggested change
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

Copilot uses AI. Check for mistakes.
# You can use this mock server to test the inference router without deploying a real LLM server.
#
# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.

Suggested change
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

Copilot uses AI. Check for mistakes.
# You can use this mock server to test the inference router without deploying a real LLM server.
#
# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.

Suggested change
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

Copilot uses AI. Check for mistakes.
# You can use this mock server to test the inference router without deploying a real LLM server.
#
# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.

Suggested change
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

Copilot uses AI. Check for mistakes.
- name: Build and push vllm-mock
uses: docker/build-push-action@v5
with:
context: kthena/hack/mock-images/vllm-mock
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build context path is incorrect. It should be hack/mock-images/vllm-mock instead of kthena/hack/mock-images/vllm-mock, as the actual files are located at hack/mock-images/, not under a kthena/ subdirectory.

Copilot uses AI. Check for mistakes.
- name: Build and push sglang-mock
uses: docker/build-push-action@v5
with:
context: kthena/hack/mock-images/sglang-mock
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build context path is incorrect. It should be hack/mock-images/sglang-mock instead of kthena/hack/mock-images/sglang-mock, as the actual files are located at hack/mock-images/, not under a kthena/ subdirectory.

Copilot uses AI. Check for mistakes.
@FAUST-BENCHOU
Copy link
Copy Markdown
Contributor

/hold
we need to disscuss with community about this first

@hzxuzhonghu
Copy link
Copy Markdown
Member

I think this idea is good, but the image should be milti-arch.

And since the mock is originated from vllm/sglang, should keep their copyright.

@anirudh240
Copy link
Copy Markdown
Author

yes, i kinda messed up anyway

@FAUST-BENCHOU
Copy link
Copy Markdown
Contributor

also aibirx's copyright i think

@@ -0,0 +1,36 @@
import threading
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should support specifying model name through env, ref: https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/LLM-Mock-ds1.5b.yaml#L26

And it should also be reflected in the exposure metrics.

@YaoZengzeng
Copy link
Copy Markdown
Member

Also make sure all tests and check passed first.

Copilot AI review requested due to automatic review settings April 1, 2026 13:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 38 out of 38 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (5)

pytest.ini:1

  • With testpaths = python, the newly added tests under hack/mock-images/** (e.g., hack/mock-images/vllm-mock/test_app.py) will not be collected by default, so the mock server behavior isn't actually covered in CI. If these tests are intended to run in this repo, include hack/mock-images in testpaths (or remove this restriction and rely on standard discovery) so they are executed.
    python/kthena/pyproject.toml:1
  • Pytest configuration is now duplicated in two places (pytest.ini and [tool.pytest.ini_options]), which can drift and confuse tooling (especially when options differ later). Prefer keeping pytest configuration in a single location (either root pytest.ini or pyproject.toml) to avoid split-brain configuration.
    hack/mock-images/vllm-mock/test_app.py:1
  • This import depends on the current working directory / PYTHONPATH being set so that app.py is importable as a top-level module. When running tests from repo root (typical CI), this will likely fail because hack/mock-images/vllm-mock is not on sys.path and the directory name mock-images cannot be imported as a package. Consider adjusting the test to add the local directory to sys.path (or restructure the mock image code into an importable module) so the test can run reliably.
    hack/mock-images/vllm-mock/test_app.py:1
  • The test is brittle because it asserts exact float string formatting from /metrics (which can vary, e.g., 33.333333333333336 vs 33.33333333333333) and it uses a raw print that will spam CI logs on failures. Prefer parsing the Prometheus exposition format (or using regex with a numeric tolerance) and avoid unconditional prints (use assertions with helpful messages or logging only on failure).
    hack/mock-images/vllm-mock/test_app.py:1
  • The test is brittle because it asserts exact float string formatting from /metrics (which can vary, e.g., 33.333333333333336 vs 33.33333333333333) and it uses a raw print that will spam CI logs on failures. Prefer parsing the Prometheus exposition format (or using regex with a numeric tolerance) and avoid unconditional prints (use assertions with helpful messages or logging only on failure).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

total = overrides.get("total", 100.0)
model_name = overrides.get("model_name", MODEL_NAME)
# calculate metrics with potential overrides
success_total = overrides.get("success_total", total / replicas)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

success_total can raise a ZeroDivisionError when the Deployment is scaled to replicas = 0 (a valid state in Kubernetes). Align it with the throughput metrics by guarding replicas > 0 (or defaulting to 1) so /metrics remains scrapeable during scale-to-zero.

Suggested change
success_total = overrides.get("success_total", total / replicas)
success_total = overrides.get("success_total", total / replicas if replicas > 0 else 0)

Copilot uses AI. Check for mistakes.
try:
from kubernetes import client, config
except Exception as e:
print(f"Failed to import kubernetes, skip: {e}")
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the kubernetes import fails, client/config are not defined, but /metrics later unconditionally references client.AppsV1Api() and config.load_incluster_config(), which will crash with NameError. Either (a) make the import failure fatal in this mock image, or (b) set client = None/config = None and branch in /metrics and __main__ to skip k8s calls when unavailable.

Suggested change
print(f"Failed to import kubernetes, skip: {e}")
print(f"Failed to import kubernetes, exiting: {e}", file=sys.stderr)
sys.exit(1)

Copilot uses AI. Check for mistakes.
Comment on lines +339 to +341
@app.route('/set_metrics', methods=['POST'])
def set_metrics():
global overrides
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/set_metrics allows arbitrary metric overrides without authentication, which enables anyone who can reach the service to manipulate metrics and potentially break e2e assertions/scrapes. If the mock server supports an --api_key, consider protecting this endpoint with @auth.login_required (or restricting it behind a separate debug flag) to avoid unintended exposure.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +3
flask
Flask-HTTPAuth
kubernetes
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These dependencies are unpinned, which makes the image build non-reproducible and can cause sudden breakages when upstream releases change behavior (notably kubernetes and Flask-related packages). Pin versions (or at least constrain major versions) to keep ghcr.io/volcano-sh/vllm-mock:latest stable across rebuilds.

Suggested change
flask
Flask-HTTPAuth
kubernetes
flask>=2.0,<3.0
Flask-HTTPAuth>=4.0,<5.0
kubernetes>=28.0.0,<29.0.0

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +33
- name: Build and push vllm-mock
uses: docker/build-push-action@v5
with:
context: hack/mock-images/vllm-mock
push: true
tags: ghcr.io/volcano-sh/vllm-mock:latest
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only pushing the :latest tag makes it hard to correlate a deployed image back to a specific commit and to roll back safely. Consider also pushing an immutable tag (e.g., :sha-${{ github.sha }} and/or a semver/date tag) alongside :latest so consumers can pin and debugging is easier.

Copilot uses AI. Check for mistakes.
move vllm-mock and sglang-mock source code to hack/mock-images
and add a workflow to build and push them to ghcr.io/volcano-sh.
update all image references from personal registries accordingly

Signed-off-by: Anirudh <2410030013@klh.edu.in>
Signed-off-by: Anirudh <2410030013@klh.edu.in>
Signed-off-by: Anirudh <2410030013@klh.edu.in>
@anirudh240 anirudh240 force-pushed the feat/move-mock-images-v2 branch from 222a6fd to eb92204 Compare April 1, 2026 13:46
Signed-off-by: Anirudh <2410030013@klh.edu.in>
Copilot AI review requested due to automatic review settings April 1, 2026 13:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 38 out of 38 changed files in this pull request and generated 10 comments.

Comments suppressed due to low confidence (3)

hack/mock-images/vllm-mock/test_app.py:1

  • This test hard-codes replica = 3, but /metrics falls back to DEFAULT_REPLICAS when it can’t read the Deployment (which will typically be 1 in unit tests), so the expected value will be wrong and the test will fail. Make the test set DEFAULT_REPLICAS=3 (env var) or call /set_metrics to force deterministic values before asserting.
    hack/mock-images/vllm-mock/test_app.py:1
  • This test hard-codes replica = 3, but /metrics falls back to DEFAULT_REPLICAS when it can’t read the Deployment (which will typically be 1 in unit tests), so the expected value will be wrong and the test will fail. Make the test set DEFAULT_REPLICAS=3 (env var) or call /set_metrics to force deterministic values before asserting.
    python/kthena/pyproject.toml:1
  • This introduces pytest discovery configuration in pyproject.toml while a repo-level pytest.ini is also added. Having two places to define testpaths makes test discovery dependent on where pytest is invoked from. Consider consolidating to a single configuration source (preferably root-level for the repo), and ensure it includes any intended tests under hack/mock-images/** (otherwise hack/mock-images/vllm-mock/test_app.py won’t be collected by default).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -0,0 +1,3 @@
{
"huggingface_token": "your huggingface token"
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Committing a config.json meant for sensitive values (even with a placeholder) makes it easy to accidentally add real secrets later and also bakes this file into the built image via COPY . .. Prefer reading the token from an environment variable (or a mounted secret) and commit only a config.example.json (or remove the file entirely from the image build context).

Suggested change
"huggingface_token": "your huggingface token"
"note": "Configure your Hugging Face token via the HUGGINGFACE_TOKEN environment variable; do not store tokens in this file."

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +32

try:
from kubernetes import client, config
except Exception as e:
print(f"Failed to import kubernetes, skip: {e}")
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the kubernetes import fails, client/config are undefined and /metrics will raise NameError (caught and silently masked), making the behavior dependent on an import side-effect. Consider explicitly setting client = None on import failure and guarding AppsV1Api() usage (or failing fast at startup if k8s support is required).

Suggested change
try:
from kubernetes import client, config
except Exception as e:
print(f"Failed to import kubernetes, skip: {e}")
client = None
config = None
try:
from kubernetes import client, config
except Exception as e:
print(f"Failed to import kubernetes, skip: {e}")
client = None
config = None

Copilot uses AI. Check for mistakes.
Comment on lines +477 to +487
@app.route('/metrics')
def metrics():
# get deployment information
try:
apps_v1 = client.AppsV1Api()
resp = apps_v1.read_namespaced_deployment(DEPLOYMENT_NAME, NAMESPACE)
replicas = resp.spec.replicas if resp.spec.replicas is not None else 1
except Exception as e:
#print(f"Failed to get deployment information: {DEPLOYMENT_NAME=} {NAMESPACE=} error={str(e)}")
#print(f"Due to the failure, replicas {DEFAULT_REPLICAS} will be used to calculate metrics")
replicas = DEFAULT_REPLICAS
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the kubernetes import fails, client/config are undefined and /metrics will raise NameError (caught and silently masked), making the behavior dependent on an import side-effect. Consider explicitly setting client = None on import failure and guarding AppsV1Api() usage (or failing fast at startup if k8s support is required).

Copilot uses AI. Check for mistakes.
Comment on lines +352 to +356
@app.route('/set_metrics', methods=['POST'])
def set_metrics():
global overrides
# Get JSON data from the request
data = request.json
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /set_metrics endpoint allows unauthenticated mutation of global metrics state. Even for a mock server, this can be abused if the service is reachable in-cluster or externally. Consider requiring @auth.login_required (consistent with the other endpoints) or gating this endpoint behind an explicit env flag (disabled by default).

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +23
from flask import Response
from random import randint
import os

try:
from flask import Flask
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback try/except won’t work as intended because from flask import Response will already raise ImportError before reaching the guarded import. Move Flask/Response imports into the same guarded block (or remove the redundant check entirely).

Suggested change
from flask import Response
from random import randint
import os
try:
from flask import Flask
from random import randint
import os
try:
from flask import Flask, Response

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +25
from flask import Response
from random import randint
import os

try:
from flask import Flask
except ImportError:
raise ImportError("flask is required. Run: pip install flask")
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback try/except won’t work as intended because from flask import Response will already raise ImportError before reaching the guarded import. Move Flask/Response imports into the same guarded block (or remove the redundant check entirely).

Suggested change
from flask import Response
from random import randint
import os
try:
from flask import Flask
except ImportError:
raise ImportError("flask is required. Run: pip install flask")
from random import randint
import os
try:
from flask import Flask, Response
except ImportError as e:
raise ImportError("flask is required. Run: pip install flask") from e

Copilot uses AI. Check for mistakes.
with:
context: hack/mock-images/sglang-mock
push: true
tags: ghcr.io/volcano-sh/sglang-mock:latest
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Publishing only the :latest tag makes rollbacks and provenance difficult (you can’t reliably map a running image back to a commit). Consider also tagging with an immutable identifier (e.g., :${{ github.sha }}) and optionally keeping :latest as a moving alias.

Copilot uses AI. Check for mistakes.
#
# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/kthena/tree/main/hack/mock-images/vllm-mock`.
# Move the image to kthena registry once it's public.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note on line 6 is now outdated/confusing since the image is already in the volcano-sh GHCR namespace and the PR adds an official build/push workflow. Update or remove the comment to reflect the current publishing location/source of truth.

Suggested change
# Move the image to kthena registry once it's public.
# The image is published in the `volcano-sh` GHCR namespace by the official kthena build/publish workflow.

Copilot uses AI. Check for mistakes.
"deepseek-r1-7b": "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B"
}

# Polifill the necessary arguments.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct the typo 'Polifill' to 'Polyfill'.

Suggested change
# Polifill the necessary arguments.
# Polyfill the necessary arguments.

Copilot uses AI. Check for mistakes.

def read_configs(file_path):
"""
Reads a JSON file that store sensitive information.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix grammar in the docstring: 'store' -> 'stores'.

Suggested change
Reads a JSON file that store sensitive information.
Reads a JSON file that stores sensitive information.

Copilot uses AI. Check for mistakes.
Signed-off-by: Anirudh <2410030013@klh.edu.in>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move vLLM mock and sglang mock in kthena or volcano's repo

6 participants