feat: move mock images to official repo by anirudh240 · Pull Request #852 · volcano-sh/kthena

anirudh240 · 2026-04-01T00:40:52Z

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
move vllm-mock and sglang-mock source code to hack/mock-images and add a workflow to build and push them to ghcr.io/volcano-sh. update all image references from personal registries accordingly
Which issue(s) this PR fixes:
Fixes #849

Special notes for your reviewer:
The vllm-mock source was extracted directly from the existing ghcr.io/yaozengzeng/vllm-mock:latest image since the source repo was private. The behavior is unchanged.
Does this PR introduce a user-facing change?:

volcano-sh-bot · 2026-04-01T00:40:58Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign hzxuzhonghu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist

Code Review

This pull request migrates the vllm-mock image to the volcano-sh registry across documentation, examples, and E2E tests. It also introduces new mock server implementations for SGLang and vLLM to support runtime verification and Prometheus metrics simulation. The review feedback identifies a missing /health endpoint required for Kubernetes readiness probes, a copy-paste error in the DEPLOYMENT_NAME environment variable initialization, and incorrect assertions in the test suite that do not match the mock server's output.

gemini-code-assist · 2026-04-01T00:42:13Z

hack/mock-images/vllm-mock/app.py

+
+

The mock server is missing a /health endpoint. This is required because the readiness probes in the deployment templates (e.g., pkg/model-booster-controller/convert/templates/vllm.yaml) expect a /health endpoint on port 8000. Without this, the pods will fail to become ready.

Suggested change

@app.route('/health')

def health():

return "ok", 200

gemini-code-assist · 2026-04-01T00:42:13Z

hack/mock-images/vllm-mock/app.py

+overrides = {}
+
+MODEL_NAME = os.getenv('MODEL_NAME', 'deepseek-r1-1-5b')
+DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')


The DEPLOYMENT_NAME variable is incorrectly initialized using the MODEL_NAME environment variable. This appears to be a copy-paste error and will cause the mock to query the wrong deployment for replica counts if MODEL_NAME is set.

Suggested change

DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')

DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME', 'deepseek-r1')

gemini-code-assist · 2026-04-01T00:42:13Z

hack/mock-images/vllm-mock/app.py

+# the metrics and results in lots of meaningless requests that we do not want to log.
+def disable_endpoint_logs():
+    """Disable logs for requests to specific endpoints."""
+    disabled_endpoints = ('/', '/healthz', '/metrics', '/v1/models')


Add /health to the list of disabled endpoints to suppress logs for health checks, ensuring consistency with the endpoint used by Kubernetes probes.

Suggested change

disabled_endpoints = ('/', '/healthz', '/metrics', '/v1/models')

disabled_endpoints = ('/', '/health', '/healthz', '/metrics', '/v1/models')

gemini-code-assist · 2026-04-01T00:42:13Z

hack/mock-images/vllm-mock/test_app.py

+        self.assertIn(
+            f'vllm:request_success_total{{finished_reason="stop",model_name="llama2-70b"}} {expected_total / replica}',
+            data)
+        self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}',
+                      data)
+        self.assertIn(
+            f'vllm:avg_generation_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}', data)


The test assertions are incorrect. They expect labels (finished_reason) and a model name (llama2-70b) that are not produced by the current implementation in app.py. Furthermore, the expected values do not match the default behavior where replicas defaults to 1, resulting in a value of 100.0 instead of 33.33.

Suggested change

self.assertIn(

f'vllm:request_success_total{{finished_reason="stop",model_name="llama2-70b"}} {expected_total / replica}',

data)

self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}',

data)

self.assertIn(

f'vllm:avg_generation_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}', data)

# assert metric value

self.assertIn(

f'vllm:request_success_total{{model_name="deepseek-r1-1-5b"}} 100.0',

data)

self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="deepseek-r1-1-5b"}} 100.0',

data)

self.assertIn(

f'vllm:avg_generation_throughput_toks_per_s{{model_name="deepseek-r1-1-5b"}} 100.0', data)

Copilot

Pull request overview

This PR moves vLLM and SGLang mock image source code from personal registries to the official volcano-sh repository, enabling centralized maintenance and easier future updates. The mock images are used for end-to-end testing without deploying actual LLM servers.

Changes:

Added vllm-mock source code (hack/mock-images/vllm-mock/) with Python Flask application, simulator, tests, and Dockerfile
Added sglang-mock source code (hack/mock-images/sglang-mock/) with Python Flask application and Dockerfile
Updated all image references across YAML test files and examples from personal registries to ghcr.io/volcano-sh/vllm-mock:latest and ghcr.io/volcano-sh/sglang-mock:latest
Added GitHub Actions workflow to automatically build and push mock images to GHCR

Reviewed changes

Copilot reviewed 36 out of 36 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
hack/mock-images/vllm-mock/	Complete vLLM mock server implementation with Flask app, simulator, tests, and Docker configuration
hack/mock-images/sglang-mock/	SGLang mock server implementation with Flask app and Docker configuration
.github/workflows/build-mock-images.yml	Automated workflow to build and push images to ghcr.io/volcano-sh
test/e2e/router/testdata/*.yaml	Updated all vLLM mock image references to official registry
examples/kthena-router/*.yaml	Updated all vLLM mock image references to official registry
test/e2e/controller-manager/model_booster_test.go	Updated mock image references in test fixtures
docs/kthena/*/.yaml	Updated mock image references across all documentation versions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-01T00:43:52Z

.github/workflows/build-mock-images.yml

+    branches:
+      - main
+    paths:
+      - kthena/hack/mock-images/**


The workflow file references kthena/hack/mock-images/ in both the paths trigger and the context fields, but the actual mock image files are located at hack/mock-images/, not under a kthena/ subdirectory. This mismatch will cause the workflow trigger to never fire when mock image files are changed, and the build context paths will be incorrect. The paths should be hack/mock-images/** for both the trigger and the build contexts should be hack/mock-images/vllm-mock and hack/mock-images/sglang-mock.

Copilot · 2026-04-01T00:43:53Z

hack/mock-images/vllm-mock/app.py

+overrides = {}
+
+MODEL_NAME = os.getenv('MODEL_NAME', 'deepseek-r1-1-5b')
+DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')


The variable DEPLOYMENT_NAME is being set using the MODEL_NAME environment variable instead of DEPLOYMENT_NAME. This is inconsistent and will cause incorrect deployment name lookups. It should use os.getenv('DEPLOYMENT_NAME', 'deepseek-r1') to get the correct deployment name.

Suggested change

DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')

DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME', 'deepseek-r1')

Copilot · 2026-04-01T00:43:53Z

hack/mock-images/vllm-mock/app.py

+        {
+            "name": "request_prompt_tokens",
+            "type": "histogram",
+            "description": "Histogram of number of prefill tokens processed..",


Typo: Double period at end of metric description. Should be a single period.

Copilot · 2026-04-01T00:43:53Z

hack/mock-images/vllm-mock/app.py

+        {
+            "name": "request_generation_tokens",
+            "type": "histogram",
+            "description": "Histogram of number of generation tokens processed..",


Typo: Double period at end of metric description. Should be a single period.

Copilot · 2026-04-01T00:43:53Z

examples/kthena-router/LLM-Mock.yaml

 # You can use this mock server to test the inference router without deploying a real LLM server.
 #
-# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
+# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.


The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.

Suggested change

# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.

# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

Copilot · 2026-04-01T00:43:54Z

docs/kthena/versioned_docs/version-v0.3.0/assets/examples/kthena-router/LLM-Mock.yaml

 # You can use this mock server to test the inference router without deploying a real LLM server.
 #
-# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
+# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.


The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.

Suggested change

# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.

# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

Copilot · 2026-04-01T00:43:54Z

docs/kthena/versioned_docs/version-v0.2.0/assets/examples/kthena-router/LLM-Mock.yaml

 # You can use this mock server to test the inference router without deploying a real LLM server.
 #
-# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
+# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.


The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.

Suggested change

# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.

# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

Copilot · 2026-04-01T00:43:54Z

docs/kthena/versioned_docs/version-v0.1.0/assets/examples/kthena-router/LLM-Mock.yaml

 # You can use this mock server to test the inference router without deploying a real LLM server.
 #
-# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
+# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.


The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.

Suggested change

# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.

# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

Copilot · 2026-04-01T00:43:54Z

.github/workflows/build-mock-images.yml

+      - name: Build and push vllm-mock
+        uses: docker/build-push-action@v5
+        with:
+          context: kthena/hack/mock-images/vllm-mock


The build context path is incorrect. It should be hack/mock-images/vllm-mock instead of kthena/hack/mock-images/vllm-mock, as the actual files are located at hack/mock-images/, not under a kthena/ subdirectory.

Copilot · 2026-04-01T00:43:54Z

.github/workflows/build-mock-images.yml

+      - name: Build and push sglang-mock
+        uses: docker/build-push-action@v5
+        with:
+          context: kthena/hack/mock-images/sglang-mock


The build context path is incorrect. It should be hack/mock-images/sglang-mock instead of kthena/hack/mock-images/sglang-mock, as the actual files are located at hack/mock-images/, not under a kthena/ subdirectory.

FAUST-BENCHOU · 2026-04-01T02:48:14Z

/hold
we need to disscuss with community about this first

hzxuzhonghu · 2026-04-01T03:01:48Z

I think this idea is good, but the image should be milti-arch.

And since the mock is originated from vllm/sglang, should keep their copyright.

anirudh240 · 2026-04-01T03:01:56Z

yes, i kinda messed up anyway

FAUST-BENCHOU · 2026-04-01T03:03:46Z

also aibirx's copyright i think

YaoZengzeng · 2026-04-01T06:37:24Z

hack/mock-images/vllm-mock/simulator.py

@@ -0,0 +1,36 @@
+import threading


We should support specifying model name through env, ref: https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/LLM-Mock-ds1.5b.yaml#L26

And it should also be reflected in the exposure metrics.

YaoZengzeng · 2026-04-01T06:40:16Z

Also make sure all tests and check passed first.

Copilot

Pull request overview

Copilot reviewed 38 out of 38 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (5)

pytest.ini:1

With testpaths = python, the newly added tests under hack/mock-images/** (e.g., hack/mock-images/vllm-mock/test_app.py) will not be collected by default, so the mock server behavior isn't actually covered in CI. If these tests are intended to run in this repo, include hack/mock-images in testpaths (or remove this restriction and rely on standard discovery) so they are executed.
python/kthena/pyproject.toml:1
Pytest configuration is now duplicated in two places (pytest.ini and [tool.pytest.ini_options]), which can drift and confuse tooling (especially when options differ later). Prefer keeping pytest configuration in a single location (either root pytest.ini or pyproject.toml) to avoid split-brain configuration.
hack/mock-images/vllm-mock/test_app.py:1
This import depends on the current working directory / PYTHONPATH being set so that app.py is importable as a top-level module. When running tests from repo root (typical CI), this will likely fail because hack/mock-images/vllm-mock is not on sys.path and the directory name mock-images cannot be imported as a package. Consider adjusting the test to add the local directory to sys.path (or restructure the mock image code into an importable module) so the test can run reliably.
hack/mock-images/vllm-mock/test_app.py:1
The test is brittle because it asserts exact float string formatting from /metrics (which can vary, e.g., 33.333333333333336 vs 33.33333333333333) and it uses a raw print that will spam CI logs on failures. Prefer parsing the Prometheus exposition format (or using regex with a numeric tolerance) and avoid unconditional prints (use assertions with helpful messages or logging only on failure).
hack/mock-images/vllm-mock/test_app.py:1
The test is brittle because it asserts exact float string formatting from /metrics (which can vary, e.g., 33.333333333333336 vs 33.33333333333333) and it uses a raw print that will spam CI logs on failures. Prefer parsing the Prometheus exposition format (or using regex with a numeric tolerance) and avoid unconditional prints (use assertions with helpful messages or logging only on failure).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-01T13:44:43Z

hack/mock-images/vllm-mock/app.py

+    total = overrides.get("total", 100.0)
+    model_name = overrides.get("model_name", MODEL_NAME)
+    # calculate metrics with potential overrides
+    success_total = overrides.get("success_total", total / replicas)


success_total can raise a ZeroDivisionError when the Deployment is scaled to replicas = 0 (a valid state in Kubernetes). Align it with the throughput metrics by guarding replicas > 0 (or defaulting to 1) so /metrics remains scrapeable during scale-to-zero.

Suggested change

success_total = overrides.get("success_total", total / replicas)

success_total = overrides.get("success_total", total / replicas if replicas > 0 else 0)

Copilot · 2026-04-01T13:44:44Z

hack/mock-images/vllm-mock/app.py

+try:
+    from kubernetes import client, config
+except Exception as e:
+    print(f"Failed to import kubernetes, skip: {e}")


If the kubernetes import fails, client/config are not defined, but /metrics later unconditionally references client.AppsV1Api() and config.load_incluster_config(), which will crash with NameError. Either (a) make the import failure fatal in this mock image, or (b) set client = None/config = None and branch in /metrics and __main__ to skip k8s calls when unavailable.

Suggested change

print(f"Failed to import kubernetes, skip: {e}")

print(f"Failed to import kubernetes, exiting: {e}", file=sys.stderr)

sys.exit(1)

Copilot · 2026-04-01T13:44:44Z

hack/mock-images/vllm-mock/app.py

+@app.route('/set_metrics', methods=['POST'])
+def set_metrics():
+    global overrides


/set_metrics allows arbitrary metric overrides without authentication, which enables anyone who can reach the service to manipulate metrics and potentially break e2e assertions/scrapes. If the mock server supports an --api_key, consider protecting this endpoint with @auth.login_required (or restricting it behind a separate debug flag) to avoid unintended exposure.

Copilot · 2026-04-01T13:44:44Z

hack/mock-images/vllm-mock/requirements.txt

+flask
+Flask-HTTPAuth
+kubernetes


These dependencies are unpinned, which makes the image build non-reproducible and can cause sudden breakages when upstream releases change behavior (notably kubernetes and Flask-related packages). Pin versions (or at least constrain major versions) to keep ghcr.io/volcano-sh/vllm-mock:latest stable across rebuilds.

Suggested change

flask

Flask-HTTPAuth

kubernetes

flask>=2.0,<3.0

Flask-HTTPAuth>=4.0,<5.0

kubernetes>=28.0.0,<29.0.0

Copilot · 2026-04-01T13:44:45Z

.github/workflows/build-mock-images.yml

+      - name: Build and push vllm-mock
+        uses: docker/build-push-action@v5
+        with:
+          context: hack/mock-images/vllm-mock
+          push: true
+          tags: ghcr.io/volcano-sh/vllm-mock:latest


Only pushing the :latest tag makes it hard to correlate a deployed image back to a specific commit and to roll back safely. Consider also pushing an immutable tag (e.g., :sha-${{ github.sha }} and/or a semver/date tag) alongside :latest so consumers can pin and debugging is easier.

move vllm-mock and sglang-mock source code to hack/mock-images and add a workflow to build and push them to ghcr.io/volcano-sh. update all image references from personal registries accordingly Signed-off-by: Anirudh <2410030013@klh.edu.in>

Signed-off-by: Anirudh <2410030013@klh.edu.in>

Copilot

Pull request overview

Copilot reviewed 38 out of 38 changed files in this pull request and generated 10 comments.

Comments suppressed due to low confidence (3)

hack/mock-images/vllm-mock/test_app.py:1

This test hard-codes replica = 3, but /metrics falls back to DEFAULT_REPLICAS when it can’t read the Deployment (which will typically be 1 in unit tests), so the expected value will be wrong and the test will fail. Make the test set DEFAULT_REPLICAS=3 (env var) or call /set_metrics to force deterministic values before asserting.
hack/mock-images/vllm-mock/test_app.py:1
This test hard-codes replica = 3, but /metrics falls back to DEFAULT_REPLICAS when it can’t read the Deployment (which will typically be 1 in unit tests), so the expected value will be wrong and the test will fail. Make the test set DEFAULT_REPLICAS=3 (env var) or call /set_metrics to force deterministic values before asserting.
python/kthena/pyproject.toml:1
This introduces pytest discovery configuration in pyproject.toml while a repo-level pytest.ini is also added. Having two places to define testpaths makes test discovery dependent on where pytest is invoked from. Consider consolidating to a single configuration source (preferably root-level for the repo), and ensure it includes any intended tests under hack/mock-images/** (otherwise hack/mock-images/vllm-mock/test_app.py won’t be collected by default).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-01T13:57:25Z

hack/mock-images/vllm-mock/config.json

@@ -0,0 +1,3 @@
+{
+    "huggingface_token": "your huggingface token"


Committing a config.json meant for sensitive values (even with a placeholder) makes it easy to accidentally add real secrets later and also bakes this file into the built image via COPY . .. Prefer reading the token from an environment variable (or a mounted secret) and commit only a config.example.json (or remove the file entirely from the image build context).

Suggested change

"huggingface_token": "your huggingface token"

"note": "Configure your Hugging Face token via the HUGGINGFACE_TOKEN environment variable; do not store tokens in this file."

Copilot · 2026-04-01T13:57:26Z

hack/mock-images/vllm-mock/app.py

+
+try:
+    from kubernetes import client, config
+except Exception as e:
+    print(f"Failed to import kubernetes, skip: {e}")


If the kubernetes import fails, client/config are undefined and /metrics will raise NameError (caught and silently masked), making the behavior dependent on an import side-effect. Consider explicitly setting client = None on import failure and guarding AppsV1Api() usage (or failing fast at startup if k8s support is required).

Suggested change

try:

from kubernetes import client, config

except Exception as e:

print(f"Failed to import kubernetes, skip: {e}")

client = None

config = None

try:

from kubernetes import client, config

except Exception as e:

print(f"Failed to import kubernetes, skip: {e}")

client = None

config = None

Copilot · 2026-04-01T13:57:26Z

hack/mock-images/vllm-mock/app.py

+@app.route('/metrics')
+def metrics():
+    # get deployment information
+    try:
+        apps_v1 = client.AppsV1Api()
+        resp = apps_v1.read_namespaced_deployment(DEPLOYMENT_NAME, NAMESPACE)
+        replicas = resp.spec.replicas if resp.spec.replicas is not None else 1
+    except Exception as e:
+        #print(f"Failed to get deployment information: {DEPLOYMENT_NAME=} {NAMESPACE=} error={str(e)}")
+        #print(f"Due to the failure, replicas {DEFAULT_REPLICAS} will be used to calculate metrics")
+        replicas = DEFAULT_REPLICAS


If the kubernetes import fails, client/config are undefined and /metrics will raise NameError (caught and silently masked), making the behavior dependent on an import side-effect. Consider explicitly setting client = None on import failure and guarding AppsV1Api() usage (or failing fast at startup if k8s support is required).

Copilot · 2026-04-01T13:57:26Z

hack/mock-images/vllm-mock/app.py

+@app.route('/set_metrics', methods=['POST'])
+def set_metrics():
+    global overrides
+    # Get JSON data from the request
+    data = request.json


The /set_metrics endpoint allows unauthenticated mutation of global metrics state. Even for a mock server, this can be abused if the service is reachable in-cluster or externally. Consider requiring @auth.login_required (consistent with the other endpoints) or gating this endpoint behind an explicit env flag (disabled by default).

Copilot · 2026-04-01T13:57:27Z

hack/mock-images/sglang-mock/sglang_app.py

+from flask import Response
+from random import randint
+import os
+
+try:
+    from flask import Flask


The fallback try/except won’t work as intended because from flask import Response will already raise ImportError before reaching the guarded import. Move Flask/Response imports into the same guarded block (or remove the redundant check entirely).

Suggested change

from flask import Response

from random import randint

import os

try:

from flask import Flask

from random import randint

import os

try:

from flask import Flask, Response

Copilot · 2026-04-01T13:57:27Z

hack/mock-images/sglang-mock/sglang_app.py

+from flask import Response
+from random import randint
+import os
+
+try:
+    from flask import Flask
+except ImportError:
+    raise ImportError("flask is required. Run: pip install flask")


The fallback try/except won’t work as intended because from flask import Response will already raise ImportError before reaching the guarded import. Move Flask/Response imports into the same guarded block (or remove the redundant check entirely).

Suggested change

from flask import Response

from random import randint

import os

try:

from flask import Flask

except ImportError:

raise ImportError("flask is required. Run: pip install flask")

from random import randint

import os

try:

from flask import Flask, Response

except ImportError as e:

raise ImportError("flask is required. Run: pip install flask") from e

Copilot · 2026-04-01T13:57:28Z

.github/workflows/build-mock-images.yml

+        with:
+          context: hack/mock-images/sglang-mock
+          push: true
+          tags: ghcr.io/volcano-sh/sglang-mock:latest


Publishing only the :latest tag makes rollbacks and provenance difficult (you can’t reliably map a running image back to a commit). Consider also tagging with an immutable identifier (e.g., :${{ github.sha }}) and optionally keeping :latest as a moving alias.

Copilot · 2026-04-01T13:57:28Z

examples/kthena-router/LLM-Mock.yaml

 #
-# NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`.
+# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/kthena/tree/main/hack/mock-images/vllm-mock`.
 # Move the image to kthena registry once it's public.


The note on line 6 is now outdated/confusing since the image is already in the volcano-sh GHCR namespace and the PR adds an official build/push workflow. Update or remove the comment to reflect the current publishing location/source of truth.

Suggested change

# Move the image to kthena registry once it's public.

# The image is published in the `volcano-sh` GHCR namespace by the official kthena build/publish workflow.

Copilot · 2026-04-01T13:57:28Z

hack/mock-images/vllm-mock/app.py

+    "deepseek-r1-7b": "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B"
+}
+
+# Polifill the necessary arguments.


Correct the typo 'Polifill' to 'Polyfill'.

Suggested change

# Polifill the necessary arguments.

# Polyfill the necessary arguments.

Copilot · 2026-04-01T13:57:28Z

hack/mock-images/vllm-mock/app.py

+
+def read_configs(file_path):
+    """
+    Reads a JSON file that store sensitive information.


Fix grammar in the docstring: 'store' -> 'stores'.

Suggested change

Reads a JSON file that store sensitive information.

Reads a JSON file that stores sensitive information.

Signed-off-by: Anirudh <2410030013@klh.edu.in>

Copilot AI review requested due to automatic review settings April 1, 2026 00:40

volcano-sh-bot added the kind/cleanup label Apr 1, 2026

volcano-sh-bot requested review from LiZhenCheng9527, YaoZengzeng, git-malu and hzxuzhonghu April 1, 2026 00:40

volcano-sh-bot added the size/XXL label Apr 1, 2026

Copilot started reviewing on behalf of anirudh240 April 1, 2026 00:41 View session

gemini-code-assist bot reviewed Apr 1, 2026

View reviewed changes

Copilot AI reviewed Apr 1, 2026

View reviewed changes

volcano-sh-bot added the do-not-merge/hold label Apr 1, 2026

YaoZengzeng reviewed Apr 1, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings April 1, 2026 13:36

Copilot AI reviewed Apr 1, 2026

View reviewed changes

anirudh240 added 3 commits April 1, 2026 19:15

feat: move mock images to official repo

5d887fd

move vllm-mock and sglang-mock source code to hack/mock-images and add a workflow to build and push them to ghcr.io/volcano-sh. update all image references from personal registries accordingly Signed-off-by: Anirudh <2410030013@klh.edu.in>

fix: address copilot and lint review comments

fad0b6b

Signed-off-by: Anirudh <2410030013@klh.edu.in>

fix: exclude hack/ from ruff and restrict pytest to python/ dir

eb92204

Signed-off-by: Anirudh <2410030013@klh.edu.in>

anirudh240 force-pushed the feat/move-mock-images-v2 branch from 222a6fd to eb92204 Compare April 1, 2026 13:46

fix: add license headers to mock image source files

78840c2

Signed-off-by: Anirudh <2410030013@klh.edu.in>

Copilot AI review requested due to automatic review settings April 1, 2026 13:49

Copilot AI reviewed Apr 1, 2026

View reviewed changes

Copilot started reviewing on behalf of anirudh240 April 1, 2026 14:17 View session

Copilot started reviewing on behalf of anirudh240 April 1, 2026 14:32 View session

fix: address copilot review comments in mock images

8cf158c

Signed-off-by: Anirudh <2410030013@klh.edu.in>

+@app.route('/health')
+def health():
+    return "ok", 200

	DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1')
	DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME', 'deepseek-r1')

	disabled_endpoints = ('/', '/healthz', '/metrics', '/v1/models')
	disabled_endpoints = ('/', '/health', '/healthz', '/metrics', '/v1/models')

	# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`.
	# NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`.

	success_total = overrides.get("success_total", total / replicas)
	success_total = overrides.get("success_total", total / replicas if replicas > 0 else 0)

	print(f"Failed to import kubernetes, skip: {e}")
	print(f"Failed to import kubernetes, exiting: {e}", file=sys.stderr)
	sys.exit(1)

-flask
-Flask-HTTPAuth
-kubernetes
+flask>=2.0,<3.0
+Flask-HTTPAuth>=4.0,<5.0
+kubernetes>=28.0.0,<29.0.0

		@@ -0,0 +1,3 @@
		{
		"huggingface_token": "your huggingface token"

	"huggingface_token": "your huggingface token"
	"note": "Configure your Hugging Face token via the HUGGINGFACE_TOKEN environment variable; do not store tokens in this file."

	# Move the image to kthena registry once it's public.
	# The image is published in the `volcano-sh` GHCR namespace by the official kthena build/publish workflow.

	# Polifill the necessary arguments.
	# Polyfill the necessary arguments.

	Reads a JSON file that store sensitive information.
	Reads a JSON file that stores sensitive information.

Conversation

anirudh240 commented Apr 1, 2026

Uh oh!

volcano-sh-bot commented Apr 1, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

FAUST-BENCHOU commented Apr 1, 2026

Uh oh!

hzxuzhonghu commented Apr 1, 2026

Uh oh!

anirudh240 commented Apr 1, 2026

Uh oh!

FAUST-BENCHOU commented Apr 1, 2026

Uh oh!

YaoZengzeng Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

YaoZengzeng commented Apr 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 1, 2026