feat: move mock images to official repo#852
feat: move mock images to official repo#852anirudh240 wants to merge 5 commits intovolcano-sh:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request migrates the vllm-mock image to the volcano-sh registry across documentation, examples, and E2E tests. It also introduces new mock server implementations for SGLang and vLLM to support runtime verification and Prometheus metrics simulation. The review feedback identifies a missing /health endpoint required for Kubernetes readiness probes, a copy-paste error in the DEPLOYMENT_NAME environment variable initialization, and incorrect assertions in the test suite that do not match the mock server's output.
|
|
||
|
|
There was a problem hiding this comment.
The mock server is missing a /health endpoint. This is required because the readiness probes in the deployment templates (e.g., pkg/model-booster-controller/convert/templates/vllm.yaml) expect a /health endpoint on port 8000. Without this, the pods will fail to become ready.
| @app.route('/health') | |
| def health(): | |
| return "ok", 200 | |
hack/mock-images/vllm-mock/app.py
Outdated
| overrides = {} | ||
|
|
||
| MODEL_NAME = os.getenv('MODEL_NAME', 'deepseek-r1-1-5b') | ||
| DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1') |
There was a problem hiding this comment.
The DEPLOYMENT_NAME variable is incorrectly initialized using the MODEL_NAME environment variable. This appears to be a copy-paste error and will cause the mock to query the wrong deployment for replica counts if MODEL_NAME is set.
| DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1') | |
| DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME', 'deepseek-r1') |
| # the metrics and results in lots of meaningless requests that we do not want to log. | ||
| def disable_endpoint_logs(): | ||
| """Disable logs for requests to specific endpoints.""" | ||
| disabled_endpoints = ('/', '/healthz', '/metrics', '/v1/models') |
There was a problem hiding this comment.
Add /health to the list of disabled endpoints to suppress logs for health checks, ensuring consistency with the endpoint used by Kubernetes probes.
| disabled_endpoints = ('/', '/healthz', '/metrics', '/v1/models') | |
| disabled_endpoints = ('/', '/health', '/healthz', '/metrics', '/v1/models') |
| self.assertIn( | ||
| f'vllm:request_success_total{{finished_reason="stop",model_name="llama2-70b"}} {expected_total / replica}', | ||
| data) | ||
| self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}', | ||
| data) | ||
| self.assertIn( | ||
| f'vllm:avg_generation_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}', data) |
There was a problem hiding this comment.
The test assertions are incorrect. They expect labels (finished_reason) and a model name (llama2-70b) that are not produced by the current implementation in app.py. Furthermore, the expected values do not match the default behavior where replicas defaults to 1, resulting in a value of 100.0 instead of 33.33.
| self.assertIn( | |
| f'vllm:request_success_total{{finished_reason="stop",model_name="llama2-70b"}} {expected_total / replica}', | |
| data) | |
| self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}', | |
| data) | |
| self.assertIn( | |
| f'vllm:avg_generation_throughput_toks_per_s{{model_name="llama2-70b"}} {expected_total / replica}', data) | |
| # assert metric value | |
| self.assertIn( | |
| f'vllm:request_success_total{{model_name="deepseek-r1-1-5b"}} 100.0', | |
| data) | |
| self.assertIn(f'vllm:avg_prompt_throughput_toks_per_s{{model_name="deepseek-r1-1-5b"}} 100.0', | |
| data) | |
| self.assertIn( | |
| f'vllm:avg_generation_throughput_toks_per_s{{model_name="deepseek-r1-1-5b"}} 100.0', data) |
There was a problem hiding this comment.
Pull request overview
This PR moves vLLM and SGLang mock image source code from personal registries to the official volcano-sh repository, enabling centralized maintenance and easier future updates. The mock images are used for end-to-end testing without deploying actual LLM servers.
Changes:
- Added vllm-mock source code (
hack/mock-images/vllm-mock/) with Python Flask application, simulator, tests, and Dockerfile - Added sglang-mock source code (
hack/mock-images/sglang-mock/) with Python Flask application and Dockerfile - Updated all image references across YAML test files and examples from personal registries to
ghcr.io/volcano-sh/vllm-mock:latestandghcr.io/volcano-sh/sglang-mock:latest - Added GitHub Actions workflow to automatically build and push mock images to GHCR
Reviewed changes
Copilot reviewed 36 out of 36 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| hack/mock-images/vllm-mock/ | Complete vLLM mock server implementation with Flask app, simulator, tests, and Docker configuration |
| hack/mock-images/sglang-mock/ | SGLang mock server implementation with Flask app and Docker configuration |
| .github/workflows/build-mock-images.yml | Automated workflow to build and push images to ghcr.io/volcano-sh |
| test/e2e/router/testdata/*.yaml | Updated all vLLM mock image references to official registry |
| examples/kthena-router/*.yaml | Updated all vLLM mock image references to official registry |
| test/e2e/controller-manager/model_booster_test.go | Updated mock image references in test fixtures |
| docs/kthena/**/*.yaml | Updated mock image references across all documentation versions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| branches: | ||
| - main | ||
| paths: | ||
| - kthena/hack/mock-images/** |
There was a problem hiding this comment.
The workflow file references kthena/hack/mock-images/ in both the paths trigger and the context fields, but the actual mock image files are located at hack/mock-images/, not under a kthena/ subdirectory. This mismatch will cause the workflow trigger to never fire when mock image files are changed, and the build context paths will be incorrect. The paths should be hack/mock-images/** for both the trigger and the build contexts should be hack/mock-images/vllm-mock and hack/mock-images/sglang-mock.
hack/mock-images/vllm-mock/app.py
Outdated
| overrides = {} | ||
|
|
||
| MODEL_NAME = os.getenv('MODEL_NAME', 'deepseek-r1-1-5b') | ||
| DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1') |
There was a problem hiding this comment.
The variable DEPLOYMENT_NAME is being set using the MODEL_NAME environment variable instead of DEPLOYMENT_NAME. This is inconsistent and will cause incorrect deployment name lookups. It should use os.getenv('DEPLOYMENT_NAME', 'deepseek-r1') to get the correct deployment name.
| DEPLOYMENT_NAME = os.getenv('MODEL_NAME', 'deepseek-r1') | |
| DEPLOYMENT_NAME = os.getenv('DEPLOYMENT_NAME', 'deepseek-r1') |
hack/mock-images/vllm-mock/app.py
Outdated
| { | ||
| "name": "request_prompt_tokens", | ||
| "type": "histogram", | ||
| "description": "Histogram of number of prefill tokens processed..", |
There was a problem hiding this comment.
Typo: Double period at end of metric description. Should be a single period.
hack/mock-images/vllm-mock/app.py
Outdated
| { | ||
| "name": "request_generation_tokens", | ||
| "type": "histogram", | ||
| "description": "Histogram of number of generation tokens processed..", |
There was a problem hiding this comment.
Typo: Double period at end of metric description. Should be a single period.
examples/kthena-router/LLM-Mock.yaml
Outdated
| # You can use this mock server to test the inference router without deploying a real LLM server. | ||
| # | ||
| # NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`. | ||
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`. |
There was a problem hiding this comment.
The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`. | |
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`. |
| # You can use this mock server to test the inference router without deploying a real LLM server. | ||
| # | ||
| # NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`. | ||
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`. |
There was a problem hiding this comment.
The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`. | |
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`. |
| # You can use this mock server to test the inference router without deploying a real LLM server. | ||
| # | ||
| # NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`. | ||
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`. |
There was a problem hiding this comment.
The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`. | |
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`. |
| # You can use this mock server to test the inference router without deploying a real LLM server. | ||
| # | ||
| # NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`. | ||
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`. |
There was a problem hiding this comment.
The path in this comment is incorrect. The vllm-mock source code is located at hack/mock-images/vllm-mock, not kthena/hack/mock-images/vllm-mock. The GitHub URL reference should be https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock without the kthena/ prefix.
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/kthena/hack/mock-images/vllm-mock`. | |
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/volcano/tree/main/hack/mock-images/vllm-mock`. |
| - name: Build and push vllm-mock | ||
| uses: docker/build-push-action@v5 | ||
| with: | ||
| context: kthena/hack/mock-images/vllm-mock |
There was a problem hiding this comment.
The build context path is incorrect. It should be hack/mock-images/vllm-mock instead of kthena/hack/mock-images/vllm-mock, as the actual files are located at hack/mock-images/, not under a kthena/ subdirectory.
| - name: Build and push sglang-mock | ||
| uses: docker/build-push-action@v5 | ||
| with: | ||
| context: kthena/hack/mock-images/sglang-mock |
There was a problem hiding this comment.
The build context path is incorrect. It should be hack/mock-images/sglang-mock instead of kthena/hack/mock-images/sglang-mock, as the actual files are located at hack/mock-images/, not under a kthena/ subdirectory.
|
/hold |
|
I think this idea is good, but the image should be milti-arch. And since the mock is originated from vllm/sglang, should keep their copyright. |
|
yes, i kinda messed up anyway |
|
also aibirx's copyright i think |
| @@ -0,0 +1,36 @@ | |||
| import threading | |||
There was a problem hiding this comment.
We should support specifying model name through env, ref: https://github.com/volcano-sh/kthena/blob/main/examples/kthena-router/LLM-Mock-ds1.5b.yaml#L26
And it should also be reflected in the exposure metrics.
|
Also make sure all tests and check passed first. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 38 out of 38 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (5)
pytest.ini:1
- With
testpaths = python, the newly added tests underhack/mock-images/**(e.g.,hack/mock-images/vllm-mock/test_app.py) will not be collected by default, so the mock server behavior isn't actually covered in CI. If these tests are intended to run in this repo, includehack/mock-imagesintestpaths(or remove this restriction and rely on standard discovery) so they are executed.
python/kthena/pyproject.toml:1 - Pytest configuration is now duplicated in two places (
pytest.iniand[tool.pytest.ini_options]), which can drift and confuse tooling (especially when options differ later). Prefer keeping pytest configuration in a single location (either rootpytest.iniorpyproject.toml) to avoid split-brain configuration.
hack/mock-images/vllm-mock/test_app.py:1 - This import depends on the current working directory /
PYTHONPATHbeing set so thatapp.pyis importable as a top-level module. When running tests from repo root (typical CI), this will likely fail becausehack/mock-images/vllm-mockis not onsys.pathand the directory namemock-imagescannot be imported as a package. Consider adjusting the test to add the local directory tosys.path(or restructure the mock image code into an importable module) so the test can run reliably.
hack/mock-images/vllm-mock/test_app.py:1 - The test is brittle because it asserts exact float string formatting from
/metrics(which can vary, e.g.,33.333333333333336vs33.33333333333333) and it uses a rawprintthat will spam CI logs on failures. Prefer parsing the Prometheus exposition format (or using regex with a numeric tolerance) and avoid unconditional prints (use assertions with helpful messages or logging only on failure).
hack/mock-images/vllm-mock/test_app.py:1 - The test is brittle because it asserts exact float string formatting from
/metrics(which can vary, e.g.,33.333333333333336vs33.33333333333333) and it uses a rawprintthat will spam CI logs on failures. Prefer parsing the Prometheus exposition format (or using regex with a numeric tolerance) and avoid unconditional prints (use assertions with helpful messages or logging only on failure).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| total = overrides.get("total", 100.0) | ||
| model_name = overrides.get("model_name", MODEL_NAME) | ||
| # calculate metrics with potential overrides | ||
| success_total = overrides.get("success_total", total / replicas) |
There was a problem hiding this comment.
success_total can raise a ZeroDivisionError when the Deployment is scaled to replicas = 0 (a valid state in Kubernetes). Align it with the throughput metrics by guarding replicas > 0 (or defaulting to 1) so /metrics remains scrapeable during scale-to-zero.
| success_total = overrides.get("success_total", total / replicas) | |
| success_total = overrides.get("success_total", total / replicas if replicas > 0 else 0) |
| try: | ||
| from kubernetes import client, config | ||
| except Exception as e: | ||
| print(f"Failed to import kubernetes, skip: {e}") |
There was a problem hiding this comment.
If the kubernetes import fails, client/config are not defined, but /metrics later unconditionally references client.AppsV1Api() and config.load_incluster_config(), which will crash with NameError. Either (a) make the import failure fatal in this mock image, or (b) set client = None/config = None and branch in /metrics and __main__ to skip k8s calls when unavailable.
| print(f"Failed to import kubernetes, skip: {e}") | |
| print(f"Failed to import kubernetes, exiting: {e}", file=sys.stderr) | |
| sys.exit(1) |
| @app.route('/set_metrics', methods=['POST']) | ||
| def set_metrics(): | ||
| global overrides |
There was a problem hiding this comment.
/set_metrics allows arbitrary metric overrides without authentication, which enables anyone who can reach the service to manipulate metrics and potentially break e2e assertions/scrapes. If the mock server supports an --api_key, consider protecting this endpoint with @auth.login_required (or restricting it behind a separate debug flag) to avoid unintended exposure.
| flask | ||
| Flask-HTTPAuth | ||
| kubernetes |
There was a problem hiding this comment.
These dependencies are unpinned, which makes the image build non-reproducible and can cause sudden breakages when upstream releases change behavior (notably kubernetes and Flask-related packages). Pin versions (or at least constrain major versions) to keep ghcr.io/volcano-sh/vllm-mock:latest stable across rebuilds.
| flask | |
| Flask-HTTPAuth | |
| kubernetes | |
| flask>=2.0,<3.0 | |
| Flask-HTTPAuth>=4.0,<5.0 | |
| kubernetes>=28.0.0,<29.0.0 |
| - name: Build and push vllm-mock | ||
| uses: docker/build-push-action@v5 | ||
| with: | ||
| context: hack/mock-images/vllm-mock | ||
| push: true | ||
| tags: ghcr.io/volcano-sh/vllm-mock:latest |
There was a problem hiding this comment.
Only pushing the :latest tag makes it hard to correlate a deployed image back to a specific commit and to roll back safely. Consider also pushing an immutable tag (e.g., :sha-${{ github.sha }} and/or a semver/date tag) alongside :latest so consumers can pin and debugging is easier.
move vllm-mock and sglang-mock source code to hack/mock-images and add a workflow to build and push them to ghcr.io/volcano-sh. update all image references from personal registries accordingly Signed-off-by: Anirudh <2410030013@klh.edu.in>
Signed-off-by: Anirudh <2410030013@klh.edu.in>
Signed-off-by: Anirudh <2410030013@klh.edu.in>
222a6fd to
eb92204
Compare
Signed-off-by: Anirudh <2410030013@klh.edu.in>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 38 out of 38 changed files in this pull request and generated 10 comments.
Comments suppressed due to low confidence (3)
hack/mock-images/vllm-mock/test_app.py:1
- This test hard-codes
replica = 3, but/metricsfalls back toDEFAULT_REPLICASwhen it can’t read the Deployment (which will typically be 1 in unit tests), so the expected value will be wrong and the test will fail. Make the test setDEFAULT_REPLICAS=3(env var) or call/set_metricsto force deterministic values before asserting.
hack/mock-images/vllm-mock/test_app.py:1 - This test hard-codes
replica = 3, but/metricsfalls back toDEFAULT_REPLICASwhen it can’t read the Deployment (which will typically be 1 in unit tests), so the expected value will be wrong and the test will fail. Make the test setDEFAULT_REPLICAS=3(env var) or call/set_metricsto force deterministic values before asserting.
python/kthena/pyproject.toml:1 - This introduces pytest discovery configuration in
pyproject.tomlwhile a repo-levelpytest.iniis also added. Having two places to definetestpathsmakes test discovery dependent on where pytest is invoked from. Consider consolidating to a single configuration source (preferably root-level for the repo), and ensure it includes any intended tests underhack/mock-images/**(otherwisehack/mock-images/vllm-mock/test_app.pywon’t be collected by default).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1,3 @@ | |||
| { | |||
| "huggingface_token": "your huggingface token" | |||
There was a problem hiding this comment.
Committing a config.json meant for sensitive values (even with a placeholder) makes it easy to accidentally add real secrets later and also bakes this file into the built image via COPY . .. Prefer reading the token from an environment variable (or a mounted secret) and commit only a config.example.json (or remove the file entirely from the image build context).
| "huggingface_token": "your huggingface token" | |
| "note": "Configure your Hugging Face token via the HUGGINGFACE_TOKEN environment variable; do not store tokens in this file." |
|
|
||
| try: | ||
| from kubernetes import client, config | ||
| except Exception as e: | ||
| print(f"Failed to import kubernetes, skip: {e}") |
There was a problem hiding this comment.
If the kubernetes import fails, client/config are undefined and /metrics will raise NameError (caught and silently masked), making the behavior dependent on an import side-effect. Consider explicitly setting client = None on import failure and guarding AppsV1Api() usage (or failing fast at startup if k8s support is required).
| try: | |
| from kubernetes import client, config | |
| except Exception as e: | |
| print(f"Failed to import kubernetes, skip: {e}") | |
| client = None | |
| config = None | |
| try: | |
| from kubernetes import client, config | |
| except Exception as e: | |
| print(f"Failed to import kubernetes, skip: {e}") | |
| client = None | |
| config = None |
| @app.route('/metrics') | ||
| def metrics(): | ||
| # get deployment information | ||
| try: | ||
| apps_v1 = client.AppsV1Api() | ||
| resp = apps_v1.read_namespaced_deployment(DEPLOYMENT_NAME, NAMESPACE) | ||
| replicas = resp.spec.replicas if resp.spec.replicas is not None else 1 | ||
| except Exception as e: | ||
| #print(f"Failed to get deployment information: {DEPLOYMENT_NAME=} {NAMESPACE=} error={str(e)}") | ||
| #print(f"Due to the failure, replicas {DEFAULT_REPLICAS} will be used to calculate metrics") | ||
| replicas = DEFAULT_REPLICAS |
There was a problem hiding this comment.
If the kubernetes import fails, client/config are undefined and /metrics will raise NameError (caught and silently masked), making the behavior dependent on an import side-effect. Consider explicitly setting client = None on import failure and guarding AppsV1Api() usage (or failing fast at startup if k8s support is required).
| @app.route('/set_metrics', methods=['POST']) | ||
| def set_metrics(): | ||
| global overrides | ||
| # Get JSON data from the request | ||
| data = request.json |
There was a problem hiding this comment.
The /set_metrics endpoint allows unauthenticated mutation of global metrics state. Even for a mock server, this can be abused if the service is reachable in-cluster or externally. Consider requiring @auth.login_required (consistent with the other endpoints) or gating this endpoint behind an explicit env flag (disabled by default).
| from flask import Response | ||
| from random import randint | ||
| import os | ||
|
|
||
| try: | ||
| from flask import Flask |
There was a problem hiding this comment.
The fallback try/except won’t work as intended because from flask import Response will already raise ImportError before reaching the guarded import. Move Flask/Response imports into the same guarded block (or remove the redundant check entirely).
| from flask import Response | |
| from random import randint | |
| import os | |
| try: | |
| from flask import Flask | |
| from random import randint | |
| import os | |
| try: | |
| from flask import Flask, Response |
| from flask import Response | ||
| from random import randint | ||
| import os | ||
|
|
||
| try: | ||
| from flask import Flask | ||
| except ImportError: | ||
| raise ImportError("flask is required. Run: pip install flask") |
There was a problem hiding this comment.
The fallback try/except won’t work as intended because from flask import Response will already raise ImportError before reaching the guarded import. Move Flask/Response imports into the same guarded block (or remove the redundant check entirely).
| from flask import Response | |
| from random import randint | |
| import os | |
| try: | |
| from flask import Flask | |
| except ImportError: | |
| raise ImportError("flask is required. Run: pip install flask") | |
| from random import randint | |
| import os | |
| try: | |
| from flask import Flask, Response | |
| except ImportError as e: | |
| raise ImportError("flask is required. Run: pip install flask") from e |
| with: | ||
| context: hack/mock-images/sglang-mock | ||
| push: true | ||
| tags: ghcr.io/volcano-sh/sglang-mock:latest |
There was a problem hiding this comment.
Publishing only the :latest tag makes rollbacks and provenance difficult (you can’t reliably map a running image back to a commit). Consider also tagging with an immutable identifier (e.g., :${{ github.sha }}) and optionally keeping :latest as a moving alias.
| # | ||
| # NOTE: `ghcr.io/yaozengzeng/vllm-mock:latest` is built based on `https://github.com/YaoZengzeng/aibrix/tree/vllm-mock`. | ||
| # NOTE: `ghcr.io/volcano-sh/vllm-mock:latest` is built based on `https://github.com/volcano-sh/kthena/tree/main/hack/mock-images/vllm-mock`. | ||
| # Move the image to kthena registry once it's public. |
There was a problem hiding this comment.
The note on line 6 is now outdated/confusing since the image is already in the volcano-sh GHCR namespace and the PR adds an official build/push workflow. Update or remove the comment to reflect the current publishing location/source of truth.
| # Move the image to kthena registry once it's public. | |
| # The image is published in the `volcano-sh` GHCR namespace by the official kthena build/publish workflow. |
hack/mock-images/vllm-mock/app.py
Outdated
| "deepseek-r1-7b": "deepseek-ai/DeepSeek-R1-Distill-Qwen-7B" | ||
| } | ||
|
|
||
| # Polifill the necessary arguments. |
There was a problem hiding this comment.
Correct the typo 'Polifill' to 'Polyfill'.
| # Polifill the necessary arguments. | |
| # Polyfill the necessary arguments. |
hack/mock-images/vllm-mock/app.py
Outdated
|
|
||
| def read_configs(file_path): | ||
| """ | ||
| Reads a JSON file that store sensitive information. |
There was a problem hiding this comment.
Fix grammar in the docstring: 'store' -> 'stores'.
| Reads a JSON file that store sensitive information. | |
| Reads a JSON file that stores sensitive information. |
Signed-off-by: Anirudh <2410030013@klh.edu.in>
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
move vllm-mock and sglang-mock source code to hack/mock-images and add a workflow to build and push them to ghcr.io/volcano-sh. update all image references from personal registries accordingly
Which issue(s) this PR fixes:
Fixes #849
Special notes for your reviewer:
The vllm-mock source was extracted directly from the existing ghcr.io/yaozengzeng/vllm-mock:latest image since the source repo was private. The behavior is unchanged.
Does this PR introduce a user-facing change?: