feat: add submodule of worker-vllm, updated fastapi endpoints by velaraptor-runpod · Pull Request #6 · runpod-workers/vllm-loadbalancer-ep

velaraptor-runpod · 2026-03-18T01:05:05Z

No description provided.

TimPietruskyRunPod

A few items to address before merge — mostly small fixes. The core rewrite using vLLM's native serving classes is solid.

TimPietruskyRunPod · 2026-03-26T12:54:08Z

handler_lb.py

+):
+    from vllm.entrypoints.openai.protocol import ResponsesResponse
+    from vllm.entrypoints.openai.engine.protocol import ErrorResponse
+


retrieve_responses and cancel_responses import ResponsesResponse from vllm.entrypoints.openai.protocol, but create_responses imports it from vllm.entrypoints.openai.responses.protocol. These should be consistent — which is the correct path in vLLM 0.16.0?

TimPietruskyRunPod · 2026-03-26T12:54:08Z

handler_lb.py

+
+    if not body.get("stream"):
+        return JSONResponse(response.model_dump())
+


The chat and completion handlers check body.get("stream") on the raw dict to decide streaming vs non-streaming. The responses and messages handlers instead check the response type (isinstance). Consider making these consistent — checking the response type is more robust since it follows what vLLM actually returned.

TimPietruskyRunPod · 2026-03-26T12:54:08Z

.runpod/tests_json

@@ -0,0 +1,64 @@
+{


This file is named tests_json — should it be tests.json?

TimPietruskyRunPod · 2026-03-26T12:54:08Z

README.md

-curl -X POST "https://your-endpoint-id.api.runpod.ai/v1/completions" \
-  -H "Authorization: Bearer YOUR_RUNPOD_API_KEY" \
+curl -X POST "https://<endpoint-id>.api.runpod.ai/v1/chat/completions" \
+  -H "Authorization: Bearer $RUNPOD_API_KEY" \


Is this a real endpoint ID? If so it should probably be replaced with a placeholder like <endpoint-id> to match the examples above. If there's a reason to keep it (e.g. a public demo endpoint), happy to leave it.

TimPietruskyRunPod · 2026-03-26T12:54:08Z

README.md

+### Core (from worker-vllm)
+
+| Variable | Required | Description | Default |
+|----------|----------|-------------|---------|


Typo: RunP[d → RunPod

TimPietruskyRunPod · 2026-03-26T12:54:08Z

.runpod/README.md

+```
+
+| Path | Method | Description |
+|------|--------|-------------|


MAX_CONCURRENCY default here is 300, but hub.json sets it to 10 with a note about keeping it lower for load balancing. These should be consistent.

feat: add submodule of worker-vllm, updated fastapi endpoints

123695f

velaraptor-runpod requested a review from TimPietruskyRunPod March 18, 2026 01:05

velaraptor-runpod added 4 commits March 17, 2026 20:14

chore: update readme

621bea4

chore: update .runpod readme

5ed04e7

Runpod not RunPod

6edea3c

update banner

08af673

velaraptor-runpod requested a review from Madiator2011Work March 20, 2026 22:25

velaraptor-runpod added 2 commits March 20, 2026 18:23

fix: add lmcache version, and astral version

7be7531

fix: cleaner error handling, ping states when engine is ready

f0727a8

TimPietruskyRunPod requested changes Mar 26, 2026

View reviewed changes

TimPietruskyRunPod reviewed Mar 26, 2026

View reviewed changes

velaraptor-runpod added 4 commits April 3, 2026 16:01

remove github workflows for now, update submodule, update readme

b1ecd74

fix: fix readme

082de35

fix: python imports, consistent streaming check

c7ce118

fix: dockerfile lmcache in submodule now

3ea3255

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add submodule of worker-vllm, updated fastapi endpoints#6

feat: add submodule of worker-vllm, updated fastapi endpoints#6
velaraptor-runpod wants to merge 11 commits intomainfrom
feat/update-vllm

velaraptor-runpod commented Mar 18, 2026

Uh oh!

TimPietruskyRunPod left a comment

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		if not body.get("stream"):
		return JSONResponse(response.model_dump())

Conversation

velaraptor-runpod commented Mar 18, 2026

Uh oh!

TimPietruskyRunPod left a comment

Choose a reason for hiding this comment

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

TimPietruskyRunPod Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants