-
Notifications
You must be signed in to change notification settings - Fork 98
Closed
Description
Several k8s app manifest and docker compose files in Examples repo support vLLM:
$ GenAIExamples$ find -iname '*vllm*.yaml'
./EdgeCraftRAG/docker_compose/intel/gpu/arc/compose_vllm.yaml
./ChatQnA/docker_compose/intel/cpu/xeon/compose_vllm.yaml
./ChatQnA/docker_compose/intel/hpu/gaudi/compose_vllm.yaml
./ChatQnA/kubernetes/intel/hpu/gaudi/manifest/chatqna-vllm-remote-inference.yaml
./ChatQnA/kubernetes/intel/hpu/gaudi/manifest/chatqna-vllm.yaml
./WorkflowExecAgent/docker_compose/intel/cpu/xeon/compose_vllm.yaml
And k8s ones specify vLLM options in configMap:
https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/kubernetes/intel/hpu/gaudi/manifest/chatqna-vllm.yaml#L178
However, such vLLM support is missing from application Helm charts here, and vLLM options are missing from the vLLM configMap in Helm charts:
https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/common/vllm/templates/configmap.yaml
Some of those options are specified with ExtraCmdArgs chart value in couple of other charts:
- https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/common/agent/values.yaml#L17
- https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/common/llm-uservice/ci-vllm-gaudi-values.yaml
Latter one uses those args only in the CI values file.
=> I think:
- vLLM options should be fixed to be specified only in
configMap - vLLM support should be added to app Helm charts, to same apps as used in Examples repo
- Examples repo manifests should be generated from Helm charts
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels