Skip to content

Sync vLLM support from Examples repo k8s manifests to Helm charts #608

@eero-t

Description

@eero-t

Several k8s app manifest and docker compose files in Examples repo support vLLM:

$ GenAIExamples$ find -iname '*vllm*.yaml'
./EdgeCraftRAG/docker_compose/intel/gpu/arc/compose_vllm.yaml
./ChatQnA/docker_compose/intel/cpu/xeon/compose_vllm.yaml
./ChatQnA/docker_compose/intel/hpu/gaudi/compose_vllm.yaml
./ChatQnA/kubernetes/intel/hpu/gaudi/manifest/chatqna-vllm-remote-inference.yaml
./ChatQnA/kubernetes/intel/hpu/gaudi/manifest/chatqna-vllm.yaml
./WorkflowExecAgent/docker_compose/intel/cpu/xeon/compose_vllm.yaml

And k8s ones specify vLLM options in configMap:
https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/kubernetes/intel/hpu/gaudi/manifest/chatqna-vllm.yaml#L178

However, such vLLM support is missing from application Helm charts here, and vLLM options are missing from the vLLM configMap in Helm charts:
https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/common/vllm/templates/configmap.yaml

Some of those options are specified with ExtraCmdArgs chart value in couple of other charts:

Latter one uses those args only in the CI values file.


=> I think:

  • vLLM options should be fixed to be specified only in configMap
  • vLLM support should be added to app Helm charts, to same apps as used in Examples repo
  • Examples repo manifests should be generated from Helm charts

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions