You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15Lines changed: 15 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -80,6 +80,7 @@ To build an image with the model baked in, you must specify the following docker
80
80
-`WORKER_CUDA_VERSION`: `12.1.0` (`12.1.0` is recommended for optimal performance).
81
81
-`TOKENIZER_NAME`: Tokenizer repository if you would like to use a different tokenizer than the one that comes with the model. (default: `None`, which uses the model's tokenizer)
82
82
-`TOKENIZER_REVISION`: Tokenizer revision to load (default: `main`).
83
+
-`VLLM_NIGHTLY`: Set to `true` to replace the pinned vLLM release with the latest nightly build and the latest `transformers` from source. Useful for testing unreleased vLLM features. (default: `false`)
83
84
84
85
For the remaining settings, you may apply them as environment variables when running the container. Supported environment variables are listed in the [Environment Variables](#environment-variables) section.
85
86
@@ -89,6 +90,20 @@ For the remaining settings, you may apply them as environment variables when run
If the model you would like to deploy is private or gated, you will need to include it during build time as a Docker secret, which will protect it from being exposed in the image and on DockerHub.
0 commit comments