-
-
Notifications
You must be signed in to change notification settings - Fork 161
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description
The version of cuda and torch are too outdated to run on blackwell GPUs (i.e. RTX 5060)
I am trying to update this while V4 is under development. I have successfully updated other containers (such as ebook2audiobook so I know this will work in docker)
To Reproduce
Steps to reproduce the behavior:
- Install whishper
- Click on 'GPU'
- Scroll down
- See error
Expected behavior
After updating the cuda and pytorch libraries I expect the GPU transcription to work.
Environment
- OS: Linux
- Browser: All browsers
- Version: Latest Version
- Hosting: All types (local host, reverse proxy etc)
Attempted Update
I have updated the Dockerfile. The container builds and the CPU transcription still works. The GPU transcription spins the CPU with the python process taking 100% for as long as you let it. A file that takes about 20 seconds on CPU with the tiny model, never completes with the GPU
# YT-DLP Download and setup
FROM --platform=$BUILDPLATFORM golang:bookworm AS ytdlp_cache
ARG TARGETOS
ARG TARGETARCH
RUN apt update && apt install -y wget
RUN wget https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -O /usr/local/bin/yt-dlp
RUN chmod a+rx /usr/local/bin/yt-dlp
# Backend setup
FROM devopsworks/golang-upx:latest as backend-builder
ENV DEBIAN_FRONTEND noninteractive
WORKDIR /app
COPY ./backend /app
RUN go mod tidy
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o whishper . && \
upx whishper
RUN chmod a+rx whishper
# Frontend setup
FROM node:alpine as frontend
ENV PNPM_HOME="/pnpm"
ENV PATH="$PNPM_HOME:$PATH"
RUN corepack enable && corepack prepare [email protected] --activate
COPY ./frontend /app
WORKDIR /app
FROM frontend AS frontend-prod-deps
RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --prod --frozen-lockfile
FROM frontend AS frontend-build
RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --frozen-lockfile
ENV BODY_SIZE_LIMIT=0
RUN pnpm run build
# Base container
FROM nvidia/cuda:12.8.1-cudnn-runtime-ubuntu22.04 as base
ENV PYTHON_VERSION=3.10
RUN export DEBIAN_FRONTEND=noninteractive \
&& apt-get -qq update \
&& apt-get -qq install --no-install-recommends \
python${PYTHON_VERSION} \
python${PYTHON_VERSION}-venv \
python3-pip \
ffmpeg \
curl wget xz-utils \
ffmpeg nginx supervisor \
libcublas11 libcudnn8 libcudnn8-dev \
&& rm -rf /var/lib/apt/lists/*
RUN ln -s -f /usr/bin/python${PYTHON_VERSION} /usr/bin/python3 && \
ln -s -f /usr/bin/python${PYTHON_VERSION} /usr/bin/python && \
ln -s -f /usr/bin/pip3 /usr/bin/pip
# Python service setup
COPY ./transcription-api /app/transcription
WORKDIR /app/transcription
RUN pip install -r requirements.txt && \
pip install python-multipart && \
# pip3 install torch --index-url https://download.pytorch.org/whl/cu128
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
# Node.js service setup
RUN wget https://nodejs.org/dist/v20.9.0/node-v20.9.0-linux-x64.tar.xz && \
tar -xf node-v20.9.0-linux-x64.tar.xz && \
mv node-v20.9.0-linux-x64 /usr/local/lib/node && \
rm node-v20.9.0-linux-x64.tar.xz
ENV PATH="/usr/local/lib/node/bin:${PATH}"
ENV BODY_SIZE_LIMIT=0
COPY ./frontend /app/frontend
COPY --from=frontend-build /app/build /app/frontend
COPY --from=frontend-prod-deps /app/node_modules /app/frontend/node_modules
# Golang service setup
COPY --from=backend-builder /app/whishper /bin/whishper
RUN chmod a+rx /bin/whishper
COPY --from=ytdlp_cache /usr/local/bin/yt-dlp /bin/yt-dlp
# Nginx setup
COPY ./nginx.conf /etc/nginx/nginx.conf
# Set workdir and entrypoint
WORKDIR /app
RUN mkdir /app/uploads
# Cleanup to make the image smaller
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* /usr/share/doc/* ~/.cache /var/cache
COPY ./supervisord.conf /etc/supervisor/conf.d/supervisord.conf
ENTRYPOINT ["supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]
# Expose ports for each service and Nginx
EXPOSE 8080 3000 5000 80
Docker Compose File
version: "3.9"
services:
mongo:
image: mongo
env_file:
- .env
restart: unless-stopped
volumes:
- ./whishper_data/db_data:/data/db
- ./whishper_data/db_data/logs/:/var/log/mongodb/
environment:
MONGO_INITDB_ROOT_USERNAME: ${DB_USER:-whishper}
MONGO_INITDB_ROOT_PASSWORD: ${DB_PASS:-whishper}
expose:
- 27017
command: ['--logpath', '/var/log/mongodb/mongod.log']
translate:
container_name: whisper-libretranslate
image: libretranslate/libretranslate:latest-cuda
restart: unless-stopped
volumes:
- ./whishper_data/libretranslate/data:/home/libretranslate/.local/share
- ./whishper_data/libretranslate/cache:/home/libretranslate/.local/cache
env_file:
- .env
user: root
tty: true
environment:
LT_DISABLE_WEB_UI: True
LT_LOAD_ONLY: ${LT_LOAD_ONLY:-en,fr,es}
LT_UPDATE_MODELS: True
expose:
- 5000
networks:
default:
aliases:
- translate
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
whishper:
pull_policy: always
image: containers.example.com/pluja/whispher:cuda128
env_file:
- .env
volumes:
- ./whishper_data/uploads:/app/uploads
- ./whishper_data/logs:/var/log/whishper
container_name: whishper
restart: unless-stopped
networks:
default:
aliases:
- whishper
ports:
- 8082:80
depends_on:
- mongo
- translate
environment:
PUBLIC_INTERNAL_API_HOST: "http://127.0.0.1:80"
PUBLIC_API_HOST: https://transcribe.example.com
PUBLIC_TRANSLATION_API_HOST: ""
# PUBLIC_API_HOST: ${WHISHPER_HOST:-}
PUBLIC_WHISHPER_PROFILE: gpu
WHISPER_MODELS_DIR: /app/models
UPLOAD_DIR: /app/uploads
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working