Skip to content

[BUG] Cannot use Blackwell GPU #163

@stratus-ss

Description

@stratus-ss

Description

The version of cuda and torch are too outdated to run on blackwell GPUs (i.e. RTX 5060)

I am trying to update this while V4 is under development. I have successfully updated other containers (such as ebook2audiobook so I know this will work in docker)

To Reproduce

Steps to reproduce the behavior:

  1. Install whishper
  2. Click on 'GPU'
  3. Scroll down
  4. See error
Image

Expected behavior

After updating the cuda and pytorch libraries I expect the GPU transcription to work.

Environment

  • OS: Linux
  • Browser: All browsers
  • Version: Latest Version
  • Hosting: All types (local host, reverse proxy etc)

Attempted Update

I have updated the Dockerfile. The container builds and the CPU transcription still works. The GPU transcription spins the CPU with the python process taking 100% for as long as you let it. A file that takes about 20 seconds on CPU with the tiny model, never completes with the GPU

# YT-DLP Download and setup
FROM --platform=$BUILDPLATFORM golang:bookworm AS ytdlp_cache
ARG TARGETOS
ARG TARGETARCH
RUN apt update && apt install -y wget
RUN wget https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -O /usr/local/bin/yt-dlp
RUN chmod a+rx /usr/local/bin/yt-dlp

# Backend setup
FROM devopsworks/golang-upx:latest as backend-builder

ENV DEBIAN_FRONTEND noninteractive
WORKDIR /app
COPY ./backend /app
RUN go mod tidy
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o whishper . && \
    upx whishper
RUN chmod a+rx whishper

# Frontend setup
FROM node:alpine as frontend
ENV PNPM_HOME="/pnpm"
ENV PATH="$PNPM_HOME:$PATH"
RUN corepack enable && corepack prepare [email protected] --activate
COPY ./frontend /app
WORKDIR /app

FROM frontend AS frontend-prod-deps
RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --prod --frozen-lockfile

FROM frontend AS frontend-build
RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --frozen-lockfile
ENV BODY_SIZE_LIMIT=0
RUN pnpm run build

# Base container
FROM nvidia/cuda:12.8.1-cudnn-runtime-ubuntu22.04 as base
ENV PYTHON_VERSION=3.10

RUN export DEBIAN_FRONTEND=noninteractive \
    && apt-get -qq update \
    && apt-get -qq install --no-install-recommends \
    python${PYTHON_VERSION} \
    python${PYTHON_VERSION}-venv \
    python3-pip \
    ffmpeg \
    curl wget xz-utils \
    ffmpeg nginx supervisor \
    libcublas11 libcudnn8 libcudnn8-dev \
    && rm -rf /var/lib/apt/lists/* 

RUN ln -s -f /usr/bin/python${PYTHON_VERSION} /usr/bin/python3 && \
    ln -s -f /usr/bin/python${PYTHON_VERSION} /usr/bin/python && \
    ln -s -f /usr/bin/pip3 /usr/bin/pip

# Python service setup
COPY ./transcription-api /app/transcription
WORKDIR /app/transcription
RUN pip install -r requirements.txt && \
    pip install python-multipart && \
#    pip3 install torch --index-url https://download.pytorch.org/whl/cu128
    pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128 

# Node.js service setup
RUN wget https://nodejs.org/dist/v20.9.0/node-v20.9.0-linux-x64.tar.xz && \
    tar -xf node-v20.9.0-linux-x64.tar.xz && \
    mv node-v20.9.0-linux-x64 /usr/local/lib/node && \
    rm node-v20.9.0-linux-x64.tar.xz
ENV PATH="/usr/local/lib/node/bin:${PATH}"

ENV BODY_SIZE_LIMIT=0
COPY ./frontend /app/frontend
COPY --from=frontend-build /app/build /app/frontend
COPY --from=frontend-prod-deps /app/node_modules /app/frontend/node_modules

# Golang service setup
COPY --from=backend-builder /app/whishper /bin/whishper 
RUN chmod a+rx /bin/whishper
COPY --from=ytdlp_cache /usr/local/bin/yt-dlp /bin/yt-dlp

# Nginx setup
COPY ./nginx.conf /etc/nginx/nginx.conf

# Set workdir and entrypoint
WORKDIR /app
RUN mkdir /app/uploads


# Cleanup to make the image smaller
RUN apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* /usr/share/doc/* ~/.cache /var/cache

COPY ./supervisord.conf /etc/supervisor/conf.d/supervisord.conf
ENTRYPOINT ["supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"]

# Expose ports for each service and Nginx
EXPOSE 8080 3000 5000 80

Docker Compose File

version: "3.9"

services:
  mongo:
    image: mongo
    env_file:
      - .env
    restart: unless-stopped
    volumes:
      - ./whishper_data/db_data:/data/db
      - ./whishper_data/db_data/logs/:/var/log/mongodb/
    environment:
      MONGO_INITDB_ROOT_USERNAME: ${DB_USER:-whishper}
      MONGO_INITDB_ROOT_PASSWORD: ${DB_PASS:-whishper}
    expose:
      - 27017
    command: ['--logpath', '/var/log/mongodb/mongod.log']

  translate:
    container_name: whisper-libretranslate
    image: libretranslate/libretranslate:latest-cuda
    restart: unless-stopped
    volumes:
      - ./whishper_data/libretranslate/data:/home/libretranslate/.local/share
      - ./whishper_data/libretranslate/cache:/home/libretranslate/.local/cache
    env_file:
      - .env
    user: root
    tty: true
    environment:
      LT_DISABLE_WEB_UI: True
      LT_LOAD_ONLY: ${LT_LOAD_ONLY:-en,fr,es}
      LT_UPDATE_MODELS: True
    expose:
      - 5000
    networks:
      default:
        aliases:
          - translate
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]

  whishper:
    pull_policy: always
   image: containers.example.com/pluja/whispher:cuda128

    env_file:
      - .env
    volumes:
      - ./whishper_data/uploads:/app/uploads
      - ./whishper_data/logs:/var/log/whishper
    container_name: whishper
    restart: unless-stopped
    networks:
      default:
        aliases:
          - whishper
    ports:
      - 8082:80
    depends_on:
      - mongo
      - translate
    environment:
      PUBLIC_INTERNAL_API_HOST: "http://127.0.0.1:80"
      PUBLIC_API_HOST: https://transcribe.example.com
      PUBLIC_TRANSLATION_API_HOST: ""
#      PUBLIC_API_HOST: ${WHISHPER_HOST:-}
      PUBLIC_WHISHPER_PROFILE: gpu
      WHISPER_MODELS_DIR: /app/models
      UPLOAD_DIR: /app/uploads
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions