Skip to content

pywhispercpp -> faster-whisper (VAD), subtitle length, language detection#28

Open
michal-lozowski wants to merge 1 commit intouni-halle:mainfrom
michal-lozowski:faster-whisper-and-utils
Open

pywhispercpp -> faster-whisper (VAD), subtitle length, language detection#28
michal-lozowski wants to merge 1 commit intouni-halle:mainfrom
michal-lozowski:faster-whisper-and-utils

Conversation

@michal-lozowski
Copy link
Copy Markdown

SUMMARY:

Moved from pywhispercpp to faster-whisper: has VAD, has no built in output formatters txt/srt/vtt
Implemented: subtitle line length capping, multipoint language detection, CPU/GPU toggle via config, VAD visualization, output formatters

MODIFIED FILES:

Dockerfile

  • Added CPU/GPU toggle blocks (comment/uncomment)

docker-compose.yaml

  • Added CPU/GPU toggle blocks (comment/uncomment)

src/app.py

  • New /vad endpoint for VAD segment data
  • Formatters

src/core/Transcriber.py

  • Rewritten for faster-whisper API
  • Multi-point language detection (samples at 20%, 40%, 60%, 80%)
  • VAD log capture from faster-whisper
  • Word-level timestamps for subtitle splitting
  • Configurable device/compute_type from env vars

src/core/TsApi.py

  • WhisperModel init for faster-whisper
  • Reads device/compute_type from env vars

NEW FILES:

src/utils/formatters.py

  • Custom SRT/VTT/TXT/CSV/JSON formatters
  • Replaces pywhispercpp's built-in formatters

src/utils/subtitle_splitter.py

  • Splits long subtitles using word-level timestamps
  • Splits at punctuation/spaces, max 80 chars per line
  • Trims subtitles longer than 30s (whisper hallucination countermeasure crutch :)

src/utils/vad_capture.py

  • Captures VAD segments from faster-whisper logs
  • Generates VAD timeline CSVs

visualize_vad.py

  • Generates VAD timeline as a graphic bar chart

QUICKSTART.md

  • Curl commands
  • CPU/GPU switching instructions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant