MediTalk - Medical AI with Voice

Medical conversational AI system combining MultiMeditron LLM with speech capabilities for voice-based medical interactions.

Overview

MediTalk integrates multiple AI services to enable natural voice conversations with medical language models:

Medical LLM: MultiMeditron for medical question answering
Speech Recognition: Whisper for audio transcription
Speech Synthesis: Multiple TTS models (Orpheus, Bark, CSM, Qwen3-Omni)
Web Interface: Streamlit-based conversation UI
Benchmarking: Comprehensive evaluation suite for TTS and ASR models

Prerequisites

HuggingFace token (get one)
Model access:
- meta-llama/Meta-Llama-3.1-8B-Instruct
- ClosedMeditron/Mulimeditron-End2End-CLIP-medical (request from EPFL LiGHT lab)
- canopylabs/orpheus-3b-0.1-ft

Quick Start

1. Configure environment:

Create .env file:

HUGGINGFACE_TOKEN=your_token
MULTIMEDITRON_HF_TOKEN=your_token
MULTIMEDITRON_MODEL=ClosedMeditron/Mulimeditron-End2End-CLIP-medical

2. Setup (first time only):

./scripts/setup-local.sh

3. Start services:

./scripts/start-local.sh

4. Access interface:

Open http://localhost:8503 in your browser.

5. Stop services:

./scripts/stop-local.sh

Services

MediTalk consists of multiple microservices. Each service has its own README with detailed setup instructions for individual API usage.

Service	Port	Description	README
Controller	8000	Orchestrates all services	Link
WebUI	8501	Streamlit interface	Link
MultiMeditron	5003	Medical LLM	Link
Whisper	8000	Speech-to-text	Link
Orpheus	5005	Neural TTS	Link
Bark	5008	Multilingual TTS	Link
CSM	5004	Conversational TTS	Link
Qwen3-Omni	5006	Multimodal TTS	Link
NISQA	8006	Speech quality assessment	Link

Benchmarking

MediTalk includes comprehensive benchmarking suites for evaluating model performance.

TTS Benchmark

Evaluate text-to-speech models on intelligibility, quality, and speed.

cd benchmark/tts
./run_benchmark.sh

See benchmark/tts/README.md for details.

Whisper Benchmark

Evaluate speech recognition accuracy across different Whisper model sizes.

cd benchmark/whisper
./run_benchmark.sh

See benchmark/whisper/README.md for details.

Project Structure

MediTalk/
│
├── services/                 # Microservices
│   ├── controller/           # Service orchestration
│   ├── webui/                # Web interface
│   ├── modelMultiMeditron/   # Medical LLM
│   ├── modelWhisper/         # ASR
│   ├── modelOrpheus/         # TTS
│   ├── modelBark/            # TTS
│   ├── modelCSM/             # TTS (conversational)
│   ├── modelQwen3Omni/       # TTS (conversational)
│   └── modelNisqa/           # Quality assessment (MOS)
│
├── benchmark/                # Evaluation suites
│   ├── tts/                  # TTS benchmark
│   └── whisper/              # ASR benchmark
│
├── scripts/                  # Management scripts
│
├── data/                     # Datasets (Download, Storage, Preprocessing) 
│
├── inputs/                   # Input files
│
├── outputs/                  # Generated files
│
└── logs/                     # Service logs

Monitoring

Check service health:

./scripts/health-check.sh

View logs:

tail -f logs/controller.log
tail -f logs/modelOrpheus.log

Monitor GPU usage:

./scripts/monitor-gpus.sh

Troubleshooting

Service won't start:

tail -f logs/<service>.log

Check for errors, missing dependencies, or invalid tokens.

Missing ffmpeg:

sudo apt-get update && sudo apt-get install -y ffmpeg
./scripts/restart.sh

Model loading fails:

Verify HuggingFace token in .env
Check disk space (models are large)
Review service logs in logs/ directory

Note: First run may take several minutes as models are downloaded and cached.

EPFL RCP Cluster Deployment

For deployment on EPFL RCP cluster, refer to LiGHT RCP Documentation.

Acknowledgments

MultiMeditron - EPFL LiGHT Lab
Orpheus - Canopy Labs
Bark - Suno AI
Whisper - OpenAI
Qwen3-Omni - Alibaba Cloud
NISQA - TU Berlin

Semester Project | Nicolas Teissier | LiGHT Laboratory | EPFL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MediTalk - Medical AI with Voice

Overview

Prerequisites

Quick Start

Services

Benchmarking

TTS Benchmark

Whisper Benchmark

Project Structure

Monitoring

Troubleshooting

EPFL RCP Cluster Deployment

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
benchmark		benchmark
data		data
outputs		outputs
scripts		scripts
services		services
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md

EPFLiGHT/MediTalk

Folders and files

Latest commit

History

Repository files navigation

MediTalk - Medical AI with Voice

Overview

Prerequisites

Quick Start

Services

Benchmarking

TTS Benchmark

Whisper Benchmark

Project Structure

Monitoring

Troubleshooting

EPFL RCP Cluster Deployment

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages