Skip to content

aleapc/speechcoach-backend

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

speechcoach-backend

Express + TypeScript backend for Speech Coach G2. Provides real speech-to-text via OpenAI Whisper or Deepgram, plus speech quality analysis (WPM, filler words, pauses, word count).

Setup

npm install
cp .env.example .env
# Edit .env and set OPENAI_API_KEY or DEEPGRAM_API_KEY
npm run dev

Server listens on PORT (default 8787).

Without an API key, transcription runs in mock mode and returns a placeholder string so the glasses client can still exercise the full flow.

Endpoints

Method Path Purpose
GET /health Liveness + provider info
POST /transcribe One-shot PCM -> text + metrics
POST /session Create session, returns { id }
POST /session/:id/audio Append raw PCM to session
GET /session/:id/stream SSE stream of live metrics
POST /session/:id/finalize Finalize session, returns summary
GET /session/:id Fetch current metrics

All audio endpoints accept raw PCM (16 kHz, signed 16-bit LE, mono) with Content-Type: application/octet-stream.

SSE message shape

{
  "type": "partial" | "final" | "metrics" | "error",
  "transcript": "...",
  "metrics": {
    "wpm": 128,
    "fillerWords": 5,
    "pauseCount": 12,
    "avgPauseMs": 320,
    "wordCount": 212,
    "fillerBreakdown": { "um": 3, "like": 2 }
  },
  "elapsedMs": 15230
}

Providers

Priority order on startup:

  1. OPENAI_API_KEY -> OpenAI Whisper (whisper-1)
  2. DEEPGRAM_API_KEY -> Deepgram nova-2 with filler-word detection
  3. neither -> mock mode (placeholder transcript)

Scripts

  • npm run dev - nodemon + tsx hot reload
  • npm start - run once via tsx
  • npm run build - tsc emit to dist/
  • npm run typecheck - type-check only

About

Backend STT service (Whisper/Deepgram) for Speech Coach G2 with SSE streaming

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors