Is your feature request related to a problem? Please describe.
I'm currently running both Hex and SamScribe simultaneously to handle different transcription needs. Hex is perfect for quick voice-to-text dictation with its hotkey-activated workflow, but it lacks persistent meeting recording capabilities. When I need to transcribe Zoom/Teams meetings with speaker identification and save transcripts for future reference, I have to switch to SamScribe. This creates friction in my workflow and requires downloading the same FluidAudio/Parakeet transcription models twice (once per app). Additionally, Hex's current "press-and-hold" or "double-tap" modes aren't designed for continuous, multi-speaker meeting capture where I need the recording to persist after the session ends.
Describe the solution you'd like
I'd like to propose adding an optional "Meeting Mode" to Hex that integrates continuous recording capabilities inspired by SamScribe (MIT licensed, also uses FluidAudio). This would add:
- Continuous Recording Toggle: A mode switch in the UI to enable persistent recording sessions (not just hotkey-activated bursts)
- Per-Process Audio Capture: Integration with
ScreenCaptureKit to isolate audio from specific applications (Zoom, Teams, Chrome tabs) rather than just microphone input
- Speaker Diarization: Leverage FluidAudio's existing
OfflineDiarizerManager/LSEENDDiarizer capabilities to automatically identify and separate different speakers
- Persistent Storage: Save full meeting transcripts with speaker labels using SwiftData, accessible via a sidebar history view (extending the existing
HistoryFeature)
- Cross-Session Speaker Recognition: Voice embeddings to remember speaker identities across multiple meetings (name "Bob" once, recognize automatically next time)
- Editable Transcripts: Inline editing for both speaker names and transcription text, similar to Hex's current text editing workflow
The key is maintaining Hex's elegant simplicity while adding this as an optional mode;users who only want hotkey dictation shouldn't see any UI complexity.
Describe alternatives you've considered
-
Continue using both apps separately: This works but wastes storage (duplicate models), requires context switching, and creates fragmented transcript history across two different databases.
-
Fork Hex entirely: Possible since both are MIT licensed, but this fragments the community and loses upstream improvements. I'd prefer to contribute back to the main project if the architecture aligns.
-
Request SamScribe add hotkey dictation: SamScribe is designed around meeting workflows; adding Hex's "paste at cursor" magic would require significant architectural changes to their recording-centric model.
-
Build a new hybrid app: Overkill when both existing apps share the same core technologies (Swift, SwiftUI, FluidAudio, TCA) and MIT licenses.
-
Use Hex's current HistoryFeature with manual workarounds: Tried saving individual transcription snippets, but this loses speaker context, meeting continuity, and per-process audio isolation.
Additional context
-
Technical feasibility: Both Hex and SamScribe use FluidAudio for transcription, meaning they share the same underlying ASR engine (Parakeet TDT v3) and speaker diarization capabilities. A unified app would eliminate duplicate model downloads (~500MB-1GB savings).
-
Architecture alignment: Hex already uses Swift Composable Architecture and has a HistoryFeature managing transcription history. SamScribe's SwiftData-based persistence and speaker management could integrate cleanly as a new feature slice.
-
License compatibility: Both projects are MIT licensed, so code sharing/reference is legally unencumbered.
-
Use case priority: This targets users who need both quick dictation (Hex's strength) AND meeting documentation (SamScribe's strength) without maintaining two separate transcription pipelines.
-
Reference implementation: SamScribe demonstrates exactly how this works with ScreenCaptureKit, speaker embeddings, and persistent storage;all using the same FluidAudio stack Hex already depends on.
Is your feature request related to a problem? Please describe.
I'm currently running both Hex and SamScribe simultaneously to handle different transcription needs. Hex is perfect for quick voice-to-text dictation with its hotkey-activated workflow, but it lacks persistent meeting recording capabilities. When I need to transcribe Zoom/Teams meetings with speaker identification and save transcripts for future reference, I have to switch to SamScribe. This creates friction in my workflow and requires downloading the same FluidAudio/Parakeet transcription models twice (once per app). Additionally, Hex's current "press-and-hold" or "double-tap" modes aren't designed for continuous, multi-speaker meeting capture where I need the recording to persist after the session ends.
Describe the solution you'd like
I'd like to propose adding an optional "Meeting Mode" to Hex that integrates continuous recording capabilities inspired by SamScribe (MIT licensed, also uses FluidAudio). This would add:
ScreenCaptureKitto isolate audio from specific applications (Zoom, Teams, Chrome tabs) rather than just microphone inputOfflineDiarizerManager/LSEENDDiarizercapabilities to automatically identify and separate different speakersHistoryFeature)The key is maintaining Hex's elegant simplicity while adding this as an optional mode;users who only want hotkey dictation shouldn't see any UI complexity.
Describe alternatives you've considered
Continue using both apps separately: This works but wastes storage (duplicate models), requires context switching, and creates fragmented transcript history across two different databases.
Fork Hex entirely: Possible since both are MIT licensed, but this fragments the community and loses upstream improvements. I'd prefer to contribute back to the main project if the architecture aligns.
Request SamScribe add hotkey dictation: SamScribe is designed around meeting workflows; adding Hex's "paste at cursor" magic would require significant architectural changes to their recording-centric model.
Build a new hybrid app: Overkill when both existing apps share the same core technologies (Swift, SwiftUI, FluidAudio, TCA) and MIT licenses.
Use Hex's current HistoryFeature with manual workarounds: Tried saving individual transcription snippets, but this loses speaker context, meeting continuity, and per-process audio isolation.
Additional context
Technical feasibility: Both Hex and SamScribe use FluidAudio for transcription, meaning they share the same underlying ASR engine (Parakeet TDT v3) and speaker diarization capabilities. A unified app would eliminate duplicate model downloads (~500MB-1GB savings).
Architecture alignment: Hex already uses Swift Composable Architecture and has a
HistoryFeaturemanaging transcription history. SamScribe's SwiftData-based persistence and speaker management could integrate cleanly as a new feature slice.License compatibility: Both projects are MIT licensed, so code sharing/reference is legally unencumbered.
Use case priority: This targets users who need both quick dictation (Hex's strength) AND meeting documentation (SamScribe's strength) without maintaining two separate transcription pipelines.
Reference implementation: SamScribe demonstrates exactly how this works with ScreenCaptureKit, speaker embeddings, and persistent storage;all using the same FluidAudio stack Hex already depends on.