feat: Settings → Hardware compute device selector (CPU / GPU) by CrisAlejo26 · Pull Request #139 · amicalhq/amical

CrisAlejo26 · 2026-04-19T20:15:08Z

Lets users pick which CPU or GPU runs local Whisper models from a new Settings → Hardware tab. Auto (current behaviour) is the default; CPU forces the CPU binary; picking a specific GPU forwards its index to whisper.cpp's whisper_context_params.gpu_device.

Also folds in the fixes that were required to build the CUDA whisper binary on Windows with CUDA 13.x — they are prerequisites for the GPU path to be exercised at all.

Whisper wrapper

addon.cpp: read gpu_device option and propagate it.
index.ts: expose gpu, gpuDevice, flashAttn, threads and preferredBackend; actually forward them to binding.init.
loader.ts: accept a preferredBackend / WHISPER_NATIVE_BACKEND env override; cache the binding per backend.
CMakeLists.txt: pass /Zc:preprocessor to MSVC (required by CCCL in CUDA 13) and iterate CMAKE_JS_INC natively so paths containing spaces resolve correctly.
bin/build-addon.js: detect VS 2022 Preview alongside the other editions when probing for lib.exe.

Desktop app

db/schema.ts: new compute section on AppSettingsData.
services/settings-service.ts: get/set compute settings with a compute-changed event.
services/hardware-detection-service.ts (new): enumerate GPUs via the existing systeminformation dep with a one-shot cache.
pipeline/providers/transcription/whisper-provider.ts: accept SettingsService, read compute settings, forward them to the worker fork on each initializeModel call and dispose the worker when they change.
pipeline/providers/transcription/whisper-worker-fork.ts: accept WhisperInitOptions and pass them through to the wrapper constructor.
services/transcription-service.ts: hand SettingsService to the WhisperProvider.
trpc: new hardware.getSnapshot query and settings.{get,set}ComputeSettings procedures.
renderer: new /settings/hardware route and page with the selector, detected hardware list and refresh button; sidebar entry and title mapping added.
i18n: en and es locales for the new strings (ja / zh-TW deferred to native translators).

Docs

docs/hardware-selection.md — user-facing guide for the new tab.
docs/building-whisper-cuda-windows.md — developer guide covering the build-script fixes above.
docs/CHANGELOG-compute-device-selector.md — release notes with the full file list and migration / testing notes.
README link added.

Tested on Windows 11 + RTX 4080 Laptop + CUDA 13.2: rebuilt the wrapper with the CMake fixes, copied the two cuBLAS DLLs next to whisper.node, launched the app, dictated a sentence. Log confirmed `whisper_native_binding: cuda`; GPU peaked at 100% while CPU stayed low. macOS (Metal) and Linux (Vulkan) paths were not re-tested on this branch; they rely on the existing loader fallback.

Summary by CodeRabbit

New Features
- Added Hardware settings (Advanced) to choose compute device (Auto, CPU, or specific GPU), with hardware detection, per-GPU details, available backends and a Refresh control.
Bug Fixes
- Improved Windows CUDA build reliability and GPU selection behavior so GPU-accelerated binaries are detected and used when available.
Documentation
- Added hardware selection docs, Windows CUDA build guide, changelog entry, and English/Spanish translations.

Lets users pick which CPU or GPU runs local Whisper models from a new Settings → Hardware tab. Auto (current behaviour) is the default; CPU forces the CPU binary; picking a specific GPU forwards its index to whisper.cpp's `whisper_context_params.gpu_device`. Also folds in the fixes that were required to build the CUDA whisper binary on Windows with CUDA 13.x — they are prerequisites for the GPU path to be exercised at all. Whisper wrapper - addon.cpp: read `gpu_device` option and propagate it. - index.ts: expose `gpu`, `gpuDevice`, `flashAttn`, `threads` and `preferredBackend`; actually forward them to `binding.init`. - loader.ts: accept a `preferredBackend` / `WHISPER_NATIVE_BACKEND` env override; cache the binding per backend. - CMakeLists.txt: pass `/Zc:preprocessor` to MSVC (required by CCCL in CUDA 13) and iterate CMAKE_JS_INC natively so paths containing spaces resolve correctly. - bin/build-addon.js: detect VS 2022 Preview alongside the other editions when probing for lib.exe. Desktop app - db/schema.ts: new `compute` section on AppSettingsData. - services/settings-service.ts: get/set compute settings with a `compute-changed` event. - services/hardware-detection-service.ts (new): enumerate GPUs via the existing `systeminformation` dep with a one-shot cache. - pipeline/providers/transcription/whisper-provider.ts: accept SettingsService, read compute settings, forward them to the worker fork on each initializeModel call and dispose the worker when they change. - pipeline/providers/transcription/whisper-worker-fork.ts: accept WhisperInitOptions and pass them through to the wrapper constructor. - services/transcription-service.ts: hand SettingsService to the WhisperProvider. - trpc: new `hardware.getSnapshot` query and `settings.{get,set}ComputeSettings` procedures. - renderer: new `/settings/hardware` route and page with the selector, detected hardware list and refresh button; sidebar entry and title mapping added. - i18n: `en` and `es` locales for the new strings (ja / zh-TW deferred to native translators). Docs - docs/hardware-selection.md — user-facing guide for the new tab. - docs/building-whisper-cuda-windows.md — developer guide covering the build-script fixes above. - docs/CHANGELOG-compute-device-selector.md — release notes with the full file list and migration / testing notes. - README link added. Tested on Windows 11 + RTX 4080 Laptop + CUDA 13.2: rebuilt the wrapper with the CMake fixes, copied the two cuBLAS DLLs next to whisper.node, launched the app, dictated a sentence. Log confirmed \`whisper_native_binding: cuda\`; GPU peaked at 100% while CPU stayed low. macOS (Metal) and Linux (Vulkan) paths were not re-tested on this branch; they rely on the existing loader fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-19T20:15:23Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ebc94220-1708-4874-bb15-bd0cc1147430

📥 Commits

Reviewing files that changed from the base of the PR and between 0ccc489 and 078375a.

📒 Files selected for processing (5)

apps/desktop/src/i18n/locales/en.json
apps/desktop/src/i18n/locales/es.json
apps/desktop/src/renderer/main/pages/settings/advanced/hardware-section.tsx
apps/desktop/src/services/transcription-service.ts
docs/CHANGELOG-compute-device-selector.md

✅ Files skipped from review due to trivial changes (2)

apps/desktop/src/services/transcription-service.ts
docs/CHANGELOG-compute-device-selector.md

🚧 Files skipped from review as they are similar to previous changes (3)

apps/desktop/src/i18n/locales/es.json
apps/desktop/src/i18n/locales/en.json
apps/desktop/src/renderer/main/pages/settings/advanced/hardware-section.tsx

📝 Walkthrough

Walkthrough

Adds hardware detection and a compute-device selector for local Whisper: new settings persistence and tRPC endpoints, UI for Auto/CPU/GPU selection, detection service, wiring into transcription provider/worker to pass GPU/backend/threads, native loader/backend selection, build fixes, and documentation.

Changes

Cohort / File(s)	Summary
Schema & Settings `apps/desktop/src/db/schema.ts`, `apps/desktop/src/services/settings-service.ts`	Add `AppSettingsData.compute` schema; new `getComputeSettings()` / `setComputeSettings()` methods and a `compute-changed` event.
i18n `apps/desktop/src/i18n/locales/en.json`, `apps/desktop/src/i18n/locales/es.json`	Add `settings.hardware.*` translation keys for UI, detected GPUs, backends, and warnings.
Hardware Detection & API `apps/desktop/src/services/hardware-detection-service.ts`, `apps/desktop/src/trpc/routers/hardware.ts`, `apps/desktop/src/trpc/router.ts`	New HardwareDetectionService with cached/deduped detection and types; `hardware.getSnapshot` tRPC endpoint exposing snapshot + available native backends; router registration.
Settings Router `apps/desktop/src/trpc/routers/settings.ts`	Add `getComputeSettings` and `setComputeSettings` tRPC procedures with Zod validation.
Transcription Integration `apps/desktop/src/pipeline/providers/transcription/whisper-provider.ts`, `apps/desktop/src/pipeline/providers/transcription/whisper-worker-fork.ts`, `apps/desktop/src/services/transcription-service.ts`	WhisperProvider accepts SettingsService, resolves init options (backend/gpu/device/threads), worker accepts `WhisperInitOptions`, re-inits only when model or init options change, better failure cleanup; TranscriptionService listens for compute changes and rebuilds provider under mutexes.
Renderer / UI `apps/desktop/src/renderer/main/pages/settings/advanced/hardware-section.tsx`, `apps/desktop/src/renderer/main/pages/settings/advanced/index.tsx`	New `HardwareSection` component with compute selector (Auto/CPU/GPU:index), GPU list, backends display, refresh button, warning banner; integrated into Advanced settings page.
Native Loader & Wrapper `packages/whisper-wrapper/src/loader.ts`, `packages/whisper-wrapper/src/index.ts`	Introduce `WhisperBackend` and `LoadBindingOptions.preferredBackend`; resolve/load binding by preferred backend (env override WHISPER_NATIVE_BACKEND), per-instance binding with backend-aware caching, expose loader types, extend WhisperOptions with gpuDevice/flashAttn/threads/preferredBackend.
Native Addon & Build `packages/whisper-wrapper/addon/CMakeLists.txt`, `packages/whisper-wrapper/addon/addon.cpp`, `packages/whisper-wrapper/bin/build-addon.js`	MSVC/CUDA compile flags for MSVC preprocessor; safer CMake-js include/link handling; accept `gpu_device` in native init; add VS Preview probe path.
Docs & README `README.md`, `docs/hardware-selection.md`, `docs/building-whisper-cuda-windows.md`, `docs/CHANGELOG-compute-device-selector.md`	Add hardware-selection docs, Windows CUDA build guide, changelog, and reference link from README.
Misc `apps/desktop/src/trpc/router.ts`, other small wiring files	Router registration and small imports/adjustments across services to wire the feature end-to-end.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant UI as Renderer (HardwareSection)
    participant API as tRPC
    participant SettingsSvc as SettingsService
    participant HWSvc as HardwareDetectionService
    participant TransSvc as TranscriptionService
    participant Provider as WhisperProvider
    participant Worker as WhisperWorker

    User->>UI: select compute device (e.g. "gpu:0")
    UI->>API: settings.setComputeSettings({device:"gpu", gpuDevice:0})
    API->>SettingsSvc: setComputeSettings(...)
    SettingsSvc->>SettingsSvc: persist & emit "compute-changed"
    SettingsSvc-->>API: success
    API-->>UI: success

    SettingsSvc->>TransSvc: compute-changed
    TransSvc->>TransSvc: acquire mutexes
    TransSvc->>Provider: reset() / dispose()
    Provider->>Worker: dispose() (free instance)
    Worker-->>Provider: disposed
    TransSvc->>TransSvc: release mutexes

    User->>UI: request transcription
    UI->>API: transcribe(audio)
    API->>TransSvc: transcribe(...)
    TransSvc->>Provider: initializeModel()
    Provider->>SettingsSvc: getComputeSettings()
    SettingsSvc-->>Provider: {device:"gpu", gpuDevice:0}
    Provider->>Worker: initializeModel([modelPath, initOptions])
    Worker->>Worker: load binding (preferred backend), init Whisper with gpu_device/threads
    Worker-->>Provider: ready
    Provider-->>TransSvc: ready for transcription

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

haritabh-z01

Poem

🐇 I sniffed the bindings, soft and light,

GPU, CPU — I chose what's right.
Auto hops, or pick one bold,
Whisper runs, new stories told.
A tiny rabbit cheers — compute takes flight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main feature: adding a hardware compute device selector in Settings with CPU/GPU options. It is concise, specific, and directly reflects the primary change.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

haritabh-z01 · 2026-04-19T20:16:49Z

@CrisAlejo26 Any reason for this change? Vulkan is already working for windows users with Nvidia GPUs.
Is it not working with CUDA 13.2+?

coderabbitai

Actionable comments posted: 11

🧹 Nitpick comments (2)

packages/whisper-wrapper/src/loader.ts (1)
56-72: Silent fallback when WHISPER_NATIVE_BACKEND holds an unknown value.

If a user sets WHISPER_NATIVE_BACKEND=cudaa (typo) or any other string outside the allowed set, the env var is silently ignored and the loader falls through to "auto". A one-line console.warn when the env var is set-but-unrecognized would save a lot of debugging time when someone reports "I set the env var and it still picked CPU".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/whisper-wrapper/src/loader.ts` around lines 56 - 72,
resolvePreferredBackend currently silently ignores unknown
WHISPER_NATIVE_BACKEND values; update the function (resolvePreferredBackend) to
detect when process.env.WHISPER_NATIVE_BACKEND is set but not one of the allowed
values ("auto","cpu","metal","openblas","cuda","vulkan") and emit a single-line
warning (e.g., console.warn) that includes the raw env value and that it will
fall back to "auto"; keep the existing return behavior so unrecognized values
still return "auto".
packages/whisper-wrapper/addon/addon.cpp (1)
269-281: LGTM — optional propagation is the right call.

Guarding the assignment behind gpu_device_set preserves the whisper.cpp default when callers don't explicitly pick a device, which is important for older/non-CUDA builds that treat the field differently. Type-check pattern (raw .As<Napi::Number>() without IsNumber()) matches the other options in this file, so no new inconsistency introduced.

One small follow-up worth considering: add a sanity log (or clamp) if gpu_device is negative or larger than available devices, so misconfiguration surfaces clearly instead of silently falling back to device 0 inside whisper.cpp.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/whisper-wrapper/addon/addon.cpp` around lines 269 - 281, Add a
sanity check before assigning cparams.gpu_device: when gpu_device_set is true,
query the available GPU count (e.g., via cudaGetDeviceCount or platform API),
and if gpu_device is negative or >= device_count either log a clear warning and
skip setting cparams.gpu_device (to preserve whisper.cpp default) or clamp
gpu_device into the valid [0, device_count-1] range before assignment; reference
the variables gpu_device, gpu_device_set and the struct field cparams.gpu_device
(created from whisper_context_default_params/whisper_context_params) when
implementing the check and warning.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/desktop/src/pipeline/providers/transcription/whisper-worker-fork.ts`:
- Around line 9-14: The WhisperInitOptions interface needs a flashAttn?: boolean
field and that property must be included in the equality check in
sameInitOptions() and forwarded into the new Whisper(...) constructor call so
the worker doesn't drop the option; stop coalescing opts.gpu to true for "auto"
behavior—pass opts through unchanged (do not derive gpu: true when undefined);
and change the initialization flow around whisperInstance and cache globals so
you only assign whisperInstance and set any cache/global state after
whisper.load() completes successfully (and ensure you clear/cleanup any
partially created instance if load() throws) to avoid leaving an invalid
instance in the cache.

In `@apps/desktop/src/renderer/main/pages/settings/hardware/index.tsx`:
- Around line 33-40: The toStored function currently returns only
device/gpuDevice which causes setComputeSettings to overwrite and drop any
existing threads value; update toStored to merge the existing compute settings'
threads into the returned object by reading the current compute state (or
accepting the previous compute object) and including threads when present, so
setComputeSettings preserves threads; apply the same fix to the other occurrence
of toStored used in the component (the second call around the GPU-selector
handling) so both branches retain threads when saving.
- Around line 54-57: The current useQuery is called with undefined so calling
refetchSnapshot() reuses undefined and hardwareService.detect(input?.refresh ??
false) keeps getting false; update the refetch calls (refetchSnapshot and the
other refetch at lines 155-159) to pass an explicit input object that sets
refresh: true (i.e., call refetch({ refresh: true })) so the
backend/hardwareService.detect receives true and forces re-detection; ensure the
query input type (api.hardware.getSnapshot.useQuery) supports a { refresh:
boolean } argument when you change the refetch invocation.
- Around line 69-72: The GPU-detection logic in hasAnyGpuBackend incorrectly
treats "openblas" as a GPU; update the check that inspects
snapshot?.availableBackends in hasAnyGpuBackend to either (a) explicitly check
for known GPU backend names (e.g., "cuda", "metal", "vulkan", "rocm") or (b)
exclude both "cpu" and "openblas" instead of only "cpu", so openblas is treated
as CPU-acceleration; apply the same change to the duplicate backend-check logic
used elsewhere in this file (the similar block around the other
availableBackends usage).

In `@apps/desktop/src/services/hardware-detection-service.ts`:
- Around line 49-51: The current dedicated flag uses only vramMB (dedicated:
typeof c.vram === "number" ? c.vram >= 2048 : false) which misclassifies
integrated GPUs; update the logic that sets dedicated (near vramMB,
vendorCategory, and categorizeVendor) to combine vendorCategory with vramMB so
that GPUs from "intel" and "apple" (as returned by categorizeVendor) are treated
as non-dedicated regardless of modest VRAM, or require a much larger VRAM
threshold for those vendors (e.g., only mark dedicated when vendorCategory not
in ["intel","apple"] and vramMB exceeds a higher cutoff). Ensure you modify the
dedicated assignment only, referencing vramMB, vendorCategory, and
categorizeVendor.
- Around line 60-66: The catch block in detect() is caching a permanent failure
by assigning this.snapshot = { gpus: [], cpuThreads: 0, cpuModel: "Unknown" } so
subsequent calls short-circuit; instead, log the error with
logger.main.error(error) but do NOT overwrite this.snapshot on transient
si.graphics()/si.cpu() failures — either return the previous snapshot or
null/undefined to force future detect() calls to retry, and only update
this.snapshot inside the successful path (the code that builds the snapshot from
si.graphics()/si.cpu()). Ensure inFlight is still cleared in finally so retries
can proceed.

In `@apps/desktop/src/services/transcription-service.ts`:
- Line 69: The WhisperProvider is currently constructed with settingsService
which registers its own compute-changed listener that can dispose the worker
outside the transcriptionMutex and race with active calls (affecting
processStreamingChunk, finalizeSession, retryTranscription). Fix by removing
direct settingsService subscription from WhisperProvider and route
compute-device change events through TranscriptionService (or add a
serialization hook on WhisperProvider) so that any worker reset/dispose runs
under transcriptionMutex or is deferred until no session is active; update the
constructor call in TranscriptionService where WhisperProvider is instantiated
and ensure compute-change handling is executed inside the transcriptionMutex (or
call a provided provider.resetSafely() that acquires the mutex) instead of
letting WhisperProvider dispose the worker asynchronously.

In `@apps/desktop/src/trpc/routers/hardware.ts`:
- Around line 13-37: In detectAvailableBackends(), remove the unused
destructuring "const { platform, arch } = process;" and delete the unreachable
statements "void arch;" and "void platform;" that appear after the return; keep
the rest of the logic (the BACKENDS loop, available.push("cpu"), and the de-dup
return Array.from(new Set(available))) unchanged so behavior is preserved.

In `@docs/building-whisper-cuda-windows.md`:
- Around line 48-62: The Markdown fenced code blocks showing build outputs (the
block containing "packages/whisper-wrapper/native/win32-x64-cuda/whisper.node"
and the block showing the error from "cccl/std/__cccl/preprocessor.h(23): fatal
error C1189") need explicit language tags to satisfy markdownlint; update both
fences to use ```text instead of bare ``` so the blocks around the package paths
and the MSVC/cl.exe preprocessor error are annotated as text.

In `@packages/whisper-wrapper/src/index.ts`:
- Around line 39-42: If opts.gpuDevice is provided but opts.gpu is undefined,
the constructed initOpts may set gpu_device without enabling GPU; update the
initOpts construction so that when opts.gpuDevice !== undefined you also set
initOpts.gpu = true (in addition to initOpts.gpu_device = opts.gpuDevice).
Modify the logic around initOpts (the block that sets model, gpu, gpu_device,
flash_attn) so gpu is set to true whenever gpu_device is set, ensuring the addon
(cparams.use_gpu) is enabled when a specific GPU is selected.
- Around line 55-68: The code currently calls loadBinding() inside full() and
free(), which can return a different native addon than the one that created
this.ctx; instead, capture and store the binding used when the Whisper instance
is constructed (e.g. set this.binding = loadBinding(...) in the constructor
where ctx and preferredBackend are chosen) and replace internal calls to
loadBinding() in full() and free() with this.binding so full(this.ctx, ...) and
binding.free(this.ctx) always use the same native addon that created this.ctx.

---

Nitpick comments:
In `@packages/whisper-wrapper/addon/addon.cpp`:
- Around line 269-281: Add a sanity check before assigning cparams.gpu_device:
when gpu_device_set is true, query the available GPU count (e.g., via
cudaGetDeviceCount or platform API), and if gpu_device is negative or >=
device_count either log a clear warning and skip setting cparams.gpu_device (to
preserve whisper.cpp default) or clamp gpu_device into the valid [0,
device_count-1] range before assignment; reference the variables gpu_device,
gpu_device_set and the struct field cparams.gpu_device (created from
whisper_context_default_params/whisper_context_params) when implementing the
check and warning.

In `@packages/whisper-wrapper/src/loader.ts`:
- Around line 56-72: resolvePreferredBackend currently silently ignores unknown
WHISPER_NATIVE_BACKEND values; update the function (resolvePreferredBackend) to
detect when process.env.WHISPER_NATIVE_BACKEND is set but not one of the allowed
values ("auto","cpu","metal","openblas","cuda","vulkan") and emit a single-line
warning (e.g., console.warn) that includes the raw env value and that it will
fall back to "auto"; keep the existing return behavior so unrecognized values
still return "auto".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 528096ad-3ddd-4145-b6b7-88915f691927

📥 Commits

Reviewing files that changed from the base of the PR and between a63c5b7 and 225b7c9.

📒 Files selected for processing (25)

README.md
apps/desktop/src/db/schema.ts
apps/desktop/src/i18n/locales/en.json
apps/desktop/src/i18n/locales/es.json
apps/desktop/src/pipeline/providers/transcription/whisper-provider.ts
apps/desktop/src/pipeline/providers/transcription/whisper-worker-fork.ts
apps/desktop/src/renderer/main/lib/settings-navigation.ts
apps/desktop/src/renderer/main/pages/settings/hardware/index.tsx
apps/desktop/src/renderer/main/routeTree.gen.ts
apps/desktop/src/renderer/main/routes/settings/hardware.tsx
apps/desktop/src/renderer/main/routes/settings/route.tsx
apps/desktop/src/services/hardware-detection-service.ts
apps/desktop/src/services/settings-service.ts
apps/desktop/src/services/transcription-service.ts
apps/desktop/src/trpc/router.ts
apps/desktop/src/trpc/routers/hardware.ts
apps/desktop/src/trpc/routers/settings.ts
docs/CHANGELOG-compute-device-selector.md
docs/building-whisper-cuda-windows.md
docs/hardware-selection.md
packages/whisper-wrapper/addon/CMakeLists.txt
packages/whisper-wrapper/addon/addon.cpp
packages/whisper-wrapper/bin/build-addon.js
packages/whisper-wrapper/src/index.ts
packages/whisper-wrapper/src/loader.ts

coderabbitai · 2026-04-19T20:21:50Z

+function toStored(value: DeviceValue): {
+  device: "auto" | "cpu" | "gpu";
+  gpuDevice?: number;
+} {
+  if (value === "auto" || value === "cpu") return { device: value };
+  const index = Number(value.slice("gpu:".length));
+  return { device: "gpu", gpuDevice: Number.isFinite(index) ? index : 0 };
+}


⚠️ Potential issue | 🟡 Minor

Preserve existing thread settings when changing devices.

setComputeSettings replaces the compute section, but toStored() only returns device/gpuDevice. Any persisted threads value is dropped as soon as the user changes the selector.

🐛 Proposed fix

function toStored(value: DeviceValue): { device: "auto" | "cpu" | "gpu"; gpuDevice?: number; + threads?: number; -} { - if (value === "auto" || value === "cpu") return { device: value }; +}, compute?: { threads?: number }) { + const withThreads = compute?.threads ? { threads: compute.threads } : {}; + if (value === "auto" || value === "cpu") { + return { device: value, ...withThreads }; + } const index = Number(value.slice("gpu:".length)); - return { device: "gpu", gpuDevice: Number.isFinite(index) ? index : 0 }; + return { + device: "gpu", + gpuDevice: Number.isFinite(index) ? index : 0, + ...withThreads, + }; } @@ setSelection(value); try { - await mutation.mutateAsync(toStored(value)); + await mutation.mutateAsync(toStored(value, compute)); } catch (error) {

Also applies to: 76-80

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/desktop/src/renderer/main/pages/settings/hardware/index.tsx` around lines 33 - 40, The toStored function currently returns only device/gpuDevice which causes setComputeSettings to overwrite and drop any existing threads value; update toStored to merge the existing compute settings' threads into the returned object by reading the current compute state (or accepting the previous compute object) and including threads when present, so setComputeSettings preserves threads; apply the same fix to the other occurrence of toStored used in the component (the second call around the GPU-selector handling) so both branches retain threads when saving.

coderabbitai · 2026-04-19T20:21:50Z

+  const { data: snapshot, refetch: refetchSnapshot, isFetching: snapshotLoading } =
+    api.hardware.getSnapshot.useQuery(undefined, {
+      staleTime: 5 * 60 * 1000,
+    });


⚠️ Potential issue | 🟠 Major

Make Refresh actually force hardware re-detection.

The query is created with undefined input, and refetchSnapshot() reuses that input, so hardwareService.detect(input?.refresh ?? false) still receives false and returns the cached snapshot.

🐛 Proposed fix

const { data: snapshot, refetch: refetchSnapshot, isFetching: snapshotLoading } = api.hardware.getSnapshot.useQuery(undefined, { staleTime: 5 * 60 * 1000, }); + + const refreshSnapshot = async () => { + const freshSnapshot = await utils.hardware.getSnapshot.fetch({ + refresh: true, + }); + utils.hardware.getSnapshot.setData(undefined, freshSnapshot); + }; @@ <Button variant="ghost" size="sm" - onClick={() => refetchSnapshot()} + onClick={() => void refreshSnapshot()} disabled={snapshotLoading} >

Also applies to: 155-159

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/desktop/src/renderer/main/pages/settings/hardware/index.tsx` around lines 54 - 57, The current useQuery is called with undefined so calling refetchSnapshot() reuses undefined and hardwareService.detect(input?.refresh ?? false) keeps getting false; update the refetch calls (refetchSnapshot and the other refetch at lines 155-159) to pass an explicit input object that sets refresh: true (i.e., call refetch({ refresh: true })) so the backend/hardwareService.detect receives true and forces re-detection; ensure the query input type (api.hardware.getSnapshot.useQuery) supports a { refresh: boolean } argument when you change the refetch invocation.

coderabbitai · 2026-04-19T20:21:50Z

+  const hasAnyGpuBackend = useMemo(() => {
+    const backends = snapshot?.availableBackends ?? [];
+    return backends.some((b) => b !== "cpu");
+  }, [snapshot]);


⚠️ Potential issue | 🟠 Major

Don’t treat OpenBLAS as a GPU backend.

availableBackends can include openblas, which is CPU acceleration. With the current b !== "cpu" check, selecting a GPU on an OpenBLAS-only build suppresses the warning even though no GPU backend is available.

🐛 Proposed fix

+const GPU_BACKENDS = new Set(["cuda", "vulkan", "metal"]); + export default function HardwareSettingsPage() { @@ const hasAnyGpuBackend = useMemo(() => { const backends = snapshot?.availableBackends ?? []; - return backends.some((b) => b !== "cpu"); + return backends.some((b) => GPU_BACKENDS.has(b)); }, [snapshot]);

Also applies to: 142-146

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/desktop/src/renderer/main/pages/settings/hardware/index.tsx` around lines 69 - 72, The GPU-detection logic in hasAnyGpuBackend incorrectly treats "openblas" as a GPU; update the check that inspects snapshot?.availableBackends in hasAnyGpuBackend to either (a) explicitly check for known GPU backend names (e.g., "cuda", "metal", "vulkan", "rocm") or (b) exclude both "cpu" and "openblas" instead of only "cpu", so openblas is treated as CPU-acceleration; apply the same change to the duplicate backend-check logic used elsewhere in this file (the similar block around the other availableBackends usage).

- Add the missing `flashAttn` field to WhisperInitOptions so it stops being silently dropped on its way to the wrapper. - Include `flashAttn` in sameInitOptions' equality so a change there forces a rebuild of the whisper context. - Stop coalescing `opts.gpu` to true when undefined. The caller (WhisperProvider.resolveInitOptions) already sets it explicitly for cpu / gpu modes and leaves it undefined for auto, which lets whisper_context_default_params() supply its own default instead of us overriding it. - Commit `whisperInstance` / `currentModelPath` / `currentInitOptions` only after a successful load, and free the half-initialised context if load() throws. Also clear the globals before freeing the previous instance so a free() that throws does not leave the worker pointing at a disposed native ctx. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Drops the dedicated /settings/hardware route and sidebar entry and renders the compute-device selector, GPU list and warnings as an extra section inside Settings → Advanced. Keeps the feature available but gated as power-user territory, matching the maintainer's direction on PR amicalhq#139. Also tightens the extracted section: toStored preserves existing compute.threads, hasAnyGpuBackend filters to known GPU backends only (cuda/metal/vulkan/rocm) instead of treating openblas as a GPU, and Refresh now calls utils.hardware.getSnapshot.fetch({ refresh: true }) so tapping it actually forces re-detection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- whisper-wrapper/src/index.ts: setting gpuDevice now also forces gpu=true so a selected device can't silently fall back to CPU. Capture the resolved binding in the constructor and reuse it in transcribe()/free() so the ctx always goes back to the native addon that created it. - whisper-wrapper/src/loader.ts: warn (once) when WHISPER_NATIVE_BACKEND holds a value outside the allowed set instead of silently falling back to "auto". - whisper-wrapper/addon/addon.cpp: sanity-check the requested gpu_device — negative values are logged and skipped so whisper.cpp falls back to its own default instead of passing garbage to the backend. - hardware-detection-service.ts: combine vendorCategory with VRAM so Intel/Apple integrated GPUs aren't flagged as dedicated; on systeminformation failure, keep the previous snapshot instead of caching an empty result forever. - transcription-service.ts + whisper-provider.ts: move the compute-changed listener out of WhisperProvider and run the worker teardown inside transcriptionMutex.runExclusive(), so a setting change can no longer race a mid-call initializeWhisper → transcribeAudio. - trpc/routers/hardware.ts: drop the unreachable void arch/platform statements after the return. - docs/building-whisper-cuda-windows.md: tag the two fenced code blocks with `text` for markdownlint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CrisAlejo26 · 2026-04-20T06:24:10Z

The specific case

I have a laptop with an Intel UHD (iGPU) + NVIDIA RTX 4080 Laptop (dGPU). With the current build, Vulkan does load, but it picks adapter 0 — which on this box is the Intel iGPU. So "it works" in the sense that the GPU binary is used, but whisper-large-v3 runs on the Intel iGPU and keeps CPU pegged at ~100% while the 12 GB RTX 4080 sits at 14% utilisation.

There is no way for a user to redirect whisper to adapter 1 without rebuilding from source or editing env vars by hand. This is common on most modern Windows laptops and it is not user error — whisper_context_default_params().gpu_device == 0 is fine for single-GPU desktops but wrong for hybrid graphics.

What the PR actually does

The PR does three things, of which only the first strictly needs UI:

Force a specific GPU (the core case) — forwards the chosen index to whisper_context_params.gpu_device, which both the CUDA and Vulkan backends honour.
Force CPU — legitimate for laptops on battery, users on a flaky GPU driver, or anyone hitting a backend-specific bug who wants a fallback that works today, not after a release.
Surface what is active — the "Detected hardware" panel and the whisper_native_binding echo help users (and you in support threads) tell whether the CUDA / Vulkan / CPU-fallback binary is loaded. Today that only lives in the logs.

Moved the selector under Advanced as you suggested

I just pushed a commit that drops the dedicated /settings/hardware route + sidebar entry and renders the Hardware section inside Settings → Advanced. Concretely:

Deleted apps/desktop/src/renderer/main/routes/settings/hardware.tsx and pages/settings/hardware/.
New pages/settings/advanced/hardware-section.tsx (compute selector, detected GPU list, refresh button, warning banner) is rendered as an extra Card in pages/settings/advanced/index.tsx, between the main Advanced card and the Danger Zone.
Removed the settings.nav.hardware entry, the /settings/hardware title mapping, and regenerated the router tree.
settings.hardware.* i18n strings stay so the section can reuse them in-place.

Net UI impact: power-user-only, no new top-level nav entry, and users who don't need it never see it.

Also pushed CodeRabbit follow-ups in the same batch

While I was in there I addressed the actionable items from the latest CodeRabbit pass:

whisper-wrapper/src/index.ts: setting gpuDevice now also forces gpu = true so a selected device can't silently fall back to CPU; captured the resolved binding in the constructor and reused it in transcribe() / free() so the ctx always goes back to the native addon that created it.
whisper-wrapper/src/loader.ts: unknown WHISPER_NATIVE_BACKEND values now emit a one-shot console.warn before falling back to "auto" instead of silently ignoring typos.
whisper-wrapper/addon/addon.cpp: gpu_device is now sanity-checked (negative → logged + skipped so whisper.cpp defaults back to device 0).
services/hardware-detection-service.ts: dedicated combines vendorCategory + VRAM so Intel / Apple integrated GPUs aren't flagged as dedicated; a transient systeminformation failure no longer caches an empty snapshot forever — it keeps the previous one so future detect({ force: true }) calls can recover.
services/transcription-service.ts + whisper-provider.ts: the compute-changed listener is now owned by TranscriptionService and runs the worker teardown inside transcriptionMutex.runExclusive(...), so a setting change can no longer race a mid-call initializeWhisper → transcribeAudio.
trpc/routers/hardware.ts: removed the unreachable void arch / void platform after the return.
docs/building-whisper-cuda-windows.md: fenced code blocks now use explicit text language tags for markdownlint.
UI refresh button: uses utils.hardware.getSnapshot.fetch({ refresh: true }) so tapping Refresh actually forces re-detection instead of re-returning the cached snapshot.

The flashAttn / sameInitOptions / post-load commit-state fixes for whisper-worker-fork.ts had already landed in 489f637.

Let me know if the placement looks right or you'd like the section collapsed behind an accordion / gated on multi-GPU detection — happy to iterate.

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (1)

packages/whisper-wrapper/src/index.ts (1)

40-49: ⚠️ Potential issue | 🟠 Major

Keep gpuDevice authoritative over gpu: false.

Line 48 can undo Lines 44-46 when callers pass { gpuDevice: 1, gpu: false }, so a specific GPU can still initialize with GPU execution disabled. Apply opts.gpu first, then force gpu=true when gpuDevice is present.

🐛 Proposed fix

     this.binding = loadBinding(loadOpts);
     const initOpts: Record<string, unknown> = { model: modelPath };
+    if (opts?.gpu !== undefined) initOpts.gpu = opts.gpu;
     // If the caller picked a specific GPU, make sure GPU execution is enabled.
     // Otherwise the native addon would set gpu_device but leave use_gpu=false,
     // silently falling back to CPU.
     if (opts?.gpuDevice !== undefined) {
       initOpts.gpu = true;
       initOpts.gpu_device = opts.gpuDevice;
     }
-    if (opts?.gpu !== undefined) initOpts.gpu = opts.gpu;
     if (opts?.flashAttn !== undefined) initOpts.flash_attn = opts.flashAttn;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/whisper-wrapper/src/index.ts` around lines 40 - 49, The init logic
allows opts.gpu to override a previously-set gpu=true from opts.gpuDevice, so
callers like {gpuDevice: 1, gpu: false} will disable GPU despite a specific
device being requested; change the order so you apply opts.gpu to initOpts
first, then apply the gpuDevice branch (set initOpts.gpu = true and
initOpts.gpu_device = opts.gpuDevice) last to ensure gpuDevice is authoritative;
update the code paths around initOpts, opts.gpu, opts.gpuDevice, and gpu_device
accordingly while leaving flash_attn handling unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/desktop/src/renderer/main/pages/settings/advanced/hardware-section.tsx`:
- Around line 116-144: The Select can end up with a value like "gpu:N" that has
no matching SelectItem when snapshot.gpus doesn't include that index, causing a
blank display; update the component handling around selection/gpus (the
selection state and handleChange) to detect when selection starts with "gpu:"
but gpus doesn't contain that index and either (a) render a disabled/unavailable
SelectItem for that exact `gpu:N` value (using the same SelectItem component)
with a label like "GPU N (not detected)" so the current persisted choice remains
visible, or (b) prompt/coerce the selection back to "auto" by calling
handleChange("auto") after user confirmation; use formatVram and gpu.index to
build the fallback label so it matches existing items.

In `@apps/desktop/src/services/transcription-service.ts`:
- Around line 79-87: The compute-change listener serialises reset/dispose with
transcriptionMutex but preload/model-load paths (preloadWhisperModel,
handleModelChange) call whisperProvider.preloadModel outside that lock, allowing
concurrent dispose during init; fix by guarding all WhisperProvider lifecycle
ops with the same lock or by always acquiring both modelLoadMutex and
transcriptionMutex in a consistent order before calling
whisperProvider.preloadModel, whisperProvider.reset, or whisperProvider.dispose
(e.g. acquire modelLoadMutex then transcriptionMutex everywhere), and update
preloadWhisperModel() and handleModelChange() to acquire those mutexes before
interacting with whisperProvider so reset/dispose cannot interleave with model
preload/load.

In `@docs/CHANGELOG-compute-device-selector.md`:
- Around line 41-44: The changelog entry incorrectly states that WhisperProvider
disposes its worker when settings change; update the lifecycle note to reflect
that TranscriptionService’s compute-changed listener performs teardown. Edit the
sentence referencing `TranscriptionService`, `WhisperProvider`, and
`initializeModel` so it says TranscriptionService listens for `compute-changed`
and disposes/tears down the worker when settings change, while WhisperProvider
only receives forwarded compute settings during `initializeModel`.
- Line 61: The fenced code block needs an explicit language to satisfy
markdownlint MD040; update the triple-backtick fence that currently has no
language to use "text" and ensure the listed path
"packages/whisper-wrapper/addon/addon.cpp" remains inside the fence so the block
becomes ```text ... ```; locate the empty fence in the CHANGELOG entry and
replace it with a fenced block starting with ```text and ending with ``` to fix
the lint.

---

Duplicate comments:
In `@packages/whisper-wrapper/src/index.ts`:
- Around line 40-49: The init logic allows opts.gpu to override a previously-set
gpu=true from opts.gpuDevice, so callers like {gpuDevice: 1, gpu: false} will
disable GPU despite a specific device being requested; change the order so you
apply opts.gpu to initOpts first, then apply the gpuDevice branch (set
initOpts.gpu = true and initOpts.gpu_device = opts.gpuDevice) last to ensure
gpuDevice is authoritative; update the code paths around initOpts, opts.gpu,
opts.gpuDevice, and gpu_device accordingly while leaving flash_attn handling
unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 46b545b5-1963-4e4f-9688-7f8d939a8cde

📥 Commits

Reviewing files that changed from the base of the PR and between 489f637 and 0ccc489.

📒 Files selected for processing (15)

README.md
apps/desktop/src/i18n/locales/en.json
apps/desktop/src/i18n/locales/es.json
apps/desktop/src/pipeline/providers/transcription/whisper-provider.ts
apps/desktop/src/renderer/main/pages/settings/advanced/hardware-section.tsx
apps/desktop/src/renderer/main/pages/settings/advanced/index.tsx
apps/desktop/src/services/hardware-detection-service.ts
apps/desktop/src/services/transcription-service.ts
apps/desktop/src/trpc/routers/hardware.ts
docs/CHANGELOG-compute-device-selector.md
docs/building-whisper-cuda-windows.md
docs/hardware-selection.md
packages/whisper-wrapper/addon/addon.cpp
packages/whisper-wrapper/src/index.ts
packages/whisper-wrapper/src/loader.ts

✅ Files skipped from review due to trivial changes (3)

README.md
docs/hardware-selection.md
docs/building-whisper-cuda-windows.md

🚧 Files skipped from review as they are similar to previous changes (5)

packages/whisper-wrapper/addon/addon.cpp
apps/desktop/src/pipeline/providers/transcription/whisper-provider.ts
apps/desktop/src/trpc/routers/hardware.ts
apps/desktop/src/i18n/locales/es.json
apps/desktop/src/i18n/locales/en.json

- transcription-service.ts: compute-changed reset now runs under modelLoadMutex + transcriptionMutex (in that order), and preloadWhisperModel() takes modelLoadMutex, so a setting change can no longer race a startup preload or handleModelChange() preload. - hardware-section.tsx: render a "GPU #N (not detected)" fallback item when the persisted compute.gpuDevice is no longer in the snapshot, so the Select doesn't render blank after a GPU is unplugged / detection temporarily fails. New i18n key settings.hardware.computeDevice.options.unavailable in en/es. - CHANGELOG-compute-device-selector.md: correct the lifecycle note (worker disposal lives in TranscriptionService, not WhisperProvider) and add a "text" language tag to the "Files touched" fence for markdownlint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CrisAlejo26 · 2026-04-20T06:37:23Z

All review feedback addressed — ready for another look 👋

What changed after your question

Placement — moved the selector under Settings → Advanced as you suggested. Dropped the dedicated /settings/hardware route and the sidebar entry; the compute-device picker + detected-GPU list now live as a Card inside pages/settings/advanced/index.tsx (between the main Advanced card and the Danger Zone). Net UI impact: no new top-level nav, users who don't need it never see it.

CodeRabbit passes

Three review passes, all clean now:

Pass	Findings	Landed in
1st	11 actionable	`0ccc489`
2nd	4 (1 🟠 Major + 3 🟡 Minor)	`078375a`
3rd	✅ No actionable comments	—

Highlights of what was fixed:

Race between compute-changed and preload/model-load — the teardown now lives in TranscriptionService and runs under modelLoadMutex + transcriptionMutex (consistent order, no deadlock). preloadWhisperModel() takes modelLoadMutex too, so a setting change during startup preload or handleModelChange can't dispose the worker mid-init.
gpuDevice silently falling back to CPU — setting gpuDevice in whisper-wrapper now also forces gpu = true; the resolved binding is captured in the constructor and reused in transcribe() / free() so the ctx always goes back to the native addon that created it.
WHISPER_NATIVE_BACKEND typos — unknown values now emit a one-shot console.warn instead of silently falling back to "auto".
gpu_device sanity check in addon.cpp — negative values are logged and skipped so whisper.cpp falls back to its own default.
dedicated heuristic — combines vendorCategory + VRAM so Intel / Apple integrated GPUs aren't flagged as dedicated.
Hardware-detection error cache — a transient systeminformation failure no longer caches an empty snapshot forever; it keeps the previous one so detect({ force: true }) can recover.
GPU missing at next launch — if the persisted compute.gpuDevice isn't in the current snapshot, the <Select> renders a "GPU #N (not detected)" fallback instead of going blank.
UI refresh — Refresh button uses utils.hardware.getSnapshot.fetch({ refresh: true }) so it actually re-detects.
Plus: unreachable code in trpc/routers/hardware.ts, markdownlint text tags on fenced blocks, updated CHANGELOG lifecycle note.

Verification

tsc --noEmit clean in apps/desktop and packages/whisper-wrapper.
pnpm --filter @amical/desktop test → 68/68 pass.
Manual smoke on Windows 11 / Intel UHD + RTX 4080 Laptop + CUDA 13.2: selecting the RTX actually routes through it (whisper_native_binding: cuda, GPU util spikes to 100 %, CPU stays low).

Happy to iterate on anything else you spot — and if you'd rather gate the section behind "multi-GPU detected" or an accordion I can do that too.

coderabbitai bot reviewed Apr 19, 2026

View reviewed changes

CrisAlejo26 and others added 3 commits April 19, 2026 22:26

coderabbitai bot reviewed Apr 20, 2026

View reviewed changes

Comment thread apps/desktop/src/renderer/main/pages/settings/advanced/hardware-section.tsx

Comment thread apps/desktop/src/services/transcription-service.ts Outdated

Comment thread docs/CHANGELOG-compute-device-selector.md Outdated

Comment thread docs/CHANGELOG-compute-device-selector.md Outdated

Uh oh!

Conversation

CrisAlejo26 commented Apr 19, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

haritabh-z01 commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CrisAlejo26 commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The specific case

What the PR actually does

Moved the selector under Advanced as you suggested

Also pushed CodeRabbit follow-ups in the same batch

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CrisAlejo26 commented Apr 20, 2026

What changed after your question

CodeRabbit passes

Verification

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CrisAlejo26 commented Apr 19, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 19, 2026 •

edited

Loading

haritabh-z01 commented Apr 19, 2026 •

edited

Loading

CrisAlejo26 commented Apr 20, 2026 •

edited

Loading