Releases · tetherto/qvac

21 Apr 12:29

sharmaraju352

bci-test-assets-v0.1.0

1f9897f

BCI Whispercpp Test Assets v0.1.0 Pre-release

Pre-release

Model files and test fixtures for @qvac/bci-whispercpp integration tests and examples.

Assets 5

17 Apr 10:51

github-actions

sdk-v0.9.0

5eed356

QVAC SDK v0.9.0

📦 NPM: https://www.npmjs.com/package/@qvac/sdk/v/0.9.0

This release significantly expands the SDK's capabilities with finetuning support, image generation via Stable Diffusion, duplex streaming transcription, and a suspend/resume lifecycle for mobile apps. Delegation gets healthier with heartbeat probes and remote cancellation. Tool-calling completions are now more robust with KV cache fixes, and a new profiler gives deep visibility into operation performance. React Native compatibility improves with Buffer-free diffusion and better progress event handling.

💥 Breaking Changes

`ping()` Replaced by `heartbeat()`

The ping() API has been replaced by heartbeat(), which supports both local and delegated (P2P) health checks. This enables proactive provider status monitoring before and during delegated inference.

Before:

import { ping } from "@qvac/sdk";
const pong = await ping();

After:

import { heartbeat } from "@qvac/sdk";

// Local heartbeat (replaces ping)
await heartbeat();

// Delegated heartbeat — check if a remote provider is alive
await heartbeat({
  delegate: { topic: "topicHex", providerPublicKey: "peerHex", timeout: 3000 },
});

🔌 New APIs

Finetuning

The SDK now supports LoRA finetuning of loaded LLM models. Training runs can be started, paused, resumed, cancelled, and inspected — all through a single finetune() function. Progress streams provide real-time loss and step metrics.

import { finetune } from "@qvac/sdk";

const handle = finetune({
  modelId,
  options: {
    trainDatasetDir: "./dataset/train",
    validation: { type: "dataset", path: "./dataset/eval" },
    outputParametersDir: "./artifacts/lora",
    numberOfEpochs: 2,
  },
});

for await (const progress of handle.progressStream) {
  console.log(progress.global_steps, progress.loss);
}
const result = await handle.result;

Operations: start, resume, pause, cancel, getState. Omit operation to let the addon auto-detect whether to start fresh or resume.

Image Generation (Diffusion)

Stable Diffusion models are now integrated as a first-class SDK capability. Load a diffusion model and generate images with step-by-step progress tracking.

import { loadModel, diffusion, SD_V2_1_1B_Q8_0 } from "@qvac/sdk";

const modelId = await loadModel({
  modelSrc: SD_V2_1_1B_Q8_0,
  modelType: "diffusion",
  modelConfig: { prediction: "v" },
});

const { progressStream, outputs, stats } = diffusion({
  modelId,
  prompt: "a cat sitting on a windowsill",
  width: 512,
  height: 512,
  steps: 20,
});

for await (const { step, totalSteps } of progressStream) {
  console.log(`${step}/${totalSteps}`);
}
const buffers = await outputs;

Duplex Streaming Transcription (`transcribeStream`)

A new bidirectional streaming API lets you feed audio incrementally and receive transcription segments as speech is detected, enabling real-time voice interfaces.

import { transcribeStream } from "@qvac/sdk";

const session = await transcribeStream({ modelId });
session.write(audioChunk);
session.end();

for await (const text of session) {
  console.log(text);
}
session.destroy();

The previous single-shot transcribeStream({ modelId, audioChunk }) pattern still works but logs a deprecation warning — use transcribe() for batch transcription.

Suspend/Resume Lifecycle

Mobile and desktop apps can now cleanly suspend and resume SDK operations when the app enters the background or foreground, preventing resource leaks and stale state.

import { suspend, resume } from "@qvac/sdk";

await suspend(); // app going to background
await resume();  // app returning to foreground

Delegated Cancellation

Remote inference and downloads running on a delegation provider can now be cancelled from the consumer side.

import { cancel } from "@qvac/sdk";

await cancel({ operation: "inference", modelId: "delegated-model-id" });

await cancel({
  operation: "downloadAsset",
  downloadKey: "download-key",
  delegate: { topic: "topicHex", providerPublicKey: "peerHex" },
});

Delegation Health Check Timeout

A new healthCheckTimeout option on the delegate config lets you control how long the RPC health probe waits before marking a cached connection as stale and reconnecting.

await loadModel({
  modelSrc: LLAMA_3_2_1B_INST_Q4_0,
  modelType: "llm",
  delegate: {
    topic: topicHex,
    providerPublicKey,
    timeout: 30_000,
    healthCheckTimeout: 2000,
  },
});

Addon Stats Across All Operations

All inference operations now return detailed performance stats from the underlying addons. Completion, transcription, translation, TTS, and embedding responses all include stats like tokensPerSecond, timeToFirstToken, audioDuration, and the new backendDevice field ("cpu" or "gpu").

const { embedding, stats } = await embed({ modelId, text: "hello" });
console.log(stats?.backendDevice); // "cpu" | "gpu"

✨ Features

CLD2 language detection is now integrated into the SDK for automatic language identification.
OCR plugin updated to work with @qvac/ocr-onnx@0.4.0.
TTS interface refactored — the TTS package uses a new files-based constructor with absolute paths, replacing the legacy loader pattern.

🐞 Bug Fixes

KV cache preserved across tool-call round-trips — multi-turn tool-calling completions no longer lose context between rounds.
KV cache save race condition fixed in tool-calling completions — concurrent saves no longer corrupt the cache.
<think> blocks stripped before parsing tool calls — reasoning traces from models like DeepSeek no longer break tool call extraction.
Progress event buffering — throttled progress events are now buffered instead of dropped, ensuring no updates are lost during fast download sequences.
RPC progress throttling — progress frames are throttled to prevent Maximum call stack size exceeded errors during high-frequency updates.
Clean process exit — the Bare runtime process global is now handled correctly, and RPC close triggers a clean exit.
Connection teardown race in closeConnections resolved — concurrent teardowns no longer deadlock.
React Native diffusion compatibility — Buffer replaced with Uint8Array in the diffusion client, fixing React Native builds.
Download progress accuracy — registry downloads now use network-layer progress instead of disk I/O measurements.
VLM addon classification — the model registry was regenerated to fix incorrect VLM addon type assignments.
ONNX companion files — .onnx.data companion files are now correctly resolved during registry model resolution.
Security hardening — multiple code scanning alerts resolved across SDK pod packages.

📦 Model Changes

Model registry updated: 312 → 653 (+341). See model changes for the full list.

295 Bergamot translation models — offline NMT covering 42 language pairs bidirectional (az, be, bg, bn, bs, ca, da, de, el, et, fa, fi, gu, he, hi, hr, hu, id, is, kn, ko, lt, lv, ml, ms, mt, nb, nl, nn, pl, ro, sk, sl, sq, sr, sv, ta, te, tr, uk, vi). Each pair includes model weights, lexical shortlists, vocabularies, and metadata.
5 FLUX models — FLUX.2 Klein 4B in Q4_0, Q4_K_M, Q6_K, Q8_0 quantizations plus VAE.
4 Stable Diffusion models — SD v2.1 1B (Q4_0, Q8_0) and SDXL Base 1.0 3B (Q4_0, Q8_0).
17 TTS Supertonic models — Official Supertone FP32 variants including duration predictor, text encoder, vocoder, config, unicode indexer, and 10 voice styles.
1 LLM model — Qwen3 4B (Q4_K_M).

🧹 Other Changes

Updated addon dependencies: @qvac/tts-onnx to v0.6.7, @qvac/transcription-whispercpp to latest, Parakeet to v0.2.7, @qvac/diffusion-cpp to ^0.1.3.
Replaced FeatureBase support links with Discord channel.
Bumped bare-crypto and @qvac/rag for runtime stability.
Renamed @tetherto npm references to @qvac namespace across READMEs.
Improved test infrastructure with SDK test bootstrap and CI model caching.

Assets 2

19 Apr 09:06

github-actions

llamacpp-llm-v0.16.0

df05614

QVAC LLM Addon v0.16.0

This release migrates the LLM addon off BaseInference inheritance and the WeightsProvider download layer onto the composable createJobHandler + exclusiveRunQueue utilities from @qvac/infer-base@^0.4.0. The constructor signature is replaced with a single object whose files.model field is an ordered array of absolute paths and files.projectionModel is an optional absolute path for multimodal models. This is a breaking change — every caller must update.

Breaking Changes

Constructor signature: single object with `files`, no `Loader`

LlmLlamacpp now takes a single { files, config, logger?, opts? } object. The old Loader + diskPath + modelName + two-arg (args, config) shape is gone — callers pre-resolve absolute paths and supply them as files.model.

// BEFORE (≤ 0.15.x)
const FilesystemDL = require('@qvac/dl-filesystem')
const loader = new FilesystemDL({ dirPath: '/models' })
const model = new LlmLlamacpp({
  loader,
  modelName: 'Qwen3-1.7B-Q4_0.gguf',
  diskPath: '/models',
  logger: console,
  opts: { stats: true }
}, { ctx_size: '4096', gpu_layers: '99' })

// AFTER (0.16.0)
const model = new LlmLlamacpp({
  files: {
    model: ['/models/Qwen3-1.7B-Q4_0.gguf']
  },
  config: { ctx_size: '4096', gpu_layers: '99' },
  logger: console,
  opts: { stats: true }
})

For sharded models the caller passes the full ordered list — the <basename>.tensors.txt companion first, followed by every <basename>-NNNNN-of-MMMMM.gguf shard in ascending order. For multimodal models, files.projectionModel carries the absolute path to the mmproj file:

const model = new LlmLlamacpp({
  files: {
    model: [
      '/models/medgemma-4b-it-Q4_1.tensors.txt',
      '/models/medgemma-4b-it-Q4_1-00001-of-00005.gguf',
      '/models/medgemma-4b-it-Q4_1-00002-of-00005.gguf',
      '/models/medgemma-4b-it-Q4_1-00003-of-00005.gguf',
      '/models/medgemma-4b-it-Q4_1-00004-of-00005.gguf',
      '/models/medgemma-4b-it-Q4_1-00005-of-00005.gguf'
    ],
    projectionModel: '/models/mmproj-model-f16.gguf'
  },
  config: { gpu_layers: '99' }
})

`BaseInference` inheritance and `WeightsProvider` removed

LlmLlamacpp no longer extends BaseInference and no longer touches the WeightsProvider download layer. The class composes createJobHandler and exclusiveRunQueue from @qvac/infer-base@^0.4.0 directly. Public lifecycle methods (load / run / finetune / pause / cancel / unload / getState) are unchanged in shape, but downloadWeights and the loader-based progress callbacks are gone — the caller is responsible for placing files on disk before constructing the model.

In-memory streaming from network sources (URLs, Hyperdrive) is no longer supported in the current API. The SDK does not currently use it (models are stored to disk first); this can be re-added when/if the SDK plans to support that feature. Before, it was possible through the Loader abstraction.

Dependency changes

@qvac/infer-base bumped from ^0.3.0 to ^0.4.0.
bare-fs is now a runtime dependency (used to stream shards from disk).
@qvac/dl-base and @qvac/dl-filesystem are no longer used by this package and have been removed from devDependencies.

`getState()` returns a narrower shape

getState() previously returned { configLoaded, weightsLoaded, destroyed } (the three-field shape inherited from BaseInference). It now returns { configLoaded } only. The weightsLoaded and destroyed fields are gone — weightsLoaded collapsed into configLoaded because the refactored load() does both in one step, and destroyed is no longer tracked since unload() resets configLoaded and nulls the addon handle instead. Callers reading state.weightsLoaded or state.destroyed must switch to state.configLoaded.

Public methods removed from `LlmLlamacpp`

LlmLlamacpp previously exposed these methods via BaseInference inheritance, all of which are now gone:

downloadWeights(onDownloadProgress, opts) — the download layer is removed; the caller places files on disk and passes absolute paths in files.model / files.projectionModel.
unpause() / stop() — BaseInference job-lifecycle helpers. The refactor still exposes pause() and cancel(); unpause is superseded by issuing a new run() after cancel().
status() — replaced by getState() for the static readiness flag; per-job state is observed via the QvacResponse returned by run().
destroy() — folded into unload(), which now both releases native resources and nulls this.addon.
getApiDefinition() — no longer exposed; consumers should import types from index.d.ts.

`load()` takes no arguments

load() previously forwarded ...args through BaseInference.load into LLM's _load(closeLoader, onDownloadProgress). Both arguments are gone — closeLoader is meaningless without a Loader, and onDownloadProgress is superseded by the caller owning download-and-placement before construction. Call await model.load() with no arguments.

Type exports removed from `index.d.ts`

The following exports are no longer part of the package's public type surface because the loader/download layer they described is gone: ReportProgressCallback, Loader, DownloadWeightsOptions, DownloadResult. TypeScript consumers importing any of these must update to the new LlmLlamacppArgs / files shape.

Features

Constructor input validation

The constructor now throws TypeError('files.model must be a non-empty array of absolute paths') when files or files.model is missing or empty. This produces a clear error for callers porting old code instead of a confusing Cannot read properties of undefined.

`run()`-before-`load()` guard

Calling run() before load() now throws Error('Addon not initialized. Call load() first.') instead of dereferencing null and crashing. finetune() already had this guard since the previous release.

`load()` is now idempotent when already loaded

A second load() call on an already-loaded instance is now a silent no-op instead of unloading and reloading. This aligns with the ReadyResource pattern used elsewhere in QVAC and prevents accidental double-loads from triggering expensive work. Callers that intentionally want to swap weights must call unload() first (which clears configLoaded) and then load() again.

Crash-safe shard streaming

If _streamShards() or addon.activate() throws mid-load (for example a corrupted shard file or a native init failure), the partially-initialized addon is now best-effort-unloaded and this.addon is reset to null. A subsequent load() call starts cleanly instead of leaking a zombie native instance.

Restored JSDoc on `FinetuneOptions`

Every FinetuneOptions field carries a /** … */ doc comment again, including the default values (numberOfEpochs = 1, learningRate = 1e-4, batchSize = 128, …) so IDE tooltips show them without needing to read docs/finetuning.md.

Bug Fixes

`unload()` clears the addon reference

unload() now sets this.addon = null after await this.addon.unload(), so post-unload cancel() / pause() / run() calls hit the explicit guards rather than dereferencing a disposed native handle. pause(), cancel(), and the job-handler cancel closure all use optional chaining for the same reason.

Removed dead `_isSuppressedNoResponseLog` filter

The _createFilteredLogger infrastructure that wrapped the user-supplied logger to swallow 'No response found for job' warnings was tied to the old BaseInference _jobToResponse Map. The new architecture cannot emit that message at all, so the filter, the wrapped logger, and the _originalLogger indirection are all removed. The user-supplied logger is now used directly.

`load()` is serialized through the exclusive run queue

load() is now routed through the same exclusiveRunQueue used by run(), finetune(), and unload(). Previously two overlapping load() calls on the same instance could both pass the configLoaded guard before it flipped to true, both stream shards into and activate the native addon, and clobber this.addon — leaking one native handle. Concurrent load() on a single instance is now safe.

Constructor rejects non-absolute path entries

Each entry in files.model is now validated with path.isAbsolute() (matching the existing error-message contract), and the same check now applies to the optional files.projectionModel — previously it had no validation at all. Relative paths are rejected at construction time instead of bubbling up from bare-fs or the native load.

Pull Requests

#1494 - chore[bc]: LLM addon interface refactor — remove BaseInference and WeightsProvider

Assets 2

19 Apr 09:20

github-actions

llamacpp-embed-v0.14.0

df05614

QVAC Embed Addon v0.14.0 Latest

Latest

This release migrates the embed addon off BaseInference inheritance and the WeightsProvider download layer onto the composable createJobHandler + exclusiveRunQueue utilities from @qvac/infer-base@^0.4.0. The constructor signature is replaced with a single object whose files.model field is an ordered array of absolute paths, mirroring the parallel LLM and diffusion addon refactors. This is a breaking change — every caller must update.

Breaking Changes

Constructor signature: single object with `files`, no `Loader`

GGMLBert now takes a single { files, config?, logger?, opts? } object. The old Loader + diskPath + modelName + two-arg (args, config) shape is gone — callers pre-resolve absolute paths and supply them as files.model.

// BEFORE (≤ 0.13.x)
const FilesystemDL = require('@qvac/dl-filesystem')
const loader = new FilesystemDL({ dirPath: '/models' })
const model = new GGMLBert({
  loader,
  modelName: 'bge-small-en-v1.5-q4_0.gguf',
  diskPath: '/models',
  logger: console,
  opts: { stats: true }
}, { device: 'gpu', batch_size: '512' })

// AFTER (0.14.0)
const model = new GGMLBert({
  files: {
    model: ['/models/bge-small-en-v1.5-q4_0.gguf']
  },
  config: { device: 'gpu', batch_size: '512' },
  logger: console,
  opts: { stats: true }
})

For sharded models the caller passes the full ordered list — the <basename>.tensors.txt companion first, followed by every <basename>-NNNNN-of-MMMMM.gguf shard in ascending order:

const model = new GGMLBert({
  files: {
    model: [
      '/models/big-embed-model.tensors.txt',
      '/models/big-embed-model-00001-of-00003.gguf',
      '/models/big-embed-model-00002-of-00003.gguf',
      '/models/big-embed-model-00003-of-00003.gguf'
    ]
  },
  config: { device: 'gpu' }
})

`BaseInference` inheritance and `WeightsProvider` removed

GGMLBert no longer extends BaseInference and no longer touches the WeightsProvider download layer. The class composes createJobHandler and exclusiveRunQueue from @qvac/infer-base@^0.4.0 directly. Public lifecycle methods (load / run / cancel / unload / getState) are unchanged in shape, but downloadWeights and the loader-based progress callbacks are gone — the caller is responsible for placing files on disk before constructing the model.

Dependency changes

@qvac/infer-base bumped from ^0.2.2 to ^0.4.0.
bare-fs is now a runtime dependency (used to stream shards from disk).
@qvac/dl-filesystem and @qvac/dl-hyperdrive are no longer used by this package and have been removed from devDependencies / peerDependencies.

`getState()` returns a narrower shape

Public methods removed from `GGMLBert`

GGMLBert previously exposed these methods via BaseInference inheritance, all of which are now gone:

downloadWeights(onDownloadProgress, opts) — the download layer is removed; the caller places files on disk and passes absolute paths in files.model.
pause() / unpause() / stop() — BaseInference job-lifecycle helpers. The refactor uses createJobHandler directly; use cancel() to terminate an in-flight run.
status() — replaced by getState() for the static readiness flag; per-job state is observed via the QvacResponse returned by run().
destroy() — folded into unload(), which now both releases native resources and nulls this.addon.
getApiDefinition() — no longer exposed; consumers should import types from index.d.ts.

`load()` takes no arguments

load() previously forwarded ...args through BaseInference.load into embed's _load(closeLoader, reportProgressCallback). Both arguments are gone — closeLoader is meaningless without a Loader, and reportProgressCallback is superseded by the caller owning download-and-placement before construction. Call await model.load() with no arguments.

Type exports removed from `index.d.ts`

The following exports are no longer part of the package's public type surface because the loader/download layer they described is gone: ReportProgressCallback, Loader, GGMLArgs, DownloadWeightsOptions, DownloadResult. TypeScript consumers importing any of these must update to the new GGMLBertArgs / files shape.

`BertInterface` `outputCb` signature: `jobId` dropped

The exported BertInterface class's constructor still takes (binding, configurationParams, outputCb), but the outputCb signature changed:

// BEFORE
(addon: unknown, event: string, jobId: number, data: unknown, error?: Error) => void
// AFTER
(addon: unknown, event: string, data: unknown, error?: Error) => void

The jobId: number argument is gone because createJobHandler owns the single active job directly; the wrapper no longer needs a per-job identifier in the callback chain. External callers constructing BertInterface with a custom outputCb must drop the third argument.

`BertInterface.runJob` return type

BertInterface.runJob(input) previously returned Promise<void>. It now returns Promise<boolean> — true if the job was accepted, false if the addon was already busy. GGMLBert uses this return to surface a busy error to the caller instead of silently dropping the job.

Features

Constructor input validation

`run()`-before-`load()` guard

Calling run() before load() now throws Error('Addon not initialized. Call load() first.') instead of dereferencing null and crashing.

`load()` is now idempotent when already loaded

Crash-safe shard streaming

Bug Fixes

`unload()` clears the addon reference

unload() now sets this.addon = null after await this.addon.unload(), so post-unload cancel() / run() calls hit the explicit guards rather than dereferencing a disposed native handle. cancel() and the job-handler cancel closure both use optional chaining for the same reason.

Unknown addon events no longer pollute the output stream

_addonOutputCallback previously fed any non-stats / non-error event payload into response.output, including unknown events. It now logs unknown events at warn level (these indicate a native-layer change worth surfacing) and only forwards Embeddings payloads to the active response.

`load()` is serialized through the exclusive run queue

load() is now routed through the same exclusiveRunQueue used by run() and unload(). Previously two overlapping load() calls on the same instance could both pass the configLoaded guard before it flipped to true, both stream shards into and activate the native addon, and clobber this.addon — leaking one native handle. Concurrent load() on a single instance is now safe.

Constructor rejects non-absolute path entries

Each entry in files.model is now validated with path.isAbsolute() (matching the existing error-message contract). Relative paths are rejected at construction time instead of bubbling up from bare-fs or the native load.

Pull Requests

#1493 - chore[bc]: embed addon interface refactor — remove BaseInference and WeightsProvider

Assets 2

19 Apr 08:59

github-actions

diffusion-cpp-v0.3.0

df05614

QVAC Stable Diffusion Addon v0.3.0

This release migrates the diffusion addon off BaseInference inheritance and onto the composable createJobHandler + exclusiveRunQueue utilities from @qvac/infer-base@^0.4.0. The constructor signature is replaced with a single object whose files field carries absolute paths for every model component, mirroring the parallel embed and LLM addon refactors. This is a breaking change — every caller must update.

Breaking Changes

Constructor signature: single object with `files` instead of `(args, config)`

ImgStableDiffusion now takes a single { files, config, logger?, opts? } object. The old diskPath + modelName + per-component filename pattern is gone — callers pass absolute paths directly via files. Companion model fields are renamed (clipLModel → clipL, clipGModel → clipG, t5XxlModel → t5Xxl, llmModel → llm, vaeModel → vae).

// BEFORE (≤ 0.2.x)
const model = new ImgStableDiffusion({
  diskPath: '/models',
  modelName: 'flux-2-klein-4b-Q8_0.gguf',
  llmModel: 'Qwen3-4B-Q4_K_M.gguf',
  vaeModel: 'flux2-vae.safetensors',
  logger: console
}, { threads: 8 })

// AFTER (0.3.0)
const model = new ImgStableDiffusion({
  files: {
    model: '/models/flux-2-klein-4b-Q8_0.gguf',
    llm:   '/models/Qwen3-4B-Q4_K_M.gguf',
    vae:   '/models/flux2-vae.safetensors'
  },
  config: { threads: 8 },
  logger: console,
  opts: { stats: true }
})

`BaseInference` inheritance removed

ImgStableDiffusion no longer extends BaseInference. The class composes createJobHandler and exclusiveRunQueue from @qvac/infer-base@^0.4.0 directly. The public lifecycle (load / run / cancel / unload / getState) is unchanged in shape; only construction differs. Internal helpers like _withExclusiveRun and _outputCallback are removed.

Caller owns absolute paths — addon no longer joins `diskPath` + filename

Callers that previously relied on the addon to resolve path.join(diskPath, filename) must now do that resolution themselves before constructing the model.

`getState()` returns a narrower shape

getState() previously returned { configLoaded, weightsLoaded, destroyed } (the three-field shape from BaseInference). It now returns { configLoaded } only. The weightsLoaded and destroyed fields are gone — weightsLoaded collapsed into configLoaded because the refactored load() does both in one step, and destroyed is no longer tracked since unload() resets configLoaded and nulls the addon handle instead. Callers reading state.weightsLoaded or state.destroyed must switch to state.configLoaded.

Public methods removed from `ImgStableDiffusion`

ImgStableDiffusion previously exposed these methods via BaseInference inheritance, all of which are now gone:

downloadWeights(onDownloadProgress, opts) — the diffusion addon never used the loader in practice, but the inherited method was still present on the public surface. It is removed along with the base class.
pause() / unpause() / stop() — BaseInference job-lifecycle helpers. The refactor uses createJobHandler directly; use cancel() to terminate an in-flight generation.
status() — replaced by getState() for the static readiness flag; per-job state is observed via the QvacResponse returned by run().
destroy() — folded into unload(), which now both releases native resources and nulls this.addon.
getApiDefinition() — no longer exposed; consumers should import types from index.d.ts.

`cancel()` no longer accepts a `jobId`

BaseInference.cancel(jobId) took an optional jobId argument. The refactor's cancel() is parameterless — there is always at most one active generation per instance, owned by createJobHandler. Any caller passing a jobId will have it ignored; update call sites to await model.cancel().

Features

Constructor input validation

The constructor now throws TypeError('files.model must be an absolute path string') when files.model is missing or not a string, or TypeError('files.model must be an absolute path (got: <value>)') when supplied as a relative path. This produces a clear error for callers porting old code instead of a confusing Cannot read properties of undefined. The same validation applies to optional companion fields (clipL, clipG, t5Xxl, llm, vae) when supplied.

`run()`-before-`load()` guard

Calling run() before load() now throws Error('Addon not initialized. Call load() first.') instead of crashing in native code. Covered by a new regression test in test/integration/api-behavior.test.js.

`load()` is now idempotent when already loaded

Broader split-layout detection

isSplitLayout now also triggers when only clipL or clipG is supplied. This closes a footgun where a FLUX.1 caller passing { model, clipL, clipG, vae } (without t5Xxl) would silently mis-route the diffusion model into the all-in-one path parameter and fail to load.

Bug Fixes

`unload()` clears the addon reference

unload() now sets this.addon = null after await this.addon.unload(), so post-unload cancel() / run() calls hit the explicit if (!this.addon) guard rather than dereferencing a disposed native handle.

Unknown addon events no longer pollute the output stream

_addonOutputCallback previously had a fallthrough that pushed any non-error / non-image / non-stats event into response.output (including null and undefined). It now logs unknown events at debug level and does not feed them into the active response.

Crash-safe activation

If addon.activate() throws during _load() (for example a native init failure or a missing model file discovered late), the partially-initialized addon is now best-effort-unloaded, the native logger is released, and this.addon is reset to null. A subsequent load() call starts cleanly instead of leaking a zombie native instance.

`load()` is serialized through the exclusive run queue

load() is now routed through the same exclusiveRunQueue used by run() and unload(). Previously two overlapping load() calls on the same instance could both pass the configLoaded guard before it flipped to true, both allocate a native addon, and clobber this.addon — leaking one native handle. Concurrent load() on a single instance is now safe.

Pull Requests

#1496 - chore[bc]: diffusion addon interface refactor — remove BaseInference

Assets 2

15 Apr 18:02

github-actions

diffusion-cpp-v0.2.0

8e4b6c4

QVAC Stable Diffusion Addon v0.2.0

Added

FLUX.2 img2img support with in-context conditioning (ref_images) via init_image parameter
JS-side input validation for readImageDimensions() with buffer-length guards for truncated PNG/JPEG
Regression tests for FLUX img2img prediction guard and truncated image handling

Changed

FLUX img2img now requires explicit prediction: 'flux2_flow' in config to prevent silent fallback to SDEdit
Updated prediction docstring to clarify auto-detection is insufficient for FLUX img2img
Exported readImageDimensions() for testing and external use

Fixed

readImageDimensions() now safely handles truncated/corrupt PNG and JPEG buffers

Assets 2

15 Apr 11:08

github-actions

diffusion-cpp-v0.1.3

0b06ef0

QVAC Stable Diffusion Addon v0.1.3

Changed

README, index.d.ts, and index.js JSDoc no longer claim FLUX.1 support for clipLModel and t5XxlModel. The addon exposes SDXL, SD3, and FLUX.2-klein only — FLUX.1 was never wired through the JS layer. The example model name in the constructor JSDoc is also corrected to flux-2-klein-4b-Q8_0.gguf.

Assets 2

14 Apr 08:37

github-actions

ocr-onnx-v0.4.2

8cb2102

QVAC OCR Addon v0.4.2

Fixed

Updated README to use current package name (@qvac/ocr-onnx) and monorepo paths
Removed redundant ensure-npm-public job from on-merge workflow

Assets 2

14 Apr 05:52

github-actions

ocr-onnx-v0.4.1

d641edc

QVAC OCR Addon v0.4.1

Fixed

SIGABRT crash on process exit in OCR addon
Use HTTPS instead of SSH for vcpkg registry URLs

Changed

Updated OCR integration tests for createJobHandler migration
Removed hyperdrive references and dependencies
Renamed dl-hyperdrive and dl-filesystem package references
Migrated qvac-devops to oss-action

Assets 2

14 Apr 10:11

github-actions

llamacpp-llm-v0.14.4

02a57be

QVAC LLM Addon v0.14.4

Changed

Updated qvac-fabric dependency from 7248.2.1 to 7248.2.3, which fixes OpenCL kernel cache support on Android.

Added

openclCacheDir option in LlamaConfig (index.d.ts): writable directory for OpenCL kernel binary cache, required on Android for fast GPU startup.
cache-type-k and cache-type-v options in LlamaConfig (index.d.ts): configure KV cache quantization types.

Assets 2

Releases: tetherto/qvac

BCI Whispercpp Test Assets v0.1.0

Uh oh!

QVAC SDK v0.9.0

💥 Breaking Changes

ping() Replaced by heartbeat()

🔌 New APIs

Finetuning

Image Generation (Diffusion)

Duplex Streaming Transcription (transcribeStream)

Suspend/Resume Lifecycle

Delegated Cancellation

Delegation Health Check Timeout

Addon Stats Across All Operations

✨ Features

🐞 Bug Fixes

📦 Model Changes

🧹 Other Changes

Uh oh!

QVAC LLM Addon v0.16.0

Breaking Changes

Constructor signature: single object with files, no Loader

BaseInference inheritance and WeightsProvider removed

Dependency changes

getState() returns a narrower shape

Public methods removed from LlmLlamacpp

load() takes no arguments

Type exports removed from index.d.ts

Features

Constructor input validation

run()-before-load() guard

load() is now idempotent when already loaded

Crash-safe shard streaming

Restored JSDoc on FinetuneOptions

Bug Fixes

unload() clears the addon reference

Removed dead _isSuppressedNoResponseLog filter

load() is serialized through the exclusive run queue

Constructor rejects non-absolute path entries

Pull Requests

Uh oh!

QVAC Embed Addon v0.14.0

Breaking Changes

Constructor signature: single object with files, no Loader

BaseInference inheritance and WeightsProvider removed

Dependency changes

getState() returns a narrower shape

Public methods removed from GGMLBert

load() takes no arguments

Type exports removed from index.d.ts

BertInterface outputCb signature: jobId dropped

BertInterface.runJob return type

Features

Constructor input validation

run()-before-load() guard

load() is now idempotent when already loaded

Crash-safe shard streaming

Bug Fixes

unload() clears the addon reference

Unknown addon events no longer pollute the output stream

load() is serialized through the exclusive run queue

Constructor rejects non-absolute path entries

Pull Requests

Uh oh!

QVAC Stable Diffusion Addon v0.3.0

Breaking Changes

Constructor signature: single object with files instead of (args, config)

BaseInference inheritance removed

Caller owns absolute paths — addon no longer joins diskPath + filename

getState() returns a narrower shape

Public methods removed from ImgStableDiffusion

cancel() no longer accepts a jobId

Features

Constructor input validation

run()-before-load() guard

load() is now idempotent when already loaded

Broader split-layout detection

Bug Fixes

unload() clears the addon reference

Unknown addon events no longer pollute the output stream

`ping()` Replaced by `heartbeat()`

Duplex Streaming Transcription (`transcribeStream`)

Constructor signature: single object with `files`, no `Loader`

`BaseInference` inheritance and `WeightsProvider` removed

`getState()` returns a narrower shape

Public methods removed from `LlmLlamacpp`

`load()` takes no arguments

Type exports removed from `index.d.ts`

`run()`-before-`load()` guard

`load()` is now idempotent when already loaded

Restored JSDoc on `FinetuneOptions`

`unload()` clears the addon reference

Removed dead `_isSuppressedNoResponseLog` filter

`load()` is serialized through the exclusive run queue

Constructor signature: single object with `files`, no `Loader`

`BaseInference` inheritance and `WeightsProvider` removed

`getState()` returns a narrower shape

Public methods removed from `GGMLBert`

`load()` takes no arguments

Type exports removed from `index.d.ts`

`BertInterface` `outputCb` signature: `jobId` dropped

`BertInterface.runJob` return type

`run()`-before-`load()` guard

`load()` is now idempotent when already loaded

`unload()` clears the addon reference

`load()` is serialized through the exclusive run queue

Constructor signature: single object with `files` instead of `(args, config)`

`BaseInference` inheritance removed

Caller owns absolute paths — addon no longer joins `diskPath` + filename

`getState()` returns a narrower shape

Public methods removed from `ImgStableDiffusion`

`cancel()` no longer accepts a `jobId`

`run()`-before-`load()` guard

`load()` is now idempotent when already loaded

`unload()` clears the addon reference

`load()` is serialized through the exclusive run queue