Skip to content

Releases: magcil/deepaudio-x

v0.4.4

20 Apr 11:31
18cd7fd

Choose a tag to compare

What's Changed

Verbosity Control for Training & Evaluation

Both Trainer and Evaluator now accept a verbose parameter, giving users explicit control over console output.

Trainer

  • verbose=True (default): epoch header, per-epoch metrics (train loss, val loss, elapsed time), and checkpointer messages are printed as before.
  • verbose=False: all epoch-level output is suppressed. EarlyStopper warnings and checkpointer error messages always surface regardless of this setting.
  • A Training Complete summary is always printed after training finishes, showing the best epoch, corresponding train/val losses, and checkpoint path.

Evaluator

  • verbose=True (default): classification report, confusion matrix, and average posteriors are printed after evaluation.
  • verbose=False: report output is suppressed. "Evaluation has finished." always prints.

Internal

  • Removed ConsoleLogger callback — logging is now handled inline in Trainer and Evaluator.

Full Changelog: v0.4.3...v0.4.4

v0.4.3

06 Apr 04:58
d5413cc

Choose a tag to compare

What's New

MPS Support

  • Added DeviceName type alias ("cuda", "mps", "cpu")
  • Trainer and Evaluator now accept an explicit device parameter — users on Apple Silicon can pass device="mps" to leverage the Metal GPU backend
  • Fixed hardcoded "cuda" in PaSST mel spectrogram autocast guard — replaced with x.device.type

DataLoader Efficiency

  • pin_memory is now conditional on CUDA — avoids unsupported behavior on MPS and CPU
  • Added persistent_workers=True — eliminates worker process respawn overhead at epoch boundaries
  • pad_collate_fn now uses torch.stack for equal-length batches (e.g. when segment_duration is set), falling back to pad_sequence for variable-length batches

v0.4.2

02 Apr 13:29
5f8f41b

Choose a tag to compare

What's New

Checkpoint Architecture Persistence

  • AudioClassifier and Backbone now store a config dict capturing all constructor arguments (backbone name, pooling, num_classes, etc.)
  • Checkpoints (.pt files) now contain both state_dict and config — fully self-describing, no need to remember how the model was built.
  • New from_checkpoint(path) classmethod reconstructs model architecture + weights in one call:
    model = AudioClassifier.from_checkpoint("checkpoint.pt")

Public API

  • Added AVAILABLE_BACKBONES and AVAILABLE_POOLING runtime tuple constants for programmatic inspection of supported options.
  • Added PoolingName type alias (Literal["gap", "simpool", "ep"]) alongside the existing BackboneName.
  • Removed internal names (BACKBONES, POOLING, AudioClassifierConstructor, BackboneConstructor) from the public namespace — users interact only via AudioClassifier and Backbone.

Documentation

  • Fixed AudioClassifier and Backbone rendering as "alias of..." in ReadTheDocs API reference.
  • Added uv-based environment setup instructions in the Installation page.
  • Updated Contributing page with development setup (uv sync) and test execution (uv run pytest -v) instructions.
  • Updated README training and evaluation examples to use from_checkpoint; added PyPI publish badge.

Tests

  • Updated test_evaluation_loop and test_inference to use AudioClassifier.from_checkpoint(...) instead of manually calling torch.load + load_state_dict.

v0.4.1

24 Mar 12:07
acbf0d1

Choose a tag to compare

Inference Pipeline

  • `predict()` is now a pure low-level method — eval mode and gradient context are the caller's responsibility
  • `inference_on_waveform()` automatically manages eval mode and restores training state via the new `@eval_mode` decorator
  • Added `ValueError` for non-1D input in `inference_on_waveform()`
  • New `utils/decorators.py` with `eval_mode` decorator using `try/finally` for guaranteed state restoration

Evaluator

  • Fixed O(n²) `np.concatenate` — results now accumulate in lists and are concatenated once after the loop
  • Replaced `torch.no_grad()` context manager with `@torch.inference_mode()` decorator

Trainer

  • Decomposed `train()` into `train_step()`, `val_step()`, and `epoch_step()` — users can now build custom training loops around `epoch_step()`

Module Structure

  • Renamed `dtos/` → `schemas/` and split into `items.py`, `predictions.py`, and `types.py`
  • `BackboneName` moved to `schemas/types.py` to eliminate circular imports
  • Fixed backbone registry typo: `monilenet_` → `mobilenet_`
  • Suppressed `FutureWarning` from deprecated `weight_norm` in BEATs backbone

Public API

  • Removed internal names (`AudioClassifierConstructor`, `BackboneConstructor`, `BACKBONES`, `POOLING`) from `all`
  • `AudioClassifier` and `Backbone` are now the only exported model constructors

CI

  • Added `publish.yml` workflow for automated PyPI publishing on GitHub release"

v0.4.0

13 Feb 14:12
7c8b015

Choose a tag to compare

  1. Add Mobilenets variants from https://github.com/fschmid56/EfficientAT/
  2. bump version to 0.4.0
  3. Update and integrate mobilenets on testing

v0.3.7

09 Feb 19:50

Choose a tag to compare

Update readthedocs:

  1. Finalize api reference
  2. fixed docstrings and generated instructional example code blocks
  3. add citation & contributing

v0.3.6

09 Feb 14:17

Choose a tag to compare

  1. Optimize inference by leveraging bitcount and not masking

v0.3.5

09 Feb 13:36

Choose a tag to compare

  1. Fix inference to calculate counts
  2. Now result is fixed and the label with most occurrences and the higher mean posterior is returned

v0.3.4

09 Feb 12:21

Choose a tag to compare

  1. Fix bug on inference by processing segments based on sample points and not on time duration
  2. Add one test on inference
  3. Version bump to 0.3.4

v0.3.3

09 Feb 05:48

Choose a tag to compare

Update README:

  1. add hyperlink on uv installation
  2. cache buster on python versions and pypi version