Releases: magcil/deepaudio-x
Releases · magcil/deepaudio-x
v0.4.4
What's Changed
Verbosity Control for Training & Evaluation
Both Trainer and Evaluator now accept a verbose parameter,
giving users explicit control over console output.
Trainer
verbose=True(default): epoch header, per-epoch metrics (train loss, val loss, elapsed time), and checkpointer messages are printed as before.verbose=False: all epoch-level output is suppressed. EarlyStopper warnings and checkpointer error messages always surface regardless of this setting.- A Training Complete summary is always printed after training finishes, showing the best epoch, corresponding train/val losses, and checkpoint path.
Evaluator
verbose=True(default): classification report, confusion matrix, and average posteriors are printed after evaluation.verbose=False: report output is suppressed. "Evaluation has finished." always prints.
Internal
- Removed
ConsoleLoggercallback — logging is now handled inline inTrainerandEvaluator.
Full Changelog: v0.4.3...v0.4.4
v0.4.3
What's New
MPS Support
- Added
DeviceNametype alias ("cuda","mps","cpu") TrainerandEvaluatornow accept an explicitdeviceparameter — users on Apple Silicon can passdevice="mps"to leverage the Metal GPU backend- Fixed hardcoded
"cuda"in PaSST mel spectrogram autocast guard — replaced withx.device.type
DataLoader Efficiency
pin_memoryis now conditional on CUDA — avoids unsupported behavior on MPS and CPU- Added
persistent_workers=True— eliminates worker process respawn overhead at epoch boundaries pad_collate_fnnow usestorch.stackfor equal-length batches (e.g. whensegment_durationis set), falling back topad_sequencefor variable-length batches
v0.4.2
What's New
Checkpoint Architecture Persistence
-
AudioClassifierandBackbonenow store aconfigdict capturing all constructor arguments (backbone name, pooling, num_classes, etc.) -
Checkpoints (
.ptfiles) now contain bothstate_dictandconfig— fully self-describing, no need to remember how the model was built. -
New
from_checkpoint(path)classmethod reconstructs model architecture + weights in one call:model = AudioClassifier.from_checkpoint("checkpoint.pt")
Public API
-
Added
AVAILABLE_BACKBONESandAVAILABLE_POOLINGruntime tuple constants for programmatic inspection of supported options. -
Added
PoolingNametype alias (Literal["gap", "simpool", "ep"]) alongside the existingBackboneName. -
Removed internal names (
BACKBONES,POOLING,AudioClassifierConstructor,BackboneConstructor) from the public namespace — users interact only viaAudioClassifierandBackbone.
Documentation
-
Fixed
AudioClassifierandBackbonerendering as "alias of..." in ReadTheDocs API reference. -
Added
uv-based environment setup instructions in the Installation page. -
Updated Contributing page with development setup (
uv sync) and test execution (uv run pytest -v) instructions. -
Updated README training and evaluation examples to use
from_checkpoint; added PyPI publish badge.
Tests
-
Updated
test_evaluation_loopandtest_inferenceto useAudioClassifier.from_checkpoint(...)instead of manually callingtorch.load+load_state_dict.
v0.4.1
Inference Pipeline
- `predict()` is now a pure low-level method — eval mode and gradient context are the caller's responsibility
- `inference_on_waveform()` automatically manages eval mode and restores training state via the new `@eval_mode` decorator
- Added `ValueError` for non-1D input in `inference_on_waveform()`
- New `utils/decorators.py` with `eval_mode` decorator using `try/finally` for guaranteed state restoration
Evaluator
- Fixed O(n²) `np.concatenate` — results now accumulate in lists and are concatenated once after the loop
- Replaced `torch.no_grad()` context manager with `@torch.inference_mode()` decorator
Trainer
- Decomposed `train()` into `train_step()`, `val_step()`, and `epoch_step()` — users can now build custom training loops around `epoch_step()`
Module Structure
- Renamed `dtos/` → `schemas/` and split into `items.py`, `predictions.py`, and `types.py`
- `BackboneName` moved to `schemas/types.py` to eliminate circular imports
- Fixed backbone registry typo: `monilenet_` → `mobilenet_`
- Suppressed `FutureWarning` from deprecated `weight_norm` in BEATs backbone
Public API
- Removed internal names (`AudioClassifierConstructor`, `BackboneConstructor`, `BACKBONES`, `POOLING`) from `all`
- `AudioClassifier` and `Backbone` are now the only exported model constructors
CI
- Added `publish.yml` workflow for automated PyPI publishing on GitHub release"
v0.4.0
- Add Mobilenets variants from https://github.com/fschmid56/EfficientAT/
- bump version to 0.4.0
- Update and integrate mobilenets on testing
v0.3.7
Update readthedocs:
- Finalize api reference
- fixed docstrings and generated instructional example code blocks
- add citation & contributing
v0.3.6
- Optimize inference by leveraging bitcount and not masking
v0.3.5
- Fix inference to calculate counts
- Now result is fixed and the label with most occurrences and the higher mean posterior is returned
v0.3.4
- Fix bug on inference by processing segments based on sample points and not on time duration
- Add one test on inference
- Version bump to 0.3.4
v0.3.3
Update README:
- add hyperlink on uv installation
- cache buster on python versions and pypi version