Releases: KatherLab/MediSwarm
v1.4.1
MediSwarm v1.4.1
Changes
- Update 2-site deploy test configuration to use DL0 (RUMC_1) + DL2 (MHA_1)
- Bump version to 1.4.1
Deploy Test Validation
Successfully validated challenge_1DivideAndConquer over Tailscale VPN with 2 clients:
- 30+ error-free training rounds across DL0 (RUMC_1) and DL2 (MHA_1)
- P2P model exchange (689MB model): ~2-3 seconds
- Adaptive epoch calculation with EPOCHS_MAX_CAP=10 working correctly
- Both swarm_config and swarm_start phases completed cleanly
Full Changelog
v1.4.0
What's New
Webviewer (Live Monitor)
- Fix age column flicker — Replaced
<meta http-equiv="refresh">with JS-based auto-refresh and client-side age ticking. Age now counts up smoothly without resetting to 0s on reload. - Hostname column — Dashboard now shows which machine each run is coming from (parsed from
heartbeat.json). - Error status detection — Runs that hit
FATAL_SYSTEM_ERROR,EXECUTION_EXCEPTION,RuntimeError,OutOfMemoryError, orCUDA out of memoryare now flagged with a red "error" badge instead of appearing as "stale" or "finished". - Default metrics visibility — Only train/val ACC and AUC-ROC are shown by default in charts. All other series are hidden but toggleable via the Chart.js legend.
- Label distribution chart — Detail page now shows a grouped bar chart of class counts per train/val/test split, parsed from console output.
Training
- Reduce EPOCHS_MAX_CAP default from 20 → 10, preventing excessive epochs on small sites (e.g. RUMC_1 with 22 samples was doing 20 epochs per round, now capped at 10). Override with
EPOCHS_MAX_CAPenv var.
Heartbeat / Live Sync
- Hostname field added to
heartbeat.jsonoutput - ANSI escape code stripping from
RUN_NAME(fixes garbled names from colored terminal output) - Quote cleanup on
kit_versionfield
CI/CD
- Deploy test workflow now triggers on release publish instead of weekly schedule (manual dispatch retained)
Housekeeping
MediSwarm v1.3.0
MediSwarm v1.3.0
Released: 2026-04-05
Major release adding the STAMP histopathology classification pipeline, FedProx aggregation strategy, comprehensive CI/CD infrastructure, Duke benchmark pipeline, and expanded documentation with architecture diagrams.
🔬 STAMP Classification Pipeline
Full support for KatherLab STAMP 2.4.0 histopathology classification in federated learning:
- Separate
Dockerfile_STAMP— Python 3.11, PyTorch 2.7.1, CUDA 12.6 (independent from ODELIA's Python 3.10/PyTorch 2.2.2 image) - Build flag —
buildDockerImageAndStartupKits.shnow accepts-d/--dockerfileto select betweenDockerfile_ODELIAandDockerfile_STAMP - Synthetic dataset generator — Creates 2 sites × 15 patients with H5 feature files for integration testing
- Integration tests — Preflight check, local training, and NVFlare simulation mode (3 rounds, 2 clients)
- Per-round metrics CSV —
STAMPMetricsCallbackwrites ground-truth/prediction probabilities and summary metrics per epoch
Two Docker Images
After v1.3.0, MediSwarm maintains two Docker images:
| Image | Python | PyTorch | Use Case |
|---|---|---|---|
jefftud/odelia:<ver> |
3.10 | 2.2.2 | 3D breast MRI classification |
jefftud/stamp:<ver> |
3.11 | 2.7.1 | STAMP histopathology classification |
🔄 FedProx Aggregation Strategy
Alternative to FedAvg for improved convergence with non-IID medical data:
FedProxCallback— Lightning callback adds proximal term(μ/2) × ‖w_local − w_global‖²to gradient updates- Cross-pipeline — Compatible with both ODELIA (
pytorch_lightning) and STAMP (lightning) - Configurable — Set
FEDPROX_MUenvironment variable (default: 0 = disabled, recommended: 0.001–0.01) - Documentation —
docs/AGGREGATION_STRATEGIES.mdcompares FedAvg, FedProx, Scaffold, and FedOpt with decision matrix
🧪 CI/CD for STAMP
Expanded test infrastructure covering both pipelines:
- Unit tests —
test_stamp_training.py(465 lines),test_stamp_model_wrapper.py(257 lines),test_fedprox_callback.py(286 lines) - Integration tests — STAMP Docker build + preflight + local training + simulation in
pr-test.yaml - Unified packages —
unit-tests.yamlswitched frompytorch-lightningto unifiedlightningpackage - Timeout — PR test timeout increased from 45 to 60 minutes
- Cleanup — CI cleanup step now kills
stampandnvflarecontainers alongsideodelia
📊 Duke Benchmark Pipeline
Automated end-to-end benchmarking on the Duke Breast MRI dataset:
run_duke_benchmark.sh— Orchestrates build → deploy → swarm training → result collection → local model comparison- Configurable deploy —
deploy_and_test.shreadsSITESandSERVER_NAMEfromdeploy_sites.conf(backward-compatible defaults) deploy_sites.conf.example— Template with dl0/dl2/dl3 configuration for TUD compute cluster- Results template —
docs/DUKE_BENCHMARK_RESULTS.mdfor recording benchmark outcomes
📐 Architecture Documentation
Expanded README from 46 lines to 214 lines:
- System Architecture — Mermaid diagram showing site-to-server topology with NVFlare aggregation
- Training Pipeline — Mermaid sequence diagram showing federated learning round lifecycle
- Supported Pipelines — Comparison table (ODELIA 3D CNN vs STAMP Classification)
- Key Features — Privacy, Docker reproducibility, multi-pipeline support
- Project Structure — Annotated directory tree
🔐 Differential Privacy Assessment
Gap analysis and roadmap (documentation only — implementation deferred to v1.4.0):
docs/DIFFERENTIAL_PRIVACY.md— CurrentPercentilePrivacyis gradient clipping, NOT formal (ε,δ)-DP. Detailed analysis of Opacus/DP-SGD integration path, compatibility issues, and privacy budget accountingdocs/DIFFERENTIAL_PRIVACY_DECISION.md— Architecture decision record
Changed
deploy_and_test.shcontainer matching broadened to includestampandnvflarealongsideodelia- CI
pr-test.yamltimeout increased from 45 to 60 minutes - CI cleanup step now kills
stampandnvflarecontainers
Stats
Upgrade Notes
- No breaking changes from v1.2.0
- ODELIA pipeline users: no action required —
Dockerfile_ODELIAis unchanged - STAMP pipeline users: build with
./buildDockerImageAndStartupKits.sh -d docker_config/Dockerfile_STAMP -p <project> - FedProx: opt-in via
FEDPROX_MUenv var — set to 0 or leave unset for standard FedAvg behavior
Full Changelog: v1.2.0...v1.3.0
MediSwarm v1.2.0
MediSwarm v1.2.0
Highlights
This release introduces STAMP classification support for swarm learning, a prediction workflow for external test data, significant code deduplication, improved training stability, and comprehensive documentation for making standalone training code MediSwarm-compatible.
New Features
STAMP Classification Job (#249)
- New
STAMP_classificationjob for swarm learning with STAMP's data pipeline (H5 features + clinical tables) - Supports VIT, MLP, TransMIL, and other STAMP model architectures
- Configurable via
STAMP_*environment variables - Stratified train/val split with STAMP's data loading pipeline
Prediction Workflow (#247)
- New prediction workflow for evaluating trained swarm models on external test data
- Supports both ODELIA 3D CNN and STAMP classification models
- Configurable via environment variables for model path, data directory, and output format
Weighted Epochs Per Site (#251)
- Replaces hardcoded per-site epoch dictionaries with a formula-based approach
- Formula:
epochs = base_epochs × (reference_size / num_train_samples), clamped to[1, max_cap] - Sites with fewer training samples get more local epochs per round, equalizing gradient updates across sites
- Configurable via
EPOCHS_PER_ROUND,EPOCHS_REFERENCE_DATASET_SIZE,EPOCHS_MAX_CAPenv vars
Best + Last Model Checkpoints (#251)
finalize_training()now saves both best (by monitor metric) and latest checkpoints- Deployers can choose between peak-validation and final-aggregated models
Server Dashboard Enhancement (#240)
- Enhanced server-side monitoring dashboard for real-time swarm training visibility
Client Stability Improvements (#245)
- Systemd service for VPN with auto-reconnect and keepalive
- GPU health check script for pre-training and Docker health checks
- Docker container restart policies (
--restart=on-failure:5) - VPN health monitor with automatic service restart after consecutive failures
Infrastructure & DevOps
Code Deduplication (#241)
- Consolidated 5 duplicate challenge job directories into shared
_shared/custom/with symlinks - Moved build scripts to
scripts/build/and CI scripts toscripts/ci/ - Single source of truth for training code across all ODELIA/challenge jobs
NVFlare Workflow Enhancements (#242)
- Cross-site evaluation (CSE) workflow added to server and client configs
- Tuned timeouts from 100-hour placeholders to practical values
- Explicit metric comparator configuration
- PercentilePrivacy filter for gradient quality control
Automated Tests (#243)
- New unit test suite in
tests/unit_tests/(models_config, env_config, data_module) - GitHub Actions workflow for unit tests on PRs
- Fixed hardcoded paths in
test_challenge_models.py
Docker Build Optimization (#250)
- Reordered Dockerfile layers: pip installs (expensive, stable) before apt installs (cheap, frequent CVE bumps)
- Added
--no-cache-dirflags to reduce image size - Consolidated RUN layers for better caching
NVFlare 2.7.2 Upgrade (#235, #236)
- Upgraded from NVFlare 2.5.x to 2.7.2
Bug Fixes
- Fix integration test printed icons (#224)
- Fix site name argument ordering (#237, fixes #227)
- Update CI Node.js version (#238, fixes #222)
- Fix CI apt-get update permissions (#239)
- Fix CLI flags for env vars lost when using sudo (#230)
Documentation
- MediSwarm Compatibility Guide (#244, addresses #216) — step-by-step guide for making standalone training code MediSwarm-compatible
- Updated README with correct repository links
Training Improvements (#246)
- Class-weighted loss for imbalanced datasets
- Gradient accumulation (effective batch size of 8)
- Gradient clipping (val=1.0) to prevent explosion
16-mixedprecision for stability
Full Changelog: v1.1.0...v1.2.0
v1.1.0 — Challenge Models
MediSwarm v1.1.0 — Challenge Models Release
This release integrates five ODELIA challenge models into MediSwarm for federated swarm training, along with infrastructure improvements for deployment, testing, and CI.
New Challenge Models
| Job | Model Architecture |
|---|---|
challenge_1DivideAndConquer |
ResidualEncoder |
challenge_2BCN_AIM |
SwinUNETR |
challenge_3agaldran |
MViT v2 |
challenge_4abmil |
CrossModalAttentionABMIL + Swin |
challenge_5pimed |
ResNet18 |
Each challenge job is a self-contained NVFlare application with its own model code, data pipeline, configs, and synthetic dataset generator.
Highlights
--jobflag fordocker.sh— Participants can now run preflight checks and local training for any challenge model:./docker.sh --preflight_check --job challenge_5pimed --data_dir $DATADIR --scratch_dir $SCRATCHDIR --GPU device=0
- Pretrained weight caching — Large model weights (
checkpoint_final.pth,mvit_v2_s-ae3be167.pth) are stored outside job directories to prevent NVFlare from bundling them during job submission - MODEL_NAME env var fix — All challenge jobs hardcode their MODEL_NAME to prevent the docker.sh default (
MST) from silently overriding the intended model - Deployment automation — New
deploy_and_test.shscript for multi-site Docker image push, startup kit deployment, and swarm lifecycle management - Live sync — New
kit_live_sync/for startup kit synchronization with heartbeat monitoring - CI reliability — Fixed script permissions, auto-install of
gdown, NVFlare submodule sync
Breaking Changes
None. The default behavior of docker.sh (without --job) remains unchanged and runs ODELIA_ternary_classification.
odelia-challenge-v1.0
What's Changed
- Dev dashboard enhancements by @Ultimate-Storm in #5
- Merge include_nvflare to dev-7-test-controller by @oleschwen in #13
- Dev 7 test controller by @oleschwen in #10
- App example cifar10 by @Ultimate-Storm in #12
- Dev 8 single logging by @oleschwen in #17
- Dev 11 minimal application code by @oleschwen in #15
- Get training via VPN to work by @oleschwen in #20
- Dev controller first review by @Ultimate-Storm in #29
- Dev 36 fix tests after merge by @oleschwen in #37
- dev-22 use same application code for local and swarm training by @oleschwen in #24
- dev-27 use consistent nvflare version by @oleschwen in #28
- Dev 26 versioning docker images by @oleschwen in #33
- Documentation and docker.sh scripts for multiple GPUs by @oleschwen in #46
- Dev 32 setup swarm training on odelia data by @oleschwen in #47
- fix missing version numbers for last-in-line apt packages by @oleschwen in #48
- further corrections of apt package versions by @oleschwen in #49
- Apt package version update by @oleschwen in #57
- 53 do not git directory in image by @oleschwen in #54
- update apt package versions by @oleschwen in #59
- 55 automate updating apt packages by @oleschwen in #56
- update apt package version by @oleschwen in #61
- Update apt package versions 20250612 by @oleschwen in #65
- Dev apt version ci by @Ultimate-Storm in #63
- Update apt package versions 2025-06-23 by @oleschwen in #66
- Fix auto apt update github action by @Ultimate-Storm in #67
- Dev 34 latest update from gustav code by @oleschwen in #58
- Automatically updating APT package by @Ultimate-Storm in #71
- chore: Update APT versions in Dockerfile by @github-actions in #78
- Dev demo odelia by @Ultimate-Storm in #79
New Contributors
- @Ultimate-Storm made their first contribution in #5
- @oleschwen made their first contribution in #13
- @github-actions made their first contribution in #78
Full Changelog: https://github.com/KatherLab/MediSwarm/commits/Odelia_Challenge