This repo is dedicated to the updated version of the algorithm presented in the MLST.
The current projects include
data- Scripts for generating training and testing datatrain- Pytorch (lightning) code for training neural-networksdeploy- Triton wapper (hermes) code to deploy trained neural-networks
The repo includes implementation of both gwak1 and gwak2, where the configs for gwak1 live in the corresponding folder and in the Snakefile the corresponding rules have _gwak1.
The project uses uv, Conda and Snakemake to run the code. Follow installation instructions below to prepare your environment.
If you do not have Miniconda installed on your machine, follow first those steps
- use the
quickstartrepo to setupMinicondaand installpoetry
$ git clone git@github.com:ml4gw/quickstart.git
$ cd quickstart
$ make
If you see this error, it is already known in issue#7
Verifying checksum... Done.
Preparing to install helm into /you/path/miniconda3-tmp/bin/
helm installed into /you/path/miniconda3-tmp/bin//helm
helm not found. Is /you/path/miniconda3-tmp/bin/ on your $PATH?
Failed to install helm
For support, go to https://github.com/helm/helm.
make: *** [Makefile:65: install-helm] Error 1
do the following commands:
$ source ~/.bashrc
$ make install-poetry install-kubectl install-s3cmd
If everything was installed successfully, continue to the steps below.
If you do have Miniconda already installed on your machine, follow those steps
- checkout this repo and clone submodules (such as
ml4gw)
$ git clone git@github.com:ML4GW/gwak.git
$ cd gwak
$ git submodule update --init --recursive
- create a new
Condaenvironment
$ conda env create -n gwak --file environment.yaml
$ conda activate gwak
- install
gwakproject in the editing mode
$ pip install -e .
Now you are ready to gwak! As a first step, you can run the training by doing
$ cd gwak
$ snakemake -c1 train_all
- if you want to modify any of the submodules, first do the changes localy and then re-install
gwakto pick up the changes:
$ pip install -e .
$ curl -LsSf https://astral.sh/uv/install.sh | sh
$ uv python install 3.11
$ uv python pin 3.11
$ uv tool install "snakemake>=8"
You might need to remove the base snakemake if you have one already.
Add the following to your .bashrc file.
# ~/.bashrc
# --------------------------------------------------
# GWAK environment variables
# --------------------------------------------------
# Top-level directory to write or save container images
export GWAK_IMAGE_DIR=/path/to/containers
# Local directory for collecting outputs generated by containerized jobs
export GWAK_CONTAIN_OUTPUT_DIR=/path/to/containers/output
# Root directory of the GWAK repository
export GWAK_DIR=/path/to/gwak_repo
# Top-level directory containing different versions of GWAK dataset
export GWAK_DATA_DIR=/path/to/data
# Directory for saving outputs including models, logs, checkpoints, and intermediate results
export GWAK_OUTPUT_DIR=/path/to/output
# Directory for storing visualization artifacts and plots
export GWAK_LOUVRE_DIR=/path/to/louvre
# Dataset directory for O4 MDC BBC short-0/1
export GWAK_BBC_SHORT_0_DATA_DIR=/path/to/O4_MDC_short-0
export GWAK_BBC_SHORT_1_DATA_DIR=/path/to/O4_MDC_short-1 After the enviroment vairable were set (source ~/.bashrc), you can run the following snakemake command without further enviroment installation.
$ snakemake -c1 scan_all
Run with container.
$ snakemake -c1 build_containers
$ snakemake -c1 production_export
The result will from the container will redirect to GWAK_CONTAIN_OUTPUT_DIR.
