Skip to content

Running BGCFlow in DTU LSF HPC #361

@matinnuhamunada

Description

@matinnuhamunada

For DTU students/staff that wanted to run BGCFlow on the LSF HPC facility, follow these steps:

# go to login node
ssh [email protected] # access to login node
# once you have been assigned a scratch dir, create a symlink to your home dir
SCRATCH_DIR="/work3/<user_id>/" # change user id accordingly

# create a symlink to the scratch dir
ln -s $SCRATCH_DIR drive
cd drive/
mkdir bgcflow
cd bgcflow
# we keep the config and workflow in the home directory as it is being backed up
ln -s ~/bgcflow/workflow/ workflow
ln -s ~/bgcflow/config/ config

which results to something like this:

gbarlogin1(matinnu) $ tree
.
├── config -> /zhome/b2/0/153431/bgcflow/config/
└── workflow -> /zhome/b2/0/153431/bgcflow/workflow/

2 directories, 0 files
(base) ~/drive/bgcflow
  • Install the lsf plugin
linuxsh # go to one of the worker node

conda run -n bgcflow mamba install bioconda::snakemake-executor-plugin-lsf -y
  • execute the workflow

Create profile config file:

jobs: 4
executor: lsf
default-resources:
    mem_mb: 200
#set-threads:
#    myrule: 5
set-resources:
    prokka:
        mem: 8000MB
    antismash:
        mem: 8000MB
    checkm:
        mem: 16000MB
    automlst_wrapper:
        mem: 8000MB
    bigscape:
        mem: 24000MB
    arts:
        mem: 8000MB

Then run the workflow

# IMPORTANT, run this from the worker node uslsf_project=<project name> lsf_queue=<hpc>ng linuxsh 
cd ~/drive/bgcflow/
conda run -n bgcflow snakemake --executor lsf --use-conda --profile <path to a directory containing the profile config.yaml> --default-resources  lsf_project=<project name> lsf_queue=<hpc>

to find the right queue, use: bqueues -u <user id>

  • The final structure will look like this:
gbarlogin1(matinnu) $ (cd ~ && tree bgcflow/ drive/ -L 2)
bgcflow/
├── CITATION.cff
├── config
│   ├── config.yaml
│   ├── Lactobacillus_delbrueckii
│   └── lanthipeptide_lactobacillus
├── Dockerfile
├── envs.yaml
├── LICENSE
├── profiles
│   └── config.yaml
├── README.md
├── resources
│   └── README.md
└── workflow
    ├── Alleleome
    ├── BGC
    ├── bgcflow
    ├── Database
    ├── envs
    ├── lsabgc
    ├── Metabase
    ├── misc
    ├── notebook
    ├── ppanggolin
    ├── report
    ├── Report
    ├── rules
    ├── rules_bgc.yaml
    ├── rules_ppanggolin.yaml
    ├── rules.yaml
    ├── schemas
    ├── scripts
    └── Snakefile
drive/
└── bgcflow
    ├── config -> /zhome/b2/0/153431/bgcflow/config/
    └── workflow -> /zhome/b2/0/153431/bgcflow/workflow/

17 directories, 19 files
(base) ~

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions