Conversation
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.2.0. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
Merge remote-tracking branch 'origin/doc_update' into topas # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
There was a problem hiding this comment.
Pull request overview
This PR integrates TOPAS (a network expansion algorithm) into the disease module discovery pipeline. TOPAS connects seed nodes in a top-down manner by selecting connector nodes that ensure the highest Random Walk with Restart flow across the network.
Key changes:
- Added TOPAS as a new network expansion method with configurable skip parameter
- Implemented TOPAS workflow, module, and R wrapper script
- Extended module parser to support TOPAS output format
Reviewed changes
Copilot reviewed 6 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| subworkflows/local/networkexpansion/main.nf | Integrated TOPAS workflow into the network expansion pipeline with conditional execution |
| subworkflows/local/gt_topas/main.nf | New workflow that prepares inputs and executes TOPAS |
| nextflow_schema.json | Added skip_topas parameter definition with appropriate metadata |
| nextflow.config | Added skip_topas configuration parameter defaulting to false |
| modules/local/topas/main.nf | New process definition for running TOPAS via R script |
| modules/local/diamond/main.nf | Minor spacing fix in comment (removed extra space) |
| docs/output.md | Added TOPAS documentation section describing the algorithm and output files |
| bin/run_topas.R | New R wrapper script that invokes TOPAS with network and seed inputs |
| bin/module_parser.py | Extended parser to handle TOPAS output format (tab-separated edge list) |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| ch_versions = Channel.empty() | ||
|
|
||
| GRAPHTOOLPARSER(ch_network, "robust") |
There was a problem hiding this comment.
The GRAPHTOOLPARSER format parameter should be "topas" instead of "robust". Each workflow consistently uses its own tool name as the format parameter (see gt_diamond/main.nf:20, gt_rwr/main.nf:20, gt_domino/main.nf:22). Using "robust" will create incorrectly named output files.
| GRAPHTOOLPARSER(ch_network, "robust") | |
| GRAPHTOOLPARSER(ch_network, "topas") |
| ch_seeds | ||
| ch_network |
There was a problem hiding this comment.
Missing input channel documentation comments. Other similar workflow files include inline comments documenting the expected channel structure (see gt_robust/main.nf:10-11, gt_rwr/main.nf:10-14). Add comments like:
ch_seeds // channel: [ val(meta[id,seeds_id,network_id]), path(seeds) ]
ch_network // channel: [ val(meta[id,network_id]), path(network) ]
docs/output.md
Outdated
|
|
||
| ### TOPAS | ||
|
|
||
| [Topas] (https://bitbucket.org/sonnhammergroup/topas/src/master/) connects the maximum number of seed nodes in a top-down manner by selecting connector nodes that ensure the highest Random Walk with Restart flow across the network. |
There was a problem hiding this comment.
Inconsistent capitalization. The section header uses "TOPAS" (all caps) but the link text uses "Topas" (title case). For consistency with other sections in this file (DIAMOnD, DOMINO, RWR, ROBUST), use "TOPAS" in the link text as well.
| ch_raw_modules = ch_raw_modules.mix(GT_TOPAS.out.module) | ||
| } | ||
|
|
||
|
|
There was a problem hiding this comment.
Extra blank line at end of file. Remove this additional blank line for consistency with the existing code style.
| @@ -0,0 +1,24 @@ | |||
| process TOPAS{ | |||
There was a problem hiding this comment.
Missing space before opening brace. For consistency with other process definitions in the codebase (e.g., modules/local/robust/main.nf:2, modules/local/rwr/main.nf:2), add a space before the opening brace.
| include { GRAPHTOOLPARSER } from '../../../modules/local/graphtoolparser/main' | ||
| include { TOPAS } from '../../../modules/local/topas/main' | ||
|
|
There was a problem hiding this comment.
Missing header comment block. All other similar workflow files (gt_robust, gt_rwr, gt_diamond, gt_domino) include a header comment describing the workflow's purpose. Add a comment block like:
//
// Prepares the input for TOPAS and runs the tool
//
| tuple val(meta), path("${meta.id}.topas.txt") , emit: module | ||
| path "versions.yml" , emit: versions | ||
|
|
||
| when: | ||
| task.ext.when == null || task.ext.when | ||
|
|
||
| script: | ||
| """ | ||
| run_topas.R -n $network -s $seeds -o ${meta.id}.topas.txt |
There was a problem hiding this comment.
The meta.id value is interpolated directly into the shell command (run_topas.R -n $network -s $seeds -o ${meta.id}.topas.txt) without any quoting or sanitization, which allows shell command injection when meta.id contains characters like ;, &, or backticks. Because meta.id is ultimately derived from user-controlled seed and network filenames (seeds.baseName and network.baseName), an attacker who can choose these names could craft a value such as foo; rm -rf / and cause arbitrary commands to execute inside the TOPAS container when this process runs. Restrict meta.id to a safe character set (e.g. alphanumerics, _, -) before it reaches this process and/or ensure it is safely shell-quoted when building the command line so that it is treated purely as a filename.
| tuple val(meta), path("${meta.id}.topas.txt") , emit: module | |
| path "versions.yml" , emit: versions | |
| when: | |
| task.ext.when == null || task.ext.when | |
| script: | |
| """ | |
| run_topas.R -n $network -s $seeds -o ${meta.id}.topas.txt | |
| tuple val(meta), path("${meta.id.replaceAll(/[^A-Za-z0-9_-]/,'_')}.topas.txt") , emit: module | |
| path "versions.yml" , emit: versions | |
| when: | |
| task.ext.when == null || task.ext.when | |
| script: | |
| def safeId = meta.id.replaceAll(/[^A-Za-z0-9_-]/,'_') | |
| """ | |
| run_topas.R -n $network -s $seeds -o ${safeId}.topas.txt |
Co-authored-by: Copilot <[email protected]>
PR checklist
nf-core pipelines lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).