Add ONT telomere test data (GIAB HG002)#1947
Merged
pinin4fjords merged 1 commit intonf-core:modulesfrom Mar 24, 2026
Merged
Conversation
17 real telomeric ONT reads for testing telogator2 and other telomere analysis tools. Downsampled from GIAB 2025.01 ONT release (SUP basecalling, R10.4.1). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Member
Author
|
Thanks @fellen31 ! |
pinin4fjords
added a commit
to nf-core/modules
that referenced
this pull request
Mar 24, 2026
- tlens: tlens_by_allele.tsv (primary result) - plots: *.png (allele and violin plots, optional) - qc: qc directory (stats, read lengths, metadata) Also revert modules_testdata_base_path now that nf-core/test-datasets#1947 is merged. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
3 tasks
github-merge-queue bot
pushed a commit
to nf-core/modules
that referenced
this pull request
Mar 24, 2026
* Add new module: telogator2 Add nf-core module for telogator2, a tool for allele-specific telomere length estimation and TVR characterization from long-read sequencing data (ONT/PacBio). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * add telomere test data and temporarily point to fork - Add ONT telomere reads test (exercises real analysis path) - Keep PacBio no-telomere test (exercises graceful fallback) - Temporarily override modules_testdata_base_path to pinin4fjords/test-datasets#telogator2-test-data (revert before merge) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: combine fasta and fai into single reference channel Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: simplify process script and assert failure on no telomere reads Remove error-catching wrapper from telogator2 process script. When no telomere reads are found the tool now fails with a clear error message, which the no-telomere test asserts against. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * refactor: emit individual files instead of directory - tlens: tlens_by_allele.tsv (primary result) - plots: *.png (allele and violin plots, optional) - qc: qc directory (stats, read lengths, metadata) Also revert modules_testdata_base_path now that nf-core/test-datasets#1947 is merged. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: exclude non-deterministic rng.txt from snapshot The qc/rng.txt file contains a random seed that differs across runs. Assert qc output exists but don't snapshot it. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: handle non-deterministic telogator2 outputs in tests Set fixed random seed (--rng 42) via test config. Assert tlens header structure rather than md5 since TL values vary across runs due to minimap2 non-determinism. Assert plots and qc exist without snapshotting. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: address PR review feedback - Use module_args pattern for ext.args in test config - Snapshot output file names (not md5s) for non-deterministic outputs - Remove PNGs from stub (plots are optional) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * fix: make plots a required output The two main plots (all_final_alleles.png, violin_atl.png) are always produced on a successful run. Remove optional flag and add them back to the stub. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> * refactor: split QC directory into individual output channels Emit cmd, stats, qc_readlens, readlens, and rng as separate channels instead of a single qc directory. Touch all QC files in stub. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]> --------- Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
data/genomics/homo_sapiens/nanopore/bam/HG002_ont_telomere/HG002_ont_tel_sub.bam(~491 KB) + indexContext
Needed for the new
telogator2nf-core module (nf-core/modules#11033). Existing PacBio test BAMs don't contain telomere reads, so telogator2 can only test its "no telomere reads" fallback path.🤖 Generated with Claude Code