Skip to content

Add new module: telogator2#11033

Merged
pinin4fjords merged 12 commits intomasterfrom
new-module/telogator2
Mar 24, 2026
Merged

Add new module: telogator2#11033
pinin4fjords merged 12 commits intomasterfrom
new-module/telogator2

Conversation

@pinin4fjords
Copy link
Member

@pinin4fjords pinin4fjords commented Mar 24, 2026

Summary

  • Add nf-core module for telogator2, a tool for allele-specific telomere length (TL) estimation and telomere variant repeat (TVR) characterization from long-read sequencing data (ONT/PacBio)
  • Uses bioconda package telogator2=2.2.3 with Wave-built Docker and Singularity containers
  • Accepts BAM/CRAM input with optional reference genome, uses topic-based version reporting

Inputs

Input Type Description
reads BAM/CRAM Long-read alignments (PacBio HiFi or ONT)
reads_index BAI/CRAI Index for reads
fasta + fai FASTA + FAI Optional reference genome (single channel)

Outputs

Output Type Description
tlens TSV Telomere lengths per allele (tlens_by_allele.tsv)
plots PNG Allele and violin plots (optional)
qc Directory QC stats, read length distributions, run metadata

Notes

  • The tool exits with code 1 when no telomere reads are found in the input, with a clear error message. This is not caught by the module - pipelines should handle this if mixed samples are expected.
  • Test data uses real ONT telomeric reads from GIAB HG002 (Add ONT telomere test data (GIAB HG002) test-datasets#1947), plus a PacBio BAM without telomere reads to test the failure path.
  • TL values are non-deterministic across runs (minimap2 alignment), so tests assert the TSV header structure and output existence rather than exact md5 checksums. A fixed random seed (--rng 42) is set in the test config.

Test plan

  • nf-test passes for all 3 tests (telomere reads, no telomere reads, stub) with --profile docker
  • Checked stderr for silent errors - none found
  • CI nf-test validation

🤖 Generated with Claude Code

Add nf-core module for telogator2, a tool for allele-specific telomere
length estimation and TVR characterization from long-read sequencing
data (ONT/PacBio).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pinin4fjords pinin4fjords marked this pull request as draft March 24, 2026 11:26
@pinin4fjords
Copy link
Member Author

pinin4fjords commented Mar 24, 2026

TODO: Generate telomere-containing long-read test data so the tests exercise telogator2's actual analysis path, not just the "no telomere reads found" fallback.

pinin4fjords and others added 8 commits March 24, 2026 11:45
- Add ONT telomere reads test (exercises real analysis path)
- Keep PacBio no-telomere test (exercises graceful fallback)
- Temporarily override modules_testdata_base_path to
  pinin4fjords/test-datasets#telogator2-test-data (revert before merge)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove error-catching wrapper from telogator2 process script. When no
telomere reads are found the tool now fails with a clear error message,
which the no-telomere test asserts against.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- tlens: tlens_by_allele.tsv (primary result)
- plots: *.png (allele and violin plots, optional)
- qc: qc directory (stats, read lengths, metadata)

Also revert modules_testdata_base_path now that
nf-core/test-datasets#1947 is merged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The qc/rng.txt file contains a random seed that differs across runs.
Assert qc output exists but don't snapshot it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Set fixed random seed (--rng 42) via test config. Assert tlens header
structure rather than md5 since TL values vary across runs due to
minimap2 non-determinism. Assert plots and qc exist without
snapshotting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pinin4fjords pinin4fjords marked this pull request as ready for review March 24, 2026 12:42
Copy link
Contributor

@fellen31 fellen31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

pinin4fjords and others added 2 commits March 24, 2026 13:22
- Use module_args pattern for ext.args in test config
- Snapshot output file names (not md5s) for non-deterministic outputs
- Remove PNGs from stub (plots are optional)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The two main plots (all_final_alleles.png, violin_atl.png) are always
produced on a successful run. Remove optional flag and add them back
to the stub.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pinin4fjords pinin4fjords requested a review from fellen31 March 24, 2026 13:30
Copy link
Contributor

@fellen31 fellen31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just touch all QC files in the stub section, so that they are output correctly in a pipeline stub run. Otherwise, everything looks good!

Emit cmd, stats, qc_readlens, readlens, and rng as separate channels
instead of a single qc directory. Touch all QC files in stub.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pinin4fjords pinin4fjords requested a review from fellen31 March 24, 2026 13:45
Copy link
Contributor

@fellen31 fellen31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@pinin4fjords pinin4fjords added this pull request to the merge queue Mar 24, 2026
Merged via the queue into master with commit 690bb56 Mar 24, 2026
25 checks passed
@pinin4fjords pinin4fjords deleted the new-module/telogator2 branch March 24, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants