Rdata I/O: phenotype input and per-mask output for STAARpipelineSummary

Running STAARpipeline-Tutorial end-to-end via favor-cli needs Rdata at both ends:
- phenotype often shipped as an .Rdata data frame
- STAARpipelineSummary calls `get(load(.))` on per-shard output files and expects named R objects

Current state: phenotype load is CSV/TSV only; outputs are parquet + JSON metadata.

Tutorial expectations (from STAARpipelineSummary scripts):
- individual: one data frame per shard, filename `<output>_<chr>_<groupid>.Rdata`, columns CHR,POS,REF,ALT,ALT_AF,MAC,N,pvalue,Score,SE,Est
- gene-centric coding/noncoding/ncRNA: list of mask data frames; columns include Gene,Chr,Category,#SNV,cMAC,MAF_cutoff,STAAR-O,ACAT-O,STAAR-S(1,25),STAAR-S(1,1),STAAR-B(1,25),STAAR-B(1,1),STAAR-A(1,25),STAAR-A(1,1), plus per-annotation sub p-values
- sliding window: same column shape keyed by chr,start_loc,end_loc
- SCANG: list with SCANG_O/S/B _res, _top1, _emthr

Needs:
- Rdata reader for phenotype input (serde-rdata or equivalent)
- Rdata writer for per-shard outputs
- --output-format flag accepting parquet (default), rdata, or both
- object and column names match STAARpipelineSummary load sites exactly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rdata I/O: phenotype input and per-mask output for STAARpipelineSummary #109

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Rdata I/O: phenotype input and per-mask output for STAARpipelineSummary #109

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions