RecordContext handles normalized fields and writes parquet. The format-specific reading should be behind a trait so new formats plug in without touching the processing core.
trait VariantReader {
fn sample_names(&self) -> &[String];
fn records(&mut self) -> impl Iterator<Item = Result<RawRecord, CohortError>>;
}
Implementations: VcfVariantReader (noodles-vcf), BcfVariantReader (noodles-bcf), BgenVariantReader, GdsVariantReader (SeqArray/HDF5).
Each reader owns its parallelism strategy (BGZF threads, tabix region splits, BGEN .bgi index, HDF5 chunks). RecordContext stays format-agnostic.
Composition: fn ingest(reader: &mut dyn VariantReader, ctx: &mut RecordContext, output: &dyn Output)
Related: #69, #74, #86
RecordContext handles normalized fields and writes parquet. The format-specific reading should be behind a trait so new formats plug in without touching the processing core.
Implementations: VcfVariantReader (noodles-vcf), BcfVariantReader (noodles-bcf), BgenVariantReader, GdsVariantReader (SeqArray/HDF5).
Each reader owns its parallelism strategy (BGZF threads, tabix region splits, BGEN .bgi index, HDF5 chunks). RecordContext stays format-agnostic.
Composition:
fn ingest(reader: &mut dyn VariantReader, ctx: &mut RecordContext, output: &dyn Output)Related: #69, #74, #86