When ingesting many VCF blocks (e.g. 23 UKB chr22 blocks), a failure at file 18 of 23 loses all prior work. The entire ingest restarts from zero.
A simple checkpoint mechanism would let us skip already-completed files on retry:
- After each file's worker finishes, record it in a
progress.json alongside the output
- On restart, check which part files already exist and skip those workers
- Final metadata merge (scan_and_register) runs after all files are done
This matters most for the parallel ingest path where jobs run on shared HPC queues and can be preempted.
When ingesting many VCF blocks (e.g. 23 UKB chr22 blocks), a failure at file 18 of 23 loses all prior work. The entire ingest restarts from zero.
A simple checkpoint mechanism would let us skip already-completed files on retry:
progress.jsonalongside the outputThis matters most for the parallel ingest path where jobs run on shared HPC queues and can be preempted.