Runtime and accuracy metrics for all release models

Setup

The runtime and accuracy reported in this page are generated using n2-standard-96 GCP instances which has the following configuration:

GCP instance type: n2-standard-96
CPUs: 96-core (vCPU)
Memory: 384GiB
GPUs: 0

WGS (Illumina)

Runtime

Runtime is on HG003 (all chromosomes). Reported runtime is an average of 5 runs.

Stage	Time (minutes)
make_examples	45m13.77s
call_variants	16m25.61s
postprocess_variants (with gVCF)	6m51.14s
vcf_stats_report (optional)	5m16.42s (optional)
total	78m57.99s (1h18m57.99s)

Accuracy

hap.py results on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training.

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	501527	2974	1262	0.994105	0.997591	0.995845
SNP	3306720	20776	4900	0.993756	0.998521	0.996133

See VCF stats report.

WES (Illumina)

Runtime

Runtime is on HG003 (all chromosomes). Reported runtime is an average of 5 runs.

Stage	Time (minutes)
make_examples	3m0.54s
call_variants	0m33.30s
postprocess_variants (with gVCF)	0m38.91s
vcf_stats_report (optional)	0m4.97s (optional)
total	4m45.64s

Accuracy

hap.py results on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training.

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	1024	27	8	0.97431	0.992417	0.98328
SNP	24983	296	60	0.988291	0.997604	0.992926

See VCF stats report.

PacBio (HiFi)

Updated dataset

We have updated the PacBio test data from HG003 Sequel-II to latest Revio with SPRQ chemistry data to showcase performance on the updated platform and chemistry. The numbers reported here are generated using the bam that can be found in:

gs://deepvariant/pacbio-case-study-testdata/HG003.SPRQ.pacbio.GRCh38.nov2024.bam

Which is also available through here.

Runtime

Runtime is on HG003 (all chromosomes). Reported runtime is an average of 5 runs.

Stage	Time (minutes)
make_examples	36m48.09s
call_variants	11m33.13s
postprocess_variants (with gVCF)	4m47.06s
vcf_stats_report (optional)	5m26.10s (optional)
total	66m14.44s (1h06m14.44s)

Accuracy

hap.py results on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training.

Starting from v1.4.0, users don't need to phase the BAMs first, and only need to run DeepVariant once.

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	501348	3153	3117	0.99375	0.994046	0.993898
SNP	3321474	6021	3903	0.998191	0.998828	0.998509

See VCF stats report.

ONT_R104

Runtime

Runtime is on HG003 reads (all chromosomes). Reported runtime is an average of 5 runs.

Stage	Time (minutes)
make_examples	55m56.13s
call_variants	17m29.76s
postprocess_variants (with gVCF)	5m58.82s
vcf_stats_report (optional)	6m23.70s (optional)
total	91m6.31s (1h31m6.31s)

Accuracy

hap.py results on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training.

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	450891	53610	40728	0.893737	0.919559	0.906464
SNP	3319370	8125	2954	0.997558	0.999111	0.998334

See VCF stats report.

Hybrid (Illumina + PacBio HiFi)

Runtime

Runtime is on HG003 (all chromosomes). Reported runtime is an average of 5 runs.

Stage	Time (minutes)
make_examples	62m2.28s
call_variants	65m3.32s
postprocess_variants (with gVCF)	3m43.18s
vcf_stats_report (optional)	5m6.89s (optional)
total	154m30.64s (2h34m30.64s)

Accuracy

Evaluating on HG003 (all chromosomes, using NIST v4.2.1 truth), which was held out while training the hybrid model.

Type	TRUTH.TP	TRUTH.FN	QUERY.FP	METRIC.Recall	METRIC.Precision	METRIC.F1_Score
INDEL	503160	1341	2243	0.997342	0.99577	0.996555
SNP	3323907	3588	1981	0.998922	0.999405	0.999163

See VCF stats report.

Inspect outputs that produced the metrics above

The DeepVariant VCFs, gVCFs, and hap.py evaluation outputs are available at:

gs://deepvariant/case-study-outputs

You can also inspect them in a web browser here: https://42basepairs.com/browse/gs/deepvariant/case-study-outputs

How to reproduce the metrics on this page

For simplicity and consistency, we report runtime with a CPU instance with 96 CPUs This is NOT the fastest or cheapest configuration.

Use gcloud compute ssh to log in to the newly created instance.

Download and run any of the following case study scripts:

# Get the script.
curl -O https://raw.githubusercontent.com/google/deepvariant/r1.9/scripts/inference_deepvariant.sh

# WGS
bash inference_deepvariant.sh --model_preset WGS

# WES
bash inference_deepvariant.sh --model_preset WES

# PacBio
bash inference_deepvariant.sh --model_preset PACBIO

# ONT_R104
bash inference_deepvariant.sh --model_preset ONT_R104

# Hybrid
bash inference_deepvariant.sh --model_preset HYBRID_PACBIO_ILLUMINA

Runtime metrics are taken from the resulting log after each stage of DeepVariant. The runtime numbers reported above are the average of 5 runs each. The accuracy metrics come from the hap.py summary.csv output file. The runs are deterministic so all 5 runs produced the same output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime and accuracy metrics for all release models

Setup

WGS (Illumina)

Runtime

Accuracy

WES (Illumina)

Runtime

Accuracy

PacBio (HiFi)

Updated dataset

Runtime

Accuracy

ONT_R104

Runtime

Accuracy

Hybrid (Illumina + PacBio HiFi)

Runtime

Accuracy

Inspect outputs that produced the metrics above

How to reproduce the metrics on this page

FilesExpand file tree

metrics.md

Latest commit

History

metrics.md

File metadata and controls

Runtime and accuracy metrics for all release models

Setup

WGS (Illumina)

Runtime

Accuracy

WES (Illumina)

Runtime

Accuracy

PacBio (HiFi)

Updated dataset

Runtime

Accuracy

ONT_R104

Runtime

Accuracy

Hybrid (Illumina + PacBio HiFi)

Runtime

Accuracy

Inspect outputs that produced the metrics above

How to reproduce the metrics on this page