Skip to content

jmillerlab/prs_comparisons

Repository files navigation

prs_comparisons

This repository contains the scripts necessary to replicate the polygenic risk score comparisons on the ADSP r5 dataset using GenoPred. Postprocessing scripts are also included.

Step 1: Calculate polygenic risk scores (PRS) for each person using GenoPred and combine those scores into a single file called final_merged_output.csv (see GenoPred directory)

Step 2: Separate cases and controls from other data in ADSP using separateCaseControl.py

Step 3: Extract the needed columns from the ADSP dataset for cases and controls using createCSVfromSeparatedPheno.py

Step 4: Extract needed columns from ADNI using readADNI.py

Step 5: Add ADNI output to ADSP output using addADNItoADSP.sh

Step 6: Combine phenotype data with PRS data using mergePRS_pdata.py

Step 7: Separate combined data based on genetic ancestry using ancestry projections from GenoPred and separateByGeneticAncestry.py

Step 8: Remove highly correlated features in each population-separated CSV using the absolute value of Spearman's rho using assessFeatureCorrelation.py. Different thresholds are allowed with the default being 0.7

Step 9: Determine the optimal threshold for maximizing precision for each PRS in each population using findOptimalThreshold.py. Thresholds are defined as a PRS greater than or equal to a PRS value. By default, at least 10 individuals must have a PRS greater than or equal to that threshold.

Step 10: Plot the best PRS for every threshold, stratified by APOE diplotype using plot_feature.py.

About

This repository contains the scripts necessary to replicate the polygenic risk score comparisons on the ADSP r5 dataset using GenoPred. Postprocessing scripts are also included.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors