GitHub Repo for BGE analysis
BGE Analysis paper now on BioRXiv:
The exome-qc
directory contains python and R scripts for generating the QC metrics of BGE data and corresponding plots as shown in the manuscript.
The imputation
directory contains a sub-folder called glimpse_comparison
, which are the initial scripts used in implementing GLIMPSE on Hail Batch and testing the GLIMPSE parameters on a subset of BGE data. The final scripts used in the BGE imputation pipeline include:
Run batches of 200 individuals at a time (to minimize costs) through the GLIMPSE2 imputation software on the Broad's Hail Batch service. Each batch of 200 individuals will be written to its own .bcf file.
This script is a Hail Batch implementation of code described here:
The concordance
directory contains scripts used in computing accuracy metrics of BGE imputed SNPs with Global Screening Array data used as ground truth, as shown in the manuscript.