-
Notifications
You must be signed in to change notification settings - Fork 0
Home
SLRP is a software for long range phasing and IBD detection in inbread and/or isolated populations genotyped on dense SNP arrays. The methodology was described in Palin et.al. 2011. The master branch of this repository contains a further developed version of the software with better user interface (VCF files) and various tricks for phasing thousands of samples (fastPreProc, slice_length, procs, float, IBDcoverLimit options)
- Software requirements
- Command line options
- Examples
If you need help running or installing the software, don't hesitate contact me, Kimmo Palin, at firstname.lastname at sanger.ac.uk
If you use the software in scientific publications, please cite:
Palin, K., Campbell, H., Wright, A. F., Wilson, J. F. and Durbin, R. (2011), Identity-by-descent-based phasing and imputation in founder populations using graphical models. Genetic Epidemiology. doi: 10.1002/gepi.20635
The installation seems to be as trivial as:
pip install --user --egg git+https://github.com/kpalin/SLRP.git
The output VCF files from SLRP has a bit of annotation for each genotype that are not described elsewhere but might be useful for many users. Here is some extra description for the various FORMAT fields.
- GP Genotype posterior probabilities. These are calculated from the diplotype probabilities in the SLRP model, and the heterozygous probability is a sum of the two diplotypes representing two alternative phases.
- HQ Haplotype Quality. Haplotype qualities. For i=1,2 -10log(Probability that i:th allele is wrong)
- GQ Genotype Quality. Phred scaled probability of wrong genotype call, calculated again from the diplotypes, summing over phase.
Hoai Tuong Nguyen and Anne-Louise Leutenegger from INSERM have been very helpfull in testing the package.
Here are few things of how to improve SLRP:
- Calculate marginals (forward backward, sum-product), instead of max-marginals (viterbi, max-product) It might even be faster. Scale the values on each node to sum to one and avoid branching like plaque.