EcoPopDL-DP is a comprehensive framework for environmental-aware and population-informed genomic prediction using deep learning and ChromoMap. It is designed to address challenges in predicting complex traits influenced by genotype-by-environment and genotype-by-location interactions in resource-limited breeding populations. The framework integrates genomic, population structure, and environmental data to improve predictive accuracy for complex traits such as yield, flowering time, and seed weight.
- ChromoMap: A visual-spatial representation of SNP-level genomic variation, chromosome structure, and positional relationships.
- Deep Learning Integration: Utilizes convolutional neural networks (CNNs) for feature extraction and trait prediction.
- Linear Mixed Model (LMM): Captures both fixed and random effects to refine predictions and improve interpretability.
- Hybrid Framework: Combines CNN-derived features with LMM for improved genomic prediction performance.
- Data Augmentation and Transfer Learning: Enhances model generalizability and robustness with advanced techniques.
- Preliminary Data Processing:
- Input Data: Genotypic data, phenotypic data, and environmental variables.
- Genotypic Data Processing: Minor allele frequency (MAF) filtering and linkage disequilibrium (LD) pruning.
- Phenotypic Data Processing: Normalization and outlier removal.
- Genetic Ancestry Analysis:
- Incorporates unsupervised and supervised admixture analysis to derive genetic ancestry profiles.
- Integrates population clusters into predictive models.
- Prediction Model Design:
- ChromoMap Generation: Encodes SNPs into a color-coded image representing the genome.
- CNN Architecture: Leverages EfficientNet-B0 for trait prediction.
- Feature Engineering: Extracts and integrates genomic and metadata features.
- Linear Mixed Model: Adds environmental covariates and population structure.
- Benchmarking:
- Compares against baseline models such as GBLUP, RRBLUP, Bayesian Ridge Regression, Lasso Regression, and SVM.
To install and run the EcoPopDL-DP framework:
- Clone the repository:
git clone https://git.cs.usask.ca/qnm481/ecopopgp.git cd ecopopgp
- Prepare input data:
- Ensure genotypic, phenotypic, and environmental data are in the appropriate formats as outlined in the scripts.
For any questions or inquiries, please feel free to open an issue on our repository or contact us at [email protected].
- Thulani Hewavithana
- Sophie Duchesne
- Bunyamin Tar’an
- Ian Stavness
- Steve Shirtliffe
- Kirstin Bett
- Isobel A. P. Parkin
- Lingling Jin
This project is licensed under the MIT License.
We welcome contributions! To contribute:
- Fork the repository.
- Create a new branch:
git checkout -b feature-branch
- Commit changes:
git commit -m 'Add new feature'
- Push to the branch:
git push origin feature-branch
- Create a pull request.