Response Surface Analysis of Genomic Prediction Accuracy Values Using Quality Control Covariates in Soybean. (March 2019)
- Record Type:
- Journal Article
- Title:
- Response Surface Analysis of Genomic Prediction Accuracy Values Using Quality Control Covariates in Soybean. (March 2019)
- Main Title:
- Response Surface Analysis of Genomic Prediction Accuracy Values Using Quality Control Covariates in Soybean
- Authors:
- Jarquín, Diego
Howard, Reka
Graef, George
Lorenz, Aaron - Abstract:
- An important and broadly used tool for selection purposes and to increase yield and genetic gain in plant breeding programs is genomic prediction (GP). Genomic prediction is a technique where molecular marker information and phenotypic data are used to predict the phenotype (eg, yield) of individuals for which only marker data are available. Higher prediction accuracy can be achieved not only by using efficient models but also by using quality molecular marker and phenotypic data. The steps of a typical quality control (QC) of marker data include the elimination of markers with certain level of minor allele frequency (MAF) and missing marker values and the imputation of missing marker values. In this article, we evaluated how the prediction accuracy is influenced by the combination of 12 MAF values, 27 different percentages of missing marker values, and 2 imputation techniques (IT; naïve and Random Forest (RF)). We constructed a response surface of prediction accuracy values for the two ITs as a function of MAF and percentage of missing marker values using soybean data from the University of Nebraska–Lincoln Soybean Breeding Program. We found that both the genetic architecture of the trait and the IT affect the prediction accuracy implying that we have to be careful how we perform QC on the marker data. For the corresponding combinations MAF-percentage of missing values we observed that implementing the RF imputation increased the number of markers by 2 to 5 times than theAn important and broadly used tool for selection purposes and to increase yield and genetic gain in plant breeding programs is genomic prediction (GP). Genomic prediction is a technique where molecular marker information and phenotypic data are used to predict the phenotype (eg, yield) of individuals for which only marker data are available. Higher prediction accuracy can be achieved not only by using efficient models but also by using quality molecular marker and phenotypic data. The steps of a typical quality control (QC) of marker data include the elimination of markers with certain level of minor allele frequency (MAF) and missing marker values and the imputation of missing marker values. In this article, we evaluated how the prediction accuracy is influenced by the combination of 12 MAF values, 27 different percentages of missing marker values, and 2 imputation techniques (IT; naïve and Random Forest (RF)). We constructed a response surface of prediction accuracy values for the two ITs as a function of MAF and percentage of missing marker values using soybean data from the University of Nebraska–Lincoln Soybean Breeding Program. We found that both the genetic architecture of the trait and the IT affect the prediction accuracy implying that we have to be careful how we perform QC on the marker data. For the corresponding combinations MAF-percentage of missing values we observed that implementing the RF imputation increased the number of markers by 2 to 5 times than the simple naïve imputation method that is based on the mean allele dosage of the non-missing values at each loci. We conclude that there is not a unique strategy (combination of the QCs and imputation method) that outperforms the results of the others for all traits. … (more)
- Is Part Of:
- Evolutionary bioinformatics online. Volume 15(2019)
- Journal:
- Evolutionary bioinformatics online
- Issue:
- Volume 15(2019)
- Issue Display:
- Volume 15, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 15
- Issue:
- 2019
- Issue Sort Value:
- 2019-0015-2019-0000
- Page Start:
- Page End:
- Publication Date:
- 2019-03
- Subjects:
- genomic prediction -- quality control -- response surface -- soybean -- imputation -- Random Forest -- minor allele frequency -- missing marker score
Bioinformatics -- Periodicals
Evolutionary computation -- Periodicals
Genetic programming (Computer science) -- Periodicals
Computational Biology
Evolution, Molecular
Bioinformatics
Electronic journals
Periodicals
Fulltext
Internet Resources
Periodicals
Periodicals
576.8 - Journal URLs:
- http://insights.sagepub.com/journal-evolutionary-bioinformatics-j17 ↗
http://www.uk.sagepub.com/home.nav ↗
http://www.la-press.com/evolutionary-bioinformatics-journal-j17 ↗
http://bibpurl.oclc.org/web/38943 ↗ - DOI:
- 10.1177/1176934319831307 ↗
- Languages:
- English
- ISSNs:
- 1176-9343
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12120.xml