The impact of clustering methods for cross-validation, choice of phenotypes, and genotyping strategies on the accuracy of genomic predictions. (5th February 2019)
- Record Type:
- Journal Article
- Title:
- The impact of clustering methods for cross-validation, choice of phenotypes, and genotyping strategies on the accuracy of genomic predictions. (5th February 2019)
- Main Title:
- The impact of clustering methods for cross-validation, choice of phenotypes, and genotyping strategies on the accuracy of genomic predictions
- Authors:
- Baller, Johnna L
Howard, Jeremy T
Kachman, Stephen D
Spangler, Matthew L - Abstract:
- Abstract: For genomic predictors to be of use in genetic evaluation, their predicted accuracy must be a reliable indicator of their utility, and thus unbiased. The objective of this paper was to evaluate the accuracy of prediction of genomic breeding values (GBV ) using different clustering strategies and response variables. Red Angus genotypes ( n = 9, 763) were imputed to a reference 50K panel. The influence of clustering method [k-means, k-medoids, principal component (PC ) analysis on the numerator relationship matrix (A ) and the identical-by-state genomic relationship matrix (G ) as both data and covariance matrices, and random] and response variables [deregressed estimated breeding values (DEBV ) and adjusted phenotypes] were evaluated for cross-validation. The GBV were estimated using a Bayes C model for all traits. Traits for DEBV included birth weight (BWT ), marbling (MARB ), rib-eye area (REA ), and yearling weight (YWT ). Adjusted phenotypes included BWT, YWT, and ultrasonically measured intramuscular fat percentage and REA. Prediction accuracies were estimated using the genetic correlation between GBV and associated response variable using a bivariate animal model. A simulation mimicking a cattle population, replicated 5 times, was conducted to quantify differences between true and estimated accuracies. The simulation used the same clustering methods and response variables, with the addition of 2 genotyping strategies (random and top 25% of individuals), andAbstract: For genomic predictors to be of use in genetic evaluation, their predicted accuracy must be a reliable indicator of their utility, and thus unbiased. The objective of this paper was to evaluate the accuracy of prediction of genomic breeding values (GBV ) using different clustering strategies and response variables. Red Angus genotypes ( n = 9, 763) were imputed to a reference 50K panel. The influence of clustering method [k-means, k-medoids, principal component (PC ) analysis on the numerator relationship matrix (A ) and the identical-by-state genomic relationship matrix (G ) as both data and covariance matrices, and random] and response variables [deregressed estimated breeding values (DEBV ) and adjusted phenotypes] were evaluated for cross-validation. The GBV were estimated using a Bayes C model for all traits. Traits for DEBV included birth weight (BWT ), marbling (MARB ), rib-eye area (REA ), and yearling weight (YWT ). Adjusted phenotypes included BWT, YWT, and ultrasonically measured intramuscular fat percentage and REA. Prediction accuracies were estimated using the genetic correlation between GBV and associated response variable using a bivariate animal model. A simulation mimicking a cattle population, replicated 5 times, was conducted to quantify differences between true and estimated accuracies. The simulation used the same clustering methods and response variables, with the addition of 2 genotyping strategies (random and top 25% of individuals), and forward validation. The prediction accuracies were estimated similarly, and true accuracies were estimated as the correlation between the residuals of a bivariate model including true breeding value (TBV ) and GBV. Using the adjusted Rand index, random clusters were clearly different from relationship-based clustering methods. In both real and simulated data, random clustering consistently led to the largest estimates of accuracy, while no method was consistently associated with more or less bias than other methods. In simulation, random genotyping led to higher estimated accuracies than selection of the top 25% of individuals. Interestingly, random genotyping seemed to overpredict true accuracy while selective genotyping tended to underpredict accuracy. When forward in time validation was used, DEBV led to less biased estimates of GBV accuracy. Results suggest the highest, least biased GBV accuracies are associated with random genotyping and DEBV. … (more)
- Is Part Of:
- Journal of animal science. Volume 97:Number 4(2019)
- Journal:
- Journal of animal science
- Issue:
- Volume 97:Number 4(2019)
- Issue Display:
- Volume 97, Issue 4 (2019)
- Year:
- 2019
- Volume:
- 97
- Issue:
- 4
- Issue Sort Value:
- 2019-0097-0004-0000
- Page Start:
- 1534
- Page End:
- 1549
- Publication Date:
- 2019-02-05
- Subjects:
- beef cattle -- bias -- genomic prediction -- simulation
Livestock -- Periodicals
Livestock
Electronic journals
Periodicals
636.005 - Journal URLs:
- https://dl.sciencesocieties.org/publications/jas/index ↗
http://www.asas.org/jas/ ↗
https://academic.oup.com/jas ↗
http://www.oxfordjournals.org/ ↗ - DOI:
- 10.1093/jas/skz055 ↗
- Languages:
- English
- ISSNs:
- 0021-8812
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11999.xml