"Noisy beets": impact of phenotyping errors on genomic predictions for binary traits in Beta vulgaris. Issue 1 (December 2016)
- Record Type:
- Journal Article
- Title:
- "Noisy beets": impact of phenotyping errors on genomic predictions for binary traits in Beta vulgaris. Issue 1 (December 2016)
- Main Title:
- "Noisy beets": impact of phenotyping errors on genomic predictions for binary traits in Beta vulgaris
- Authors:
- Biscarini, Filippo
Nazzicari, Nelson
Broccanello, Chiara
Stevanato, Piergiorgio
Marini, Simone - Abstract:
- Abstract Background Noise (errors) in scientific data is endemic and may have a detrimental effect on statistical analyses and experimental results. The effects of noisy data have been assessed in genome-wide association studies for case-control experiments in human medicine. Little is known, however, on the impact of noisy data on genomic predictions, a widely used statistical application in plant and animal breeding. Results In this study, the sensitivity to noise in the data of five classification methods (K-nearest neighbours—KNN, random forest—RF, ridge logistic regression—LR, and support vector machines with linear or radial basis function kernels) was investigated. A sugar beet population of 123 plants phenotyped for a binary trait and genotyped for 192 SNP (single nucleotide polymorphism) markers was used. Labels (0/1 phenotype) were randomly sampled to generate noise. From the base scenario without errors in the labels, increasing proportions of noisy labels—up to 50 %—were generated and introduced in the data. Conclusions Local classification methods—KNN and RF—showed higher tolerance to noisy labels compared to methods that leverage global data properties—LR and the two SVM models. In particular, KNN outperformed all other classifiers with AUC (area under the ROC curve) higher than 0.95 up to 20 % noisy labels. The runner-up method, RF, had an AUC of 0.941 with 20 % noise.
- Is Part Of:
- Plant methods. Volume 12:Issue 1(2016)
- Journal:
- Plant methods
- Issue:
- Volume 12:Issue 1(2016)
- Issue Display:
- Volume 12, Issue 1 (2016)
- Year:
- 2016
- Volume:
- 12
- Issue:
- 1
- Issue Sort Value:
- 2016-0012-0001-0000
- Page Start:
- 1
- Page End:
- 8
- Publication Date:
- 2016-12
- Subjects:
- Noisy data -- Classification -- K-nearest neighbours (KNN) -- Random forest (RF) -- Support vector machines (SVM) -- Ridge logistic regression -- Sugar beet -- Binomial phenotype -- Robustness to errors -- Genomic predictions
Botany -- Methodology -- Periodicals
572.2 - Journal URLs:
- http://pubmedcentral.com/tocrender.fcgi?journal=354&action=archive ↗
http://www.plantmethods.com/ ↗
http://link.springer.com/ ↗ - DOI:
- 10.1186/s13007-016-0136-4 ↗
- Languages:
- English
- ISSNs:
- 1746-4811
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 10038.xml