Does encoding matter? A novel view on the quantitative genetic trait prediction problem. Issue 9 (July 2016)
- Record Type:
- Journal Article
- Title:
- Does encoding matter? A novel view on the quantitative genetic trait prediction problem. Issue 9 (July 2016)
- Main Title:
- Does encoding matter? A novel view on the quantitative genetic trait prediction problem
- Authors:
- He, Dan
Parida, Laxmi - Abstract:
- Abstract Background Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models which require quantitative encodings for the genotypes. There are lots of work on the prediction algorithms, but none of the existing work investigated the effects of the encodings on the genetic trait prediction problem. Methods In this work, we view the genetic trait prediction problem from a novel angle: a multiple regression on categorical data problem, which requires encoding the categorical data into numerical data. We further proposed two novel encoding methods and we show that they are able to generate numerical features with higher predictive power. Results and Discussion Our experiments show that our methods are superior to the other encoding methods for both single marker model and epistasis model. We showed that the quantitative genetic trait prediction problem heavily depends on the encoding of genotypes, for both single marker model and epistasis model. Conclusions We conducted a detailed analysis on the performance of the hybrid encodings. To our knowledge, this is the first work that discusses the effects of encodings for genetic trait prediction problem.
- Is Part Of:
- BMC bioinformatics. Volume 17:Issue 9(2016)
- Journal:
- BMC bioinformatics
- Issue:
- Volume 17:Issue 9(2016)
- Issue Display:
- Volume 17, Issue 9 (2016)
- Year:
- 2016
- Volume:
- 17
- Issue:
- 9
- Issue Sort Value:
- 2016-0017-0009-0000
- Page Start:
- 1
- Page End:
- 9
- Publication Date:
- 2016-07
- Subjects:
- Quantitative genetic trait prediction -- Encoding -- Epistasis -- Ridge regression
Bioinformatics -- Periodicals
Computational biology -- Periodicals
570.285 - Journal URLs:
- http://www.biomedcentral.com/bmcbioinformatics/ ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=13 ↗
http://link.springer.com/ ↗ - DOI:
- 10.1186/s12859-016-1127-1 ↗
- Languages:
- English
- ISSNs:
- 1471-2105
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - Digital store
British Library HMNTS - ELD Digital store - Ingest File:
- 10045.xml