DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models. (November 2018)
- Record Type:
- Journal Article
- Title:
- DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models. (November 2018)
- Main Title:
- DNA methylation-based age prediction using massively parallel sequencing data and multiple machine learning models
- Authors:
- Aliferi, Anastasia
Ballard, David
Gallidabino, Matteo D.
Thurtle, Helen
Barron, Leon
Syndercombe Court, Denise - Abstract:
- Highlights: 110 whole blood samples were analysed for 12 age-correlated CpGs using the Illumina MiSeq. 17 different statistical models were developed using R and Trajan Neural Network Simulation. A support vector machine model was selected with a MAE of 4.1 years in the blind test set. The accuracy of both DNA methylation quantification and age prediction was retained down to 2ng of DNA input (PCR stage). 50% of saliva samples (n = 34) were accurately predicted with less than 4 years error and 70% with less than 7 years. Abstract: The field of DNA intelligence focuses on retrieving information from DNA evidence that can help narrow down large groups of suspects or define target groups of interest. With recent breakthroughs on the estimation of geographical ancestry and physical appearance, the estimation of chronological age comes to complete this circle of information. Recent studies have identified methylation sites in the human genome that correlate strongly with age and can be used for the development of age-estimation algorithms. In this study, 110 whole blood samples from individuals aged 11–93 years were analysed using a DNA methylation quantification assay based on bisulphite conversion and massively parallel sequencing (Illumina MiSeq) of 12 CpG sites. Using this data, 17 different statistical modelling approaches were compared based on root mean square error (RMSE) and a Support Vector Machine with polynomial function (SVMp) model was selected for further testing.Highlights: 110 whole blood samples were analysed for 12 age-correlated CpGs using the Illumina MiSeq. 17 different statistical models were developed using R and Trajan Neural Network Simulation. A support vector machine model was selected with a MAE of 4.1 years in the blind test set. The accuracy of both DNA methylation quantification and age prediction was retained down to 2ng of DNA input (PCR stage). 50% of saliva samples (n = 34) were accurately predicted with less than 4 years error and 70% with less than 7 years. Abstract: The field of DNA intelligence focuses on retrieving information from DNA evidence that can help narrow down large groups of suspects or define target groups of interest. With recent breakthroughs on the estimation of geographical ancestry and physical appearance, the estimation of chronological age comes to complete this circle of information. Recent studies have identified methylation sites in the human genome that correlate strongly with age and can be used for the development of age-estimation algorithms. In this study, 110 whole blood samples from individuals aged 11–93 years were analysed using a DNA methylation quantification assay based on bisulphite conversion and massively parallel sequencing (Illumina MiSeq) of 12 CpG sites. Using this data, 17 different statistical modelling approaches were compared based on root mean square error (RMSE) and a Support Vector Machine with polynomial function (SVMp) model was selected for further testing. For the selected model (RMSE = 4.9 years) the mean average error (MAE) of the blind test (n = 33) was calculated at 4.1 years, with 52% of the samples predicting with less than 4 years of error and 86% with less than 7 years. Furthermore, the sensitivity of the method was assessed both in terms of methylation quantification accuracy and prediction accuracy in the first validation of this kind. The described method retained its accuracy down to 10 ng of initial DNA input or ∼2 ng bisulphite PCR input. Finally, 34 saliva samples were analysed and following basic normalisation, the chronological age of the donors was predicted with less than 4 years of error for 50% of the samples and with less than 7 years of error for 70%. … (more)
- Is Part Of:
- Forensic science international. Volume 37(2018)
- Journal:
- Forensic science international
- Issue:
- Volume 37(2018)
- Issue Display:
- Volume 37, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 37
- Issue:
- 2018
- Issue Sort Value:
- 2018-0037-2018-0000
- Page Start:
- 215
- Page End:
- 226
- Publication Date:
- 2018-11
- Subjects:
- DNA methylation -- Age prediction -- Artificial neural networks -- Machine learning -- Whole blood -- Saliva -- Sperm
Forensic genetics -- Periodicals
Génétique légale -- Périodiques
Forensic genetics
Electronic journals
Periodicals
614.1 - Journal URLs:
- http://www.clinicalkey.com.au/dura/browse/journalIssue/18724973 ↗
http://www.clinicalkey.com/dura/browse/journalIssue/18724973 ↗
http://www.sciencedirect.com/science/journal/18724973 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.fsigen.2018.09.003 ↗
- Languages:
- English
- ISSNs:
- 1872-4973
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3987.764050
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 23131.xml