A deep hybrid model to detect multi-locus interacting SNPs in the presence of noise. (November 2018)
- Record Type:
- Journal Article
- Title:
- A deep hybrid model to detect multi-locus interacting SNPs in the presence of noise. (November 2018)
- Main Title:
- A deep hybrid model to detect multi-locus interacting SNPs in the presence of noise
- Authors:
- Uppu, Suneetha
Krishna, Aneesh - Abstract:
- Highlights: Deep hybrid method is proposed to detect multi-locus SNP interactions in high dimensional genome. The method combines deep neural networks and random forest. The proposed method is evaluated on various simulated scenarios and real data applications in the absence and presence of noise. The experimental findings led to the improvement in the performance of the proposed method over the traditional machine learning approaches. Abstract: Identifying genetic variants associated with complex diseases is a central focus of genome-wide association studies. These studies extensively adopt univariate analysis by ignoring interaction effects. It is widely accepted that the etiology of most complex diseases depends on interactions between genetic variants and / or environmental factors. Several machine learning and data mining methods have been consistently successful in exposing these interaction effects. However, there has been no major breakthrough due to various biological complexities, and statistical computational challenges facing in the field of genetic epidemiology, despite of many efforts. Deep learning is emerging machine learning approach that promises to reveal the hidden patterns of big data for accurate predictions. In this study, a deep neural network is unified with a random forest by forming hybrid architecture, for achieving reliable detection of multi-locus interactions between single nucleotide polymorphisms. The proposed hybrid method is evaluated onHighlights: Deep hybrid method is proposed to detect multi-locus SNP interactions in high dimensional genome. The method combines deep neural networks and random forest. The proposed method is evaluated on various simulated scenarios and real data applications in the absence and presence of noise. The experimental findings led to the improvement in the performance of the proposed method over the traditional machine learning approaches. Abstract: Identifying genetic variants associated with complex diseases is a central focus of genome-wide association studies. These studies extensively adopt univariate analysis by ignoring interaction effects. It is widely accepted that the etiology of most complex diseases depends on interactions between genetic variants and / or environmental factors. Several machine learning and data mining methods have been consistently successful in exposing these interaction effects. However, there has been no major breakthrough due to various biological complexities, and statistical computational challenges facing in the field of genetic epidemiology, despite of many efforts. Deep learning is emerging machine learning approach that promises to reveal the hidden patterns of big data for accurate predictions. In this study, a deep neural network is unified with a random forest by forming hybrid architecture, for achieving reliable detection of multi-locus interactions between single nucleotide polymorphisms. The proposed hybrid method is evaluated on various simulated scenarios in the absence of main effect for six epistasis models. The best model with optimal hyper-parameters (grid and random grid search) is chosen to enhance the power of the method by maximising the model's prediction accuracy. The performance metrics of each model is analysed for both training and validation. Further, the performance of the method in the presence of noise due to missing data, genotyping errors, genetic heterogeneity, and phenocopy, and their combined effects are evaluated. The power of the method in detecting two-locus interactions is compared with the previous methods in the presence and absence of noise. On an average, the power of the proposed method is much higher than the previous methods for all simulated scenarios. Finally, findings are confirmed on a chronical dialysis patient's data, obtained from the published study performed at the Kaohsiung Chang Gung Memorial Hospital. It is observed that the interaction between SNP 21 (2) and SNP 28 (2) in the mitochondrial D-loop has the highest risk for the disease manifestation. … (more)
- Is Part Of:
- International journal of medical informatics. Volume 119(2018)
- Journal:
- International journal of medical informatics
- Issue:
- Volume 119(2018)
- Issue Display:
- Volume 119, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 119
- Issue:
- 2018
- Issue Sort Value:
- 2018-0119-2018-0000
- Page Start:
- 134
- Page End:
- 151
- Publication Date:
- 2018-11
- Subjects:
- Deep neural networks -- Random forest -- SNP-SNP interactions -- Genetic heterogeneity -- Phenocopy -- Genotyping error -- Genome-wide association studies
Medical informatics -- Periodicals
Information science -- Periodicals
Computers -- Periodicals
Medical technology -- Periodicals
Medical Informatics -- Periodicals
Technology, Medical -- Periodicals
Computers
Information science
Medical informatics
Medical technology
Electronic journals
Periodicals
Electronic journals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/13865056 ↗
http://www.clinicalkey.com/dura/browse/journalIssue/13865056 ↗
http://www.clinicalkey.com.au/dura/browse/journalIssue/13865056 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ijmedinf.2018.09.003 ↗
- Languages:
- English
- ISSNs:
- 1386-5056
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4542.345250
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 7966.xml