A comprehensive comparison of residue-level methylation levels with the regression-based gene-level methylation estimations by ReGear. Issue 4 (13th October 2020)
- Record Type:
- Journal Article
- Title:
- A comprehensive comparison of residue-level methylation levels with the regression-based gene-level methylation estimations by ReGear. Issue 4 (13th October 2020)
- Main Title:
- A comprehensive comparison of residue-level methylation levels with the regression-based gene-level methylation estimations by ReGear
- Authors:
- Cai, Jinpu
Xu, Yuyang
Zhang, Wen
Ding, Shiying
Sun, Yuewei
Lyu, Jingyi
Duan, Meiyu
Liu, Shuai
Huang, Lan
Zhou, Fengfeng - Abstract:
- Abstract: Motivation: DNA methylation is a biological process impacting the gene functions without changing the underlying DNA sequence. The DNA methylation machinery usually attaches methyl groups to some specific cytosine residues, which modify the chromatin architectures. Such modifications in the promoter regions will inactivate some tumor-suppressor genes. DNA methylation within the coding region may significantly reduce the transcription elongation efficiency. The gene function may be tuned through some cytosines are methylated. Methods: This study hypothesizes that the overall methylation level across a gene may have a better association with the sample labels like diseases than the methylations of individual cytosines. The gene methylation level is formulated as a regression model using the methylation levels of all the cytosines within this gene. A comprehensive evaluation of various feature selection algorithms and classification algorithms is carried out between the gene-level and residue-level methylation levels. Results: A comprehensive evaluation was conducted to compare the gene and cytosine methylation levels for their associations with the sample labels and classification performances. The unsupervised clustering was also improved using the gene methylation levels. Some genes demonstrated statistically significant associations with the class label, even when no residue-level methylation features have statistically significant associations with the classAbstract: Motivation: DNA methylation is a biological process impacting the gene functions without changing the underlying DNA sequence. The DNA methylation machinery usually attaches methyl groups to some specific cytosine residues, which modify the chromatin architectures. Such modifications in the promoter regions will inactivate some tumor-suppressor genes. DNA methylation within the coding region may significantly reduce the transcription elongation efficiency. The gene function may be tuned through some cytosines are methylated. Methods: This study hypothesizes that the overall methylation level across a gene may have a better association with the sample labels like diseases than the methylations of individual cytosines. The gene methylation level is formulated as a regression model using the methylation levels of all the cytosines within this gene. A comprehensive evaluation of various feature selection algorithms and classification algorithms is carried out between the gene-level and residue-level methylation levels. Results: A comprehensive evaluation was conducted to compare the gene and cytosine methylation levels for their associations with the sample labels and classification performances. The unsupervised clustering was also improved using the gene methylation levels. Some genes demonstrated statistically significant associations with the class label, even when no residue-level methylation features have statistically significant associations with the class label. So in summary, the trained gene methylation levels improved various methylome-based machine learning models. Both methodology development of regression algorithms and experimental validation of the gene-level methylation biomarkers are worth of further investigations in the future studies. The source code, example data files and manual are available at http://www.healthinformaticslab.org/supp/ . … (more)
- Is Part Of:
- Briefings in bioinformatics. Volume 22:Issue 4(2021)
- Journal:
- Briefings in bioinformatics
- Issue:
- Volume 22:Issue 4(2021)
- Issue Display:
- Volume 22, Issue 4 (2021)
- Year:
- 2021
- Volume:
- 22
- Issue:
- 4
- Issue Sort Value:
- 2021-0022-0004-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-10-13
- Subjects:
- regression -- ReGear -- differential methylation -- feature selection -- classification -- hierarchical clustering
Genetics -- Data processing -- Periodicals
Molecular biology -- Data processing -- Periodicals
Genomes -- Data processing -- Periodicals
572.80285 - Journal URLs:
- http://bib.oxfordjournals.org ↗
http://www.oxfordjournals.org/content?genre=journal&issn=1477-4054 ↗
http://ukcatalogue.oup.com/ ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1093/bib/bbaa253 ↗
- Languages:
- English
- ISSNs:
- 1467-5463
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2283.958363
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24947.xml