Prediction of RNA Methylation Status From Gene Expression Data Using Classification and Regression Methods. (April 2020)
- Record Type:
- Journal Article
- Title:
- Prediction of RNA Methylation Status From Gene Expression Data Using Classification and Regression Methods. (April 2020)
- Main Title:
- Prediction of RNA Methylation Status From Gene Expression Data Using Classification and Regression Methods
- Authors:
- Xue, Hao
Wei, Zhen
Chen, Kunqi
Tang, Yujiao
Wu, Xiangyu
Su, Jionglong
Meng, Jia - Abstract:
- RNA N 6 -methyladenosine (m 6 A) has emerged as an important epigenetic modification for its role in regulating the stability, structure, processing, and translation of RNA. Instability of m 6 A homeostasis may result in flaws in stem cell regulation, decrease in fertility, and risk of cancer. To this day, experimental detection and quantification of RNA m 6 A modification are still time-consuming and labor-intensive. There is only a limited number of epitranscriptome samples in existing databases, and a matched RNA methylation profile is not often available for a biological problem of interests. As gene expression data are usually readily available for most biological problems, it could be appealing if we can estimate the RNA methylation status from gene expression data using in silico methods. In this study, we explored the possibility of computational prediction of RNA methylation status from gene expression data using classification and regression methods based on mouse RNA methylation data collected from 73 experimental conditions. Elastic Net-regularized Logistic Regression (ENLR), Support Vector Machine (SVM), and Random Forests (RF) were constructed for classification. Both SVM and RF achieved the best performance with the mean area under the curve (AUC) = 0.84 across samples; SVM had a narrower AUC spread. Gene Site Enrichment Analysis was conducted on those sites selected by ENLR as predictors to access the biological significance of the model. Three functionalRNA N 6 -methyladenosine (m 6 A) has emerged as an important epigenetic modification for its role in regulating the stability, structure, processing, and translation of RNA. Instability of m 6 A homeostasis may result in flaws in stem cell regulation, decrease in fertility, and risk of cancer. To this day, experimental detection and quantification of RNA m 6 A modification are still time-consuming and labor-intensive. There is only a limited number of epitranscriptome samples in existing databases, and a matched RNA methylation profile is not often available for a biological problem of interests. As gene expression data are usually readily available for most biological problems, it could be appealing if we can estimate the RNA methylation status from gene expression data using in silico methods. In this study, we explored the possibility of computational prediction of RNA methylation status from gene expression data using classification and regression methods based on mouse RNA methylation data collected from 73 experimental conditions. Elastic Net-regularized Logistic Regression (ENLR), Support Vector Machine (SVM), and Random Forests (RF) were constructed for classification. Both SVM and RF achieved the best performance with the mean area under the curve (AUC) = 0.84 across samples; SVM had a narrower AUC spread. Gene Site Enrichment Analysis was conducted on those sites selected by ENLR as predictors to access the biological significance of the model. Three functional annotation terms were found statistically significant: phosphoprotein, SRC Homology 3 (SH3) domain, and endoplasmic reticulum. All 3 terms were found to be closely related to m 6 A pathway. For regression analysis, Elastic Net was implemented, which yielded a mean Pearson correlation coefficient = 0.68 and a mean Spearman correlation coefficient = 0.64. Our exploratory study suggested that gene expression data could be used to construct predictors for m 6 A methylation status with adequate accuracy. Our work showed for the first time that RNA methylation status may be predicted from the matched gene expression data. This finding may facilitate RNA modification research in various biological contexts when a matched RNA methylation profile is not available, especially in the very early stage of the study. … (more)
- Is Part Of:
- Evolutionary bioinformatics online. Volume 16(2020)
- Journal:
- Evolutionary bioinformatics online
- Issue:
- Volume 16(2020)
- Issue Display:
- Volume 16, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 16
- Issue:
- 2020
- Issue Sort Value:
- 2020-0016-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-04
- Subjects:
- Epitranscriptomics -- next generation sequencing -- multiomics -- machine learning -- supervised learning
Bioinformatics -- Periodicals
Evolutionary computation -- Periodicals
Genetic programming (Computer science) -- Periodicals
Computational Biology
Evolution, Molecular
Bioinformatics
Electronic journals
Periodicals
Fulltext
Internet Resources
Periodicals
Periodicals
576.8 - Journal URLs:
- http://insights.sagepub.com/journal-evolutionary-bioinformatics-j17 ↗
http://www.uk.sagepub.com/home.nav ↗
http://www.la-press.com/evolutionary-bioinformatics-journal-j17 ↗
http://bibpurl.oclc.org/web/38943 ↗ - DOI:
- 10.1177/1176934320915707 ↗
- Languages:
- English
- ISSNs:
- 1176-9343
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14512.xml