Identification of coenzyme-binding proteins with machine learning algorithms. (April 2019)
- Record Type:
- Journal Article
- Title:
- Identification of coenzyme-binding proteins with machine learning algorithms. (April 2019)
- Main Title:
- Identification of coenzyme-binding proteins with machine learning algorithms
- Authors:
- Liu, Yong
Munteanu, Cristian R.
Kong, Zhiwei
Ran, Tao
Sahagún-Ruiz, Alfredo
He, Zhixiong
Zhou, Chuanshe
Tan, Zhiliang - Abstract:
- Highlights: CoEnBPs play a vital role in fatty acid biosynthesis, gene regulation, and lipid synthesis, etc . In silico screening of a new peptide with coenzyme binding activity might be an efficient process. A dataset of 42 SGTI features for 2, 897 proteins (456 CoEnBPs) was used as inputs to construct classification models. Best model obtained with 3 SGTI features in RF model with 0.971 (AUROC), 91.7% (TP rate), and 7.6% (FP rate). ML models for predicting new CoEnBPs could be useful for future drug development or enzyme catalysis metabolism research. Abstract: The coenzyme-binding proteins play a vital role in the cellular metabolism processes, such as fatty acid biosynthesis, enzyme and gene regulation, lipid synthesis, particular vesicular traffic, and β-oxidation donation of acyl-CoA esters. Based on the theory of Star Graph Topological Indices (SGTIs) of protein primary sequences, we proposed a method to develop a first classification model for predicting protein with coenzyme-binding properties. To simulate the properties of coenzyme-binding proteins, we created a dataset containing 2897 proteins, among 456 proteins functioned as coenzyme-binding activity. The SGTIs of peptide sequence were calculated with Sequence to Star Network (S2SNet) application. We used the SGTIs as inputs to several classification techniques with a machine learning software - Weka. A Random Forest classifier based on 3 features of the embedded and non-embedded graphs was identified as theHighlights: CoEnBPs play a vital role in fatty acid biosynthesis, gene regulation, and lipid synthesis, etc . In silico screening of a new peptide with coenzyme binding activity might be an efficient process. A dataset of 42 SGTI features for 2, 897 proteins (456 CoEnBPs) was used as inputs to construct classification models. Best model obtained with 3 SGTI features in RF model with 0.971 (AUROC), 91.7% (TP rate), and 7.6% (FP rate). ML models for predicting new CoEnBPs could be useful for future drug development or enzyme catalysis metabolism research. Abstract: The coenzyme-binding proteins play a vital role in the cellular metabolism processes, such as fatty acid biosynthesis, enzyme and gene regulation, lipid synthesis, particular vesicular traffic, and β-oxidation donation of acyl-CoA esters. Based on the theory of Star Graph Topological Indices (SGTIs) of protein primary sequences, we proposed a method to develop a first classification model for predicting protein with coenzyme-binding properties. To simulate the properties of coenzyme-binding proteins, we created a dataset containing 2897 proteins, among 456 proteins functioned as coenzyme-binding activity. The SGTIs of peptide sequence were calculated with Sequence to Star Network (S2SNet) application. We used the SGTIs as inputs to several classification techniques with a machine learning software - Weka. A Random Forest classifier based on 3 features of the embedded and non-embedded graphs was identified as the best predictive model for coenzyme-binding proteins. This model developed was with the true positive (TP) rate of 91.7%, false positive (FP) rate of 7.6%, and Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.971. The prediction of new coenzyme-binding activity proteins using this model could be useful for further drug development or enzyme metabolism researches. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 79(2019)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 79(2019)
- Issue Display:
- Volume 79, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 79
- Issue:
- 2019
- Issue Sort Value:
- 2019-0079-2019-0000
- Page Start:
- 185
- Page End:
- 192
- Publication Date:
- 2019-04
- Subjects:
- Coenzyme-binding -- Protein sequence -- Topological indices -- Classification model -- Random Forest
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2019.01.014 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 9621.xml