Recognizing protein-metal ion ligands binding residues by random forest algorithm with adding orthogonal properties. (June 2022)
- Record Type:
- Journal Article
- Title:
- Recognizing protein-metal ion ligands binding residues by random forest algorithm with adding orthogonal properties. (June 2022)
- Main Title:
- Recognizing protein-metal ion ligands binding residues by random forest algorithm with adding orthogonal properties
- Authors:
- You, Xiaoxiao
Hu, Xiuzhen
Feng, Zhenxing
Wang, Ziyang
Hao, Sixi
Yang, Caiyun - Abstract:
- Abstract: Accurately identifying protein-metal ion ligand binding residues is the key to study protein functions. Because the number of binding residues and non-binding residues is significantly imbalanced, false positives is hard to be eliminated from the binding residues prediction result. Therefore, identification of protein-metal ion ligand binding residues remains challenging. In this paper, the binding site of 7 metal ions (Ca 2+, Mg 2+, Zn 2+, Fe 3+, Mn 2+, Cu 2+ and Co 2+ ) were used as the objects of the study. Besides generally adopted parameters: amino acids and predicted secondary structure information, we creatively introduced ten orthogonal properties as a parameter. These orthogonal properties are clustering of 188 physical and chemical characteristics that can be used to describe three-dimension structural information. With the optimized parameters, we used the Random Forest algorithm to predict ion ligand binding residues. The proposed method obtained good prediction results with the MCC values of Mg 2+, Ca 2+ and Zn 2+ reaching 0.255, 0.254, 0.540, respectively. Comparing to the IonSeq method, the method developed in this paper has advantages on the binding residues prediction of some ions. Graphical Abstract: ga1 Highlights: We creatively introduced ten orthogonal properties as a parameter. In this paper, the RandomForest was realized with assistance of the package"ranger"and "caret" in the R language. We used a weighted method to preprocess the data setsAbstract: Accurately identifying protein-metal ion ligand binding residues is the key to study protein functions. Because the number of binding residues and non-binding residues is significantly imbalanced, false positives is hard to be eliminated from the binding residues prediction result. Therefore, identification of protein-metal ion ligand binding residues remains challenging. In this paper, the binding site of 7 metal ions (Ca 2+, Mg 2+, Zn 2+, Fe 3+, Mn 2+, Cu 2+ and Co 2+ ) were used as the objects of the study. Besides generally adopted parameters: amino acids and predicted secondary structure information, we creatively introduced ten orthogonal properties as a parameter. These orthogonal properties are clustering of 188 physical and chemical characteristics that can be used to describe three-dimension structural information. With the optimized parameters, we used the Random Forest algorithm to predict ion ligand binding residues. The proposed method obtained good prediction results with the MCC values of Mg 2+, Ca 2+ and Zn 2+ reaching 0.255, 0.254, 0.540, respectively. Comparing to the IonSeq method, the method developed in this paper has advantages on the binding residues prediction of some ions. Graphical Abstract: ga1 Highlights: We creatively introduced ten orthogonal properties as a parameter. In this paper, the RandomForest was realized with assistance of the package"ranger"and "caret" in the R language. We used a weighted method to preprocess the data sets which were significantly imbalanced . We attempted to ensemble the RF and SVM algorithm predictors. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 98(2022)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 98(2022)
- Issue Display:
- Volume 98, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 98
- Issue:
- 2022
- Issue Sort Value:
- 2022-0098-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Metal ion ligands -- Binding residues -- Ten orthogonal properties -- Random Forest algorithm
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2022.107693 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 21569.xml