Prediction of protein–protein interactions based on elastic net and deep forest. (15th August 2021)
- Record Type:
- Journal Article
- Title:
- Prediction of protein–protein interactions based on elastic net and deep forest. (15th August 2021)
- Main Title:
- Prediction of protein–protein interactions based on elastic net and deep forest
- Authors:
- Yu, Bin
Chen, Cheng
Wang, Xiaolin
Yu, Zhaomin
Ma, Anjun
Liu, Bingqiang - Abstract:
- Highlights: A novel method (GcForest-PPI) to predict protein–protein interactions. The PseAAC, AD, MMI, CTD, AAC-PSSM and DPC-PSSM are fused to extract feature information. The elastic net is employed to eliminate redundant and irrelevant features. We firstly use deep forest as classifier to predict PPIs via layer-by-layer processing of raw features. GcForest-PPI model has good generalization ability on cross-species datasets and PPIs network. Abstract: Prediction of protein–protein interactions (PPIs) helps to grasp molecular roots of disease. However, web-lab experiments to predict PPIs are limited and costly. Using machine-learning-based frameworks can not only automatically identify PPIs, but also provide new ideas for drug research and development from a promising alternative. We present a novel deep-forest-based method for PPIs prediction. Firstly, pseudo amino acid composition (PAAC), autocorrelation descriptor (Auto), multivariate mutual information (MMI), composition-transition-distribution (CTD), amino acid composition position-specific scoring matrix (AAC-PSSM), and dipeptide composition PSSM (DPC-PSSM) are adopted to extract and construct the pattern of PPIs. Secondly, elastic net is utilized to optimize the initial feature vectors and boost the predictive performance. Finally, we ensemble XGBoost, random forest, and extremely randomized trees to construct deep forest model via cascade architecture for PPIs prediction (GcForest-PPI). Benchmark experiments revealHighlights: A novel method (GcForest-PPI) to predict protein–protein interactions. The PseAAC, AD, MMI, CTD, AAC-PSSM and DPC-PSSM are fused to extract feature information. The elastic net is employed to eliminate redundant and irrelevant features. We firstly use deep forest as classifier to predict PPIs via layer-by-layer processing of raw features. GcForest-PPI model has good generalization ability on cross-species datasets and PPIs network. Abstract: Prediction of protein–protein interactions (PPIs) helps to grasp molecular roots of disease. However, web-lab experiments to predict PPIs are limited and costly. Using machine-learning-based frameworks can not only automatically identify PPIs, but also provide new ideas for drug research and development from a promising alternative. We present a novel deep-forest-based method for PPIs prediction. Firstly, pseudo amino acid composition (PAAC), autocorrelation descriptor (Auto), multivariate mutual information (MMI), composition-transition-distribution (CTD), amino acid composition position-specific scoring matrix (AAC-PSSM), and dipeptide composition PSSM (DPC-PSSM) are adopted to extract and construct the pattern of PPIs. Secondly, elastic net is utilized to optimize the initial feature vectors and boost the predictive performance. Finally, we ensemble XGBoost, random forest, and extremely randomized trees to construct deep forest model via cascade architecture for PPIs prediction (GcForest-PPI). Benchmark experiments reveal that the proposed approach outperforms other state-of-the-art predictors on Saccharomyces cerevisiae and Helicobacter pylori . We also apply GcForest-PPI on independent test sets, CD9-core network, crossover network, and cancer-specific network. The evaluation shows that GcForest-PPI can boost the prediction accuracy, complement experiments and improve drug discovery. … (more)
- Is Part Of:
- Expert systems with applications. Volume 176(2021)
- Journal:
- Expert systems with applications
- Issue:
- Volume 176(2021)
- Issue Display:
- Volume 176, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 176
- Issue:
- 2021
- Issue Sort Value:
- 2021-0176-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-08-15
- Subjects:
- Protein-protein interactions -- Multi-information fusion -- Elastic net -- Deep forest
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2021.114876 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 23807.xml