A novel feature selection method to predict protein structural class. (October 2018)
- Record Type:
- Journal Article
- Title:
- A novel feature selection method to predict protein structural class. (October 2018)
- Main Title:
- A novel feature selection method to predict protein structural class
- Authors:
- Yuan, Mingshun
Yang, Zijiang
Huang, Guangzao
Ji, Guoli - Abstract:
- Graphical abstract: Highlights: Propose a novel feature selection method for high-dimensional protein data classification. Provide a multi-class label coding strategy. Make use of the idea of multi-step components extraction to increase robustness. Propose a method with good time efficiency. Abstract: Integrating various features from different protein properties helps to improve the prediction accuracy of protein structural class but need to deal with the corresponding integrated high-dimensional data. Thus, the feature selection process used to select the informative features from the integrated features also becomes an indispensable key step. This paper proposes a novel feature selection method, Partial-Maximum-Correlation-Information based Recursive Feature Elimination (PMCI-RFE), to quickly select the best feature subset from the integrated high-dimensional protein features set to improve the prediction performance of protein structural class. PMCI-RFE can also be used to find different types of informative features to further analyze some biological relationships. The proposed PMCI-RFE method uses the correlation information between the feature space and class encoding space to select informative features based on the idea of orthogonal component projection in the feature space. The experimental results on six widely used benchmark datasets show that PMCI-RFE is a fast and effective method compare to other four state-of-the-art feature selection methods, which indeedGraphical abstract: Highlights: Propose a novel feature selection method for high-dimensional protein data classification. Provide a multi-class label coding strategy. Make use of the idea of multi-step components extraction to increase robustness. Propose a method with good time efficiency. Abstract: Integrating various features from different protein properties helps to improve the prediction accuracy of protein structural class but need to deal with the corresponding integrated high-dimensional data. Thus, the feature selection process used to select the informative features from the integrated features also becomes an indispensable key step. This paper proposes a novel feature selection method, Partial-Maximum-Correlation-Information based Recursive Feature Elimination (PMCI-RFE), to quickly select the best feature subset from the integrated high-dimensional protein features set to improve the prediction performance of protein structural class. PMCI-RFE can also be used to find different types of informative features to further analyze some biological relationships. The proposed PMCI-RFE method uses the correlation information between the feature space and class encoding space to select informative features based on the idea of orthogonal component projection in the feature space. The experimental results on six widely used benchmark datasets show that PMCI-RFE is a fast and effective method compare to other four state-of-the-art feature selection methods, which indeed can make full use of different protein property information and improve the predictability of protein structural class. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 76(2018)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 76(2018)
- Issue Display:
- Volume 76, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 76
- Issue:
- 2018
- Issue Sort Value:
- 2018-0076-2018-0000
- Page Start:
- 118
- Page End:
- 129
- Publication Date:
- 2018-10
- Subjects:
- Feature selection -- Protein structural class -- Maximum correlation information (MCI) -- Recursive feature elimination (RFE)
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2018.06.007 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 23145.xml