Enzyme classification using multiclass support vector machine and feature subset selection. (October 2017)
- Record Type:
- Journal Article
- Title:
- Enzyme classification using multiclass support vector machine and feature subset selection. (October 2017)
- Main Title:
- Enzyme classification using multiclass support vector machine and feature subset selection
- Authors:
- Pradhan, Debasmita
Padhy, Sudarsan
Sahoo, Biswajit - Abstract:
- Graphical abstract: Highlights: Physico-chemical properties important for protein function classification are identified by using Orthogonal Forward Selection (OFS) method. A given Protein is classified as Enzyme or Non-Enzyme by using Binary SVM. An enzyme is classified into 6-functional classes using multiclass SVM. On comparison it is seen that our model (OFS followed by multiclass SVM) performs better than other methods. Abstract: Proteins are the macromolecules responsible for almost all biological processes in a cell. With the availability of large number of protein sequences from different sequencing projects, the challenge with the scientist is to characterize their functions. As the wet lab methods are time consuming and expensive, many computational methods such as FASTA, PSI-BLAST, DNA microarray clustering, and Nearest Neighborhood classification on protein–protein interaction network have been proposed. Support vector machine is one such method that has been used successfully for several problems such as protein fold recognition, protein structure prediction etc. Cai et al. in 2003 have used SVM for classifying proteins into different functional classes and to predict their function. They used the physico-chemical properties of proteins to represent the protein sequences. In this paper a model comprising of feature subset selection followed by multiclass Support Vector Machine is proposed to determine the functional class of a newly generated protein sequence.Graphical abstract: Highlights: Physico-chemical properties important for protein function classification are identified by using Orthogonal Forward Selection (OFS) method. A given Protein is classified as Enzyme or Non-Enzyme by using Binary SVM. An enzyme is classified into 6-functional classes using multiclass SVM. On comparison it is seen that our model (OFS followed by multiclass SVM) performs better than other methods. Abstract: Proteins are the macromolecules responsible for almost all biological processes in a cell. With the availability of large number of protein sequences from different sequencing projects, the challenge with the scientist is to characterize their functions. As the wet lab methods are time consuming and expensive, many computational methods such as FASTA, PSI-BLAST, DNA microarray clustering, and Nearest Neighborhood classification on protein–protein interaction network have been proposed. Support vector machine is one such method that has been used successfully for several problems such as protein fold recognition, protein structure prediction etc. Cai et al. in 2003 have used SVM for classifying proteins into different functional classes and to predict their function. They used the physico-chemical properties of proteins to represent the protein sequences. In this paper a model comprising of feature subset selection followed by multiclass Support Vector Machine is proposed to determine the functional class of a newly generated protein sequence. To train and test the model for its performance, 32 physico-chemical properties of enzymes from 6 enzyme classes are considered. To determine the features that contribute significantly for functional classification, Sequential Forward Floating Selection (SFFS), Orthogonal Forward Selection (OFS), and SVM Recursive Feature Elimination (SVM-RFE) algorithms are used and it is observed that out of 32 properties considered initially, only 20 features are sufficient to classify the proteins into its functional classes with an accuracy ranging from 91% to 94%. On comparison it is seen that, OFS followed by SVM performs better than other methods. Our model generalizes the existing model to include multiclass classification and to identify most significant features affecting the protein function. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 70(2017)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 70(2017)
- Issue Display:
- Volume 70, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 70
- Issue:
- 2017
- Issue Sort Value:
- 2017-0070-2017-0000
- Page Start:
- 211
- Page End:
- 219
- Publication Date:
- 2017-10
- Subjects:
- Enzyme classification -- Multiclass SVM -- Sequential Forward Floating Selection (SFFS) -- Orthogonal Forward Selection (OFS) -- SVM Recursive Feature Elimination (SVM-RFE) -- Random Forest (RF)
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2017.08.009 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 4716.xml