A multiple information fusion method for predicting subcellular locations of two different types of bacterial protein simultaneously. (January 2016)
- Record Type:
- Journal Article
- Title:
- A multiple information fusion method for predicting subcellular locations of two different types of bacterial protein simultaneously. (January 2016)
- Main Title:
- A multiple information fusion method for predicting subcellular locations of two different types of bacterial protein simultaneously
- Authors:
- Chen, Jing
Xu, Huimin
He, Ping-an
Dai, Qi
Yao, Yuhua - Abstract:
- Abstract: Subcellular localization prediction of bacterial protein is an important component of bioinformatics, which has great importance for drug design and other applications. For the prediction of protein subcellular localization, as we all know, lots of computational tools have been developed in the recent decades. In this study, we firstly introduce three kinds of protein sequences encoding schemes: physicochemical-based, evolutionary-based, and GO-based. The original and consensus sequences were combined with physicochemical properties. And elements information of different rows and columns in position-specific scoring matrix were taken into consideration simultaneously for more core and essence information. Computational methods based on gene ontology (GO) have been demonstrated to be superior to methods based on other features. Then principal component analysis (PCA) is applied for feature selection and reduced vectors are input to a support vector machine (SVM) to predict protein subcellular localization. The proposed method can achieve a prediction accuracy of 98.28% and 97.87% on a stringent Gram-positive (Gpos) and Gram-negative (Gneg) dataset with Jackknife test, respectively. At last, we calculate "absolute true overall accuracy (ATOA)", which is stricter than overall accuracy. The ATOA obtained from the proposed method is also up to 97.32% and 93.06% for Gpos and Gneg. From both the rationality of testing procedure and the success rates of test results, theAbstract: Subcellular localization prediction of bacterial protein is an important component of bioinformatics, which has great importance for drug design and other applications. For the prediction of protein subcellular localization, as we all know, lots of computational tools have been developed in the recent decades. In this study, we firstly introduce three kinds of protein sequences encoding schemes: physicochemical-based, evolutionary-based, and GO-based. The original and consensus sequences were combined with physicochemical properties. And elements information of different rows and columns in position-specific scoring matrix were taken into consideration simultaneously for more core and essence information. Computational methods based on gene ontology (GO) have been demonstrated to be superior to methods based on other features. Then principal component analysis (PCA) is applied for feature selection and reduced vectors are input to a support vector machine (SVM) to predict protein subcellular localization. The proposed method can achieve a prediction accuracy of 98.28% and 97.87% on a stringent Gram-positive (Gpos) and Gram-negative (Gneg) dataset with Jackknife test, respectively. At last, we calculate "absolute true overall accuracy (ATOA)", which is stricter than overall accuracy. The ATOA obtained from the proposed method is also up to 97.32% and 93.06% for Gpos and Gneg. From both the rationality of testing procedure and the success rates of test results, the current method can improve the prediction quality of protein subcellular localization. … (more)
- Is Part Of:
- Bio systems. Volume 139(2016:Jan.)
- Journal:
- Bio systems
- Issue:
- Volume 139(2016:Jan.)
- Issue Display:
- Volume 139 (2016)
- Year:
- 2016
- Volume:
- 139
- Issue Sort Value:
- 2016-0139-0000-0000
- Page Start:
- 37
- Page End:
- 45
- Publication Date:
- 2016-01
- Subjects:
- Physicochemical properties -- Position-specific score matrix -- Gene ontology -- Principal component analysis -- Support vector machine
Biological systems -- Periodicals
Biology -- Periodicals
Biology -- Periodicals
Evolution -- Periodicals
Biologie -- Périodiques
Évolution -- Périodiques
570 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03032647 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.biosystems.2015.12.002 ↗
- Languages:
- English
- ISSNs:
- 0303-2647
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2089.670000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 1575.xml