Predicting protein subcellular localization based on information content of gene ontology terms. (December 2016)
- Record Type:
- Journal Article
- Title:
- Predicting protein subcellular localization based on information content of gene ontology terms. (December 2016)
- Main Title:
- Predicting protein subcellular localization based on information content of gene ontology terms
- Authors:
- Zhang, Shu-Bo
Tang, Qiang-Rong - Abstract:
- Graphical abstract: Highlights: A GO-driven method to predict protein subcellular localization. Deriving information content from the lower part of GO graph. Constructing feature vector by combining information from both upper and lower parts of the three GO graphs. Integrated features generally outperform than individual feature. Integrated feature with SVM classifier produce over 90% prediction accuracy. Abstract: Predicting the location where a protein resides within a cell is important in cell biology. Computational approaches to this issue have attracted more and more attentions from the community of biomedicine. Among the protein features used to predict the subcellular localization of proteins, the feature derived from Gene Ontology (GO) has been shown to be superior to others. However, most of the sights in this field are set on the presence or absence of some predefined GO terms. We proposed a method to derive information from the intrinsic structure of the GO graph. The feature vector was constructed with each element in it representing the information content of the GO term annotating to a protein investigated, and the support vector machines was used as classifier to test our extracted features. Evaluation experiments were conducted on three protein datasets and the results show that our method can enhance eukaryotic and human subcellular location prediction accuracy by up to 1.1% better than previous studies that also used GO-based features. Especially in theGraphical abstract: Highlights: A GO-driven method to predict protein subcellular localization. Deriving information content from the lower part of GO graph. Constructing feature vector by combining information from both upper and lower parts of the three GO graphs. Integrated features generally outperform than individual feature. Integrated feature with SVM classifier produce over 90% prediction accuracy. Abstract: Predicting the location where a protein resides within a cell is important in cell biology. Computational approaches to this issue have attracted more and more attentions from the community of biomedicine. Among the protein features used to predict the subcellular localization of proteins, the feature derived from Gene Ontology (GO) has been shown to be superior to others. However, most of the sights in this field are set on the presence or absence of some predefined GO terms. We proposed a method to derive information from the intrinsic structure of the GO graph. The feature vector was constructed with each element in it representing the information content of the GO term annotating to a protein investigated, and the support vector machines was used as classifier to test our extracted features. Evaluation experiments were conducted on three protein datasets and the results show that our method can enhance eukaryotic and human subcellular location prediction accuracy by up to 1.1% better than previous studies that also used GO-based features. Especially in the scenario where the cellular component annotation is absent, our method can achieved satisfied results with an overall accuracy of more than 87%. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 65(2016)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 65(2016)
- Issue Display:
- Volume 65, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 65
- Issue:
- 2016
- Issue Sort Value:
- 2016-0065-2016-0000
- Page Start:
- 1
- Page End:
- 7
- Publication Date:
- 2016-12
- Subjects:
- Protein subcellular localization -- Gene ontology -- Information content -- Support vector machines
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2016.09.009 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 7632.xml