Protein (multi-)location prediction: utilizing interdependencies via a generative model. (10th June 2015)
- Record Type:
- Journal Article
- Title:
- Protein (multi-)location prediction: utilizing interdependencies via a generative model. (10th June 2015)
- Main Title:
- Protein (multi-)location prediction: utilizing interdependencies via a generative model
- Authors:
- Simha, Ramanuja
Briesemeister, Sebastian
Kohlbacher, Oliver
Shatkay, Hagit - Abstract:
- Abstract : Motivation: Proteins are responsible for a multitude of vital tasks in all living organisms. Given that a protein's function and role are strongly related to its subcellular location, protein location prediction is an important research area. While proteins move from one location to another and can localize to multiple locations, most existing location prediction systems assign only a single location per protein. A few recent systems attempt to predict multiple locations for proteins, however, their performance leaves much room for improvement. Moreover, such systems do not capture dependencies among locations and usually consider locations as independent. We hypothesize that a multi-location predictor that captures location inter-dependencies can improve location predictions for proteins. Results : We introduce a probabilistic generative model for protein localization, and develop a system based on it—which we call MDLoc —that utilizes inter-dependencies among locations to predict multiple locations for proteins. The model captures location inter-dependencies using Bayesian networks and represents dependency between features and locations using a mixture model. We use iterative processes for learning model parameters and for estimating protein locations. We evaluate our classifier MDLoc, on a dataset of single- and multi-localized proteins derived from the DBMLoc dataset, which is the most comprehensive protein multi-localization dataset currently available. OurAbstract : Motivation: Proteins are responsible for a multitude of vital tasks in all living organisms. Given that a protein's function and role are strongly related to its subcellular location, protein location prediction is an important research area. While proteins move from one location to another and can localize to multiple locations, most existing location prediction systems assign only a single location per protein. A few recent systems attempt to predict multiple locations for proteins, however, their performance leaves much room for improvement. Moreover, such systems do not capture dependencies among locations and usually consider locations as independent. We hypothesize that a multi-location predictor that captures location inter-dependencies can improve location predictions for proteins. Results : We introduce a probabilistic generative model for protein localization, and develop a system based on it—which we call MDLoc —that utilizes inter-dependencies among locations to predict multiple locations for proteins. The model captures location inter-dependencies using Bayesian networks and represents dependency between features and locations using a mixture model. We use iterative processes for learning model parameters and for estimating protein locations. We evaluate our classifier MDLoc, on a dataset of single- and multi-localized proteins derived from the DBMLoc dataset, which is the most comprehensive protein multi-localization dataset currently available. Our results, obtained by using MDLoc, significantly improve upon results obtained by an initial simpler classifier, as well as on results reported by other top systems. Availability and implementation: MDLoc is available at: http://www.eecis.udel.edu/∼compbio/mdloc . Contact: shatkay@udel.edu . … (more)
- Is Part Of:
- Bioinformatics. Volume 31:Number 12(2015)
- Journal:
- Bioinformatics
- Issue:
- Volume 31:Number 12(2015)
- Issue Display:
- Volume 31, Issue 12 (2015)
- Year:
- 2015
- Volume:
- 31
- Issue:
- 12
- Issue Sort Value:
- 2015-0031-0012-0000
- Page Start:
- i365
- Page End:
- i374
- Publication Date:
- 2015-06-10
- Subjects:
- Bioinformatics -- Periodicals
Genomics -- Data processing -- Periodicals
Computational biology -- Periodicals
572.80285 - Journal URLs:
- http://bioinformatics.oxfordjournals.org ↗
http://firstsearch.oclc.org ↗
http://ukcatalogue.oup.com/ ↗ - DOI:
- 10.1093/bioinformatics/btv264 ↗
- Languages:
- English
- ISSNs:
- 1367-4803
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2072.348000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12388.xml