Multi-instance multi-label distance metric learning for genome-wide protein function prediction. (August 2016)
- Record Type:
- Journal Article
- Title:
- Multi-instance multi-label distance metric learning for genome-wide protein function prediction. (August 2016)
- Main Title:
- Multi-instance multi-label distance metric learning for genome-wide protein function prediction
- Authors:
- Xu, Yonghui
Min, Huaqing
Song, Hengjie
Wu, Qingyao - Abstract:
- Abstract: Multi-instance multi-label (MIML) learning has been proven to be effective for the genome-wide protein function prediction problems where each training example is associated with not only multiple instances but also multiple class labels. To find an appropriate MIML learning method for genome-wide protein function prediction, many studies in the literature attempted to optimize objective functions in which dissimilarity between instances is measured using the Euclidean distance. But in many real applications, Euclidean distance may be unable to capture the intrinsic similarity/dissimilarity in feature space and label space. Unlike other previous approaches, in this paper, we propose to learn a multi-instance multi-label distance metric learning framework (MIMLDML) for genome-wide protein function prediction. Specifically, we learn a Mahalanobis distance to preserve and utilize the intrinsic geometric information of both feature space and label space for MIML learning. In addition, we try to deal with the sparsely labeled data by giving weight to the labeled data. Extensive experiments on seven real-world organisms covering the biological three-domain system ( i.e., archaea, bacteria, and eukaryote;Woese et al., 1990 ) show that the MIMLDML algorithm is superior to most state-of-the-art MIML learning algorithms.
- Is Part Of:
- Computational biology and chemistry. Volume 63(2016)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 63(2016)
- Issue Display:
- Volume 63, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 63
- Issue:
- 2016
- Issue Sort Value:
- 2016-0063-2016-0000
- Page Start:
- 30
- Page End:
- 40
- Publication Date:
- 2016-08
- Subjects:
- Protein function prediction -- Genome wide -- Distance metric learning -- Machine learning -- Multi-instance multi-label learning
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2016.02.011 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 7368.xml