Robust fuzzy clustering for multiple instance regression. (June 2019)
- Record Type:
- Journal Article
- Title:
- Robust fuzzy clustering for multiple instance regression. (June 2019)
- Main Title:
- Robust fuzzy clustering for multiple instance regression
- Authors:
- Trabelsi, Mohamed
Frigui, Hichem - Abstract:
- Highlights: We propose a novel multiple instance regression (MIR) algorithm that is based on robust clustering. The proposed algorithm can identify the optimal number of linear regression models, the primary instances within each bag, and can assign labels at the instance as well as at the bag level. The proposed algorithm was evaluated using synthetic data sets of varying difficulty and was shown to outperform 4 existing MIR algorithms. The proposed algorithm was applied to remote sensing data to predict the yearly average yield of a crop and to drug activity prediction. We showed that our approach achieves higher accuracy and more consistent results than existing methods. Abstract: Multiple instance regression (MIR) operates on a collection of bags, where each bag contains many instances sharing the same real-valued label. Only few instances, called primary instances, contribute to the bag labels. The remaining ones are noisy observations. The goal in MIR is to identify the primary instances within each bag and learn a regression model that can predict the label of a previously unseen bag. In this paper, we show that regression models can be identified as clusters when appropriate features and distances are used. We introduce an algorithm, called Robust Fuzzy Clustering for Multiple Instance Regression (RFC-MIR), that can learn multiple linear models simultaneously. First, RFC-MIR uses constrained fuzzy memberships to obtain an initial partition where instances can belongHighlights: We propose a novel multiple instance regression (MIR) algorithm that is based on robust clustering. The proposed algorithm can identify the optimal number of linear regression models, the primary instances within each bag, and can assign labels at the instance as well as at the bag level. The proposed algorithm was evaluated using synthetic data sets of varying difficulty and was shown to outperform 4 existing MIR algorithms. The proposed algorithm was applied to remote sensing data to predict the yearly average yield of a crop and to drug activity prediction. We showed that our approach achieves higher accuracy and more consistent results than existing methods. Abstract: Multiple instance regression (MIR) operates on a collection of bags, where each bag contains many instances sharing the same real-valued label. Only few instances, called primary instances, contribute to the bag labels. The remaining ones are noisy observations. The goal in MIR is to identify the primary instances within each bag and learn a regression model that can predict the label of a previously unseen bag. In this paper, we show that regression models can be identified as clusters when appropriate features and distances are used. We introduce an algorithm, called Robust Fuzzy Clustering for Multiple Instance Regression (RFC-MIR), that can learn multiple linear models simultaneously. First, RFC-MIR uses constrained fuzzy memberships to obtain an initial partition where instances can belong to multiple models with various degrees. Then, it uses unconstrained possibilistic memberships to allow the initial local models to expand and converge to the global model. These memberships are also used to identify the primary instances within each bag. After clustering, the possibilistic memberships are used to identify the optimal number of regression models. We evaluate our approach on synthetic data sets generated by varying the dimensionality of the feature space, the number of instances per bag, and the noise level. We also validate the RFC-MIR using two real applications: prediction of the yearly average yield of a crop using remote sensing data; and drug activity prediction. These applications have been used consistently to validate existing MIR algorithms. We show that our approach achieves higher accuracy than existing methods. … (more)
- Is Part Of:
- Pattern recognition. Volume 90(2019:Jun.)
- Journal:
- Pattern recognition
- Issue:
- Volume 90(2019:Jun.)
- Issue Display:
- Volume 90 (2019)
- Year:
- 2019
- Volume:
- 90
- Issue Sort Value:
- 2019-0090-0000-0000
- Page Start:
- 424
- Page End:
- 435
- Publication Date:
- 2019-06
- Subjects:
- Multiple instance regression -- Fuzzy clustering -- Possibilistic clustering -- Multiple model regression
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2019.01.030 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9562.xml