Network‐based semisupervised clustering. (24th March 2021)
- Record Type:
- Journal Article
- Title:
- Network‐based semisupervised clustering. (24th March 2021)
- Main Title:
- Network‐based semisupervised clustering
- Authors:
- Frigau, Luca
Contu, Giulia
Mola, Francesco
Conversano, Claudio - Other Names:
- Ekin Tahir guestEditor.
Puggioni Gavino guestEditor.
Canas Rodrigues Paulo guestEditor. - Abstract:
- Abstract: Semisupervised clustering extends standard clustering methods to the semisupervised setting, in some cases considering situations when clusters are associated with a given outcome variable that acts as a "noisy surrogate, " that is a good proxy of the unknown clustering structure. In this article, a novel approach to semisupervised clustering associated with an outcome variable named network‐based semisupervised clustering (NeSSC) is introduced. It combines an initialization, a training and an agglomeration phase. In the initialization and training a matrix of pairwise affinity of the instances is estimated by a classifier. In the agglomeration phase the matrix of pairwise affinity is transformed into a complex network, in which a community detection algorithm searches the underlying community structure. Thus, a partition of the instances into clusters highly homogeneous in terms of the outcome is obtained. We consider a particular specification of NeSSC that uses classification or regression trees as classifiers and the Louvain, Label propagation and Walktrap as possible community detection algorithm. NeSSC's stopping criterion and the choice of the optimal partition of the original data are also discussed. Several applications on both real and simulated data are presented to demonstrate the effectiveness of the proposed semisupervised clustering method and the benefits it provides in terms of improved interpretability of results with respect to three alternativeAbstract: Semisupervised clustering extends standard clustering methods to the semisupervised setting, in some cases considering situations when clusters are associated with a given outcome variable that acts as a "noisy surrogate, " that is a good proxy of the unknown clustering structure. In this article, a novel approach to semisupervised clustering associated with an outcome variable named network‐based semisupervised clustering (NeSSC) is introduced. It combines an initialization, a training and an agglomeration phase. In the initialization and training a matrix of pairwise affinity of the instances is estimated by a classifier. In the agglomeration phase the matrix of pairwise affinity is transformed into a complex network, in which a community detection algorithm searches the underlying community structure. Thus, a partition of the instances into clusters highly homogeneous in terms of the outcome is obtained. We consider a particular specification of NeSSC that uses classification or regression trees as classifiers and the Louvain, Label propagation and Walktrap as possible community detection algorithm. NeSSC's stopping criterion and the choice of the optimal partition of the original data are also discussed. Several applications on both real and simulated data are presented to demonstrate the effectiveness of the proposed semisupervised clustering method and the benefits it provides in terms of improved interpretability of results with respect to three alternative semisupervised clustering methods. … (more)
- Is Part Of:
- Applied stochastic models in business and industry. Volume 37:Number 2(2021)
- Journal:
- Applied stochastic models in business and industry
- Issue:
- Volume 37:Number 2(2021)
- Issue Display:
- Volume 37, Issue 2 (2021)
- Year:
- 2021
- Volume:
- 37
- Issue:
- 2
- Issue Sort Value:
- 2021-0037-0002-0000
- Page Start:
- 182
- Page End:
- 202
- Publication Date:
- 2021-03-24
- Subjects:
- community detection -- complex networks -- label propagation -- Louvain -- tree‐based classifiers -- Walktrap
Stochastic analysis -- Periodicals
Stochastic processes -- Periodicals
Business mathematics -- Periodicals
Finance -- Mathematical models -- Periodicals
Industrial management -- Mathematical models -- Periodicals
338.00151923 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/asmb.2618 ↗
- Languages:
- English
- ISSNs:
- 1524-1904
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 1580.062200
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 16349.xml