Mixture‐based clustering for count data using approximated Fisher Scoring and Minorization–Maximization approaches. (27th December 2020)
- Record Type:
- Journal Article
- Title:
- Mixture‐based clustering for count data using approximated Fisher Scoring and Minorization–Maximization approaches. (27th December 2020)
- Main Title:
- Mixture‐based clustering for count data using approximated Fisher Scoring and Minorization–Maximization approaches
- Authors:
- Bregu, Ornela
Zamzami, Nuha
Bouguila, Nizar - Abstract:
- Abstract: The multinomial distribution has been widely used to model count data. To increase clustering efficiency, we use an approximation to the Fisher scoring algorithm, which is more robust regarding the choice of initial parameter values. Then, we use a novel approach to estimate the optimal number of components, based on minimum message length criterion. Moreover, we consider a generalization of the multinomial model obtained by introducing the Dirichlet as prior, yielding the Dirichlet Compound Multinomial (DCM). Even though DCM can address the burstiness phenomenon of count data, the presence of Gamma function in its density function usually leads to undesired complications. In this article, we use two alternative representations of DCM distribution to perform clustering based on finite mixture models, where the mixture parameters are estimated using the minorization–maximization framework. To evaluate and compare the performance of our proposed models, we have considered three challenging real‐world applications that involve high‐dimensional count vectors, namely, sentiment analysis, facial expression recognition, and human action recognition. The results show that the proposed algorithms increase the clustering efficiency of their respective models remarkably, and the best results are achieved by the second parametrization of DCM, which can accommodate over‐dispersed count data.
- Is Part Of:
- Computational intelligence. Volume 37:Number 1(2021)
- Journal:
- Computational intelligence
- Issue:
- Volume 37:Number 1(2021)
- Issue Display:
- Volume 37, Issue 1 (2021)
- Year:
- 2021
- Volume:
- 37
- Issue:
- 1
- Issue Sort Value:
- 2021-0037-0001-0000
- Page Start:
- 596
- Page End:
- 620
- Publication Date:
- 2020-12-27
- Subjects:
- approximated Fisher scoring -- count data -- DCM -- high‐dimensional -- minorization–maximization -- MML
Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3 - Journal URLs:
- http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/coin.12429 ↗
- Languages:
- English
- ISSNs:
- 0824-7935
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 15883.xml