A data-driven selection of the number of clusters in the Dirichlet allocation model via Bayesian mixture modelling. Issue 15 (13th October 2019)
- Record Type:
- Journal Article
- Title:
- A data-driven selection of the number of clusters in the Dirichlet allocation model via Bayesian mixture modelling. Issue 15 (13th October 2019)
- Main Title:
- A data-driven selection of the number of clusters in the Dirichlet allocation model via Bayesian mixture modelling
- Authors:
- Saraiva, E. F.
Pereira, C. A. B.
Suzuki, A. K. - Abstract:
- ABSTRACT: In this paper, we consider a Bayesian mixture model that allows us to integrate out the weights of the mixture in order to obtain a procedure in which the number of clusters is an unknown quantity. To determine clusters and estimate parameters of interest, we develop an MCMC algorithm denominated by sequential data-driven allocation sampler. In this algorithm, a single observation has a non-null probability to create a new cluster and a set of observations may create a new cluster through the split-merge movements. The split-merge movements are developed using a sequential allocation procedure based in allocation probabilities that are calculated according to the Kullback–Leibler divergence between the posterior distribution using the observations previously allocated and the posterior distribution including a 'new' observation. We verified the performance of the proposed algorithm on the simulated data and then we illustrate its use on three publicly available real data sets.
- Is Part Of:
- Journal of statistical computation and simulation. Volume 89:Issue 15(2019)
- Journal:
- Journal of statistical computation and simulation
- Issue:
- Volume 89:Issue 15(2019)
- Issue Display:
- Volume 89, Issue 15 (2019)
- Year:
- 2019
- Volume:
- 89
- Issue:
- 15
- Issue Sort Value:
- 2019-0089-0015-0000
- Page Start:
- 2848
- Page End:
- 2870
- Publication Date:
- 2019-10-13
- Subjects:
- Mixture model -- Bayesian approach -- Gibbs sampling -- Metropolis–Hastings -- split-merge update -- Kullback–Leibler divergence
62M05
Mathematical statistics -- Data processing -- Periodicals
Digital computer simulation -- Periodicals
519.5028505 - Journal URLs:
- http://www.tandfonline.com/loi/gscs20 ↗
http://www.tandfonline.com/ ↗ - DOI:
- 10.1080/00949655.2019.1643345 ↗
- Languages:
- English
- ISSNs:
- 0094-9655
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5066.820000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11352.xml