Weighting schemes based on EM algorithm for LDA. (2018)
- Record Type:
- Journal Article
- Title:
- Weighting schemes based on EM algorithm for LDA. (2018)
- Main Title:
- Weighting schemes based on EM algorithm for LDA
- Authors:
- Ju, Yaya
Yan, Jianfeng
Liu, Zhiqiang
Yang, Lu - Abstract:
- Latent Dirichlet allocation (LDA) is a popular probabilistic topic modelling method, which automatically finds latent topics from a corpus. LDA users often encounter two major problems: first, LDA treats each word equally, and common words tend to scatter across almost all topics without reason, thereby leading to bad topic interpretability, consistency, and overlap. Second, an appropriate way to distinguish low-dimensional topic features for better classification performance is lacking. To overcome these two shortcomings, we propose two novel weighting schemes: a word-weighted scheme, which is realised by introducing a weight factor during the iterative process, and a topic-weighted scheme, which is realised by combining the Jenson-Shannon (JS) distance and the entropy of the generated low-dimensional topic features as a weight coefficient, using expectation-maximisation (EM). Experimental results show that the word-weighted scheme can find better topics for improving the clustering performance effectively, and the topic-weighted scheme has a larger effect on text classification than traditional methods.
- Is Part Of:
- International journal of high performance systems architecture. Volume 8:Number 1/2(2018)
- Journal:
- International journal of high performance systems architecture
- Issue:
- Volume 8:Number 1/2(2018)
- Issue Display:
- Volume 8, Issue 1/2 (2018)
- Year:
- 2018
- Volume:
- 8
- Issue:
- 1/2
- Issue Sort Value:
- 2018-0008-NaN-0000
- Page Start:
- 94
- Page End:
- 104
- Publication Date:
- 2018
- Subjects:
- latent Dirichlet allocation -- LDA -- expectation-maximisation -- word-weighted scheme -- topic-weighted scheme
Computer architecture -- Periodicals
Computer systems -- Periodicals
High performance computing -- Periodicals
004.205 - Journal URLs:
- http://www.inderscience.com/jhome.php?jcode=ijhpsa ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1751-6528
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 9261.xml