Supervising topic models with Gaussian processes. (May 2018)
- Record Type:
- Journal Article
- Title:
- Supervising topic models with Gaussian processes. (May 2018)
- Main Title:
- Supervising topic models with Gaussian processes
- Authors:
- Kandemir, Melih
Kekeç, Taygun
Yeniterzi, Reyyan - Abstract:
- Highlights: We propose the first model that can supervise Latent Dirichlet Allocation (LDA) by Gaussian Processes (GPs). LDA and GP are jointly trained by a novel variational inference algorithm that adopts ideas form Deep GPs. Differently from Supervised LDA (sLDA), our model learns non-linear mappings from topic activations to document classes. By virtue of this non-linearity, our model outperforms s LDA, as well as a disjointly trained cascade of LDA and GP in three real-world data sets from two different domains. Abstract: Topic modeling is a powerful approach for modeling data represented as high-dimensional histograms. While the high dimensionality of such input data is extremely beneficial in unsupervised applications including language modeling and text data exploration, it introduces difficulties in cases where class information is available to boost up prediction performance. Feeding such input directly to a classifier suffers from the curse of dimensionality. Performing dimensionality reduction and classification disjointly, on the other hand, cannot enjoy optimal performance due to information loss in the gap between these two steps unaware of each other. Existing supervised topic models introduced as a remedy to such scenarios have thus far incorporated only linear classifiers in order to keep inference tractable, causing a dramatical sacrifice from expressive power. In this paper, we propose the first Bayesian construction to perform topic modeling andHighlights: We propose the first model that can supervise Latent Dirichlet Allocation (LDA) by Gaussian Processes (GPs). LDA and GP are jointly trained by a novel variational inference algorithm that adopts ideas form Deep GPs. Differently from Supervised LDA (sLDA), our model learns non-linear mappings from topic activations to document classes. By virtue of this non-linearity, our model outperforms s LDA, as well as a disjointly trained cascade of LDA and GP in three real-world data sets from two different domains. Abstract: Topic modeling is a powerful approach for modeling data represented as high-dimensional histograms. While the high dimensionality of such input data is extremely beneficial in unsupervised applications including language modeling and text data exploration, it introduces difficulties in cases where class information is available to boost up prediction performance. Feeding such input directly to a classifier suffers from the curse of dimensionality. Performing dimensionality reduction and classification disjointly, on the other hand, cannot enjoy optimal performance due to information loss in the gap between these two steps unaware of each other. Existing supervised topic models introduced as a remedy to such scenarios have thus far incorporated only linear classifiers in order to keep inference tractable, causing a dramatical sacrifice from expressive power. In this paper, we propose the first Bayesian construction to perform topic modeling and non-linear classification jointly. We use the well-known Latent Dirichlet Allocation (LDA) for topic modeling and sparse Gaussian processes for non-linear classification. We combine these two components by a latent variable encoding the empirical topic distribution of each document in the corpus. We achieve a novel variational inference scheme by adapting ideas from the newly emerging deep Gaussian processes into the realm of topic modeling. We demonstrate that our model outperforms other existing approaches such as: (i) disjoint LDA and non-linear classification, (ii) joint LDA and linear classification, (iii) joint non-LDA linear subspace modeling and linear classification, and (iv) non-linear classification without topic modeling, in three benchmark data sets from two real-world applications: text categorization and image tagging. … (more)
- Is Part Of:
- Pattern recognition. Volume 77(2018:May)
- Journal:
- Pattern recognition
- Issue:
- Volume 77(2018:May)
- Issue Display:
- Volume 77 (2018)
- Year:
- 2018
- Volume:
- 77
- Issue Sort Value:
- 2018-0077-0000-0000
- Page Start:
- 226
- Page End:
- 236
- Publication Date:
- 2018-05
- Subjects:
- Latent Dirichlet allocation -- Nonparametric Bayesian inference -- Gaussian processes -- Variational inference -- Supervised topic models
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2017.12.019 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11338.xml