Unsupervised learning of semantic representation for documents with the law of total probability. (2nd November 2017)
- Record Type:
- Journal Article
- Title:
- Unsupervised learning of semantic representation for documents with the law of total probability. (2nd November 2017)
- Main Title:
- Unsupervised learning of semantic representation for documents with the law of total probability
- Authors:
- WEI, YANG
WEI, JINMAO
YANG, ZHENGLU - Abstract:
- Abstract: The semantic information of documents needs to be represented because it is the basis for many applications, such as document summarization, web search, and text analysis. Although many studies have explored this problem by enriching document vectors with the relatedness of the words involved, the performance remains far from satisfactory because the physical boundaries of documents hinder the evaluation of the relatedness between words. To address this problem, we propose an effective approach to further infer the implicit relatedness between words via their common related words. To avoid overestimation of the implicit relatedness, we restrict the inference in terms of the marginal probabilities of the words based on the law of total probability. The proposed method measures the relatedness between words, which is confirmed theoretically and experimentally. Thorough evaluation on real datasets illustrates that significant improvement on document clustering has been achieved with the proposed method compared with state-of-the-art methods.
- Is Part Of:
- Natural language engineering. Volume 24:Part 4(2018)
- Journal:
- Natural language engineering
- Issue:
- Volume 24:Part 4(2018)
- Issue Display:
- Volume 24, Issue 4, Part 4 (2018)
- Year:
- 2018
- Volume:
- 24
- Issue:
- 4
- Part:
- 4
- Issue Sort Value:
- 2018-0024-0004-0004
- Page Start:
- 491
- Page End:
- 522
- Publication Date:
- 2017-11-02
- Subjects:
- Natural language processing (Computer science) -- Periodicals
Software engineering -- Periodicals
006.35 - Journal URLs:
- http://journals.cambridge.org/action/displayJournal?jid=NLE ↗
- DOI:
- 10.1017/S1351324917000420 ↗
- Languages:
- English
- ISSNs:
- 1351-3249
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD Digital store
- Ingest File:
- 6800.xml