An unsupervised multilingual approach for online social media topic identification. (15th September 2017)
- Record Type:
- Journal Article
- Title:
- An unsupervised multilingual approach for online social media topic identification. (15th September 2017)
- Main Title:
- An unsupervised multilingual approach for online social media topic identification
- Authors:
- Lo, Siaw Ling
Chiong, Raymond
Cornforth, David - Abstract:
- Highlights: An unsupervised multilingual approach to identify topics on Twitter is proposed. Localised language can be leveraged for identifying relevant and important topics. 'Joint' term ranking coupled with DPMM clustering consistently performed well. Multilingual sentiment analysis is essential to understand sentiment on the ground. Topics coverage of social media and main stream media does not always stay the same. Abstract: Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identifying highly relevant terms and topics from the mass of social media data. This approach combines term ranking, localised language analysis, unsupervised topic clustering and multilingual sentiment analysis to extract prominent topics through analysis of Twitter's tweets from a period of time. It is observed that each of the ranking methods tested has their strengths and weaknesses, and that our proposed 'Joint' ranking method is able to take advantage of the strengths of the ranking methods. This 'Joint' ranking method coupled with an unsupervised topic clustering model is shown to have the potential to discover topics of interest or concern to a local community. Practically, being able to do so may help decision makers to gauge the true opinions orHighlights: An unsupervised multilingual approach to identify topics on Twitter is proposed. Localised language can be leveraged for identifying relevant and important topics. 'Joint' term ranking coupled with DPMM clustering consistently performed well. Multilingual sentiment analysis is essential to understand sentiment on the ground. Topics coverage of social media and main stream media does not always stay the same. Abstract: Social media data can be valuable in many ways. However, the vast amount of content shared and the linguistic variants of languages used on social media are making it very challenging for high-value topics to be identified. In this paper, we present an unsupervised multilingual approach for identifying highly relevant terms and topics from the mass of social media data. This approach combines term ranking, localised language analysis, unsupervised topic clustering and multilingual sentiment analysis to extract prominent topics through analysis of Twitter's tweets from a period of time. It is observed that each of the ranking methods tested has their strengths and weaknesses, and that our proposed 'Joint' ranking method is able to take advantage of the strengths of the ranking methods. This 'Joint' ranking method coupled with an unsupervised topic clustering model is shown to have the potential to discover topics of interest or concern to a local community. Practically, being able to do so may help decision makers to gauge the true opinions or concerns on the ground. Theoretically, the research is significant as it shows how an unsupervised online topic identification approach can be designed without much manual annotation effort, which may have great implications for future development of expert and intelligent systems. … (more)
- Is Part Of:
- Expert systems with applications. Volume 81(2017)
- Journal:
- Expert systems with applications
- Issue:
- Volume 81(2017)
- Issue Display:
- Volume 81, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 81
- Issue:
- 2017
- Issue Sort Value:
- 2017-0081-2017-0000
- Page Start:
- 282
- Page End:
- 298
- Publication Date:
- 2017-09-15
- Subjects:
- Topic identification -- Multilingual analysis -- Unsupervised learning -- Social media
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2017.03.029 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 1557.xml