A linguistic approach for determining the topics of Spanish Twitter messages. (April 2015)
- Record Type:
- Journal Article
- Title:
- A linguistic approach for determining the topics of Spanish Twitter messages. (April 2015)
- Main Title:
- A linguistic approach for determining the topics of Spanish Twitter messages
- Authors:
- Vilares, David
Alonso, Miguel A.
Gómez-Rodríguez, Carlos - Abstract:
- The vast number of opinions and reviews provided in Twitter is helpful in order to make interesting findings about a given industry, but given the huge number of messages published every day, it is important to detect the relevant ones. In this respect, the Twitter search functionality is not a practical tool when we want to poll messages dealing with a given set of general topics. This article presents an approach to classify Twitter messages into various topics. We tackle the problem from a linguistic angle, taking into account part-of-speech, syntactic and semantic information, showing how language processing techniques should be adapted to deal with the informal language present in Twitter messages. The TASS 2013 General corpus, a collection of tweets that has been specifically annotated to perform text analytics tasks, is used as the dataset in our evaluation framework. We carry out a wide range of experiments to determine which kinds of linguistic information have the greatest impact on this task and how they should be combined in order to obtain the best-performing system. The results lead us to conclude that relating features by means of contextual information adds complementary knowledge over pure lexical models, making it possible to outperform them on standard metrics for multilabel classification tasks.
- Is Part Of:
- Journal of information science. Volume 41:Number 2(2015)
- Journal:
- Journal of information science
- Issue:
- Volume 41:Number 2(2015)
- Issue Display:
- Volume 41, Issue 2 (2015)
- Year:
- 2015
- Volume:
- 41
- Issue:
- 2
- Issue Sort Value:
- 2015-0041-0002-0000
- Page Start:
- 127
- Page End:
- 145
- Publication Date:
- 2015-04
- Subjects:
- multilabel topic classification -- natural language processing -- Twitter
Information science -- Periodicals
Information science
Periodicals
020.5 - Journal URLs:
- http://jis.sagepub.com/archive/ ↗
http://www.ingenta.com/journals/browse/bks/jis?mode=direct ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0165-5515;screen=info;ECOIP ↗ - DOI:
- 10.1177/0165551514561652 ↗
- Languages:
- English
- ISSNs:
- 0165-5515
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 6297.xml