Deep sentiments in Roman Urdu text using Recurrent Convolutional Neural Network model. Issue 4 (July 2020)
- Record Type:
- Journal Article
- Title:
- Deep sentiments in Roman Urdu text using Recurrent Convolutional Neural Network model. Issue 4 (July 2020)
- Main Title:
- Deep sentiments in Roman Urdu text using Recurrent Convolutional Neural Network model
- Authors:
- Mahmood, Zainab
Safder, Iqra
Nawab, Rao Muhammad Adeel
Bukhari, Faisal
Nawaz, Raheel
Alfakeeh, Ahmed S.
Aljohani, Naif Radi
Hassan, Saeed-Ul - Abstract:
- Highlights: This study contributes to generating the Roman Urdu corpus by annotating 10, 021 reviews for the task of sentiment analysis. The deep learning models such as RCNN can help in text classification task for the under resource Urdu language. The RCNN outperforms Rule-based and N-gram methods, due to its ability to learn a language-independent textual context. Abstract: Although over 64 million people worldwide speak Urdu language and are well aware of its Roman script, limited research and efforts have been made to carry out sentiment analysis and build language resources for the Roman Urdu language. This article proposes a deep learning model to mine the emotions and attitudes of people expressed in Roman Urdu - consisting of 10, 021 sentences from 566 online threads belonging to the following genres: Sports; Software; Food & Recipes; Drama; and Politics. The objectives of this research are twofold: (1) to develop a human-annotated benchmark corpus for the under-resourced Roman Urdu language for the sentiment analysis; and (2) to evaluate sentiment analysis techniques using the Rule-based, N-gram, and Recurrent Convolutional Neural Network (RCNN) models. Using Corpus, annotated by three experts to be positive, negative, and neutral with 0.557 Cohen's Kappa score, we run two sets of tests, i.e., binary classification (positive and negative) and tertiary classification (positive, negative and neutral). Finally, the results of the RCNN model are analyzed by comparingHighlights: This study contributes to generating the Roman Urdu corpus by annotating 10, 021 reviews for the task of sentiment analysis. The deep learning models such as RCNN can help in text classification task for the under resource Urdu language. The RCNN outperforms Rule-based and N-gram methods, due to its ability to learn a language-independent textual context. Abstract: Although over 64 million people worldwide speak Urdu language and are well aware of its Roman script, limited research and efforts have been made to carry out sentiment analysis and build language resources for the Roman Urdu language. This article proposes a deep learning model to mine the emotions and attitudes of people expressed in Roman Urdu - consisting of 10, 021 sentences from 566 online threads belonging to the following genres: Sports; Software; Food & Recipes; Drama; and Politics. The objectives of this research are twofold: (1) to develop a human-annotated benchmark corpus for the under-resourced Roman Urdu language for the sentiment analysis; and (2) to evaluate sentiment analysis techniques using the Rule-based, N-gram, and Recurrent Convolutional Neural Network (RCNN) models. Using Corpus, annotated by three experts to be positive, negative, and neutral with 0.557 Cohen's Kappa score, we run two sets of tests, i.e., binary classification (positive and negative) and tertiary classification (positive, negative and neutral). Finally, the results of the RCNN model are analyzed by comparing it with the outcome of the Rule-based and N-gram models. We show that the RCNN model outperforms baseline models in terms of accuracy of 0.652 for binary classification and 0.572 for tertiary classification. … (more)
- Is Part Of:
- Information processing & management. Volume 57:Issue 4(2020:Jul.)
- Journal:
- Information processing & management
- Issue:
- Volume 57:Issue 4(2020:Jul.)
- Issue Display:
- Volume 57, Issue 4 (2020)
- Year:
- 2020
- Volume:
- 57
- Issue:
- 4
- Issue Sort Value:
- 2020-0057-0004-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-07
- Subjects:
- Roman Urdu corpus -- Urdu sentiment analysis -- Under resource language -- Deep learning -- Text Classification -- Recurrent Convolutional Neural Networks (RCNN)
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ipm.2020.102233 ↗
- Languages:
- English
- ISSNs:
- 0306-4573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20467.xml