Multi-level embeddings for processing Arabic social media contents. (November 2021)

Record Type:: Journal Article
Title:: Multi-level embeddings for processing Arabic social media contents. (November 2021)
Main Title:: Multi-level embeddings for processing Arabic social media contents
Authors:: Moudjari, Leila
Benamara, Farah
Akli-Astouati, Karima
Abstract:: Abstract: Embeddings are very popular representations that allow computing semantic and syntactic similarities between linguistic units from text co-occurrence matrix. Units can vary from character n-grams to words, including more coarse-grained units such as sentences and documents. Recently, multi-level embeddings combining representations from different units have been proposed as an alternative to single-level embeddings to account for the internal structure of words (i.e., morphology) and help systems to generalise well over out of vocabulary words. These representations, either pre-trained or learned, have shown to be quite effective, outperforming word-level baselines in several NLP tasks such as machine translation, part of speech tagging and named entity recognition. Our aim here is to contribute to this line of research proposing for the first time in Arabic NLP an in-depth study of the impact of various subwords configurations ranging from character to character n-grams (including word) for social media text classification. We propose several neural architectures to learn character, subword and word embeddings, as well as a combination of these three levels, exploring different composition functions to obtain the final representation of a given text. To evaluate the effectiveness of these representations, we perform extrinsic evaluations on three text classification tasks (sentiment analysis, emotion detection and irony detection) while accounting for different … (more)
Is Part Of:: Computer speech & language. Volume 70(2021)
Journal:: Computer speech & language
Issue:: Volume 70(2021)
Issue Display:: Volume 70, Issue 2021 (2021)
Year:: 2021
Volume:: 70
Issue:: 2021
Issue Sort Value:: 2021-0070-2021-0000
Page Start:
Page End:
Publication Date:: 2021-11
Subjects:: Multi-level embeddings -- Convolutional neural networks -- Arabic dialects -- Sentiment analysis -- Emotion detection -- Irony detection
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2021.101240 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 17252.xml