Comprehensive analysis of aspect term extraction methods using various text embeddings. (September 2021)
- Record Type:
- Journal Article
- Title:
- Comprehensive analysis of aspect term extraction methods using various text embeddings. (September 2021)
- Main Title:
- Comprehensive analysis of aspect term extraction methods using various text embeddings
- Authors:
- Augustyniak, Łukasz
Kajdanowicz, Tomasz
Kazienko, Przemysław - Abstract:
- Highlights: A wide comparison with ablation analysis of Aspect Term Extraction methods. A guideline to choose the best Aspect Term Extraction models for particular user needs. We analyzed 88 method combinations, model customizations (LSTM vs. BiLSTM, CRF vs. no CRF) and pre-trained word embeddings. We also evaluated the results of three contextual text representations (BERT, Flair, ELMo) using the BiLSTM-CRF model. We analyzed the influence on the performance of extending the word vectorization step with character-based word embeddings. The experimental results on SemEval datasets revealed that the BiLSTM could be used as a very good predictor. Language model-based model are not always the best and obvious choice for vector representation layer in NLP tasks. Abstract: Recently, a variety of model designs and methods have blossomed in the context of the sentiment analysis domain. However, there is still a lack of comprehensive studies of Aspect-based Sentiment Analysis. We want to fill this gap and propose a comparison with ablation analysis of Aspect Term Extraction using various text embeddings methods. We particularly focused on simple architectures based on long short-term memory (LSTM) with optional conditional random field (CRF) enhancement using different pre-trained word embeddings. Moreover, we analyzed the influence on the performance of extending the word vectorization step with character-based word embeddings. The experimental results on SemEval datasets revealedHighlights: A wide comparison with ablation analysis of Aspect Term Extraction methods. A guideline to choose the best Aspect Term Extraction models for particular user needs. We analyzed 88 method combinations, model customizations (LSTM vs. BiLSTM, CRF vs. no CRF) and pre-trained word embeddings. We also evaluated the results of three contextual text representations (BERT, Flair, ELMo) using the BiLSTM-CRF model. We analyzed the influence on the performance of extending the word vectorization step with character-based word embeddings. The experimental results on SemEval datasets revealed that the BiLSTM could be used as a very good predictor. Language model-based model are not always the best and obvious choice for vector representation layer in NLP tasks. Abstract: Recently, a variety of model designs and methods have blossomed in the context of the sentiment analysis domain. However, there is still a lack of comprehensive studies of Aspect-based Sentiment Analysis. We want to fill this gap and propose a comparison with ablation analysis of Aspect Term Extraction using various text embeddings methods. We particularly focused on simple architectures based on long short-term memory (LSTM) with optional conditional random field (CRF) enhancement using different pre-trained word embeddings. Moreover, we analyzed the influence on the performance of extending the word vectorization step with character-based word embeddings. The experimental results on SemEval datasets revealed that bi-directional long short-term memory (BiLSTM) could be used as a very good predictor, even comparing to very sophisticated and complex models using huge word embeddings or language models. We presented a comprehensive analysis of various customizations of LSTM-based architecture and word/character embeddings that could be used as a guideline to choose the best model version for particular user needs. … (more)
- Is Part Of:
- Computer speech & language. Volume 69(2021)
- Journal:
- Computer speech & language
- Issue:
- Volume 69(2021)
- Issue Display:
- Volume 69, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 69
- Issue:
- 2021
- Issue Sort Value:
- 2021-0069-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-09
- Subjects:
- Aspect-based sentiment analysis -- Aspect term extraction -- Word embeddings -- Character embeddings -- LSTM -- BiLSTM -- CRF -- SemEval
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2021.101217 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 16782.xml