Should one use term proximity or multi-word terms for Arabic information retrieval?. (November 2019)
- Record Type:
- Journal Article
- Title:
- Should one use term proximity or multi-word terms for Arabic information retrieval?. (November 2019)
- Main Title:
- Should one use term proximity or multi-word terms for Arabic information retrieval?
- Authors:
- El Mahdaouy, Abdelkader
Gaussier, Eric
El Alaoui, Saïd Ouatik - Abstract:
- Highlights: Explore whether term dependencies (TDs) can help improve Arabic IR systems. Consider explicit term dependencies based on MWTs and implicit term proximity. The study provides complete extensions and their comparison to deal with TDs. Formal condition that IR models should satisfy to deal with term dependencies. The difference between both TDs performances is not statistically significant. Abstract: Recently, several information retrieval (IR) models have been proposed in order to boost the retrieval performance using term dependencies. However, in the context of the Arabic language, most IR researchers have focused on the problem of stemming, which is highly challenging in this language. In this paper, we propose to explore whether term dependencies can help improve Arabic IR systems, and what are the best methods to use. To do so, we consider both explicit term dependencies based on multi-word terms (MWTs) that are extracted using syntactic patterns and statistical filters, as well as implicit ones based on the notion of cross-terms or term proximities. Our experiments, performed on standard TREC Arabic IR collections, show the importance of taking into account term dependencies for Arabic IR. To the best of our knowledge, this is the first study that provides complete extensions, and their comparison, of most standard IR models to deal with term dependencies in the Arabic language.
- Is Part Of:
- Computer speech & language. Volume 58(2019)
- Journal:
- Computer speech & language
- Issue:
- Volume 58(2019)
- Issue Display:
- Volume 58, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 58
- Issue:
- 2019
- Issue Sort Value:
- 2019-0058-2019-0000
- Page Start:
- 76
- Page End:
- 97
- Publication Date:
- 2019-11
- Subjects:
- Arabic information retrieval -- Multi-word terms -- Term proximity -- Term dependence IR models
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2019.04.002 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11148.xml