Extracting translations from comparable corpora for Cross-Language Information Retrieval using the language modeling framework. Issue 2 (March 2016)

Record Type:: Journal Article
Title:: Extracting translations from comparable corpora for Cross-Language Information Retrieval using the language modeling framework. Issue 2 (March 2016)
Main Title:: Extracting translations from comparable corpora for Cross-Language Information Retrieval using the language modeling framework
Authors:: Rahimi, Razieh
Shakery, Azadeh
King, Irwin
Abstract:: Highlights: Proposing a language modeling method to extract translations from comparable corpora. Comparing two similarity functions for deriving bilingual word correlations. Improving translation quality by integrating co-occurrence relations into word models. Comparing different estimations of translation probabilities from word correlations. Showing the significant impact of probability estimation methods on CLIR performance. Abstract: A main challenge in Cross-Language Information Retrieval (CLIR) is to estimate a proper translation model from available translation resources, since translation quality directly affects the retrieval performance. Among different translation resources, we focus on obtaining translation models from comparable corpora, because they provide appropriate translations for both languages and domains with limited linguistic resources. In this paper, we employ a two-step approach to build an effective translation model from comparable corpora, without requiring any additional linguistic resources, for the CLIR task. In the first step, translations are extracted by deriving correlations between source–target word pairs. These correlations are used to estimate word translation probabilities in the second step. We propose a language modeling approach for the first step, where modeling based on probability distribution provides two key advantages. First, our approach can be tuned easier in comparison with heuristically adjusted previous work. Second, it … (more)
Is Part Of:: Information processing & management. Volume 52:Issue 2(2016:Mar.)
Journal:: Information processing & management
Issue:: Volume 52:Issue 2(2016:Mar.)
Issue Display:: Volume 52, Issue 2 (2016)
Year:: 2016
Volume:: 52
Issue:: 2
Issue Sort Value:: 2016-0052-0002-0000
Page Start:: 299
Page End:: 318
Publication Date:: 2016-03
Subjects:: Translation model -- Bilingual lexicon -- Comparable corpora -- Cross-Language Information Retrieval -- Language modeling framework
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038
Journal URLs:: http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.ipm.2015.08.001 ↗
Languages:: English
ISSNs:: 0306-4573
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 7616.xml