A tree does not make a well-formed sentence: Improving syntactic string-to-tree statistical machine translation with more linguistic knowledge. (July 2015)
- Record Type:
- Journal Article
- Title:
- A tree does not make a well-formed sentence: Improving syntactic string-to-tree statistical machine translation with more linguistic knowledge. (July 2015)
- Main Title:
- A tree does not make a well-formed sentence: Improving syntactic string-to-tree statistical machine translation with more linguistic knowledge
- Authors:
- Sennrich, Rico
Williams, Philip
Huck, Matthias - Abstract:
- Abstract : Highlights: String-to-tree SMT systems aim to produce grammatically well-formed translation. Modelling assumptions of synchronous grammars can result in ungrammatical output. Grammar overgeneralisations can be reduced by parser choice and modifications. Overgeneralisation errors can be prevented with rule-based lingustic constraints. Abstract: Synchronous context-free grammars (SCFGs) can be learned from parallel texts that are annotated with target-side syntax, and can produce translations by building target-side syntactic trees from source strings. Ideally, producing syntactic trees would entail that the translation is grammatically well-formed, but in reality, this is often not the case. Focusing on translation into German, we discuss various ways in which string-to-tree translation models over- or undergeneralise. We show how these problems can be addressed by choosing a suitable parser and modifying its output, by introducing linguistic constraints that enforce morphological agreement and constrain subcategorisation, and by modelling the productive generation of German compounds.
- Is Part Of:
- Computer speech & language. Volume 32(2015)
- Journal:
- Computer speech & language
- Issue:
- Volume 32(2015)
- Issue Display:
- Volume 32, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 32
- Issue:
- 2015
- Issue Sort Value:
- 2015-0032-2015-0000
- Page Start:
- 27
- Page End:
- 45
- Publication Date:
- 2015-07
- Subjects:
- Statistical machine translation -- Syntactic translation models -- String-to-tree models -- Morphology
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2014.09.002 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5431.xml