Data augmentation for low‐resource languages NMT guided by constrained sampling. Issue 1 (25th August 2021)
- Record Type:
- Journal Article
- Title:
- Data augmentation for low‐resource languages NMT guided by constrained sampling. Issue 1 (25th August 2021)
- Main Title:
- Data augmentation for low‐resource languages NMT guided by constrained sampling
- Authors:
- Maimaiti, Mieradilijiang
Liu, Yang
Luan, Huanbo
Sun, Maosong - Abstract:
- Abstract: Data augmentation (DA) is a ubiquitous approach for several text generation tasks. Intuitively, in the machine translation paradigm, especially in low‐resource languages scenario, many DA methods have appeared. The most commonly used methods are building pseudocorpus by randomly sampling, omitting, or replacing some words in the text. However, previous approaches hardly guarantee the quality of augmented data. In this study, we try to augment the corpus by introducing a constrained sampling method. Additionally, we also build the evaluation framework to select higher quality data after augmentation. Namely, we use the discriminator submodel to mitigate syntactic and semantic errors to some extent. Experimental results show that our augmentation method consistently outperforms all the previous state‐of‐the‐art methods on both small and large‐scale corpora in eight language pairs from four corpora by 2.38–4.18 bilingual evaluation understudy points.
- Is Part Of:
- International journal of intelligent systems. Volume 37:Issue 1(2022)
- Journal:
- International journal of intelligent systems
- Issue:
- Volume 37:Issue 1(2022)
- Issue Display:
- Volume 37, Issue 1 (2022)
- Year:
- 2022
- Volume:
- 37
- Issue:
- 1
- Issue Sort Value:
- 2022-0037-0001-0000
- Page Start:
- 30
- Page End:
- 51
- Publication Date:
- 2021-08-25
- Subjects:
- artificial intelligence -- constrained sampling -- data augmentation -- low‐resource languages -- natural language processing -- neural machine translation
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
006.3 - Journal URLs:
- http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1098-111X ↗
https://www.hindawi.com/journals/ijis ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/int.22616 ↗
- Languages:
- English
- ISSNs:
- 0884-8173
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4542.310500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20029.xml