Incorporating history and future into non-autoregressive machine translation. (January 2023)
- Record Type:
- Journal Article
- Title:
- Incorporating history and future into non-autoregressive machine translation. (January 2023)
- Main Title:
- Incorporating history and future into non-autoregressive machine translation
- Authors:
- Wang, Shuheng
Huang, Heyan
Shi, Shumin - Abstract:
- Abstract: In non-autoregressive machine translation, the target tokens are generated by the decoder in one shot. Although this decoding process can significantly reduce the decoding latency, non-autoregressive machine translation still suffers from the sacrifice of translation accuracy. We argue that the reason for such decrease is the lack of the target dependencies, history and future information, between target tokens. So, in this work, we propose a novel method to address this problem. We suppose the hidden representation of a target token from the decoder should consist of three parts: history, present, and future information. And we dynamically aggregate such parts-to-whole information with capsule network for the decoder to improve the performance of non-autoregressive machine translation. In addition, to ensure the capsules learn the information as we expect, we introduce an autoregressive decoder. Several experiments on benchmark tasks demonstrate that the explicit modeling of history and future information can significantly improve performance of NAT model. Extensive analyses show that our model is able to learn history and future information as we expect. Highlights: In order to solve the problem that the NAT model cannot model target dependency, we adapt the capsule network to assist the model to learn future and history information. To ensure the capsule network learn as we expect, we design an autoregressive model to extract information. We conduct a series ofAbstract: In non-autoregressive machine translation, the target tokens are generated by the decoder in one shot. Although this decoding process can significantly reduce the decoding latency, non-autoregressive machine translation still suffers from the sacrifice of translation accuracy. We argue that the reason for such decrease is the lack of the target dependencies, history and future information, between target tokens. So, in this work, we propose a novel method to address this problem. We suppose the hidden representation of a target token from the decoder should consist of three parts: history, present, and future information. And we dynamically aggregate such parts-to-whole information with capsule network for the decoder to improve the performance of non-autoregressive machine translation. In addition, to ensure the capsules learn the information as we expect, we introduce an autoregressive decoder. Several experiments on benchmark tasks demonstrate that the explicit modeling of history and future information can significantly improve performance of NAT model. Extensive analyses show that our model is able to learn history and future information as we expect. Highlights: In order to solve the problem that the NAT model cannot model target dependency, we adapt the capsule network to assist the model to learn future and history information. To ensure the capsule network learn as we expect, we design an autoregressive model to extract information. We conduct a series of experiments on the benchmark tasks, and the results shows that our model can improve the performance. … (more)
- Is Part Of:
- Computer speech & language. Volume 77(2023)
- Journal:
- Computer speech & language
- Issue:
- Volume 77(2023)
- Issue Display:
- Volume 77, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 77
- Issue:
- 2023
- Issue Sort Value:
- 2023-0077-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- Machine translation -- Non-autoregressive -- Capsule network -- History and future information
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2022.101439 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 23382.xml