DORA: Towards policy optimization for task-oriented dialogue system with efficient context. (March 2022)

Record Type:: Journal Article
Title:: DORA: Towards policy optimization for task-oriented dialogue system with efficient context. (March 2022)
Main Title:: DORA: Towards policy optimization for task-oriented dialogue system with efficient context
Authors:: Jeon, Hyunmin
Lee, Gary Geunbae
Abstract:: Abstract: Recently, reinforcement learning (RL) has been applied to task-oriented dialogue systems by using latent actions to solve shortcomings of supervised learning (SL). In this paper, we propose a multi-domain task-oriented dialogue system, called D ialogue System with O ptimizing a R ecurrent A ction Policy using Efficient Context (DORA), that uses SL, with subsequently applied RL to optimize dialogue systems using a recurrent dialogue policy. This dialogue policy recurrently generates explicit system actions as a both word-level and high-level policy. As a result, DORA is clearly optimized during both SL and RL steps by using an explicit system action policy that considers an efficient context instead of the entire dialogue history. The system actions are both interpretable and controllable, whereas the latent actions are not. DORA improved the success rate by 6.6 points on MultiWOZ 2.0 and by 10.9 points on MultiWOZ 2.1. Highlights: Recurrent action policy optimizes well the dialogue system with both supervised learning and reinforcement learning. Explicit system actions enable effective reward shaping for dialogue policy optimization. Optimized dialogue policy can be controlled by post-processing of system actions. Dialogue system operates as a hybrid system by system action control. Efficient input context can replace the entire dialogue history.
Is Part Of:: Computer speech & language. Volume 72(2022)
Journal:: Computer speech & language
Issue:: Volume 72(2022)
Issue Display:: Volume 72, Issue 2022 (2022)
Year:: 2022
Volume:: 72
Issue:: 2022
Issue Sort Value:: 2022-0072-2022-0000
Page Start:
Page End:
Publication Date:: 2022-03
Subjects:: Task-oriented dialogue system -- Multi-domain dialogue -- Policy optimization -- Recurrent Action Policy -- Efficient context
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2021.101310 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 20100.xml