Reward estimation for dialogue policy optimisation. (September 2018)

Record Type:: Journal Article
Title:: Reward estimation for dialogue policy optimisation. (September 2018)
Main Title:: Reward estimation for dialogue policy optimisation
Authors:: Su, Pei-Hao
Gašić, Milica
Young, Steve
Abstract:: Highlights: Off-line neural network-based reward model. On-line Gaussian process-based reward model. Neural network-based dialogue embedding. Human user evaluation. Abstract: Viewing dialogue management as a reinforcement learning task enables a system to learn to act optimally by maximising a reward function. This reward function is designed to induce the system behaviour required for the target application and for goal-oriented applications, this usually means fulfilling the user's goal as efficiently as possible. However, in real-world spoken dialogue system applications, the reward is hard to measure because the user's goal is frequently known only to the user. Of course, the system can ask the user if the goal has been satisfied but this can be intrusive. Furthermore, in practice, the accuracy of the user's response has been found to be highly variable. This paper presents two approaches to tackling this problem. Firstly, a recurrent neural network is utilised as a task success predictor which is pre-trained from off-line data to estimate task success during subsequent on-line dialogue policy learning. Secondly, an on-line learning framework is described whereby a dialogue policy is jointly trained alongside a reward function modelled as a Gaussian process with active learning. This Gaussian process operates on a fixed dimension embedding which encodes each varying length dialogue. This dialogue embedding is generated in both a supervised and unsupervised fashion using … (more)
Is Part Of:: Computer speech & language. Volume 51(2018)
Journal:: Computer speech & language
Issue:: Volume 51(2018)
Issue Display:: Volume 51, Issue 2018 (2018)
Year:: 2018
Volume:: 51
Issue:: 2018
Issue Sort Value:: 2018-0051-2018-0000
Page Start:: 24
Page End:: 43
Publication Date:: 2018-09
Subjects:: Dialogue systems -- Reinforcement learning -- Deep learning -- Reward estimation -- Gaussian process -- Active learning
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2018.02.003 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 6640.xml