Skip to search Skip to main content

This is an interim version of our Electronic Legal Deposit Catalogue-eJournals and eBooks while we continue to recover from a cyber-attack.

Electronic Legal Deposit Catalogue

Home
Help
Reader Registration
Main catalogue

Search in

search for

Cite

HARVARD
APA
MLA

HARVARD Citation

    Su, P. et al. (2018). Reward estimation for dialogue policy optimisation. Computer speech & language. pp. 24-43. [Online].

Back to record

Privacy policy
Terms and conditions
Freedom of information
Accessibility statement
Contact us

© 2026 British Library, 96 Euston Road, London, NW1 2DB

All text is © British Library Board and is available under a CC-BY Licence except where otherwise stated.