A novel policy gradient algorithm with PSO-based parameter exploration for continuous control. (April 2020)

Record Type:: Journal Article
Title:: A novel policy gradient algorithm with PSO-based parameter exploration for continuous control. (April 2020)
Main Title:: A novel policy gradient algorithm with PSO-based parameter exploration for continuous control
Authors:: Liu, Tundong
Li, Liduan
Shao, Guifang
Wu, Xiaomin
Huang, Meng
Abstract:: Abstract: Continuous control has attracted enormous attention due to its essential role in real-world applications. However, it is considerably difficult to be addressed through explicitly modeling in practice. As promising approaches, model-free policy gradient (PG) based methods in reinforcement learning (RL), however, suffer from slow convergence and complex computation owing to the high variance of gradient estimating and sophisticated backpropagation. Therefore, in this paper, a gradient-free policy gradient algorithm with PSO-based parameter exploration (PG-PSOPE) is proposed for continuous control tasks. To reduce variance and improve convergence rate, the PSO is combined with PG to provide a novel way for training policy network in RL. Experimental results of simulated physical control tasks verify the effectiveness of the proposed algorithm. Besides, the PG-PSOPE is superior in both convergence speed and final performance to the typical on-policy PG and the off-policy deep RL method. Furthermore, the PG-PSOPE exhibits the simplicity and high effectiveness by comparison of training time under different tasks, and its running time is reduced by 58 times compared with other gradient-based methods for the best case.
Is Part Of:: Engineering applications of artificial intelligence. Volume 90(2020)
Journal:: Engineering applications of artificial intelligence
Issue:: Volume 90(2020)
Issue Display:: Volume 90, Issue 2020 (2020)
Year:: 2020
Volume:: 90
Issue:: 2020
Issue Sort Value:: 2020-0090-2020-0000
Page Start:
Page End:
Publication Date:: 2020-04
Subjects:: Continuous control -- Model-free reinforcement learning -- Convergence improvement -- Gradient-free algorithm -- Policy gradient -- Parameter exploration
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285
Journal URLs:: http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.engappai.2020.103525 ↗
Languages:: English
ISSNs:: 0952-1976
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 13443.xml