Learning‐based T‐sHDP(λ) for optimal control of a class of nonlinear discrete‐time systems. (24th October 2021)

Record Type:: Journal Article
Title:: Learning‐based T‐sHDP(λ) for optimal control of a class of nonlinear discrete‐time systems. (24th October 2021)
Main Title:: Learning‐based T‐sHDP(λ) for optimal control of a class of nonlinear discrete‐time systems
Authors:: Yu, Luyang
Liu, Weibo
Liu, Yurong
Alsaadi, Fawaz E.
Other Names:: Karimi Hamid Reza guestEditor.
Wang Ning guestEditor.
Man Zhihong guestEditor.
Abstract:: Abstract: This article investigates the optimal control problem via reinforcement learning for a class of nonlinear discrete‐time systems. The nonlinear system under consideration is assumed to be partially unknown. A new learning‐based algorithm, T ‐step heuristic dynamic programming with eligibility traces ( T ‐sHDP( λ )), is proposed to tackle the optimal control problem for such partially unknown system. First, the concerned optimal control problem is turned into its equivalence problem, that is, solving a Bellman equation. Then, the T ‐sHDP( λ ) is utilized to get an approximate solution of Bellman equation, and a rigorous convergence analysis is also conducted as well. Instead of the commonly used single step update approach, the T ‐sHDP( λ ) stores finite step past returns by introducing a parameter, and then utilizes these knowledge to update the value function (VF) of multiple moments synchronously, so as to achieve higher convergence speed. For implementation of T ‐sHDP( λ ), a neural network‐based actor‐critic architecture is applied to approximate VF and optimal control scheme. Finally, the feasibility of the algorithm is demonstrated by two illustrative simulation examples.
Is Part Of:: International journal of robust and nonlinear control. Volume 32:Number 5(2022)
Journal:: International journal of robust and nonlinear control
Issue:: Volume 32:Number 5(2022)
Issue Display:: Volume 32, Issue 5 (2022)
Year:: 2022
Volume:: 32
Issue:: 5
Issue Sort Value:: 2022-0032-0005-0000
Page Start:: 2624
Page End:: 2643
Publication Date:: 2021-10-24
Subjects:: eligibility traces (ET) -- heuristic dynamic programming (HDP) -- learning‐based optimal control -- value iteration
Automatic control -- Periodicals
Control theory -- Periodicals
Nonlinear systems -- Periodicals
629.836
Journal URLs:: http://onlinelibrary.wiley.com/ ↗
DOI:: 10.1002/rnc.5847 ↗
Languages:: English
ISSNs:: 1049-8923
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 4542.538900
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store
Ingest File:: 21112.xml