Novel two-dimensional off-policy Q-learning method for output feedback optimal tracking control of batch process with unknown dynamics. (May 2022)
- Record Type:
- Journal Article
- Title:
- Novel two-dimensional off-policy Q-learning method for output feedback optimal tracking control of batch process with unknown dynamics. (May 2022)
- Main Title:
- Novel two-dimensional off-policy Q-learning method for output feedback optimal tracking control of batch process with unknown dynamics
- Authors:
- Shi, Huiyuan
Yang, Chen
Jiang, Xueying
Su, Chengli
Li, Ping - Abstract:
- Abstract: Reinforcement learning (RL) is an artificial intelligence algorithm that can learn adaptive optimal control law online. In view of the fact that the previous control approaches were usually overly dependent on the model parameters of system, and most existing RL methods are based on state feedback, their application in actual industrial production is limited. Additionally, developing accurate process system models and ensuring the closed-loop system's control performance is more challenging, as modern businesses place a premium on product quality and economic efficiency. As a result, this work introduces a novel data-driven two-dimensional (2D) off-policy Q -learning method based on output feedback is used to achieve optimal tracking control for batch process. First, the error between the actual output and the given set-point is extended to the system to ensure the good tracking performance. Second, by analyzing the relationship between the value function and the Q -function obtained from the 2D system's performance index, the 2D Bellman equation is obtained in terms of output feedback that is independent of the model parameters. The optimal control problem can be effectively solved by the proposed method in this paper when the policy iteration is executed using only the measurement data of system along the batch and time directions. Following that, the proposed approach's unbiasedness and convergence are strictly confirmed. Finally, the simulation results for theAbstract: Reinforcement learning (RL) is an artificial intelligence algorithm that can learn adaptive optimal control law online. In view of the fact that the previous control approaches were usually overly dependent on the model parameters of system, and most existing RL methods are based on state feedback, their application in actual industrial production is limited. Additionally, developing accurate process system models and ensuring the closed-loop system's control performance is more challenging, as modern businesses place a premium on product quality and economic efficiency. As a result, this work introduces a novel data-driven two-dimensional (2D) off-policy Q -learning method based on output feedback is used to achieve optimal tracking control for batch process. First, the error between the actual output and the given set-point is extended to the system to ensure the good tracking performance. Second, by analyzing the relationship between the value function and the Q -function obtained from the 2D system's performance index, the 2D Bellman equation is obtained in terms of output feedback that is independent of the model parameters. The optimal control problem can be effectively solved by the proposed method in this paper when the policy iteration is executed using only the measurement data of system along the batch and time directions. Following that, the proposed approach's unbiasedness and convergence are strictly confirmed. Finally, the simulation results for the injection molding process demonstrate that the proposed method is capable of determining the optimal control law as the number of batches is growing increasingly. Highlights: Compared to the 1D Q-learning-based output feedback control method, the control law gain obtained by the proposed method has a smaller deviation and a faster convergence speed. In contrast to a recent 2D method proposed for solving optimal tracking control problems based on state feedback, the method described in this paper can effectively resolve the problem of unmeasurable states encountered in industrial production process. Not only does the proposed 2D RL have the merit in immunity to detection noise for the obtained solution, but it also eliminates the need for a state matrix during the learning process. … (more)
- Is Part Of:
- Journal of process control. Volume 113(2022)
- Journal:
- Journal of process control
- Issue:
- Volume 113(2022)
- Issue Display:
- Volume 113, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 113
- Issue:
- 2022
- Issue Sort Value:
- 2022-0113-2022-0000
- Page Start:
- 29
- Page End:
- 41
- Publication Date:
- 2022-05
- Subjects:
- 2D Two-dimensional -- RL Reinforcement learning
Two-dimensional (2D) -- Reinforcement learning (RL) -- Batch process -- Q-learning -- Output feedback
Process control -- Periodicals
Fabrication -- Contrôle -- Périodiques
Process control
Periodicals
Electronic journals
660.281 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09591524 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.jprocont.2022.03.006 ↗
- Languages:
- English
- ISSNs:
- 0959-1524
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5042.645000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21401.xml