A real-world application of Markov chain Monte Carlo method for Bayesian trajectory control of a robotic manipulator. (June 2022)
- Record Type:
- Journal Article
- Title:
- A real-world application of Markov chain Monte Carlo method for Bayesian trajectory control of a robotic manipulator. (June 2022)
- Main Title:
- A real-world application of Markov chain Monte Carlo method for Bayesian trajectory control of a robotic manipulator
- Authors:
- Tavakol Aghaei, Vahid
Ağababaoğlu, Arda
Yıldırım, Sinan
Onat, Ahmet - Abstract:
- Abstract: Reinforcement learning methods are being applied to control problems in robotics domain. These algorithms are well suited for dealing with the continuous large scale state spaces in robotics field. Even though policy search methods related to stochastic gradient optimization algorithms have become a successful candidate for coping with challenging robotics and control problems in recent years, they may become unstable when abrupt variations occur in gradient computations. Moreover, they may end up with a locally optimal solution. To avoid these disadvantages, a Markov chain Monte Carlo (MCMC) algorithm for policy learning under the RL configuration is proposed. The policy space is explored in a non-contiguous manner such that higher reward regions have a higher probability of being visited. The proposed algorithm is applied in a risk-sensitive setting where the reward structure is multiplicative. Our method has the advantages of being model-free and gradient-free, as well as being suitable for real-world implementation. The merits of the proposed algorithm are shown with experimental evaluations on a 2-Degree of Freedom robot arm. The experiments demonstrate that it can perform a thorough policy space search while maintaining adequate control performance and can learn a complex trajectory control task within a small finite number of iteration steps. Highlights: Policy search is formulated as a Bayesian framework where return is likelihood function. LikelihoodAbstract: Reinforcement learning methods are being applied to control problems in robotics domain. These algorithms are well suited for dealing with the continuous large scale state spaces in robotics field. Even though policy search methods related to stochastic gradient optimization algorithms have become a successful candidate for coping with challenging robotics and control problems in recent years, they may become unstable when abrupt variations occur in gradient computations. Moreover, they may end up with a locally optimal solution. To avoid these disadvantages, a Markov chain Monte Carlo (MCMC) algorithm for policy learning under the RL configuration is proposed. The policy space is explored in a non-contiguous manner such that higher reward regions have a higher probability of being visited. The proposed algorithm is applied in a risk-sensitive setting where the reward structure is multiplicative. Our method has the advantages of being model-free and gradient-free, as well as being suitable for real-world implementation. The merits of the proposed algorithm are shown with experimental evaluations on a 2-Degree of Freedom robot arm. The experiments demonstrate that it can perform a thorough policy space search while maintaining adequate control performance and can learn a complex trajectory control task within a small finite number of iteration steps. Highlights: Policy search is formulated as a Bayesian framework where return is likelihood function. Likelihood function is combined with an uninformative prior to establish a posterior. MCMC is used to explore the high-reward regions of the policy space. Parameterized control policy is learned by an MCMC-based RL algorithm. The proposed method is a gradient-free and model-free algorithm. … (more)
- Is Part Of:
- ISA transactions. Volume 125(2022)
- Journal:
- ISA transactions
- Issue:
- Volume 125(2022)
- Issue Display:
- Volume 125, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 125
- Issue:
- 2022
- Issue Sort Value:
- 2022-0125-2022-0000
- Page Start:
- 580
- Page End:
- 590
- Publication Date:
- 2022-06
- Subjects:
- Reinforcement learning -- Markov chain Monte Carlo -- Bayesian learning -- Intelligent control -- Policy search
Engineering instruments -- Periodicals
Engineering instruments
Periodicals
Electronic journals
629.805 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00190578 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.isatra.2021.06.010 ↗
- Languages:
- English
- ISSNs:
- 0019-0578
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4582.700000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21744.xml