Integration of imitation learning using GAIL and reinforcement learning using task-achievement rewards via probabilistic graphical model. (17th August 2020)
- Record Type:
- Journal Article
- Title:
- Integration of imitation learning using GAIL and reinforcement learning using task-achievement rewards via probabilistic graphical model. (17th August 2020)
- Main Title:
- Integration of imitation learning using GAIL and reinforcement learning using task-achievement rewards via probabilistic graphical model
- Authors:
- Kinose, Akira
Taniguchi, Tadahiro - Abstract:
- Abstract : The integration of reinforcement learning (RL) and imitation learning (IL) is an important problem that has long been studied in the field of intelligent robotics. RL optimizes policies to maximize the cumulative reward, whereas IL attempts to extract general knowledge about the trajectories demonstrated by experts, i.e, demonstrators. Because each has its own drawbacks, many methods combining them and compensating for each set of drawbacks have been explored thus far. However, many of these methods are heuristic and do not have a solid theoretical basis. This paper presents a new theory for integrating RL and IL by extending the probabilistic graphical model (PGM) framework for RL, control as inference . We develop a new PGM for RL with multiple types of rewards, called probabilistic graphical model for Markov decision processes with multiple optimality emissions (pMDP-MO). Furthermore, we demonstrate that the integrated learning method of RL and IL can be formulated as a probabilistic inference of policies on pMDP-MO by considering the discriminator in generative adversarial imitation learning (GAIL) as an additional optimality emission. We adapt the GAIL and task-achievement reward to our proposed framework, achieving significantly better performance than policies trained with baseline methods. GRAPHICAL ABSTRACT: UF0001
- Is Part Of:
- Advanced robotics. Volume 34:Number 16(2020)
- Journal:
- Advanced robotics
- Issue:
- Volume 34:Number 16(2020)
- Issue Display:
- Volume 34, Issue 16 (2020)
- Year:
- 2020
- Volume:
- 34
- Issue:
- 16
- Issue Sort Value:
- 2020-0034-0016-0000
- Page Start:
- 1055
- Page End:
- 1067
- Publication Date:
- 2020-08-17
- Subjects:
- Imitation learning -- reinforcement learning -- probabilistic inference -- control as inference -- generative adversarial imitation learning
Robotics -- Periodicals
Robotics -- Japan -- Periodicals
Robotics
Japan
Periodicals
629.89205 - Journal URLs:
- http://www.catchword.com/rpsv/cw/vsp/01691864/contp1.htm ↗
http://catalog.hathitrust.org/api/volumes/oclc/14883000.html ↗
http://www.tandfonline.com/toc/tadr20/current ↗
http://www.tandfonline.com/ ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0169-1864;screen=info;ECOIP ↗
http://www.ingentaselect.com/vl=16659242/cl=11/nw=1/rpsv/cw/vsp/01691864/contp1.htm ↗ - DOI:
- 10.1080/01691864.2020.1778521 ↗
- Languages:
- English
- ISSNs:
- 0169-1864
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0696.926500
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 22698.xml