Data-driven planning via imitation learning. (December 2018)

Record Type:: Journal Article
Title:: Data-driven planning via imitation learning. (December 2018)
Main Title:: Data-driven planning via imitation learning
Authors:: Choudhury, Sanjiban
Bhardwaj, Mohak
Arora, Sankalp
Kapoor, Ashish
Ranade, Gireeja
Scherer, Sebastian
Dey, Debadeepta
Abstract:: Robot planning is the process of selecting a sequence of actions that optimize for a task=specific objective. For instance, the objective for a navigation task would be to find collision-free paths, whereas the objective for an exploration task would be to map unknown areas. The optimal solutions to such tasks are heavily influenced by the implicit structure in the environment, i.e. the configuration of objects in the world. State-of-the-art planning approaches, however, do not exploit this structure, thereby expending valuable effort searching the action space instead of focusing on potentially good actions. In this paper, we address the problem of enabling planners to adapt their search strategies by inferring such good actions in an efficient manner using only the information uncovered by the search up until that time. We formulate this as a problem of sequential decision making under uncertainty where at a given iteration a planning policy must map the state of the search to a planning action. Unfortunately, the training process for such partial-information-based policies is slow to converge and susceptible to poor local minima. Our key insight is that if we could fully observe the underlying world map, we would easily be able to disambiguate between good and bad actions. We hence present a novel data-driven imitation learning framework to efficiently train planning policies by imitating a clairvoyant oracle: an oracle that at train time has full knowledge about the … (more)
Is Part Of:: International journal of robotics research. Volume 37:Number 13/14(2018)
Journal:: International journal of robotics research
Issue:: Volume 37:Number 13/14(2018)
Issue Display:: Volume 37, Issue 13/14 (2018)
Year:: 2018
Volume:: 37
Issue:: 13/14
Issue Sort Value:: 2018-0037-NaN-0000
Page Start:: 1632
Page End:: 1672
Publication Date:: 2018-12
Subjects:: Imitation learning -- POMDPs -- sequential decision making -- QMDPs -- heuristic search
Robots -- Periodicals
Robots, Industrial -- Periodicals
629.89205
Journal URLs:: http://ijr.sagepub.com/ ↗
http://www.uk.sagepub.com/home.nav ↗
DOI:: 10.1177/0278364918781001 ↗
Languages:: English
ISSNs:: 0278-3649
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 9580.xml