A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments. (August 2020)

Record Type:: Journal Article
Title:: A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments. (August 2020)
Main Title:: A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments
Authors:: Abdelfattah, Sherif
Kasmarik, Kathryn
Hu, Jiankun
Abstract:: Multi-objective Markov decision processes are a special kind of multi-objective optimization problem that involves sequential decision making while satisfying the Markov property of stochastic processes. Multi-objective reinforcement learning methods address this kind of problem by fusing the reinforcement learning paradigm with multi-objective optimization techniques. One major drawback of these methods is the lack of adaptability to non-stationary dynamics in the environment. This is because they adopt optimization procedures that assume stationarity in order to evolve a coverage set of policies that can solve the problem. This article introduces a developmental optimization approach that can evolve the policy coverage set while exploring the preference space over the defined objectives in an online manner. We propose a novel multi-objective reinforcement learning algorithm that can robustly evolve a convex coverage set of policies in an online manner in non-stationary environments. We compare the proposed algorithm with two state-of-the-art multi-objective reinforcement learning algorithms in stationary and non-stationary environments. Results showed that the proposed algorithm significantly outperforms the existing algorithms in non-stationary environments while achieving comparable results in stationary environments.
Is Part Of:: Adaptive behavior. Volume 28:Number 4(2020)
Journal:: Adaptive behavior
Issue:: Volume 28:Number 4(2020)
Issue Display:: Volume 28, Issue 4 (2020)
Year:: 2020
Volume:: 28
Issue:: 4
Issue Sort Value:: 2020-0028-0004-0000
Page Start:: 273
Page End:: 292
Publication Date:: 2020-08
Subjects:: Multi-objective optimization -- reinforcement learning -- non-stationary -- environment -- dynamics -- policy bootstrapping -- Markov decision processes
Animal behavior -- Periodicals
Animals -- Adaptation -- Periodicals
Adaptability (Psychology) -- Periodicals
Adaptation, Psychological -- Periodicals
Artificial intelligence -- Periodicals
591.5
Journal URLs:: http://adb.sagepub.com ↗
http://www.uk.sagepub.com/home.nav ↗
DOI:: 10.1177/1059712319869313 ↗
Languages:: English
ISSNs:: 1741-2633
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 13439.xml