A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments. (August 2020)
- Record Type:
- Journal Article
- Title:
- A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments. (August 2020)
- Main Title:
- A robust policy bootstrapping algorithm for multi-objective reinforcement learning in non-stationary environments
- Authors:
- Abdelfattah, Sherif
Kasmarik, Kathryn
Hu, Jiankun - Abstract:
- Multi-objective Markov decision processes are a special kind of multi-objective optimization problem that involves sequential decision making while satisfying the Markov property of stochastic processes. Multi-objective reinforcement learning methods address this kind of problem by fusing the reinforcement learning paradigm with multi-objective optimization techniques. One major drawback of these methods is the lack of adaptability to non-stationary dynamics in the environment. This is because they adopt optimization procedures that assume stationarity in order to evolve a coverage set of policies that can solve the problem. This article introduces a developmental optimization approach that can evolve the policy coverage set while exploring the preference space over the defined objectives in an online manner. We propose a novel multi-objective reinforcement learning algorithm that can robustly evolve a convex coverage set of policies in an online manner in non-stationary environments. We compare the proposed algorithm with two state-of-the-art multi-objective reinforcement learning algorithms in stationary and non-stationary environments. Results showed that the proposed algorithm significantly outperforms the existing algorithms in non-stationary environments while achieving comparable results in stationary environments.
- Is Part Of:
- Adaptive behavior. Volume 28:Number 4(2020)
- Journal:
- Adaptive behavior
- Issue:
- Volume 28:Number 4(2020)
- Issue Display:
- Volume 28, Issue 4 (2020)
- Year:
- 2020
- Volume:
- 28
- Issue:
- 4
- Issue Sort Value:
- 2020-0028-0004-0000
- Page Start:
- 273
- Page End:
- 292
- Publication Date:
- 2020-08
- Subjects:
- Multi-objective optimization -- reinforcement learning -- non-stationary -- environment -- dynamics -- policy bootstrapping -- Markov decision processes
Animal behavior -- Periodicals
Animals -- Adaptation -- Periodicals
Adaptability (Psychology) -- Periodicals
Adaptation, Psychological -- Periodicals
Artificial intelligence -- Periodicals
591.5 - Journal URLs:
- http://adb.sagepub.com ↗
http://www.uk.sagepub.com/home.nav ↗ - DOI:
- 10.1177/1059712319869313 ↗
- Languages:
- English
- ISSNs:
- 1741-2633
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13439.xml