A Least-Squares Temporal Difference based method for solving resource allocation problems. (September 2020)
- Record Type:
- Journal Article
- Title:
- A Least-Squares Temporal Difference based method for solving resource allocation problems. (September 2020)
- Main Title:
- A Least-Squares Temporal Difference based method for solving resource allocation problems
- Authors:
- Forootani, Ali
Tipaldi, Massimo
Zarch, Majid Ghaniee
Liuzza, Davide
Glielmo, Luigi - Abstract:
- Abstract: Value function approximation has a central role in Approximate Dynamic Programming (ADP) to overcome the so-called curse of dimensionality associated to real stochastic processes. In this regard, we propose a novel Least-Squares Temporal Difference (LSTD) based method: the "Multi-trajectory Greedy LSTD" (MG-LSTD). It is an exploration-enhanced recursive LSTD algorithm with the policy improvement embedded within the LSTD iterations. It makes use of multi-trajectories Monte Carlo simulations in order to enhance the system state space exploration. This method is applied for solving resource allocation problems modeled via a constrained Stochastic Dynamic Programming (SDP) based framework. In particular, such problems are formulated as a set of parallel Birth–Death Processes (BDPs). Some operational scenarios are defined and solved to show the effectiveness of the proposed approach. Finally, we provide some experimental evidence on the MG-LSTD algorithm convergence properties in function of its key-parameters. Highlights: Least-Squares Temporal Difference (LSTD) based method to solve real stochastic processes. Multi-trajectory Greedy LSTD as an exploration-enhanced recursive LSTD algorithm. Policy improvement embedded into LSTD iterations. Modeling of resource allocation problems as a set of Birth–Death Processes. Calculation of pricing policies by means of the Multi-trajectory Greedy LSTD algorithm.
- Is Part Of:
- IFAC journal of systems and control. Volume 13(2020)
- Journal:
- IFAC journal of systems and control
- Issue:
- Volume 13(2020)
- Issue Display:
- Volume 13, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 13
- Issue:
- 2020
- Issue Sort Value:
- 2020-0013-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-09
- Subjects:
- Least-squares temporal difference -- Approximate dynamic programming -- Markov decision process -- Birth–death process -- Monte Carlo simulations
Automatic control -- Periodicals
Relay control systems -- Periodicals
Embedded computer systems -- Periodicals
Feedback control systems -- Periodicals
Artificial intelligence -- Periodicals
Artificial intelligence
Automatic control
Embedded computer systems
Feedback control systems
Relay control systems
Electronic journals
Periodicals
629.89 - Journal URLs:
- https://www.sciencedirect.com/science/journal/24686018 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.ifacsc.2020.100106 ↗
- Languages:
- English
- ISSNs:
- 2468-6018
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14010.xml