A bounded actor–critic reinforcement learning algorithm applied to airline revenue management. (June 2019)

Record Type:: Journal Article
Title:: A bounded actor–critic reinforcement learning algorithm applied to airline revenue management. (June 2019)
Main Title:: A bounded actor–critic reinforcement learning algorithm applied to airline revenue management
Authors:: Lawhead, Ryan J.
Gosavi, Abhijit
Abstract:: Abstract: Reinforcement Learning (RL) is an artificial intelligence technique used to solve Markov and semi-Markov decision processes. Actor critics form a major class of RL algorithms that suffer from a critical deficiency, which is that the values of the so-called actor in these algorithms can become very large causing computer overflow. In practice, hence, one has to artificially constrain these values, via a projection, and at times further use temperature-reduction tuning parameters in the popular Boltzmann action-selection schemes to make the algorithm deliver acceptable results. This artificial bounding and temperature reduction, however, do not allow for full exploration of the state space, which often leads to sub-optimal solutions on large-scale problems. We propose a new actor–critic algorithm in which (i) the actor's values remain bounded without any projection and (ii) no temperature-reduction tuning parameter is needed. The algorithm also represents a significant improvement over a recent version in the literature, where although the values remain bounded they usually become very large in magnitude, necessitating the use of a temperature-reduction parameter. Our new algorithm is tested on an important problem in an area of management science known as airline revenue management, where the state-space is very large. The algorithm delivers encouraging computational behavior, outperforming a well-known industrial heuristic called EMSR-b on industrial data.
Is Part Of:: Engineering applications of artificial intelligence. Volume 82(2019)
Journal:: Engineering applications of artificial intelligence
Issue:: Volume 82(2019)
Issue Display:: Volume 82, Issue 2019 (2019)
Year:: 2019
Volume:: 82
Issue:: 2019
Issue Sort Value:: 2019-0082-2019-0000
Page Start:: 252
Page End:: 262
Publication Date:: 2019-06
Subjects:: Reinforcement learning -- Actor critics -- Airline revenue management
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285
Journal URLs:: http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.engappai.2019.04.008 ↗
Languages:: English
ISSNs:: 0952-1976
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 10923.xml