A priori-knowledge/actor-critic reinforcement learning architecture for computing the mean–variance customer portfolio: The case of bank marketing campaigns. (November 2015)

Record Type:: Journal Article
Title:: A priori-knowledge/actor-critic reinforcement learning architecture for computing the mean–variance customer portfolio: The case of bank marketing campaigns. (November 2015)
Main Title:: A priori-knowledge/actor-critic reinforcement learning architecture for computing the mean–variance customer portfolio: The case of bank marketing campaigns
Authors:: Sánchez, Emma M.
Clempner, Julio B.
Poznyak, Alexander S.
Abstract:: Abstract: In this paper we propose a novel recurrent reinforcement learning approach for controllable Markov chains that adjusts its policies according to a preprocessing and an actor-critic architecture. The preprocessing is proposed when learning a new task is needed from reinforcement based on a priori knowledge, in order to decrease computation time and not explore and not learn everything from scratch. The actor-critic architecture is based on an iterated quadratic/Lagrange programming maximization algorithm for computing the optimal strategies of the mean–variance customer portfolio. This process can be viewed as a specific form of asynchronous value iteration with optimized computational properties. The use of only the value-maximizing action at each state is unlikely in practice. Then, a specific selection of policies is used to ensure convergence. The reinforcement model proposed predicts a learning process that takes the risk of the customer portfolio into account. The resulting policies dynamically optimize the customer portfolio. We propose to apply three different learning rules, based on the transition matrices, the utilities and the costs, to estimate the objective function for the current policies. In particular, the learning rule related to estimate the real costs imposes restrictions over the formulation of the portfolio: costs cannot be underestimated or overestimated. The learning rules allow the process to make use of past experiences and decide on … (more)
Is Part Of:: Engineering applications of artificial intelligence. Volume 46:Part A(2015:Oct.)
Journal:: Engineering applications of artificial intelligence
Issue:: Volume 46:Part A(2015:Oct.)
Issue Display:: Volume 46 (2015)
Year:: 2015
Volume:: 46
Issue Sort Value:: 2015-0046-0000-0000
Page Start:: 82
Page End:: 92
Publication Date:: 2015-11
Subjects:: Reinforcement learning -- Preprocessing -- Actor-critic -- Mean–variance customer portfolio -- Markov chains
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285
Journal URLs:: http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.engappai.2015.08.011 ↗
Languages:: English
ISSNs:: 0952-1976
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 148.xml