AMRank: An adversarial Markov ranking model combining short- and long-term returns. (January 2023)
- Record Type:
- Journal Article
- Title:
- AMRank: An adversarial Markov ranking model combining short- and long-term returns. (January 2023)
- Main Title:
- AMRank: An adversarial Markov ranking model combining short- and long-term returns
- Authors:
- Peng, Dunlu
Chen, Yichao - Abstract:
- Abstract: Learning to rank (LTR) is a method of ranking search results using machine learning techniques. Currently, the reinforcement-learning-based ranking models have achieved some success in LTR task. However, these models have disadvantages like high variance gradient estimates and train inefficiency, which bring great challenges to the convergence and accuracy of the ranking model. Combining short- and long-term returns, this paper proposes AMRank, an adversarial Markov ranking model, which is based on reinforcement learning and formalizes the ranking task as a Markov decision process. To address the aforementioned weaknesses, in AMRank, we present a sequence discriminator to output a long-term return with a smaller variance and conduct single step updates, and use a document discriminator to yield a short-term return. The two discriminators are trained simultaneously before the decision is made. In the training process, the policy network is applied as a generator to sample candidate documents and get negative samples. At the beginning of the decision, the discriminator outputs the returns based on the environment state and the policy, and finally updates the parameters of the policy network using the policy gradient method. Experimental results on three LETOR benchmark datasets, OHUSMED, MQ2007 and MQ2008, demonstrate that the proposed AMRank outperforms the baseline models in document ranking task. Highlights: AMRank, a novel document ranking model, is proposed,Abstract: Learning to rank (LTR) is a method of ranking search results using machine learning techniques. Currently, the reinforcement-learning-based ranking models have achieved some success in LTR task. However, these models have disadvantages like high variance gradient estimates and train inefficiency, which bring great challenges to the convergence and accuracy of the ranking model. Combining short- and long-term returns, this paper proposes AMRank, an adversarial Markov ranking model, which is based on reinforcement learning and formalizes the ranking task as a Markov decision process. To address the aforementioned weaknesses, in AMRank, we present a sequence discriminator to output a long-term return with a smaller variance and conduct single step updates, and use a document discriminator to yield a short-term return. The two discriminators are trained simultaneously before the decision is made. In the training process, the policy network is applied as a generator to sample candidate documents and get negative samples. At the beginning of the decision, the discriminator outputs the returns based on the environment state and the policy, and finally updates the parameters of the policy network using the policy gradient method. Experimental results on three LETOR benchmark datasets, OHUSMED, MQ2007 and MQ2008, demonstrate that the proposed AMRank outperforms the baseline models in document ranking task. Highlights: AMRank, a novel document ranking model, is proposed, which combines MDP and GAN. AMRank employs long- and short-term returns to improve decision making. A sequence discriminator is presented to generate long-term returns. AMRank can realize one-step update and output return with less variance. … (more)
- Is Part Of:
- Expert systems with applications. Volume 211(2023)
- Journal:
- Expert systems with applications
- Issue:
- Volume 211(2023)
- Issue Display:
- Volume 211, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 211
- Issue:
- 2023
- Issue Sort Value:
- 2023-0211-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- Document ranking -- Learning to rank -- Reinforcement learning -- Discriminator
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2022.118512 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24122.xml