Quality versus efficiency in document scoring with learning-to-rank models. Issue 6 (November 2016)
- Record Type:
- Journal Article
- Title:
- Quality versus efficiency in document scoring with learning-to-rank models. Issue 6 (November 2016)
- Main Title:
- Quality versus efficiency in document scoring with learning-to-rank models
- Authors:
- Capannini, Gabriele
Lucchese, Claudio
Nardini, Franco Maria
Orlando, Salvatore
Perego, Raffaele
Tonellotto, Nicola - Abstract:
- Highlights: Characterization the quality versus cost trade-off of Learning-to-Rank models. QuickRank: a public-domain Learning-to-Rank learning and evaluation framework. a new measure, named AuQC, for the evaluation of LtR algorithms. Abstract: Learning-to-Rank (LtR ) techniques leverage machine learning algorithms and large amounts of training data to induce high-quality ranking functions. Given a set of documents and a user query, these functions are able to precisely predict a score for each of the documents, in turn exploited to effectively rank them. Although the scoring efficiency ofLtR models is critical in several applications – e.g., it directly impacts on response time and throughput of Web query processing – it has received relatively little attention so far. The goal of this work is to experimentally investigate the scoring efficiency ofLtR models along with their ranking quality. Specifically, we show that machine-learned ranking models exhibit a quality versus efficiency trade-off. For example, each family ofLtR algorithms has tuning parameters that can influence both effectiveness and efficiency, where higher ranking quality is generally obtained with more complex and expensive models. Moreover, LtR algorithms that learn complex models, such as those based on forests of regression trees, are generally more expensive and more effective than other algorithms that induce simpler models like linear combination of features. We extensively analyze the quality versusHighlights: Characterization the quality versus cost trade-off of Learning-to-Rank models. QuickRank: a public-domain Learning-to-Rank learning and evaluation framework. a new measure, named AuQC, for the evaluation of LtR algorithms. Abstract: Learning-to-Rank (LtR ) techniques leverage machine learning algorithms and large amounts of training data to induce high-quality ranking functions. Given a set of documents and a user query, these functions are able to precisely predict a score for each of the documents, in turn exploited to effectively rank them. Although the scoring efficiency ofLtR models is critical in several applications – e.g., it directly impacts on response time and throughput of Web query processing – it has received relatively little attention so far. The goal of this work is to experimentally investigate the scoring efficiency ofLtR models along with their ranking quality. Specifically, we show that machine-learned ranking models exhibit a quality versus efficiency trade-off. For example, each family ofLtR algorithms has tuning parameters that can influence both effectiveness and efficiency, where higher ranking quality is generally obtained with more complex and expensive models. Moreover, LtR algorithms that learn complex models, such as those based on forests of regression trees, are generally more expensive and more effective than other algorithms that induce simpler models like linear combination of features. We extensively analyze the quality versus efficiency trade-off of a wide spectrum of state-of-the-artLtR, and we propose a sound methodology to devise the most effective ranker given a time budget. To guarantee reproducibility, we used publicly available datasets and we contribute an open source C++ framework providing optimized, multi-threaded implementations of the most effective tree-based learners: Gradient Boosted Regression Trees (GBRT ), Lambda-Mart ( λ -MART ), and the first public-domain implementation of Oblivious Lambda-Mart ( Ωλ -MART ), an algorithm that induces forests of oblivious regression trees. We investigate how the different training parameters impact on the quality versus efficiency trade-off, and provide a thorough comparison of several algorithms in the quality-cost space. The experiments conducted show that there is not an overall best algorithm, but the optimal choice depends on the time budget. … (more)
- Is Part Of:
- Information processing & management. Volume 52:Issue 6(2016:Nov.)
- Journal:
- Information processing & management
- Issue:
- Volume 52:Issue 6(2016:Nov.)
- Issue Display:
- Volume 52, Issue 6 (2016)
- Year:
- 2016
- Volume:
- 52
- Issue:
- 6
- Issue Sort Value:
- 2016-0052-0006-0000
- Page Start:
- 1161
- Page End:
- 1177
- Publication Date:
- 2016-11
- Subjects:
- Efficiency -- Learning-to-rank -- Document scoring
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ipm.2016.05.004 ↗
- Languages:
- English
- ISSNs:
- 0306-4573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5040.xml