Reducing correlation of random forest–based learning‐to‐rank algorithms using subsample size. (29th April 2019)

Record Type:: Journal Article
Title:: Reducing correlation of random forest–based learning‐to‐rank algorithms using subsample size. (29th April 2019)
Main Title:: Reducing correlation of random forest–based learning‐to‐rank algorithms using subsample size
Authors:: Ibrahim, Muhammad
Abstract:: Abstract: Learning‐to‐rank (LtR) has become an integral part of modern ranking systems. In this field, the random forest–based rank‐learning algorithms are shown to be among of the top performers. Traditionally, each tree of a random forest is learnt using a bootstrapped copy of the training set, where approximately 63% of the examples are unique. The goal of using a bootstrapped copy instead of the original training set is to reduce the correlation between individual trees, thereby making the prediction of the ensemble more accurate. In this regard, the following question may be raised: how can we leverage the correlation between the trees in favor of performance and scalability of a random forest–based LtR algorithm? In this article, we investigate whether we can further decrease the correlation between the trees while maintaining or possibly improving accuracy. Among several potential options to achieve this goal, we investigate the size of the subsamples used for learning individual trees. We examine the performance of a random forest–based LtR algorithm as we control the correlation using this parameter. Experiments on LtR data sets reveal that for small‐ to moderate‐sized data sets, substantial reduction in training time can be achieved using only a small amount of training data per tree. Moreover, due to the positive correlation between the variability across the trees and performance of a random forest, we observe an increase in accuracy while maintaining the same … (more)
Is Part Of:: Computational intelligence. Volume 35:Number 4(2019)
Journal:: Computational intelligence
Issue:: Volume 35:Number 4(2019)
Issue Display:: Volume 35, Issue 4 (2019)
Year:: 2019
Volume:: 35
Issue:: 4
Issue Sort Value:: 2019-0035-0004-0000
Page Start:: 774
Page End:: 798
Publication Date:: 2019-04-29
Subjects:: information retrieval -- learning‐to‐rank -- random forests -- subsampling -- scalability
Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3
Journal URLs:: http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗
DOI:: 10.1111/coin.12213 ↗
Languages:: English
ISSNs:: 0824-7935
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store
Ingest File:: 12061.xml