Reducing hardware hit by queries in web search engines. Issue 6 (November 2016)
- Record Type:
- Journal Article
- Title:
- Reducing hardware hit by queries in web search engines. Issue 6 (November 2016)
- Main Title:
- Reducing hardware hit by queries in web search engines
- Authors:
- Mendoza, Marcelo
Marín, Mauricio
Gil-Costa, Verónica
Ferrarotti, Flavio - Abstract:
- Highlights: We propose a collection selection method outperforming the state of the art. We conduct a learning process over a term space with low dimensionality. The learning process uses IR measures as reward functions. We introduce a new novelty detection method for incremental learning. Abstract: In this paper, we introduce a new collection selection strategy to be operated in search engines with document partitioned indexes. Our method involves the selection of those document partitions that are most likely to deliver the best results to the formulated queries, reducing the number of queries that are submitted to each partition. This method employs learning algorithms that are capable of ranking the partitions, maximizing the probability of recovering documents with high gain. The method operates by building vector representations of each partition on the term space that is spanned by the queries. The proposed method is able to generalize to new queries and elaborate document lists with high precision for queries not considered during the training phase. To update the representations of each partition, our method employs incremental learning strategies. Beginning with an inversion test of the partition lists, we identify queries that contribute with new information and add them to the training phase. The experimental results show that our collection selection method favorably compares with state-of-the-art methods. In addition our method achieves a suitable performanceHighlights: We propose a collection selection method outperforming the state of the art. We conduct a learning process over a term space with low dimensionality. The learning process uses IR measures as reward functions. We introduce a new novelty detection method for incremental learning. Abstract: In this paper, we introduce a new collection selection strategy to be operated in search engines with document partitioned indexes. Our method involves the selection of those document partitions that are most likely to deliver the best results to the formulated queries, reducing the number of queries that are submitted to each partition. This method employs learning algorithms that are capable of ranking the partitions, maximizing the probability of recovering documents with high gain. The method operates by building vector representations of each partition on the term space that is spanned by the queries. The proposed method is able to generalize to new queries and elaborate document lists with high precision for queries not considered during the training phase. To update the representations of each partition, our method employs incremental learning strategies. Beginning with an inversion test of the partition lists, we identify queries that contribute with new information and add them to the training phase. The experimental results show that our collection selection method favorably compares with state-of-the-art methods. In addition our method achieves a suitable performance with low parameter sensitivity making it applicable to search engines with hundreds of partitions. … (more)
- Is Part Of:
- Information processing & management. Volume 52:Issue 6(2016:Nov.)
- Journal:
- Information processing & management
- Issue:
- Volume 52:Issue 6(2016:Nov.)
- Issue Display:
- Volume 52, Issue 6 (2016)
- Year:
- 2016
- Volume:
- 52
- Issue:
- 6
- Issue Sort Value:
- 2016-0052-0006-0000
- Page Start:
- 1031
- Page End:
- 1052
- Publication Date:
- 2016-11
- Subjects:
- Query routing -- Distributed information retrieval -- Incremental learning
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ipm.2016.04.008 ↗
- Languages:
- English
- ISSNs:
- 0306-4573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 7325.xml