A meta-learning configuration framework for graph-based similarity search indexes. Issue 112 (February 2023)
- Record Type:
- Journal Article
- Title:
- A meta-learning configuration framework for graph-based similarity search indexes. Issue 112 (February 2023)
- Main Title:
- A meta-learning configuration framework for graph-based similarity search indexes
- Authors:
- Oyamada, Rafael S.
Shimomura, Larissa C.
Barbon, Sylvio
Kaster, Daniel S. - Abstract:
- Abstract: Similarity searches retrieve elements in a dataset with similar characteristics to the input query element. Recent works show that graph-based methods have outperformed others in the literature, such as tree-based and hash-based methods. However, graphs are highly parameter-sensitive for indexing and searching, which usually demands extra time for finding a suitable trade-off for specific user requirements. Current approaches to select parameters rely on observing published experimental results or Grid Search procedures. While the former has no guarantees that good settings for a dataset will also perform well on a different one, the latter is computationally expensive and limited to a small range of values. In this work, we propose a meta-learning-based recommender framework capable of providing a suitable graph configuration according to the characteristics of the input dataset. We present two instantiations of the framework: a global instantiation that uses the whole meta-database to train meta-models and a dataset-similarity-based instantiation that relies on clustering to generate meta-models tailored to datasets with similar characteristics. We also developed generic and tuned versions of the instantiations. The generic versions can satisfy user requirements in orders of magnitude faster than the traditional Grid Search. The tuned versions provide more accurate predictions at a higher cost. Our results show that the tuned methods outperform the Grid SearchAbstract: Similarity searches retrieve elements in a dataset with similar characteristics to the input query element. Recent works show that graph-based methods have outperformed others in the literature, such as tree-based and hash-based methods. However, graphs are highly parameter-sensitive for indexing and searching, which usually demands extra time for finding a suitable trade-off for specific user requirements. Current approaches to select parameters rely on observing published experimental results or Grid Search procedures. While the former has no guarantees that good settings for a dataset will also perform well on a different one, the latter is computationally expensive and limited to a small range of values. In this work, we propose a meta-learning-based recommender framework capable of providing a suitable graph configuration according to the characteristics of the input dataset. We present two instantiations of the framework: a global instantiation that uses the whole meta-database to train meta-models and a dataset-similarity-based instantiation that relies on clustering to generate meta-models tailored to datasets with similar characteristics. We also developed generic and tuned versions of the instantiations. The generic versions can satisfy user requirements in orders of magnitude faster than the traditional Grid Search. The tuned versions provide more accurate predictions at a higher cost. Our results show that the tuned methods outperform the Grid Search for most cases, providing recommendations close to the optimal one and being a suitable alternative, particularly for more challenging datasets. Highlights: Framework based on meta-learning to automatically select and configure proximity graphs. Intelligent index parameterization for image databases. Meaningful dataset descriptors for approximate nearest neighbor searches. Strategies for instantiating the proposed recommender system. … (more)
- Is Part Of:
- Information systems. Issue 112(2023)
- Journal:
- Information systems
- Issue:
- Issue 112(2023)
- Issue Display:
- Volume 112, Issue 112 (2023)
- Year:
- 2023
- Volume:
- 112
- Issue:
- 112
- Issue Sort Value:
- 2023-0112-0112-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-02
- Subjects:
- Similarity searching -- Graph-based indexes -- Parameter recommending -- Meta-learning
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2022.102123 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24466.xml