An efficient framework for zero-shot sketch-based image retrieval. (June 2022)
- Record Type:
- Journal Article
- Title:
- An efficient framework for zero-shot sketch-based image retrieval. (June 2022)
- Main Title:
- An efficient framework for zero-shot sketch-based image retrieval
- Authors:
- Tursun, Osman
Denman, Simon
Sridharan, Sridha
Goan, Ethan
Fookes, Clinton - Abstract:
- Highlights: We propose an efficient framework for zero-shot sketch-based image retrieval. The model is trained in an end-to-end way with three introduced learning objectives: domain-balanced quadruplet loss, semantic classification loss and semantic knowledge preservation loss. A low-cost but accurate semantic knowledge distillation pipeline is introduced. It does not require a language model or an online teacher network as with past approaches. The proposed method achieved state-of-the-art results on three challenging zero-shot sketch-based image retrieval datasets: Sketchy Extended, TU-Berlin Extended and QuickDraw Extended. Abstract: Zero-shot sketch-based image retrieval (ZS-SBIR) has recently attracted the attention of the computer vision community due to its real-world applications, and the more realistic and challenging setting that it presents over SBIR. ZS-SBIR inherits the main challenges of multiple computer vision problems including content-based Image Retrieval (CBIR), zero-shot learning and domain adaptation. The majority of previous studies using deep neural networks have achieved improved results by either projecting sketch and images into a common low-dimensional space, or transferring knowledge from seen to unseen classes. However, those approaches are trained with complex frameworks composed of multiple deep convolutional neural networks (CNNs) and are dependent on category-level word labels. This increases the requirements for training resources andHighlights: We propose an efficient framework for zero-shot sketch-based image retrieval. The model is trained in an end-to-end way with three introduced learning objectives: domain-balanced quadruplet loss, semantic classification loss and semantic knowledge preservation loss. A low-cost but accurate semantic knowledge distillation pipeline is introduced. It does not require a language model or an online teacher network as with past approaches. The proposed method achieved state-of-the-art results on three challenging zero-shot sketch-based image retrieval datasets: Sketchy Extended, TU-Berlin Extended and QuickDraw Extended. Abstract: Zero-shot sketch-based image retrieval (ZS-SBIR) has recently attracted the attention of the computer vision community due to its real-world applications, and the more realistic and challenging setting that it presents over SBIR. ZS-SBIR inherits the main challenges of multiple computer vision problems including content-based Image Retrieval (CBIR), zero-shot learning and domain adaptation. The majority of previous studies using deep neural networks have achieved improved results by either projecting sketch and images into a common low-dimensional space, or transferring knowledge from seen to unseen classes. However, those approaches are trained with complex frameworks composed of multiple deep convolutional neural networks (CNNs) and are dependent on category-level word labels. This increases the requirements for training resources and datasets. In comparison, we propose a simple and efficient framework that does not require high computational training resources, and learns the semantic embedding space from a vision model rather than a language model, as is done by related studies. Furthermore, at training and inference stages our method only uses a single CNN. In this work, a pre-trained ImageNet CNN ( i.e., ResNet50) is fine-tuned with three proposed learning objects: domain-balanced quadruplet loss, semantic classification loss, and semantic knowledge preservation loss . The domain-balanced quadruplet and semantic classification losses are introduced to learn discriminative, semantic and domain invariant features by considering ZS-SBIR as an object detection and verification problem. To preserve semantic knowledge learned with ImageNet and exploit it for unseen categories, the semantic knowledge preservation loss is proposed. To reduce computational cost and increase the accuracy of the semantic knowledge distillation process, ground-truth semantic knowledge is prepared in a class-oriented fashion prior to training. Extensive experiments are conducted on three challenging ZS-SBIR datasets: Sketchy Extended, TU-Berlin Extended and QuickDraw Extended. The proposed method achieves state-of-the-art results, and outperforms the majority of related works by a substantial margin. … (more)
- Is Part Of:
- Pattern recognition. Volume 126(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 126(2022)
- Issue Display:
- Volume 126, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 126
- Issue:
- 2022
- Issue Sort Value:
- 2022-0126-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Sketch-based image retrieval -- Zero-shot learning -- Knowledge distillation -- Similarity learning
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.108528 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22254.xml