Granularity-aware distillation and structure modeling region proposal network for fine-grained image classification. (May 2023)
- Record Type:
- Journal Article
- Title:
- Granularity-aware distillation and structure modeling region proposal network for fine-grained image classification. (May 2023)
- Main Title:
- Granularity-aware distillation and structure modeling region proposal network for fine-grained image classification
- Authors:
- Ke, Xiao
Cai, Yuhang
Chen, Baitao
Liu, Hao
Guo, Wenzhong - Abstract:
- Highlights: We propose a granularity-aware distillation module to enhance the representation ability of the model. We adopt a multi-granularity feature fusion learning strategy to jointly learn multi-level information, and use cross-layer self-distillation regularization to improve the robustness of features at different granularity levels. We propose a structure modeling region proposal module. Based on the collaborative learning of discriminative semantics and structural semantics under weak supervision, the model can improve the ability to capture multi-granularity discriminative regions and mine regional structural semantic associations. GDSMP-Net reports state-of-the-art performance on four widely-used challenging datasets, including CUB-200-2011, Stanford Cars, FGVC-Aircraft and NA-birds. Abstract: Fine-grained visual classification (FGVC) aims to identify objects belonging to multiple sub-categories of the same super-category. The key to solving fine-grained classification problems is to learn discriminative visual feature representation with only subtle differences. Although previous work based on refined feature learning has made great progress, however, high-level semantic features often lack key information for fine-grained visual object nuances. How to efficiently integrate semantic information of different granularities from classification networks is a critical. In this paper, we propose Granularity-aware Distillation and Structure Modeling region ProposalHighlights: We propose a granularity-aware distillation module to enhance the representation ability of the model. We adopt a multi-granularity feature fusion learning strategy to jointly learn multi-level information, and use cross-layer self-distillation regularization to improve the robustness of features at different granularity levels. We propose a structure modeling region proposal module. Based on the collaborative learning of discriminative semantics and structural semantics under weak supervision, the model can improve the ability to capture multi-granularity discriminative regions and mine regional structural semantic associations. GDSMP-Net reports state-of-the-art performance on four widely-used challenging datasets, including CUB-200-2011, Stanford Cars, FGVC-Aircraft and NA-birds. Abstract: Fine-grained visual classification (FGVC) aims to identify objects belonging to multiple sub-categories of the same super-category. The key to solving fine-grained classification problems is to learn discriminative visual feature representation with only subtle differences. Although previous work based on refined feature learning has made great progress, however, high-level semantic features often lack key information for fine-grained visual object nuances. How to efficiently integrate semantic information of different granularities from classification networks is a critical. In this paper, we propose Granularity-aware Distillation and Structure Modeling region Proposal Network(GDSMP-Net). Our solution integrates multi-granularity hierarchical information through a multi-granularity fusion learning strategy to enhance feature representation. In view of the inherent challenges of large intra-class differences in FGVC, a cross-layer self-distillation regularization is proposed to to strengthen the connection between high-level semantics and low-level semantics for robust multi-granularity feature learning. On this basis, we use a weakly supervised method to generate local branches, and the collaborative learning of discriminative semantics and structural semantics based on local regions, facilitating model to perceive contextual information to capture structural interactions between local semantics. Comprehensive experiments show that our method achieves state-of-the-art performance on four widely-used challenging datasets.(CUB-200-2011, Stanford Cars, FGVC-Aircraft and NA-birds). … (more)
- Is Part Of:
- Pattern recognition. Volume 137(2023)
- Journal:
- Pattern recognition
- Issue:
- Volume 137(2023)
- Issue Display:
- Volume 137, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 137
- Issue:
- 2023
- Issue Sort Value:
- 2023-0137-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-05
- Subjects:
- Fine-grained visual classification -- Multi-granularity feature learning -- Knowledge distillation -- Structure modeling
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2023.109305 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25689.xml