TPSN: Transformer-based multi-Prototype Search Network for few-shot semantic segmentation. (October 2022)
- Record Type:
- Journal Article
- Title:
- TPSN: Transformer-based multi-Prototype Search Network for few-shot semantic segmentation. (October 2022)
- Main Title:
- TPSN: Transformer-based multi-Prototype Search Network for few-shot semantic segmentation
- Authors:
- Wang, Wenjian
Duan, Lijuan
En, Qing
Zhang, Baochang
Liang, Fangfang - Abstract:
- Abstract: Few-shot semantic segmentation aims to segment new objects in the image with limited annotations. Typically, in metric-based few-shot learning, the expression of categories is obtained by averaging global support object information. However, a single prototype cannot accurately describe a category. Meanwhile, simple foreground averaging operations also ignore the dependencies between objects and their surroundings. In this paper, we propose a novel Transformer-based Prototype Search Network (TPSN) for few-shot segmentation. We use the transformer encoder to integrate information between different image regions and then use the decoder to express a category in terms of multiple prototypes. The multi-prototype approach can effectively alleviate the feature fluctuation caused by limited annotation data. Moreover, we use adaptive prototype search during multi-prototype extraction instead of the ordinary averaging operation compared with the previous few-shot prototype framework. This helps the network integrate the different image regions' information and fuse object features with their dependent background information, obtaining more reasonable prototype expressions. In addition, to encourage the category's prototypes to focus on different parts and maintain consistency in high-level semantics, we use the diversity and consistency loss to constrain the multi-prototype training. Experiments show that our algorithm achieves state-of-the-art performance in few-shotAbstract: Few-shot semantic segmentation aims to segment new objects in the image with limited annotations. Typically, in metric-based few-shot learning, the expression of categories is obtained by averaging global support object information. However, a single prototype cannot accurately describe a category. Meanwhile, simple foreground averaging operations also ignore the dependencies between objects and their surroundings. In this paper, we propose a novel Transformer-based Prototype Search Network (TPSN) for few-shot segmentation. We use the transformer encoder to integrate information between different image regions and then use the decoder to express a category in terms of multiple prototypes. The multi-prototype approach can effectively alleviate the feature fluctuation caused by limited annotation data. Moreover, we use adaptive prototype search during multi-prototype extraction instead of the ordinary averaging operation compared with the previous few-shot prototype framework. This helps the network integrate the different image regions' information and fuse object features with their dependent background information, obtaining more reasonable prototype expressions. In addition, to encourage the category's prototypes to focus on different parts and maintain consistency in high-level semantics, we use the diversity and consistency loss to constrain the multi-prototype training. Experiments show that our algorithm achieves state-of-the-art performance in few-shot segmentation on two datasets: PASCAL- 5 i and COCO- 2 0 i . Graphical abstract: Highlights: Multi-prototype gets better segmentation results compared with single-prototype. Transform annotated mask information to learnable Guide-Embedding can provide more detailed support for few-shot learning. The communication of image region feature can effectively help the model get context environment context around categories. … (more)
- Is Part Of:
- Computers & electrical engineering. Volume 103(2022)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 103(2022)
- Issue Display:
- Volume 103, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 103
- Issue:
- 2022
- Issue Sort Value:
- 2022-0103-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-10
- Subjects:
- Semantic segmentation -- Few-shot learning -- Vision transformer -- Multiple prototypes
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2022.108326 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24061.xml