Semantic segmentation using stride spatial pyramid pooling and dual attention decoder. (November 2020)
- Record Type:
- Journal Article
- Title:
- Semantic segmentation using stride spatial pyramid pooling and dual attention decoder. (November 2020)
- Main Title:
- Semantic segmentation using stride spatial pyramid pooling and dual attention decoder
- Authors:
- Peng, Chengli
Ma, Jiayi - Abstract:
- Highlights: We propose an SSPP structure to capture multiscale semantic information. Attention mechanism is applied to bridge the information gap in segmentation networks. We propose a new decoder to make full use of the low- and high-level feature maps. Auxiliary loss is applied to make the network easier to train. Our method attains state-of-the-art performance on PASCAL VOC 2012, Cityscapes and COCO-Stuff. Abstract: Semantic segmentation is an end-to-end task that requires both semantic and spatial accuracy. It is important for deep learning-based segmentation methods to effectively utilize the high-level feature map whose semantic information is abundant and the low-level feature map whose spatial information is accurate. However, existing segmentation networks typically cannot take full advantage of these two kinds of feature maps, leading to inferior performance. This paper attempts to overcome this challenge by introducing two novel structures. On the one hand, we propose a structure called stride spatial pyramid pooling (SSPP) to capture multiscale semantic information from the high-level feature map. Compared with existing pyramid pooling methods based on the atrous convolution, the SSPP structure is able to gather more information from the high-level feature map with faster inference speed, which improves the utilization efficiency of the high-level feature map significantly. On the other hand, we propose a dual attention decoder consisting of a channel attentionHighlights: We propose an SSPP structure to capture multiscale semantic information. Attention mechanism is applied to bridge the information gap in segmentation networks. We propose a new decoder to make full use of the low- and high-level feature maps. Auxiliary loss is applied to make the network easier to train. Our method attains state-of-the-art performance on PASCAL VOC 2012, Cityscapes and COCO-Stuff. Abstract: Semantic segmentation is an end-to-end task that requires both semantic and spatial accuracy. It is important for deep learning-based segmentation methods to effectively utilize the high-level feature map whose semantic information is abundant and the low-level feature map whose spatial information is accurate. However, existing segmentation networks typically cannot take full advantage of these two kinds of feature maps, leading to inferior performance. This paper attempts to overcome this challenge by introducing two novel structures. On the one hand, we propose a structure called stride spatial pyramid pooling (SSPP) to capture multiscale semantic information from the high-level feature map. Compared with existing pyramid pooling methods based on the atrous convolution, the SSPP structure is able to gather more information from the high-level feature map with faster inference speed, which improves the utilization efficiency of the high-level feature map significantly. On the other hand, we propose a dual attention decoder consisting of a channel attention branch and a spatial attention branch to make full use of the high- and low-level feature maps simultaneously. The dual attention decoder can result in a more "semantic" low-level feature map and a high-level feature map with more accurate spatial information, which bridges the gap between these two kinds of feature maps and benefits their fusion. We evaluate the proposed model on several publicly available semantic image segmentation benchmarks including PASCAL VOC 2012, Cityscapes and COCO-Stuff. The qualitative and quantitative results demonstrate that our method can achieve the state-of-the-art performance. … (more)
- Is Part Of:
- Pattern recognition. Volume 107(2020:Nov.)
- Journal:
- Pattern recognition
- Issue:
- Volume 107(2020:Nov.)
- Issue Display:
- Volume 107 (2020)
- Year:
- 2020
- Volume:
- 107
- Issue Sort Value:
- 2020-0107-0000-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-11
- Subjects:
- Semantic segmentation -- Convolutional neural networks -- Pyramid pooling -- Attention mechanism
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2020.107498 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19199.xml