Real-time and light-weighted unsupervised video object segmentation network. (December 2021)
- Record Type:
- Journal Article
- Title:
- Real-time and light-weighted unsupervised video object segmentation network. (December 2021)
- Main Title:
- Real-time and light-weighted unsupervised video object segmentation network
- Authors:
- Zhao, Zongji
Zhao, Sanyuan
Shen, Jianbing - Abstract:
- Highlights: A real time unsupervised video object segmentation network without manually labeled segmentation mask for the first frame. The Dynamic ASPP module dynamically selects features with different scales and greatly reduce the parameters. The RNN-Conv module can be stacked for higher level spatiotemporal patterns, preventing gradient from explosion and disappearance. Abstract: Video object segmentation is one of the most practical computer vision tasks, especially in the unsupervised case, which has no manually labeled segmentation mask at the beginning of a video sequence. In this paper, we propose a new real-time unsupervised video object segmentation network. Based on the encoder-decoder framework, we present a Dynamic ASPP module and a RNN-Conv module. The former adds a dynamic selection mechanism into the Astrous Spatial Pyramid Pooling structure, and then the dilated convolutional kernels adaptively select appropriate features according to the scales by the channel attention mechanism. Compared with directly concatenating the dilated convolutional features, dynamically selecting feature maps reduces the amount of parameters and makes the module more efficient. The RNN-Conv module incorporates the RNN units with external convolutional blocks, aggregating the temporal features of a video sequence with the spatial information extracted by the convolutional network. We stack this module to extract deeper spatiotemporal features than the traditional RNN network. ThisHighlights: A real time unsupervised video object segmentation network without manually labeled segmentation mask for the first frame. The Dynamic ASPP module dynamically selects features with different scales and greatly reduce the parameters. The RNN-Conv module can be stacked for higher level spatiotemporal patterns, preventing gradient from explosion and disappearance. Abstract: Video object segmentation is one of the most practical computer vision tasks, especially in the unsupervised case, which has no manually labeled segmentation mask at the beginning of a video sequence. In this paper, we propose a new real-time unsupervised video object segmentation network. Based on the encoder-decoder framework, we present a Dynamic ASPP module and a RNN-Conv module. The former adds a dynamic selection mechanism into the Astrous Spatial Pyramid Pooling structure, and then the dilated convolutional kernels adaptively select appropriate features according to the scales by the channel attention mechanism. Compared with directly concatenating the dilated convolutional features, dynamically selecting feature maps reduces the amount of parameters and makes the module more efficient. The RNN-Conv module incorporates the RNN units with external convolutional blocks, aggregating the temporal features of a video sequence with the spatial information extracted by the convolutional network. We stack this module to extract deeper spatiotemporal features than the traditional RNN network. This module helps to avoid the gradient disappearance and explosion during network training. We test our network on the popular video object segmentation datasets. The experiment results demonstrate the effectiveness of our model. 1 … (more)
- Is Part Of:
- Pattern recognition. Volume 120(2021)
- Journal:
- Pattern recognition
- Issue:
- Volume 120(2021)
- Issue Display:
- Volume 120, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 120
- Issue:
- 2021
- Issue Sort Value:
- 2021-0120-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-12
- Subjects:
- Unsupervised video object segmentation -- Salient object detection
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2021.108120 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 18480.xml