Video super-resolution via mixed spatial-temporal convolution and selective fusion. (June 2022)
- Record Type:
- Journal Article
- Title:
- Video super-resolution via mixed spatial-temporal convolution and selective fusion. (June 2022)
- Main Title:
- Video super-resolution via mixed spatial-temporal convolution and selective fusion
- Authors:
- Sun, Wei
Gong, Dong
Shi, Javen Qinfeng
van den Hengel, Anton
Zhang, Yanning - Abstract:
- Highlights: A Mixed Spatial-Temporal Convolution block combines spatial-temporal information. Selective feature fusion strategy ensures the effectiveness of feature merging. Motion compensation alleviates the influence of misalignments. Our approach can achieve better performance on various datasets. Abstract: Video super-resolution aims to recover the high-resolution (HR) contents from the low-resolution (LR) observations relying on compositing the spatial-temporal information in the LR frames. It is crucial to model the spatial-temporal information jointly since the video sequences are three-dimensional spatial-temporal signals. Compared with explicitly estimating motions between the 2D frames, 3D convolutional neural networks (CNNs) have been shown its efficiency and effectiveness for video super-resolution (SR), as a natural way of spatial-temporal data modelling. Though promising, the performance of 3D CNNs is still far from satisfactory. The high computational and memory requirements limit the development of more advanced designs to extract and fuse the information from a larger spatial and temporal scale. We thus propose a Mixed Spatial-Temporal Convolution (MSTC) block that simultaneously extracts the spatial information and the supplemented temporal dependency among frames by jointly applying 2D and 3D convolution. To further fuse the learned features corresponding to different frames, we propose a novel similarity-based selective features strategy, unlike preciousHighlights: A Mixed Spatial-Temporal Convolution block combines spatial-temporal information. Selective feature fusion strategy ensures the effectiveness of feature merging. Motion compensation alleviates the influence of misalignments. Our approach can achieve better performance on various datasets. Abstract: Video super-resolution aims to recover the high-resolution (HR) contents from the low-resolution (LR) observations relying on compositing the spatial-temporal information in the LR frames. It is crucial to model the spatial-temporal information jointly since the video sequences are three-dimensional spatial-temporal signals. Compared with explicitly estimating motions between the 2D frames, 3D convolutional neural networks (CNNs) have been shown its efficiency and effectiveness for video super-resolution (SR), as a natural way of spatial-temporal data modelling. Though promising, the performance of 3D CNNs is still far from satisfactory. The high computational and memory requirements limit the development of more advanced designs to extract and fuse the information from a larger spatial and temporal scale. We thus propose a Mixed Spatial-Temporal Convolution (MSTC) block that simultaneously extracts the spatial information and the supplemented temporal dependency among frames by jointly applying 2D and 3D convolution. To further fuse the learned features corresponding to different frames, we propose a novel similarity-based selective features strategy, unlike precious methods directly stacking the learned features. Additionally, an attention-based motion compensation module is applied to alleviate the influence of misalignment between frames. Experiments on three widely used benchmark datasets and real-world dataset show that, relying on superior feature extraction and fusion ability, the proposed network can outperform previous state-of-the-art methods, especially for recovering the confusing details. … (more)
- Is Part Of:
- Pattern recognition. Volume 126(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 126(2022)
- Issue Display:
- Volume 126, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 126
- Issue:
- 2022
- Issue Sort Value:
- 2022-0126-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Video super-Resolution -- Mixed spatial-Temporal convolution -- Selective feature fusion
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.108577 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22254.xml