Encoder deep interleaved network with multi-scale aggregation for RGB-D salient object detection. (August 2022)
- Record Type:
- Journal Article
- Title:
- Encoder deep interleaved network with multi-scale aggregation for RGB-D salient object detection. (August 2022)
- Main Title:
- Encoder deep interleaved network with multi-scale aggregation for RGB-D salient object detection
- Authors:
- Feng, Guang
Meng, Jinyu
Zhang, Lihe
Lu, Huchuan - Abstract:
- Highlights: We design a two-stream deep interleaved encoder network to obtain multi-level continuous multi-modal features for saliency detection. We utilize a cross-modal mutual guidance module to locate the salient region. We present a residual multi-scale aggregation module to combine the global-to-local context progressively. Our method performs favorably against other state-of-the-art saliency detection algorithms, and the network can run at about 93 FPS in the testing stage. Abstract: Recently, RGB-D salient object detection (SOD) has aroused widespread research interest. Existing RGB-D SOD approaches mainly consider the cross-modal information fusion in the decoder. And their multi-modal interaction mainly concentrates on the same level of features between RGB stream and depth stream. They do not deeply explore the coherence of multi-model features at different levels. In this paper, we design a two-stream deep interleaved encoder network to extract RGB and depth information and realize their mixing simultaneously. This network allows us to gradually learn multi-modal representation at different levels from shallow to deep. Moreover, to further fuse multi-modal features in the decoding stage, we propose a cross-modal mutual guidance module and a residual multi-scale aggregation module to implement the global guidance and local refinement of the salient region. Extensive experiments on six benchmark datasets demonstrate that the proposed approach performs favorablyHighlights: We design a two-stream deep interleaved encoder network to obtain multi-level continuous multi-modal features for saliency detection. We utilize a cross-modal mutual guidance module to locate the salient region. We present a residual multi-scale aggregation module to combine the global-to-local context progressively. Our method performs favorably against other state-of-the-art saliency detection algorithms, and the network can run at about 93 FPS in the testing stage. Abstract: Recently, RGB-D salient object detection (SOD) has aroused widespread research interest. Existing RGB-D SOD approaches mainly consider the cross-modal information fusion in the decoder. And their multi-modal interaction mainly concentrates on the same level of features between RGB stream and depth stream. They do not deeply explore the coherence of multi-model features at different levels. In this paper, we design a two-stream deep interleaved encoder network to extract RGB and depth information and realize their mixing simultaneously. This network allows us to gradually learn multi-modal representation at different levels from shallow to deep. Moreover, to further fuse multi-modal features in the decoding stage, we propose a cross-modal mutual guidance module and a residual multi-scale aggregation module to implement the global guidance and local refinement of the salient region. Extensive experiments on six benchmark datasets demonstrate that the proposed approach performs favorably against most state-of-the-art methods under different evaluation metrics. During the testing stage, this model can run at a real-time speed of 93 FPS. … (more)
- Is Part Of:
- Pattern recognition. Volume 128(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 128(2022)
- Issue Display:
- Volume 128, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 128
- Issue:
- 2022
- Issue Sort Value:
- 2022-0128-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-08
- Subjects:
- RGB-D salient object detection -- Deep interleaved encoder -- Cross-modal mutual guidance -- Residual multi-scale feature aggregation -- Real-time
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.108666 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22284.xml