Boundary-induced and scene-aggregated network for monocular depth prediction. (July 2021)
- Record Type:
- Journal Article
- Title:
- Boundary-induced and scene-aggregated network for monocular depth prediction. (July 2021)
- Main Title:
- Boundary-induced and scene-aggregated network for monocular depth prediction
- Authors:
- Xue, Feng
Cao, Junfeng
Zhou, Yu
Sheng, Fei
Wang, Yankai
Ming, Anlong - Abstract:
- Highlights: The problems of predicting the incorrect farthest region and the blurred depth around boundaries are deeply explored. The Boundary-induced and scene-aggregated network is proposed to address the two issues above. A well-designed DCE obtains the correlations between long-distance pixels and the correlations between multi-scale regions. To extract the depth boundary, a BUBF module is designed to gradually fuse features of adjacent levels. A Stripe Refinement Module (SRM) is designed to refine depth around the boundary. Numerous experiments on the NYUD v2, iBims-1, SUN-RGBD dataset prove the effectiveness of our method. Abstract: Monocular depth prediction is an important task in scene understanding. It aims to predict the dense depth of a single RGB image. With the development of deep learning, the performance of this task has made great improvements. However, two issues remain unresolved: (1) The deep feature encodes the wrong farthest region in a scene, which leads to a distorted 3D structure of the predicted depth; (2) The low-level features are insufficient utilized, which makes it even harder to estimate the depth near the edge with sudden depth change. To tackle these two issues, we propose the Boundary-induced and Scene-aggregated network (BS-Net). In this network, the Depth Correlation Encoder (DCE) is first designed to obtain the contextual correlations between the regions in an image, and perceive the farthest region by considering the correlations.Highlights: The problems of predicting the incorrect farthest region and the blurred depth around boundaries are deeply explored. The Boundary-induced and scene-aggregated network is proposed to address the two issues above. A well-designed DCE obtains the correlations between long-distance pixels and the correlations between multi-scale regions. To extract the depth boundary, a BUBF module is designed to gradually fuse features of adjacent levels. A Stripe Refinement Module (SRM) is designed to refine depth around the boundary. Numerous experiments on the NYUD v2, iBims-1, SUN-RGBD dataset prove the effectiveness of our method. Abstract: Monocular depth prediction is an important task in scene understanding. It aims to predict the dense depth of a single RGB image. With the development of deep learning, the performance of this task has made great improvements. However, two issues remain unresolved: (1) The deep feature encodes the wrong farthest region in a scene, which leads to a distorted 3D structure of the predicted depth; (2) The low-level features are insufficient utilized, which makes it even harder to estimate the depth near the edge with sudden depth change. To tackle these two issues, we propose the Boundary-induced and Scene-aggregated network (BS-Net). In this network, the Depth Correlation Encoder (DCE) is first designed to obtain the contextual correlations between the regions in an image, and perceive the farthest region by considering the correlations. Meanwhile, the Bottom-Up Boundary Fusion (BUBF) module is designed to extract accurate boundary that indicates depth change. Finally, the Stripe Refinement module (SRM) is designed to refine the dense depth induced by the boundary cue, which improves the boundary accuracy of the predicted depth. Several experimental results on the NYUD v2 dataset and the iBims-1 dataset illustrate the state-of-the-art performance of the proposed approach. And the SUN-RGBD dataset is employed to evaluate the generalization of our method. Code is available at https://github.com/XuefengBUPT/BS-Net . … (more)
- Is Part Of:
- Pattern recognition. Volume 115(2021)
- Journal:
- Pattern recognition
- Issue:
- Volume 115(2021)
- Issue Display:
- Volume 115, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 115
- Issue:
- 2021
- Issue Sort Value:
- 2021-0115-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-07
- Subjects:
- Monocular depth prediction -- Boundary-induced -- Depth correlation
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2021.107901 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17373.xml