CMA: Cross-modal attention for 6D object pose estimation. (June 2021)

Record Type:: Journal Article
Title:: CMA: Cross-modal attention for 6D object pose estimation. (June 2021)
Main Title:: CMA: Cross-modal attention for 6D object pose estimation
Authors:: Zou, Lu
Huang, Zhangjin
Wang, Fangjun
Yang, Zhouwang
Wang, Guoping
Abstract:: Highlights: We present CMA, a novel cross-modal data fusion approach that incorporates the attention mechanism. CMA extracts discriminative cross-modal features which are more robust to 6D object pose estimation. We evaluate our method on two widely used datasets: LINEMOD dataset and YCB-Video dataset. Experimental results demonstrate that our method achieves superior performance on both datasets over the state-of-the-art methods as well as high-efficiency. Graphical abstract: Abstract: Deep learning methods for 6D object pose estimation based on RGB and depth (RGB-D) images have been successfully applied to robotic manipulation and grasping. Among these approaches, the fusion of RGB and depth modalities is one of the most critical issues. Most existing works performed fusion via either simple concatenation, or element-wise multiplication of the features generated by these two modalities. Despite achieving impressive progress, such fusion strategies do not explicitly consider the different contributions of RGB and depth modalities, leaving a gap for performance enhancement. In this paper, we present a Cross-Modal Attention (CMA) component for the problem of 6D object pose estimation. With the attention mechanism, features of two different modalities are aggregated adaptively through the attention weights, such that powerful representations from the RGB-D images can be efficiently extracted. Comprehensive experiments on both LINEMOD and YCB-Video datasets demonstrate that the … (more)
Is Part Of:: Computers & graphics. Volume 97(2021)
Journal:: Computers & graphics
Issue:: Volume 97(2021)
Issue Display:: Volume 97, Issue 2021 (2021)
Year:: 2021
Volume:: 97
Issue:: 2021
Issue Sort Value:: 2021-0097-2021-0000
Page Start:: 139
Page End:: 147
Publication Date:: 2021-06
Subjects:: 6D object pose estimation -- Cross-modal data fusion -- Attention mechanism
Computer graphics -- Periodicals
006.6
Journal URLs:: http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.cag.2021.04.018 ↗
Languages:: English
ISSNs:: 0097-8493
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.700000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 17245.xml