Relation-aware attention for video captioning via graph learning. (April 2023)

Record Type:: Journal Article
Title:: Relation-aware attention for video captioning via graph learning. (April 2023)
Main Title:: Relation-aware attention for video captioning via graph learning
Authors:: Tu, Yunbin
Zhou, Chang
Guo, Junjun
Li, Huafeng
Gao, Shengxiang
Yu, Zhengtao
Abstract:: Highlights: We improve the conventional attention mechanism to a relation-aware attention mechanism via graph learning, which aims to 1) support proper semantic alignment between target word states and attended visual features and 2) leverage the attention information from the past and future to guide the current attention process. A linguistics-to-vision heterogeneous graph (HTG) is learned to enhance the inter-relations between target word states and attended visual features. Moreover, a vision-to-vision homogeneous graph (HMG) is learned to capture the intra-relations among all the attended visual features. Experimental results on two benchmark datasets prove that our model performs much better than many state-of-the-art methods. Abstract: Video captioning often uses an attentive encoder-decoder as the baseline model. However, the conventional attention mechanism still remains two problems. First, the attended visual feature is often irrelevant to the target word state, because the attention process only uses the unidirectional flow from vision to linguistics, while lacking the reverse flow. Second, each attention result is independent, because it is computed only based on the previous word states while not considering the attention information from the past and future. This does not suit the attention habits of human beings. In this paper, we improve the conventional attention mechanism to a relation-aware attention mechanism. To this end, we propose two kinds of graph … (more)
Is Part Of:: Pattern recognition. Volume 136(2023)
Journal:: Pattern recognition
Issue:: Volume 136(2023)
Issue Display:: Volume 136, Issue 2023 (2023)
Year:: 2023
Volume:: 136
Issue:: 2023
Issue Sort Value:: 2023-0136-2023-0000
Page Start:
Page End:
Publication Date:: 2023-04
Subjects:: Video captioning -- Relation-aware attention -- Graph learning
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4
Journal URLs:: http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗
DOI:: 10.1016/j.patcog.2022.109204 ↗
Languages:: English
ISSNs:: 0031-3203
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 25681.xml