Video question answering via grounded cross-attention network learning. Issue 4 (July 2020)

Record Type:: Journal Article
Title:: Video question answering via grounded cross-attention network learning. Issue 4 (July 2020)
Main Title:: Video question answering via grounded cross-attention network learning
Authors:: Ye, Yunan
Zhang, Shifeng
Li, Yimeng
Qian, Xufeng
Tang, Siliang
Pu, Shiliang
Xiao, Jun
Abstract:: Highlights: We study the problem of video question answering from the viewpoint of modeling the rough video representation and the grounded video representation. The joint question-video representation based on rough representation and grounded representation of video is learned for answer predicting. We propose the grounded cross-attention network learning framework, which is a novel hierarchical cross-attention method with a Q - O cross-attention layer and a Q - V - H cross-attention layer. The proposed GCANet adopts a novel mutual attention learning mechanism. We construct two large-scale datasets for video question answering. The extensive experiments validate the effectiveness of our method. Abstract: Video Question Answering is a burgeoning and challenging task in visual information retrieval (VIR), which automatically generates the answer to a question based on referenced video content. Different from the existing visual question answering methods which mainly focus on static image content, video question answering takes temporal dimension into account because of the essential difference in the structure between image and video. In this paper, we study the problem of video question answering from the viewpoint of grounded cross-attention network learning. Specifically, we propose a novel hierarchical cross-attention mechanism of mutual attention learning for video question answering, named as GCANet. We first obtain the multi-level rough video representation from … (more)
Is Part Of:: Information processing & management. Volume 57:Issue 4(2020:Jul.)
Journal:: Information processing & management
Issue:: Volume 57:Issue 4(2020:Jul.)
Issue Display:: Volume 57, Issue 4 (2020)
Year:: 2020
Volume:: 57
Issue:: 4
Issue Sort Value:: 2020-0057-0004-0000
Page Start:
Page End:
Publication Date:: 2020-07
Subjects:: Visual information retrieval -- Video question answering -- Cross-attention
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038
Journal URLs:: http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.ipm.2020.102265 ↗
Languages:: English
ISSNs:: 0306-4573
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 20467.xml