Image caption generation with dual attention mechanism. Issue 2 (March 2020)

Record Type:: Journal Article
Title:: Image caption generation with dual attention mechanism. Issue 2 (March 2020)
Main Title:: Image caption generation with dual attention mechanism
Authors:: Liu, Maofu
Li, Lingjun
Hu, Huijun
Guan, Weili
Tian, Jing
Abstract:: Highlights: We combine visual attention and textual attention to forma dual attention mechanism to guide the image caption generation. We adopt FCN to predict image tagsand fuse tag generation and image caption generation to train encode-decode model. Our proposed model achieves state-of-the-artperformance in AIC-ICC image Chinese caption dataset. Abstract: As a crossing domain of computer vision and natural language processing, the image caption generation has been an active research topic in recent years, which contributes to the multimodal social media translation from unstructured image data to structured text data. The conventional research works have proposed a series of image captioning methods, such as template-based, retrieval-based, encode-decode. Among these methods, the one with encode-decode framework is widely used in the image caption generation, in which the encoder extracts the image features by Convolutional Neural Network (CNN), and the decoder adopts Recurrent Neural Network (RNN) to generate the image description. The Neural Image Caption (NIC) model has achieved good performance in image captioning, and however, there still remains some challenges to be addressed. To tackle the challenges of the lack of image information and the deviation from the core content of the image, our proposed model explores visual attention to deepen the understanding of the image, incorporating the image labels generated by Fully Convolutional Network (FCN) into the … (more)
Is Part Of:: Information processing & management. Volume 57:Issue 2(2020:Mar.)
Journal:: Information processing & management
Issue:: Volume 57:Issue 2(2020:Mar.)
Issue Display:: Volume 57, Issue 2 (2020)
Year:: 2020
Volume:: 57
Issue:: 2
Issue Sort Value:: 2020-0057-0002-0000
Page Start:
Page End:
Publication Date:: 2020-03
Subjects:: Image caption generation -- Textual attention -- Visual attention -- Dual attention -- Fully convolutional network
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038
Journal URLs:: http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.ipm.2019.102178 ↗
Languages:: English
ISSNs:: 0306-4573
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 12552.xml