Revisiting image captioning via maximum discrepancy competition. (February 2022)

Record Type:: Journal Article
Title:: Revisiting image captioning via maximum discrepancy competition. (February 2022)
Main Title:: Revisiting image captioning via maximum discrepancy competition
Authors:: Wan, Boyang
Jiang, Wenhui
Fang, Yu-Ming
Zhu, Minwei
Li, Qin
Liu, Yang
Abstract:: Highlights: We propose a new model comparison method without an unaffordable large-scale subjective annotation experiment. A new similarity function named NGSM is proposed as a semantic distance measure to model discrepancy of captions. With this NGSM, the informative images can be selected effectively from an arbitrary large-scale raw image dataset. We demonstrate quantitative results of the generalization ability of the competing ICMs and provide detailed analysis about the key factor of improving the generalization ability of ICMs. Abstract: Image captioning is a hot research topic bridging computer vision and natural language processing during the past several decades. It has achieved great progress with the help of large-scale datasets and deep learning techniques. Though the variety of image captioning models (ICMs), the performance of ICMs have got stuck in a bottleneck judging from the publicly published results. Considering the marginal performance gains brought by recent ICMs, we raise the following question: "what about the performances of the recent ICMs achieve on in-the-wild images? To clarify this question, we compare existing ICMs by evaluating their generalization ability. Specifically, we propose a novel method based on maximum discrepancy competition to diagnose existing ICMs. Firstly, we establish a new test set containing only informative images selected by adopting maximum discrepancy competition on the existing ICMs, from an arbitrary large-scale raw … (more)
Is Part Of:: Pattern recognition. Volume 122(2022)
Journal:: Pattern recognition
Issue:: Volume 122(2022)
Issue Display:: Volume 122, Issue 2022 (2022)
Year:: 2022
Volume:: 122
Issue:: 2022
Issue Sort Value:: 2022-0122-2022-0000
Page Start:
Page End:
Publication Date:: 2022-02
Subjects:: Image captioning -- Model comparison -- Attention mechanism
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4
Journal URLs:: http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗
DOI:: 10.1016/j.patcog.2021.108358 ↗
Languages:: English
ISSNs:: 0031-3203
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 19718.xml