BERT-hLSTMs: BERT and hierarchical LSTMs for visual storytelling. (May 2021)

Record Type:: Journal Article
Title:: BERT-hLSTMs: BERT and hierarchical LSTMs for visual storytelling. (May 2021)
Main Title:: BERT-hLSTMs: BERT and hierarchical LSTMs for visual storytelling
Authors:: Su, Jing
Dai, Qingyun
Guerin, Frank
Zhou, Mian
Abstract:: Highlights: We propose a novel end-to-end BERT-dcLSTM framework to automatically generate coherent descriptions for sequential images. Firstly, we have exploited a pre-training (BERT) model to obtain sentence representation and word representation of the dataset which can efficiently enrich the meaning of sentences. Furthermore, a dual chain LSTM (dcLSTM) has been used to learn the relations between sequential images and corresponding descriptions and generate more coherent descriptions for sequential images. We have evaluated the performance of our proposed approach on the Sequential Image Narrative Dataset (SIND). Experimental results show that our proposed model outperforms the other closely related baselines under the automatic metrics BLEU and CIDEr, and can generate more consistent descriptions and efficiently learn the dependencies between images. Abstract: Visual storytelling is a creative and challenging task, aiming to automatically generate a story-like description for a sequence of images. The descriptions generated by previous visual storytelling approaches lack coherence because they use word-level sequence generation methods and do not adequately consider sentence-level dependencies. To tackle this problem, we propose a novel hierarchical visual storytelling framework which separately models sentence-level and word-level semantics. We use the transformer-based BERT to obtain embeddings for sentences and words. We then employ a hierarchical LSTM network: the … (more)
Is Part Of:: Computer speech & language. Volume 67(2021)
Journal:: Computer speech & language
Issue:: Volume 67(2021)
Issue Display:: Volume 67, Issue 2021 (2021)
Year:: 2021
Volume:: 67
Issue:: 2021
Issue Sort Value:: 2021-0067-2021-0000
Page Start:
Page End:
Publication Date:: 2021-05
Subjects:: Visual storytelling -- BERT -- Hierarchical LSTMs -- Sentence vector
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2020.101169 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 15407.xml