Topic-aware video summarization using multimodal transformer. (August 2023)
- Record Type:
- Journal Article
- Title:
- Topic-aware video summarization using multimodal transformer. (August 2023)
- Main Title:
- Topic-aware video summarization using multimodal transformer
- Authors:
- Zhu, Yubo
Zhao, Wentian
Hua, Rui
Wu, Xinxiao - Abstract:
- Highlights: The new task which aims to generate multiple video summaries to meet different user interests. The new TopicSum dataset contains 136 videos to support the study of this new task. Multimodal Transformer model to simultaneously predicts topics and generates topic-related summaries by fusing multimodal features. Extensive experiments on TopicSum dataset that show the effectiveness of our method on both quantitative and qualitative evaluations. Abstract: Video summarization aims to generate a short and compact summary to represent the original video. Existing methods mainly focus on how to extract a general objective synopsis that precisely summaries the video content. However, in real scenarios, a video usually contains rich content with multiple topics and people may cast diverse interests on the visual contents even for the same video. In this paper, we propose a novel topic-aware video summarization task that generates multiple video summaries with different topics. To support the study of this new task, we first build a video benchmark dataset by collecting videos from various types of movies and annotate them with topic labels and frame-level importance scores. Then we propose a multimodal Transformer model for the topic-aware video summarization, which simultaneously predicts topic labels and generates topic-related summaries by adaptively fusing multimodal features extracted from the video. Experimental results show the effectiveness of our method.
- Is Part Of:
- Pattern recognition. Volume 140(2023)
- Journal:
- Pattern recognition
- Issue:
- Volume 140(2023)
- Issue Display:
- Volume 140, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 140
- Issue:
- 2023
- Issue Sort Value:
- 2023-0140-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-08
- Subjects:
- Topic-aware video summarization -- Multimodal transformer -- Video summarization dataset
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2023.109578 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 27019.xml