Weakly-supervised action localization via embedding-modeling iterative optimization. (May 2021)
- Record Type:
- Journal Article
- Title:
- Weakly-supervised action localization via embedding-modeling iterative optimization. (May 2021)
- Main Title:
- Weakly-supervised action localization via embedding-modeling iterative optimization
- Authors:
- Zhang, Xiao-Yu
Shi, Haichao
Li, Changsheng
Li, Peng
Li, Zekun
Ren, Peng - Abstract:
- Highlights: An iterative optimizing mechanism is designed for effective action recognition with trimmed-untrimmed parallel modeling. A shared subspace embedding is proposed with generative adversarial networks for coherent knowledge transfer. A two-stage self-attentive representation learning workflow is developed to capture the fine-grained frame-level relevance. Extensive experiments are conducted on two challenging untrimmed video datasets with superior results. Abstract: Action recognition and localization in untrimmed videos in weakly supervised scenario is a challenging problem of great application prospects. Limited by the information available in video-level labels, it is a promising attempt to fully leverage the instructive knowledge learned on trimmed videos to facilitate analysis of untrimmed videos, considering that there are abundant trimmed videos which are publicly available and well segmented with semantic descriptions. In order to enforce effective trimmed-untrimmed augmentation, this paper presents a novel framework of embedding-modeling iterative optimization network, referred to as IONet. In the proposed method, action classification modeling and shared subspace embedding are learned jointly in an iterative way, so that robust cross-domain knowledge transfer is achieved. With a carefully designed two-stage self-attentive representation learning workflow for untrimmed videos, irrelevant backgrounds are eliminated and fine-grained temporal relevance can beHighlights: An iterative optimizing mechanism is designed for effective action recognition with trimmed-untrimmed parallel modeling. A shared subspace embedding is proposed with generative adversarial networks for coherent knowledge transfer. A two-stage self-attentive representation learning workflow is developed to capture the fine-grained frame-level relevance. Extensive experiments are conducted on two challenging untrimmed video datasets with superior results. Abstract: Action recognition and localization in untrimmed videos in weakly supervised scenario is a challenging problem of great application prospects. Limited by the information available in video-level labels, it is a promising attempt to fully leverage the instructive knowledge learned on trimmed videos to facilitate analysis of untrimmed videos, considering that there are abundant trimmed videos which are publicly available and well segmented with semantic descriptions. In order to enforce effective trimmed-untrimmed augmentation, this paper presents a novel framework of embedding-modeling iterative optimization network, referred to as IONet. In the proposed method, action classification modeling and shared subspace embedding are learned jointly in an iterative way, so that robust cross-domain knowledge transfer is achieved. With a carefully designed two-stage self-attentive representation learning workflow for untrimmed videos, irrelevant backgrounds are eliminated and fine-grained temporal relevance can be robustly explored. Extensive experiments are conducted on two benchmark datasets, i.e., THUMOS14 and ActivityNet1.3, and experimental results clearly corroborate the efficacy of our method. Source code is available on GitHub . 2 . … (more)
- Is Part Of:
- Pattern recognition. Volume 113(2021)
- Journal:
- Pattern recognition
- Issue:
- Volume 113(2021)
- Issue Display:
- Volume 113, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 113
- Issue:
- 2021
- Issue Sort Value:
- 2021-0113-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-05
- Subjects:
- Action recognition -- Temporal action localization -- Attention mechanism -- Generative adversarial networks -- Subspace embedding
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2021.107831 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15807.xml