Spatiotemporal distilled dense-connectivity network for video action recognition. (August 2019)
- Record Type:
- Journal Article
- Title:
- Spatiotemporal distilled dense-connectivity network for video action recognition. (August 2019)
- Main Title:
- Spatiotemporal distilled dense-connectivity network for video action recognition
- Authors:
- Hao, Wangli
Zhang, Zhaoxiang - Abstract:
- Highlights: We propose a novel Spatiotemporal Distilled Dense-Connectivity Network (SDDN) for action recognition. We propose to generalize dense-connectivity into spatiotemporal domain 5 via building block-level dense connections between appearance and motion streams, permitting effective spatiotemporal interaction at the feature rep- resentation layers. We propose a novel knowledge distillation module, which is composed of two students and a teacher, allowing appearance and motion streams to 10 interact effectively at the high level layers. Our model obtains promising performance in action recognition on two benchmark datasets, including UCF101 and HMDB51 respectively. Abstract: Two-stream convolutional neural networks show great promise for action recognition tasks. However, most two-stream based approaches train the appearance and motion subnetworks independently, which may lead to the decline in performance due to the lack of interactions among two streams. To overcome this limitation, we propose a Spatiotemporal Distilled Dense-Connectivity Network (STDDCN) for video action recognition. This network implements both knowledge distillation and dense-connectivity (adapted from DenseNet). Using this STDDCN architecture, we aim to explore interaction strategies between appearance and motion streams along different hierarchies. Specifically, block-level dense connections between appearance and motion pathways enable spatiotemporal interaction at the feature representationHighlights: We propose a novel Spatiotemporal Distilled Dense-Connectivity Network (SDDN) for action recognition. We propose to generalize dense-connectivity into spatiotemporal domain 5 via building block-level dense connections between appearance and motion streams, permitting effective spatiotemporal interaction at the feature rep- resentation layers. We propose a novel knowledge distillation module, which is composed of two students and a teacher, allowing appearance and motion streams to 10 interact effectively at the high level layers. Our model obtains promising performance in action recognition on two benchmark datasets, including UCF101 and HMDB51 respectively. Abstract: Two-stream convolutional neural networks show great promise for action recognition tasks. However, most two-stream based approaches train the appearance and motion subnetworks independently, which may lead to the decline in performance due to the lack of interactions among two streams. To overcome this limitation, we propose a Spatiotemporal Distilled Dense-Connectivity Network (STDDCN) for video action recognition. This network implements both knowledge distillation and dense-connectivity (adapted from DenseNet). Using this STDDCN architecture, we aim to explore interaction strategies between appearance and motion streams along different hierarchies. Specifically, block-level dense connections between appearance and motion pathways enable spatiotemporal interaction at the feature representation layers. Moreover, knowledge distillation among two streams (each treated as a student) and their last fusion (treated as teacher) allows both streams to interact at the high level layers. The special architecture of STDDCN allows it to gradually obtain effective hierarchical spatiotemporal features. Moreover, it can be trained end-to-end. Finally, numerous ablation studies validate the effectiveness and generalization of our model on two benchmark datasets, including UCF101 and HMDB51. Simultaneously, our model achieves promising performances. … (more)
- Is Part Of:
- Pattern recognition. Volume 92(2019:Aug.)
- Journal:
- Pattern recognition
- Issue:
- Volume 92(2019:Aug.)
- Issue Display:
- Volume 92 (2019)
- Year:
- 2019
- Volume:
- 92
- Issue Sort Value:
- 2019-0092-0000-0000
- Page Start:
- 13
- Page End:
- 24
- Publication Date:
- 2019-08
- Subjects:
- Two-stream -- Action recognition -- Dense-connectivity -- Knowledge distillation
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2019.03.005 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9993.xml