A hierarchical model for learning to understand head gesture videos. (January 2022)
- Record Type:
- Journal Article
- Title:
- A hierarchical model for learning to understand head gesture videos. (January 2022)
- Main Title:
- A hierarchical model for learning to understand head gesture videos
- Authors:
- Li, Jiachen
Xu, Songhua
Qin, Xueying - Abstract:
- Highlights: Propose a hierarchical model to understand head gesture videos. Utilize the multi-task learning framework. Compress features using stacked BLSTM. Propose several applications in multiple fields. Abstract: Head gesture videos recorded of a person bear rich information about the individual. Automatically understanding these videos can empower many useful human-centered applications in areas such as smart health, education, work safety and security. To understand a video's content, low-level head gesture signals carried in the video that capture characteristics of both human postures and motions need to be translated into high-level semantic labels. To meet this aim, we propose a hierarchical model for learning to understand head gesture videos. Given a head gesture video of an arbitrary length, the model first segments the full-length video into multiple short clips for clip-based feature extraction. Multiple base feature extraction procedures are then independently tuned via a set of peripheral learning tasks without consuming any labels of the goal task. These independently derived base features are subsequently aggregated through a multi-task learning framework, coupled with a feature dimensionality reduction module, to optimally learn to accomplish the end video understanding task in an weakly supervised manner, utilizing the limited amount of video labels available of the goal task. Experimental results show that the hierarchical model is superior to multipleHighlights: Propose a hierarchical model to understand head gesture videos. Utilize the multi-task learning framework. Compress features using stacked BLSTM. Propose several applications in multiple fields. Abstract: Head gesture videos recorded of a person bear rich information about the individual. Automatically understanding these videos can empower many useful human-centered applications in areas such as smart health, education, work safety and security. To understand a video's content, low-level head gesture signals carried in the video that capture characteristics of both human postures and motions need to be translated into high-level semantic labels. To meet this aim, we propose a hierarchical model for learning to understand head gesture videos. Given a head gesture video of an arbitrary length, the model first segments the full-length video into multiple short clips for clip-based feature extraction. Multiple base feature extraction procedures are then independently tuned via a set of peripheral learning tasks without consuming any labels of the goal task. These independently derived base features are subsequently aggregated through a multi-task learning framework, coupled with a feature dimensionality reduction module, to optimally learn to accomplish the end video understanding task in an weakly supervised manner, utilizing the limited amount of video labels available of the goal task. Experimental results show that the hierarchical model is superior to multiple state-of-the-art peer methods in tackling versatile video understanding tasks. … (more)
- Is Part Of:
- Pattern recognition. Volume 121(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 121(2022)
- Issue Display:
- Volume 121, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 121
- Issue:
- 2022
- Issue Sort Value:
- 2022-0121-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-01
- Subjects:
- Video understanding -- Head gesture videos -- Stacked BLSTM -- Multi-task learning -- Transfer learning
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2021.108256 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 23804.xml