Stacked residual blocks based encoder–decoder framework for human motion prediction. Issue 4 (29th October 2020)
- Record Type:
- Journal Article
- Title:
- Stacked residual blocks based encoder–decoder framework for human motion prediction. Issue 4 (29th October 2020)
- Main Title:
- Stacked residual blocks based encoder–decoder framework for human motion prediction
- Authors:
- Liu, Xiaoli
Yin, Jianqin - Abstract:
- Abstract : Human motion prediction is an important and challenging task in computer vision with various applications. Recurrent neural networks (RNNs) and convolutional neural networks (CNNs) have been proposed to address this challenging task. However, RNNs exhibit their limitations on long‐term temporal modelling and spatial modelling of motion signals. CNNs show their inflexible spatial and temporal modelling capability that mainly depends on a large convolutional kernel and the stride of convolutional operation. Moreover, those methods predict multiple future poses recursively, which easily suffer from noise accumulation. The authors present a new encoder–decoder framework based on the residual convolutional block with a small filter to predict future human poses, which can flexibly capture the hierarchical spatial and temporal representation of the human motion signals from the motion capture sensor. Specifically, the encoder is stacked by multiple residual convolutional blocks to hierarchically encode the spatio‐temporal features of previous poses. The decoder is built with two fully connected layers to automatically reconstruct the spatial and temporal information of future poses in a non‐recursive manner, which can avoid noise accumulation that differs from prior works. Experimental results show that the proposed method outperforms baselines on the Human3.6M dataset, which shows the effectiveness of the proposed method. The code is available atAbstract : Human motion prediction is an important and challenging task in computer vision with various applications. Recurrent neural networks (RNNs) and convolutional neural networks (CNNs) have been proposed to address this challenging task. However, RNNs exhibit their limitations on long‐term temporal modelling and spatial modelling of motion signals. CNNs show their inflexible spatial and temporal modelling capability that mainly depends on a large convolutional kernel and the stride of convolutional operation. Moreover, those methods predict multiple future poses recursively, which easily suffer from noise accumulation. The authors present a new encoder–decoder framework based on the residual convolutional block with a small filter to predict future human poses, which can flexibly capture the hierarchical spatial and temporal representation of the human motion signals from the motion capture sensor. Specifically, the encoder is stacked by multiple residual convolutional blocks to hierarchically encode the spatio‐temporal features of previous poses. The decoder is built with two fully connected layers to automatically reconstruct the spatial and temporal information of future poses in a non‐recursive manner, which can avoid noise accumulation that differs from prior works. Experimental results show that the proposed method outperforms baselines on the Human3.6M dataset, which shows the effectiveness of the proposed method. The code is available at https://github.com/lily2lab/residual_prediction_network . … (more)
- Is Part Of:
- Cognitive computation and systems. Volume 2:Issue 4(2020)
- Journal:
- Cognitive computation and systems
- Issue:
- Volume 2:Issue 4(2020)
- Issue Display:
- Volume 2, Issue 4 (2020)
- Year:
- 2020
- Volume:
- 2
- Issue:
- 4
- Issue Sort Value:
- 2020-0002-0004-0000
- Page Start:
- 242
- Page End:
- 246
- Publication Date:
- 2020-10-29
- Subjects:
- recurrent neural nets -- feature extraction -- decoding -- motion estimation -- pose estimation -- computer vision -- image coding -- convolutional neural nets
long‐term temporal modelling -- spatial modelling -- convolutional neural networks -- RNNs -- recurrent neural networks -- human motion prediction -- stacked residual blocks -- Human3.6M dataset -- temporal information -- spatial information -- spatio‐temporal features -- multiple residual convolutional blocks -- motion capture sensor -- human motion signals -- temporal representation -- hierarchical spatial representation -- future human poses -- residual convolutional block -- encoder–decoder framework -- noise accumulation -- multiple future poses -- convolutional operation -- convolutional kernel -- temporal modelling capability -- inflexible spatial capability -- CNNs
Cognitive science -- Periodicals
Artificial intelligence -- Periodicals
Neurosciences -- Periodicals
Computer science -- Periodicals
Neurosciences
Computer science
Cognitive science
Artificial intelligence
Periodicals
Electronic journals
006.3 - Journal URLs:
- https://digital-library.theiet.org/content/journals/ccs ↗
https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=8694204 ↗
https://ietresearch.onlinelibrary.wiley.com/loi/25177567 ↗
http://www.theiet.org/ ↗
https://digital-library.theiet.org/content/journals/ccs ↗ - DOI:
- 10.1049/ccs.2020.0008 ↗
- Languages:
- English
- ISSNs:
- 2517-7567
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 16401.xml