Combination of temporal‐channels correlation information and bilinear feature for action recognition. Issue 8 (23rd November 2020)
- Record Type:
- Journal Article
- Title:
- Combination of temporal‐channels correlation information and bilinear feature for action recognition. Issue 8 (23rd November 2020)
- Main Title:
- Combination of temporal‐channels correlation information and bilinear feature for action recognition
- Authors:
- Cai, Jiahui
Hu, Jianguo
Li, Shiren
Lin, Jialing
Wang, Jun - Abstract:
- Abstract : In this study, the authors focus on improving the spatio–temporal representation ability of three‐dimensional (3D) convolutional neural networks (CNNs) in the video domain. They observe two unfavourable issues: (i) the convolutional filters only dedicate to learning local representation along input channels. Also they treat channel‐wise features equally, without emphasising the important features; (ii) traditional global average pooling layer only captures first‐order statistics, ignoring finer detail features useful for classification. To mitigate these problems, they proposed two modules to boost 3D CNNs' performance, which are temporal‐channel correlation (TCC) and bilinear pooling module. The TCC module can capture the information of inter‐channel correlations over the temporal domain. Moreover, the TCC module generates channel‐wise dependencies, which can adaptively re‐weight the channel‐wise features. Therefore, the network can focus on learning important features. With regards to the bilinear pooling module, it can capture more complex second‐order statistics in deep features and generate a second‐order classification vector. We can get more accurate classification results by combining the first‐order and second‐order classification vector. Extensive experiments show that adding our proposed modules to I3D network could consistently improve the performance and outperform the state‐of‐the‐art methods. The code and models are available atAbstract : In this study, the authors focus on improving the spatio–temporal representation ability of three‐dimensional (3D) convolutional neural networks (CNNs) in the video domain. They observe two unfavourable issues: (i) the convolutional filters only dedicate to learning local representation along input channels. Also they treat channel‐wise features equally, without emphasising the important features; (ii) traditional global average pooling layer only captures first‐order statistics, ignoring finer detail features useful for classification. To mitigate these problems, they proposed two modules to boost 3D CNNs' performance, which are temporal‐channel correlation (TCC) and bilinear pooling module. The TCC module can capture the information of inter‐channel correlations over the temporal domain. Moreover, the TCC module generates channel‐wise dependencies, which can adaptively re‐weight the channel‐wise features. Therefore, the network can focus on learning important features. With regards to the bilinear pooling module, it can capture more complex second‐order statistics in deep features and generate a second‐order classification vector. We can get more accurate classification results by combining the first‐order and second‐order classification vector. Extensive experiments show that adding our proposed modules to I3D network could consistently improve the performance and outperform the state‐of‐the‐art methods. The code and models are available at https://github.com/caijh33/I3D_TCC_Bilinear . … (more)
- Is Part Of:
- IET computer vision. Volume 14:Issue 8(2020)
- Journal:
- IET computer vision
- Issue:
- Volume 14:Issue 8(2020)
- Issue Display:
- Volume 14, Issue 8 (2020)
- Year:
- 2020
- Volume:
- 14
- Issue:
- 8
- Issue Sort Value:
- 2020-0014-0008-0000
- Page Start:
- 634
- Page End:
- 641
- Publication Date:
- 2020-11-23
- Subjects:
- learning (artificial intelligence) -- vectors -- object recognition -- image representation -- image classification -- stereo image processing -- convolutional neural nets -- correlation methods -- higher order statistics -- video signal processing
temporal‐channels correlation information -- bilinear feature -- three‐dimensional convolutional neural networks -- video domain -- 3D CNN -- convolutional filters -- channel‐wise features -- first‐order statistics -- temporal‐channel correlation -- bilinear pooling module -- TCC module -- channel‐wise dependencies -- second‐order statistics -- deep features -- second‐order classification vector -- first‐order classification vector -- spatio–temporal representation -- I3D network
Computer vision -- Periodicals
Pattern recognition systems -- Periodicals
006.37 - Journal URLs:
- http://digital-library.theiet.org/content/journals/iet-cvi ↗
http://www.ietdl.org/IET-CVI ↗
https://ietresearch.onlinelibrary.wiley.com/journal/17519640 ↗
http://www.theiet.org/ ↗ - DOI:
- 10.1049/iet-cvi.2020.0023 ↗
- Languages:
- English
- ISSNs:
- 1751-9632
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4363.252250
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 16685.xml