Video anomaly detection with spatio-temporal dissociation. (February 2022)
- Record Type:
- Journal Article
- Title:
- Video anomaly detection with spatio-temporal dissociation. (February 2022)
- Main Title:
- Video anomaly detection with spatio-temporal dissociation
- Authors:
- Chang, Yunpeng
Tu, Zhigang
Xie, Wei
Luo, Bin
Zhang, Shifu
Sui, Haigang
Yuan, Junsong - Abstract:
- Highlights: We propose a novel autoencoder architecture to dissociate the spatio temporal representation and learn the regularity in both the spatial and motion feature spaces to detect anomaly in videos. We design an efficient motion autoencoder, which takes consecutive video frames as input and RGB difference as output to imitate the movement of optical flow. The proposed method is much faster than the optical flow-based motion representation learning approach, where its average running time is 32fps. We exploit a variance attention module to automatically assign an importance weight to the moving part of video clips, which is useful to improve the performance of the motion autoencoder. To learn the normality in both the spatial and motion feature spaces, we concatenate these representations extracted from the two streams at the same spatial location, and optimize the two streams and the deep K-means cluster jointly with the early fusion strategy. We fuse the spatio-temporal information with their distance from the deep K-means cluster in the pixel level to calculate the anomaly score. Compared with our prior frame level fusion scheme, experimental results show that the performance of the new architecture is improved. Abstract: Anomaly detection in videos remains a challenging task due to the ambiguous definition of anomaly and the complexity of visual scenes from real video data. Different from the previous work which utilizes reconstruction or prediction as an auxiliaryHighlights: We propose a novel autoencoder architecture to dissociate the spatio temporal representation and learn the regularity in both the spatial and motion feature spaces to detect anomaly in videos. We design an efficient motion autoencoder, which takes consecutive video frames as input and RGB difference as output to imitate the movement of optical flow. The proposed method is much faster than the optical flow-based motion representation learning approach, where its average running time is 32fps. We exploit a variance attention module to automatically assign an importance weight to the moving part of video clips, which is useful to improve the performance of the motion autoencoder. To learn the normality in both the spatial and motion feature spaces, we concatenate these representations extracted from the two streams at the same spatial location, and optimize the two streams and the deep K-means cluster jointly with the early fusion strategy. We fuse the spatio-temporal information with their distance from the deep K-means cluster in the pixel level to calculate the anomaly score. Compared with our prior frame level fusion scheme, experimental results show that the performance of the new architecture is improved. Abstract: Anomaly detection in videos remains a challenging task due to the ambiguous definition of anomaly and the complexity of visual scenes from real video data. Different from the previous work which utilizes reconstruction or prediction as an auxiliary task to learn the temporal regularity, in this work, we explore a novel convolution autoencoder architecture that can dissociate the spatio-temporal representation to separately capture the spatial and the temporal information, since abnormal events are usually different from the normality in appearance and/or motion behavior. Specifically, the spatial autoencoder models the normality on the appearance feature space by learning to reconstruct the input of the first individual frame (FIF), while the temporal part takes the first four consecutive frames as the input and the RGB difference as the output to simulate the motion of optical flow in an efficient way. The abnormal events, which are irregular in appearance or in motion behavior, lead to a large reconstruction error. To improve detection performance on fast moving outliers, we exploit a variance-based attention module and insert it into the motion autoencoder to highlight large movement areas. In addition, we propose a deep K-means cluster strategy to force the spatial and the motion encoder to extract a compact representation. Extensive experiments on some publicly available datasets have demonstrated the effectiveness of our method which achieves the state-of-the-art performance. The code is publicly released at the link 1 . … (more)
- Is Part Of:
- Pattern recognition. Volume 122(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 122(2022)
- Issue Display:
- Volume 122, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 122
- Issue:
- 2022
- Issue Sort Value:
- 2022-0122-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-02
- Subjects:
- Video anomaly detection -- Spatio-temporal dissociation -- Simulate motion of optical flow -- Deep K-means cluster
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2021.108213 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19718.xml