Maximization and restoration: Action segmentation through dilation passing and temporal reconstruction. (September 2022)
- Record Type:
- Journal Article
- Title:
- Maximization and restoration: Action segmentation through dilation passing and temporal reconstruction. (September 2022)
- Main Title:
- Maximization and restoration: Action segmentation through dilation passing and temporal reconstruction
- Authors:
- Park, Junyong
Kim, Daekyum
Huh, Sejoon
Jo, Sungho - Abstract:
- Highlights: We propose a divide-and-conquer method that first maximizes frame accuracy and then reconstructs the features to reduce over-segmentation. Dilation passing network propagates long- and short-range features enabling better understanding of the relation between frames. Temporal reconstruction network uses a convolutional encoder-decoder to capture local context for temporal consistency among frames. Our model achieves meaningful results over the state-of-the-art models on three challenging datasets. Abstract: Action segmentation aims to split videos into segments of different actions. Recent work focuses on dealing with long-range dependencies of long, untrimmed videos, but still suffers from over-segmentation and performance saturation due to increased model complexity. This paper addresses the aforementioned issues through a divide-and-conquer strategy that first maximizes the frame-wise classification accuracy of the model and then reduces the over-segmentation errors. This strategy is implemented with the Dilation Passing and Reconstruction Network, composed of the Dilation Passing Network, which primarily aims to increase accuracy by propagating information of different dilations, and the Temporal Reconstruction Network, which reduces over-segmentation errors by temporally encoding and decoding the output features from the Dilation Passing Network. We also propose a weighted temporal mean squared error loss that further reduces over-segmentation. ThroughHighlights: We propose a divide-and-conquer method that first maximizes frame accuracy and then reconstructs the features to reduce over-segmentation. Dilation passing network propagates long- and short-range features enabling better understanding of the relation between frames. Temporal reconstruction network uses a convolutional encoder-decoder to capture local context for temporal consistency among frames. Our model achieves meaningful results over the state-of-the-art models on three challenging datasets. Abstract: Action segmentation aims to split videos into segments of different actions. Recent work focuses on dealing with long-range dependencies of long, untrimmed videos, but still suffers from over-segmentation and performance saturation due to increased model complexity. This paper addresses the aforementioned issues through a divide-and-conquer strategy that first maximizes the frame-wise classification accuracy of the model and then reduces the over-segmentation errors. This strategy is implemented with the Dilation Passing and Reconstruction Network, composed of the Dilation Passing Network, which primarily aims to increase accuracy by propagating information of different dilations, and the Temporal Reconstruction Network, which reduces over-segmentation errors by temporally encoding and decoding the output features from the Dilation Passing Network. We also propose a weighted temporal mean squared error loss that further reduces over-segmentation. Through evaluations on the 50Salads, GTEA, and Breakfast datasets, we show that our model achieves significant results compared to existing state-of-the-art models. … (more)
- Is Part Of:
- Pattern recognition. Volume 129(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 129(2022)
- Issue Display:
- Volume 129, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 129
- Issue:
- 2022
- Issue Sort Value:
- 2022-0129-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-09
- Subjects:
- Action segmentation -- Temporal segmentation -- Video understanding
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.108764 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22275.xml