ThumbDet: One thumbnail image is enough for object detection. (June 2023)
- Record Type:
- Journal Article
- Title:
- ThumbDet: One thumbnail image is enough for object detection. (June 2023)
- Main Title:
- ThumbDet: One thumbnail image is enough for object detection
- Authors:
- Zhang, Yongqiang
Zhang, Yin
Tian, Rui
Zhang, Zian
Bai, Yancheng
Zuo, Wangmeng
Ding, Mingli - Abstract:
- Highlights: A novel Transformer-based architecture for very low resolution images object detection is proposed. An image down-sampling module that utilizes the feature extraction capabilities of CNNs to generate thumbnail images from the original ones. A distillation-boost supervision strategy is introduced to maintain the detection performance of thumbnail images as the original-size inputs. Our method achieves a satisfactory detection performance (i.e. 32.3% in mAP) while drastically reduce computation and memory requirements (speed up 1.26 × ). The performance of our method outperforms bicubic method by a large margin (3.2% in mAP) and achieves comparable results compared to SOTA method on COCO dataset. Abstract: Computer vision fields have witnessed great success thanks to deep convolutional neural networks (CNNs). However, state-of-the-art methods often benefit from large models and datasets, which introduce heavy parameters and computational requirements. Deploying such large models in real-world applications is very difficult because of the limited computing resources. Although many researchers focus on designing efficient block structures to compress model parameters, they ignore that the role of large-scale input images is also an important factor for algorithm efficiency. Reducing input resolution is a useful method to boost runtime efficiency, however, traditional interpolation methods assume a fixed degradation criterion that greatly hurts performance. To solveHighlights: A novel Transformer-based architecture for very low resolution images object detection is proposed. An image down-sampling module that utilizes the feature extraction capabilities of CNNs to generate thumbnail images from the original ones. A distillation-boost supervision strategy is introduced to maintain the detection performance of thumbnail images as the original-size inputs. Our method achieves a satisfactory detection performance (i.e. 32.3% in mAP) while drastically reduce computation and memory requirements (speed up 1.26 × ). The performance of our method outperforms bicubic method by a large margin (3.2% in mAP) and achieves comparable results compared to SOTA method on COCO dataset. Abstract: Computer vision fields have witnessed great success thanks to deep convolutional neural networks (CNNs). However, state-of-the-art methods often benefit from large models and datasets, which introduce heavy parameters and computational requirements. Deploying such large models in real-world applications is very difficult because of the limited computing resources. Although many researchers focus on designing efficient block structures to compress model parameters, they ignore that the role of large-scale input images is also an important factor for algorithm efficiency. Reducing input resolution is a useful method to boost runtime efficiency, however, traditional interpolation methods assume a fixed degradation criterion that greatly hurts performance. To solve the above problems, in this paper, we propose a novel framework named ThumbDet for reducing model computation while maintaining detection accuracy. In our framework, we first design an image down-sampling module to learn a small-scale image that looks realistic and contains discriminative properties. Furthermore, we propose a distillation-boost supervision strategy to maintain the detection performance of small-scaled images as the original-size inputs. Extensive experiments conducted on a standard object detection dataset MS COCO demonstrate the effectiveness of the proposed method when using very low-resolution images ( i.e. 4 × down-sampling) as inputs. In particular, ThumbDet achieves satisfactory detection performance ( i.e. 32.3% in mAP) while drastically reducing computation and memory requirements ( i.e. speed up of 1.26 × ), outperforming the traditional interpolation methods ( e.g. bicubic) by +3.2% absolutely in terms of mAP. … (more)
- Is Part Of:
- Pattern recognition. Volume 138(2023)
- Journal:
- Pattern recognition
- Issue:
- Volume 138(2023)
- Issue Display:
- Volume 138, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 138
- Issue:
- 2023
- Issue Sort Value:
- 2023-0138-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-06
- Subjects:
- Object detection -- Down-sampling network -- Knowledge distillation
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2023.109424 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26088.xml