STDnet: Exploiting high resolution feature maps for small object detection. (May 2020)
- Record Type:
- Journal Article
- Title:
- STDnet: Exploiting high resolution feature maps for small object detection. (May 2020)
- Main Title:
- STDnet: Exploiting high resolution feature maps for small object detection
- Authors:
- Bosquet, Brais
Mucientes, Manuel
Brea, Víctor M. - Abstract:
- Abstract: The accuracy of small object detection with convolutional neural networks (ConvNets) lags behind that of larger objects. This can be observed in popular contests like MS COCO. This is in part caused by the lack of specific architectures and datasets with a sufficiently large number of small objects. Our work aims at these two issues. First, this paper introduces STDnet, a convolutional neural network focused on the detection of small objects that we defined as those under 16 × 16 pixels. The high performance of STDnet is built on a novel early visual attention mechanism, called Region Context Network (RCN), to choose the most promising regions, while discarding the rest of the input image. Processing only specific areas allows STDnet to keep high resolution feature maps in deeper layers providing low memory overhead and higher frame rates. High resolution feature maps were proved to be key to increasing localization accuracy in such small objects. Second, we also present USC-GRAD-STDdb, a video dataset with more than 56, 000 annotated small objects in challenging scenarios. Experimental results over USC-GRAD-STDdb show that STDnet improves the AP @ . 5 of the best state-of-the-art object detectors for small target detection from 50.8% to 57.4%. Performance has also been tested in MS COCO for objects under 16 × 16 pixels. In addition, a spatio-temporal baseline network, STDnet-bST, has been proposed to make use of the information of successive frames, increasing theAbstract: The accuracy of small object detection with convolutional neural networks (ConvNets) lags behind that of larger objects. This can be observed in popular contests like MS COCO. This is in part caused by the lack of specific architectures and datasets with a sufficiently large number of small objects. Our work aims at these two issues. First, this paper introduces STDnet, a convolutional neural network focused on the detection of small objects that we defined as those under 16 × 16 pixels. The high performance of STDnet is built on a novel early visual attention mechanism, called Region Context Network (RCN), to choose the most promising regions, while discarding the rest of the input image. Processing only specific areas allows STDnet to keep high resolution feature maps in deeper layers providing low memory overhead and higher frame rates. High resolution feature maps were proved to be key to increasing localization accuracy in such small objects. Second, we also present USC-GRAD-STDdb, a video dataset with more than 56, 000 annotated small objects in challenging scenarios. Experimental results over USC-GRAD-STDdb show that STDnet improves the AP @ . 5 of the best state-of-the-art object detectors for small target detection from 50.8% to 57.4%. Performance has also been tested in MS COCO for objects under 16 × 16 pixels. In addition, a spatio-temporal baseline network, STDnet-bST, has been proposed to make use of the information of successive frames, increasing the AP @ . 5 of STDnet in 2.3%. Finally, optimizations have been carried out to be fit on embedded devices such as Jetson TX2. Graphical abstract: Highlights: Object detection does not contemplate the study of small objects under 16 × 16 . STDnet is built on a novel early visual attention mechanism to select promising areas. High-resolution feature maps are key to increasing accuracy for small objects. USC-GRAD-STDdb is a video dataset with more than 56, 000 annotated small objects. Dnet outperforms its state-of-the-art counterparts for small target detection. … (more)
- Is Part Of:
- Engineering applications of artificial intelligence. Volume 91(2020)
- Journal:
- Engineering applications of artificial intelligence
- Issue:
- Volume 91(2020)
- Issue Display:
- Volume 91, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 91
- Issue:
- 2020
- Issue Sort Value:
- 2020-0091-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-05
- Subjects:
- Small object detection -- Convolution neural networks (ConvNets) -- Deep learning
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.engappai.2020.103615 ↗
- Languages:
- English
- ISSNs:
- 0952-1976
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13419.xml