Semantic-aware scene recognition. (June 2020)
- Record Type:
- Journal Article
- Title:
- Semantic-aware scene recognition. (June 2020)
- Main Title:
- Semantic-aware scene recognition
- Authors:
- López-Cifuentes, Alejandro
Escudero-Viñolo, Marcos
Bescós, Jesús
García-Martín, Álvaro - Abstract:
- Highlights: A novel approach for scene recognition based on an end-to-end multi-modal CNN that combines image and context information by means of an attention module. Context information, in the shape of semantic segmentation, is used to gate features extracted from an RGB image. The gating process reinforces the learning of indicative scene content and enhances scene disambiguation. The proposed approach outperforms state-of-the-art performance while significantly reducing the number of network parameters. Abstract: Scene recognition is currently one of the top-challenging research fields in computer vision. This may be due to the ambiguity between classes: images of several scene classes may share similar objects, which causes confusion among them. The problem is aggravated when images of a particular scene class are notably different. Convolutional Neural Networks (CNNs) have significantly boosted performance in scene recognition, albeit it is still far below from other recognition tasks (e.g., object or image recognition). In this paper, we describe a novel approach for scene recognition based on an end-to-end multi-modal CNN that combines image and context information by means of an attention module. Context information, in the shape of a semantic segmentation, is used to gate features extracted from the RGB image by leveraging on information encoded in the semantic representation: the set of scene objects and stuff, and their relative locations. This gating processHighlights: A novel approach for scene recognition based on an end-to-end multi-modal CNN that combines image and context information by means of an attention module. Context information, in the shape of semantic segmentation, is used to gate features extracted from an RGB image. The gating process reinforces the learning of indicative scene content and enhances scene disambiguation. The proposed approach outperforms state-of-the-art performance while significantly reducing the number of network parameters. Abstract: Scene recognition is currently one of the top-challenging research fields in computer vision. This may be due to the ambiguity between classes: images of several scene classes may share similar objects, which causes confusion among them. The problem is aggravated when images of a particular scene class are notably different. Convolutional Neural Networks (CNNs) have significantly boosted performance in scene recognition, albeit it is still far below from other recognition tasks (e.g., object or image recognition). In this paper, we describe a novel approach for scene recognition based on an end-to-end multi-modal CNN that combines image and context information by means of an attention module. Context information, in the shape of a semantic segmentation, is used to gate features extracted from the RGB image by leveraging on information encoded in the semantic representation: the set of scene objects and stuff, and their relative locations. This gating process reinforces the learning of indicative scene content and enhances scene disambiguation by refocusing the receptive fields of the CNN towards them. Experimental results on three publicly available datasets show that the proposed approach outperforms every other state-of-the-art method while significantly reducing the number of network parameters. All the code and data used along this paper is available at: https://github.com/vpulab/Semantic-Aware-Scene-Recognition … (more)
- Is Part Of:
- Pattern recognition. Volume 102(2020:Jun.)
- Journal:
- Pattern recognition
- Issue:
- Volume 102(2020:Jun.)
- Issue Display:
- Volume 102 (2020)
- Year:
- 2020
- Volume:
- 102
- Issue Sort Value:
- 2020-0102-0000-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-06
- Subjects:
- Scene recognition -- Deep learning -- Convolutional neural networks -- Semantic segmentation
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2020.107256 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12955.xml