Understanding image concepts using ISTOP model. (May 2016)
- Record Type:
- Journal Article
- Title:
- Understanding image concepts using ISTOP model. (May 2016)
- Main Title:
- Understanding image concepts using ISTOP model
- Authors:
- Zarchi, M.S.
Tan, R.T.
van Gemeren, C.
Monadjemi, A.
Veltkamp, R.C. - Abstract:
- Abstract: This paper focuses on recognizing image concepts by introducing the ISTOP model. The model parses the images from scene to object׳s parts by using a context sensitive grammar. Since there is a gap between the scene and object levels, this grammar proposes the "Visual Term" level to bridge the gap. Visual term is a higher concept level than the object level representing a few co-occurring objects. The grammar used in the model can be embodied in an And–Or graph representation. The hierarchical structure of the graph decomposes an image from the scene level into the visual term, object level and part level by terminal and non-terminal nodes, while the horizontal links in the graph impose the context and constraints between the nodes. In order to learn the grammar constraints and their weights, we propose an algorithm that can perform on weakly annotated datasets. This algorithm searches in the dataset to find visual terms without supervision and then learns the weights of the constraints using a latent SVM. The experimental results on the Pascal VOC dataset show that our model outperforms the state-of-the-art approaches in recognizing image concepts. Abstract : Highlights: In understanding an image there is a significant gap between scene level and object level. ISTOP model can parse an image form scene level to visual term, object and part level by context sensitive grammar. The visual term is a new concept which can bridge the gap between scene level and objectAbstract: This paper focuses on recognizing image concepts by introducing the ISTOP model. The model parses the images from scene to object׳s parts by using a context sensitive grammar. Since there is a gap between the scene and object levels, this grammar proposes the "Visual Term" level to bridge the gap. Visual term is a higher concept level than the object level representing a few co-occurring objects. The grammar used in the model can be embodied in an And–Or graph representation. The hierarchical structure of the graph decomposes an image from the scene level into the visual term, object level and part level by terminal and non-terminal nodes, while the horizontal links in the graph impose the context and constraints between the nodes. In order to learn the grammar constraints and their weights, we propose an algorithm that can perform on weakly annotated datasets. This algorithm searches in the dataset to find visual terms without supervision and then learns the weights of the constraints using a latent SVM. The experimental results on the Pascal VOC dataset show that our model outperforms the state-of-the-art approaches in recognizing image concepts. Abstract : Highlights: In understanding an image there is a significant gap between scene level and object level. ISTOP model can parse an image form scene level to visual term, object and part level by context sensitive grammar. The visual term is a new concept which can bridge the gap between scene level and object level. The context used in the grammar can improve object detection as well as visual term detection. … (more)
- Is Part Of:
- Pattern recognition. Volume 53(2016:May)
- Journal:
- Pattern recognition
- Issue:
- Volume 53(2016:May)
- Issue Display:
- Volume 53 (2016)
- Year:
- 2016
- Volume:
- 53
- Issue Sort Value:
- 2016-0053-0000-0000
- Page Start:
- 174
- Page End:
- 183
- Publication Date:
- 2016-05
- Subjects:
- Visual term -- Context sensitive grammar -- Latent SVM -- And–Or graph -- Concept recognition -- Image parsing
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2015.11.010 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 1385.xml