A hierarchical recurrent approach to predict scene graphs from a visual‐attention‐oriented perspective. (22nd March 2019)
- Record Type:
- Journal Article
- Title:
- A hierarchical recurrent approach to predict scene graphs from a visual‐attention‐oriented perspective. (22nd March 2019)
- Main Title:
- A hierarchical recurrent approach to predict scene graphs from a visual‐attention‐oriented perspective
- Authors:
- Gao, Wenjing
Zhu, Yonghua
Zhang, Wenjun
Zhang, Ke
Gao, Honghao - Other Names:
- Pang Shaoning guestEditor.
Zhang Xuyun guestEditor.
Ikeda Kazushi guestEditor.
Puthal Deepak guestEditor.
Li Jianxin guestEditor.
Sarrafzahed Abdolhossein guestEditor. - Abstract:
- Abstract: A scene graph provides a powerful intermediate knowledge structure for various visual tasks, including semantic image retrieval, image captioning, and visual question answering. In this paper, the task of predicting a scene graph for an image is formulated as two connected problems, ie, recognizing the relationship triplets, structured as < subject‐predicate‐object >, and constructing the scene graph from the recognized relationship triplets. For relationship triplet recognition, we develop a novel hierarchical recurrent neural network with visual attention mechanism. This model is composed of two attention‐based recurrent neural networks in a hierarchical organization. The first network generates a topic vector for each relationship triplet, whereas the second network predicts each word in that relationship triplet given the topic vector. This approach successfully captures the compositional structure and contextual dependency of an image and the relationship triplets describing its scene. For scene graph construction, an entity localization approach to determine the graph structure is presented with the assistance of available attention information. Then, the procedures for automatically converting the generated relationship triplets into a scene graph are clarified through an algorithm. Extensive experimental results on two widely used data sets verify the feasibility of the proposed approach.
- Is Part Of:
- Computational intelligence. Volume 35:Number 3(2019)
- Journal:
- Computational intelligence
- Issue:
- Volume 35:Number 3(2019)
- Issue Display:
- Volume 35, Issue 3 (2019)
- Year:
- 2019
- Volume:
- 35
- Issue:
- 3
- Issue Sort Value:
- 2019-0035-0003-0000
- Page Start:
- 496
- Page End:
- 516
- Publication Date:
- 2019-03-22
- Subjects:
- hierarchical recurrent neural network -- relationship triplet recognition -- scene graph -- visual attention mechanism
Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3 - Journal URLs:
- http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/coin.12202 ↗
- Languages:
- English
- ISSNs:
- 0824-7935
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 11357.xml