TransNav: spatial sequential transformer network for visual navigation. Issue 5 (31st August 2022)
- Record Type:
- Journal Article
- Title:
- TransNav: spatial sequential transformer network for visual navigation. Issue 5 (31st August 2022)
- Main Title:
- TransNav: spatial sequential transformer network for visual navigation
- Authors:
- Zhou, Kang
Zhang, Huyin
Li, Fei - Abstract:
- Abstract: Visual navigation task is to steer an embodied agent finding the given target based on observation. The effective transformer from observation of the agent to visual representation determines the navigation actions and promotes more informed navigation policy. In this work, we propose a spatial sequential transformer network (SSTNet) for learning informative visual representation in deep reinforcement learning. SSTNet is composed by spatial attention probability fused model (SAF) and sequential transformer network (STNet). SAF enforces cross-modal state into visual clues in reinforcement learning. It encodes semantic information about observed objects, as well as spatial information about their location, which jointly exploiting image inter-relations. STNet generates (imagines) the next observations and makes action inference of the aspects most relevant to the target. It decodes the image intra-relations. This way, the agent learns to understand the causality between navigation actions and dynamic changes in observations. SSTNet is conditioned on an auto-regressive model on the desired reward, past states, actions, and knowledge graph. The whole navigation framework considers the local and global visual information, as well as time sequential information. Thus, it allows the agent to navigate towards the sought-after object effectively. We evaluate our model on the AI2THOR framework show that our method attains at least $10\%$ improvement of average success rateAbstract: Visual navigation task is to steer an embodied agent finding the given target based on observation. The effective transformer from observation of the agent to visual representation determines the navigation actions and promotes more informed navigation policy. In this work, we propose a spatial sequential transformer network (SSTNet) for learning informative visual representation in deep reinforcement learning. SSTNet is composed by spatial attention probability fused model (SAF) and sequential transformer network (STNet). SAF enforces cross-modal state into visual clues in reinforcement learning. It encodes semantic information about observed objects, as well as spatial information about their location, which jointly exploiting image inter-relations. STNet generates (imagines) the next observations and makes action inference of the aspects most relevant to the target. It decodes the image intra-relations. This way, the agent learns to understand the causality between navigation actions and dynamic changes in observations. SSTNet is conditioned on an auto-regressive model on the desired reward, past states, actions, and knowledge graph. The whole navigation framework considers the local and global visual information, as well as time sequential information. Thus, it allows the agent to navigate towards the sought-after object effectively. We evaluate our model on the AI2THOR framework show that our method attains at least $10\%$ improvement of average success rate over most state-of-the-art models. Code and datasets can be found in https://github.com/zhoukang123/SDTNet_2022 . Graphical Abstract: … (more)
- Is Part Of:
- Journal of computational design and engineering. Volume 9:Issue 5(2022)
- Journal:
- Journal of computational design and engineering
- Issue:
- Volume 9:Issue 5(2022)
- Issue Display:
- Volume 9, Issue 5 (2022)
- Year:
- 2022
- Volume:
- 9
- Issue:
- 5
- Issue Sort Value:
- 2022-0009-0005-0000
- Page Start:
- 1866
- Page End:
- 1878
- Publication Date:
- 2022-08-31
- Subjects:
- visual navigation -- knowledge graph -- reinforcement learning -- spatial attention -- transformer network
Engineering -- Data processing -- Periodicals
Computer-aided design -- Periodicals
Computer-aided design
Engineering -- Data processing
Electronic journals
Electronic journals
Periodicals
620.0042 - Journal URLs:
- http://bibpurl.oclc.org/web/76338 http://www.jcde.org/ ↗
http://www.sciencedirect.com/science/journal/22884300 ↗
http://www.journals.elsevier.com/journal-of-computational-design-and-engineering ↗
https://academic.oup.com/jcde ↗
http://www.oxfordjournals.org/ ↗ - DOI:
- 10.1093/jcde/qwac084 ↗
- Languages:
- English
- ISSNs:
- 2288-4300
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24102.xml