Answering knowledge-based visual questions via the exploration of Question Purpose. (January 2023)
- Record Type:
- Journal Article
- Title:
- Answering knowledge-based visual questions via the exploration of Question Purpose. (January 2023)
- Main Title:
- Answering knowledge-based visual questions via the exploration of Question Purpose
- Authors:
- Song, Lingyun
Li, Jianao
Liu, Jun
Yang, Yang
Shang, Xuequn
Sun, Mingxuan - Abstract:
- Highlights: Explore the purposes of visual questions and use the purposes for answer prediction. Estimate the correctness of each candidate answer by the consistency between candidate answers and the Question Purpose. Exploit the knowledge facts accordant with the Question Purpose to improve predictive results. Abstract: Visual question answering has been greatly advanced by deep learning technologies, but still remains an open problem subjected to two aspects of factors. First, previous works estimate the correctness of each candidate answer mainly by its semantic correlations with visual questions, overlooking the fact that some questions and their answers are semantically inconsistent. Second, previous works that require external knowledge mainly uses the knowledge facts retrieved by key words or visual objects. However, the retrieved knowledge facts may only be related to the semantics of the question, but are useless or even misleading for answer prediction. To address these issues, we investigate how to capture the purpose of visual questions and propose a Purpose Guided Visual Question Answering model, called PGVQA. It mainly has two appealing properties: (1) It can estimate the correctness of candidate answers based on the Question Purpose (QP) that reveals which aspects of the concept are examined by visual questions. This is helpful for avoiding the negative effect of the semantic inconsistency between answers and questions. (2) It can incorporate the knowledgeHighlights: Explore the purposes of visual questions and use the purposes for answer prediction. Estimate the correctness of each candidate answer by the consistency between candidate answers and the Question Purpose. Exploit the knowledge facts accordant with the Question Purpose to improve predictive results. Abstract: Visual question answering has been greatly advanced by deep learning technologies, but still remains an open problem subjected to two aspects of factors. First, previous works estimate the correctness of each candidate answer mainly by its semantic correlations with visual questions, overlooking the fact that some questions and their answers are semantically inconsistent. Second, previous works that require external knowledge mainly uses the knowledge facts retrieved by key words or visual objects. However, the retrieved knowledge facts may only be related to the semantics of the question, but are useless or even misleading for answer prediction. To address these issues, we investigate how to capture the purpose of visual questions and propose a Purpose Guided Visual Question Answering model, called PGVQA. It mainly has two appealing properties: (1) It can estimate the correctness of candidate answers based on the Question Purpose (QP) that reveals which aspects of the concept are examined by visual questions. This is helpful for avoiding the negative effect of the semantic inconsistency between answers and questions. (2) It can incorporate the knowledge facts accordant with the QP into answer prediction, which helps to improve the probability of answering visual questions correctly. Empirical studies on benchmark datasets show that PGVQA achieves state-of-the-art performance. … (more)
- Is Part Of:
- Pattern recognition. Volume 133(2023)
- Journal:
- Pattern recognition
- Issue:
- Volume 133(2023)
- Issue Display:
- Volume 133, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 133
- Issue:
- 2023
- Issue Sort Value:
- 2023-0133-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- Visual question answering -- DNN -- Question Purpose
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.109015 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24024.xml