VD-PCR: Improving visual dialog with pronoun coreference resolution. (May 2022)
- Record Type:
- Journal Article
- Title:
- VD-PCR: Improving visual dialog with pronoun coreference resolution. (May 2022)
- Main Title:
- VD-PCR: Improving visual dialog with pronoun coreference resolution
- Authors:
- Yu, Xintong
Zhang, Hongming
Hong, Ruixin
Song, Yangqiu
Zhang, Changshui - Abstract:
- Highlights: A novel framework VD-PCR to improve visual dialog models with pronoun coreference. The joint training helps visual dialog models understand pronouns better. The history pruning with pronoun coreference prevents overfitting to dialog history. VD-PCR achieves state-of-the-art results on the VisDial dataset. Abstract: The visual dialog task requires an AI agent to interact with humans in multi-round dialogs based on a visual environment. As a common linguistic phenomenon, pronouns are often used in dialogs to improve the communication efficiency. As a result, resolving pronouns (i.e., grounding pronouns to the noun phrases they refer to) is an essential step towards understanding dialogs. In this paper, we propose VD-PCR, a novel framework to improve Visual Dialog understanding with Pronoun Coreference Resolution in both implicit and explicit ways. First, to implicitly help models understand pronouns, we design novel methods to perform the joint training of the pronoun coreference resolution and visual dialog tasks. Second, after observing that the coreference relationship of pronouns and their referents indicates the relevance between dialog rounds, we propose to explicitly prune the irrelevant history rounds in visual dialog models' input. With pruned input, the models can focus on relevant dialog history and ignore the distraction in the irrelevant one. With the proposed implicit and explicit methods, VD-PCR achieves state-of-the-art experimental results on theHighlights: A novel framework VD-PCR to improve visual dialog models with pronoun coreference. The joint training helps visual dialog models understand pronouns better. The history pruning with pronoun coreference prevents overfitting to dialog history. VD-PCR achieves state-of-the-art results on the VisDial dataset. Abstract: The visual dialog task requires an AI agent to interact with humans in multi-round dialogs based on a visual environment. As a common linguistic phenomenon, pronouns are often used in dialogs to improve the communication efficiency. As a result, resolving pronouns (i.e., grounding pronouns to the noun phrases they refer to) is an essential step towards understanding dialogs. In this paper, we propose VD-PCR, a novel framework to improve Visual Dialog understanding with Pronoun Coreference Resolution in both implicit and explicit ways. First, to implicitly help models understand pronouns, we design novel methods to perform the joint training of the pronoun coreference resolution and visual dialog tasks. Second, after observing that the coreference relationship of pronouns and their referents indicates the relevance between dialog rounds, we propose to explicitly prune the irrelevant history rounds in visual dialog models' input. With pruned input, the models can focus on relevant dialog history and ignore the distraction in the irrelevant one. With the proposed implicit and explicit methods, VD-PCR achieves state-of-the-art experimental results on the VisDial dataset. … (more)
- Is Part Of:
- Pattern recognition. Volume 125(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 125(2022)
- Issue Display:
- Volume 125, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 125
- Issue:
- 2022
- Issue Sort Value:
- 2022-0125-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-05
- Subjects:
- Vision and language -- Visual dialog -- Pronoun coreference resolution
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.108540 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22253.xml