Human-Centric Image Captioning. (June 2022)
- Record Type:
- Journal Article
- Title:
- Human-Centric Image Captioning. (June 2022)
- Main Title:
- Human-Centric Image Captioning
- Authors:
- Yang, Zuopeng
Wang, Pengbo
Chu, Tianshu
Yang, Jie - Abstract:
- Highlights: We propose a new task of Human-Centric Image Captioning and build a dataset - HC-COCO. We introduce the Human-Centric Feature Hierarchization to hierarchize image features more explicitly for human-centric captioning by incorporating human body part information. We propose a novel three-branch architecture for the separate information flow control and optimization, which helps generating more detailed captions for human activities. Our proposed method achieves state-of-the-art performance on HC-COCO, outperforming the previous state of the art by a clear margin. Abstract: In this paper, we propose a new topic, Human-Centric Captioning, to mainly describe the human behavior in an image. Human activities and relationships are the primary objectives of visual understanding in daily applications. However, existing image captioning systems cannot differently treat humans and other objects, which limits the ability to understand and describe diverse human activities. As the first explorer of this new task, we build a novel Human-Centric COCO dataset concentrating on humans. Accordingly, we propose a novel Human-Centric Captioning Model (HCCM) that focuses on human-centric feature hierarchization and sentence generation. Specifically, our model first utilizes human body part-level knowledge to hierarchize the image features and then applies a novel three-branch captioning model to process these hierarchical features independently to calibrate the descriptions of humanHighlights: We propose a new task of Human-Centric Image Captioning and build a dataset - HC-COCO. We introduce the Human-Centric Feature Hierarchization to hierarchize image features more explicitly for human-centric captioning by incorporating human body part information. We propose a novel three-branch architecture for the separate information flow control and optimization, which helps generating more detailed captions for human activities. Our proposed method achieves state-of-the-art performance on HC-COCO, outperforming the previous state of the art by a clear margin. Abstract: In this paper, we propose a new topic, Human-Centric Captioning, to mainly describe the human behavior in an image. Human activities and relationships are the primary objectives of visual understanding in daily applications. However, existing image captioning systems cannot differently treat humans and other objects, which limits the ability to understand and describe diverse human activities. As the first explorer of this new task, we build a novel Human-Centric COCO dataset concentrating on humans. Accordingly, we propose a novel Human-Centric Captioning Model (HCCM) that focuses on human-centric feature hierarchization and sentence generation. Specifically, our model first utilizes human body part-level knowledge to hierarchize the image features and then applies a novel three-branch captioning model to process these hierarchical features independently to calibrate the descriptions of human actions. Comprehensive experiments demonstrate that our HCCM achieves the state-of-the-art performance with BLEU-4, CIDEr and SPICE scores of 41.5, 127.3, 23.5 respectively. Dataset and code are publicly available at https://github.com/JohnDreamer/HCCM/. … (more)
- Is Part Of:
- Pattern recognition. Volume 126(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 126(2022)
- Issue Display:
- Volume 126, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 126
- Issue:
- 2022
- Issue Sort Value:
- 2022-0126-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06
- Subjects:
- Human-centric -- Image captioning -- Feature hierarchization
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.108545 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22254.xml