Semi-supervised cross-modal hashing via modality-specific and cross-modal graph convolutional networks. (April 2023)
- Record Type:
- Journal Article
- Title:
- Semi-supervised cross-modal hashing via modality-specific and cross-modal graph convolutional networks. (April 2023)
- Main Title:
- Semi-supervised cross-modal hashing via modality-specific and cross-modal graph convolutional networks
- Authors:
- Wu, Fei
Li, Shuaishuai
Gao, Guangwei
Ji, Yimu
Jing, Xiao-Yuan
Wan, Zhiguo - Abstract:
- Highlights: MCGCN for the first time builds cross-modal graph and jointly learns modality-specific and modality-shared features for semi-supervised cross-modal hashing. MCGCN provides a three-channel network architecture, including two modality-specific channels and a cross-modal channel to model cross-modal graph with heterogeneous image and text features. To effectively reduce the modality gap, network training is guided by adversarial scheme. MCGCN obtains state-of-the-art semi-supervised cross-modal hashing performance. Abstract: Cross-modal hashing maps heterogeneous multimedia data into Hamming space for retrieving relevant samples across modalities, which has received great research interests due to its rapid retrieval and low storage cost. In real-world applications, due to high manual annotation cost of multi-media data, we can only make use of limited number of labeled data with rich unlabeled data. In recent years, several semi-supervised cross-modal hashing (SCH) methods have been presented. However, how to fully explore and jointly utilize the modality-specific (complementarity) and modality-shared (correlation) information for retrieval has not been well studied for existing SCH works. In this paper, we propose a novel SCH approach named Modality-specific and Cross-modal Graph Convolutional Networks (MCGCN). The network architecture contains two modality-specific channels and a cross-modal channel to learn modality-specific and shared representations for eachHighlights: MCGCN for the first time builds cross-modal graph and jointly learns modality-specific and modality-shared features for semi-supervised cross-modal hashing. MCGCN provides a three-channel network architecture, including two modality-specific channels and a cross-modal channel to model cross-modal graph with heterogeneous image and text features. To effectively reduce the modality gap, network training is guided by adversarial scheme. MCGCN obtains state-of-the-art semi-supervised cross-modal hashing performance. Abstract: Cross-modal hashing maps heterogeneous multimedia data into Hamming space for retrieving relevant samples across modalities, which has received great research interests due to its rapid retrieval and low storage cost. In real-world applications, due to high manual annotation cost of multi-media data, we can only make use of limited number of labeled data with rich unlabeled data. In recent years, several semi-supervised cross-modal hashing (SCH) methods have been presented. However, how to fully explore and jointly utilize the modality-specific (complementarity) and modality-shared (correlation) information for retrieval has not been well studied for existing SCH works. In this paper, we propose a novel SCH approach named Modality-specific and Cross-modal Graph Convolutional Networks (MCGCN). The network architecture contains two modality-specific channels and a cross-modal channel to learn modality-specific and shared representations for each modality, respectively. Graph convolutional network (GCN) is leveraged in these three channels to explore intra-modal and inter-modal similarity, and perform semantic information propagation from labeled data to unlabeled data. Modality-specific and shared representations for each modality are fused with attention scheme. To further reduce the modality gap, a discriminative model is designed, learning to classify the modality of representations, and network training is guided by adversarial scheme. Experiments on two widely used multi-modal datasets demonstrate MCGCN outperforms state-of-the-art semi-supervised/supervised cross-modal hashing methods. … (more)
- Is Part Of:
- Pattern recognition. Volume 136(2023)
- Journal:
- Pattern recognition
- Issue:
- Volume 136(2023)
- Issue Display:
- Volume 136, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 136
- Issue:
- 2023
- Issue Sort Value:
- 2023-0136-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-04
- Subjects:
- Cross-modal hashing -- semi-supervised learning -- graph convolutional networks -- modality-specific features -- modality-shared features
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.109211 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25681.xml