3D face-model reconstruction from a single image: A feature aggregation approach using hierarchical transformer with weak supervision. (December 2022)
- Record Type:
- Journal Article
- Title:
- 3D face-model reconstruction from a single image: A feature aggregation approach using hierarchical transformer with weak supervision. (December 2022)
- Main Title:
- 3D face-model reconstruction from a single image: A feature aggregation approach using hierarchical transformer with weak supervision
- Authors:
- Basak, Shubhajit
Corcoran, Peter
McDonnell, Rachel
Schukat, Michael - Abstract:
- Abstract: Convolutional Neural Networks (CNN) have gained popularity as the de-facto model for any computer vision task. However, CNN have drawbacks, i.e. they fail to extract long-range perceptions in images. Due to their ability to capture long-range dependencies, transformer networks are adopted in computer vision applications, where they show state-of-the-art (SOTA) results in popular tasks like image classification, instance segmentation, and object detection. Although they gained ample attention, transformers have not been applied to 3D face reconstruction tasks. In this work, we propose a novel hierarchical transformer model, added to a feature pyramid aggregation structure, to extract the 3D face parameters from a single 2D image. More specifically, we use pre-trained Swin Transformer backbone networks in a hierarchical manner and add the feature fusion module to aggregate the features in multiple stages. We use a semi-supervised training approach and train our model in a supervised way with the 3DMM parameters from a publicly available dataset and unsupervised training with a differential renderer on other parameters like facial keypoints and facial features. We also train our network on a hybrid unsupervised loss and compare the results with other SOTA approaches. When evaluated across two public datasets on face reconstruction and dense 3D face alignment tasks, our method can achieve comparable results to the current SOTA performance and in some instances doAbstract: Convolutional Neural Networks (CNN) have gained popularity as the de-facto model for any computer vision task. However, CNN have drawbacks, i.e. they fail to extract long-range perceptions in images. Due to their ability to capture long-range dependencies, transformer networks are adopted in computer vision applications, where they show state-of-the-art (SOTA) results in popular tasks like image classification, instance segmentation, and object detection. Although they gained ample attention, transformers have not been applied to 3D face reconstruction tasks. In this work, we propose a novel hierarchical transformer model, added to a feature pyramid aggregation structure, to extract the 3D face parameters from a single 2D image. More specifically, we use pre-trained Swin Transformer backbone networks in a hierarchical manner and add the feature fusion module to aggregate the features in multiple stages. We use a semi-supervised training approach and train our model in a supervised way with the 3DMM parameters from a publicly available dataset and unsupervised training with a differential renderer on other parameters like facial keypoints and facial features. We also train our network on a hybrid unsupervised loss and compare the results with other SOTA approaches. When evaluated across two public datasets on face reconstruction and dense 3D face alignment tasks, our method can achieve comparable results to the current SOTA performance and in some instances do better than the SOTA methods. A detailed subjective evaluation also shows that our method performs better than the previous works in realism and occlusion resistance. … (more)
- Is Part Of:
- Neural networks. Volume 156(2022)
- Journal:
- Neural networks
- Issue:
- Volume 156(2022)
- Issue Display:
- Volume 156, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 156
- Issue:
- 2022
- Issue Sort Value:
- 2022-0156-2022-0000
- Page Start:
- 108
- Page End:
- 122
- Publication Date:
- 2022-12
- Subjects:
- Face reconstruction -- Hierarchical transformer -- Feature fusion -- ViT -- Swin Transformer
Neural computers -- Periodicals
Neural networks (Computer science) -- Periodicals
Neural networks (Neurobiology) -- Periodicals
Nervous System -- Periodicals
Ordinateurs neuronaux -- Périodiques
Réseaux neuronaux (Informatique) -- Périodiques
Réseaux neuronaux (Neurobiologie) -- Périodiques
Neural computers
Neural networks (Computer science)
Neural networks (Neurobiology)
Periodicals
006.32 - Journal URLs:
- http://www.sciencedirect.com/science/journal/08936080 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.neunet.2022.09.019 ↗
- Languages:
- English
- ISSNs:
- 0893-6080
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 6081.280800
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24323.xml