Concept formation through multimodal integration using multimodal BERT and VQ-VAE. (16th February 2023)