Concept formation through multimodal integration using multimodal BERT and VQ-VAE. (16th February 2023)
- Record Type:
- Journal Article
- Title:
- Concept formation through multimodal integration using multimodal BERT and VQ-VAE. (16th February 2023)
- Main Title:
- Concept formation through multimodal integration using multimodal BERT and VQ-VAE
- Authors:
- Miyazawa, Kazuki
Nagai, Takayuki - Abstract:
- Abstract : Humans form concepts from multimodal information obtained from multiple sensory organs and understand the environment and language by predicting unobserved information through these concepts. Robots can also make better predictions by forming concepts from multimodal information in a bottom-up manner. In contrast, BERT, a language model, acquires semantic representations of sentences applicable to various language comprehension tasks by learning relationships between words through mask prediction learning. We believe that BERT can be extended to multimodality and learn multimodal representations for world understanding by learning relationships among data across modalities. Therefore, we propose a multimodal learning model using BERT and VQ-VAEs with multimodal information as input and evaluate the model in terms of model predictability and concept formation. To verify the usefulness of the proposed model, experiments were conducted using a multimodal dataset of objects acquired by a robot. The experimental results showed that the proposed model has higher category classification accuracy and word prediction performance than multimodal latent Dirichlet allocation (MLDA), which is used as a concept formation model. An analysis of the representation of the model showed that clusters of object categories were formed within the model. GRAPHICAL ABSTRACT: UF0001
- Is Part Of:
- Advanced robotics. Volume 37:Number 4(2023)
- Journal:
- Advanced robotics
- Issue:
- Volume 37:Number 4(2023)
- Issue Display:
- Volume 37, Issue 4 (2023)
- Year:
- 2023
- Volume:
- 37
- Issue:
- 4
- Issue Sort Value:
- 2023-0037-0004-0000
- Page Start:
- 281
- Page End:
- 296
- Publication Date:
- 2023-02-16
- Subjects:
- Multimodal learning -- robot learning -- transformers -- concept formation -- natural language processing
Robotics -- Periodicals
Robotics -- Japan -- Periodicals
Robotics
Japan
Periodicals
629.89205 - Journal URLs:
- http://www.catchword.com/rpsv/cw/vsp/01691864/contp1.htm ↗
http://catalog.hathitrust.org/api/volumes/oclc/14883000.html ↗
http://www.tandfonline.com/toc/tadr20/current ↗
http://www.tandfonline.com/ ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0169-1864;screen=info;ECOIP ↗
http://www.ingentaselect.com/vl=16659242/cl=11/nw=1/rpsv/cw/vsp/01691864/contp1.htm ↗ - DOI:
- 10.1080/01691864.2022.2141583 ↗
- Languages:
- English
- ISSNs:
- 0169-1864
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0696.926500
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 25744.xml