Double articulation analyzer with deep sparse autoencoder for unsupervised word discovery from speech signals. (17th June 2016)
- Record Type:
- Journal Article
- Title:
- Double articulation analyzer with deep sparse autoencoder for unsupervised word discovery from speech signals. (17th June 2016)
- Main Title:
- Double articulation analyzer with deep sparse autoencoder for unsupervised word discovery from speech signals
- Authors:
- Taniguchi, Tadahiro
Nakashima, Ryo
Liu, Hailong
Nagasaka, Shogo - Abstract:
- Abstract : Direct word discovery from audio speech signals is a very difficult and challenging problem for a developmental robot. Human infants are able to discover words directly from speech signals, and, to understand human infants' developmental capability using a constructive approach, it is very important to build a machine learning system that can acquire knowledge about words and phonemes, i.e. a language model and an acoustic model, autonomously in an unsupervised manner. To achieve this, the nonparametric Bayesian double articulation analyzer (NPB-DAA) with the deep sparse autoencoder (DSAE) is proposed in this paper. The NPB-DAA has been proposed to achieve totally unsupervised direct word discovery from speech signals. However, the performance was still unsatisfactory, although it outperformed pre-existing unsupervised learning methods. In this paper, we integrate the NPB-DAA with the DSAE, which is a neural network model that can be trained in an unsupervised manner, and demonstrate its performance through an experiment about direct word discovery from auditory speech signals. The experiment shows that the combined method, the NPB-DAA with the DSAE, outperforms pre-existing unsupervised learning methods, and shows state-of-the-art performance. It is also shown that the proposed method outperforms several standard speech recognizer-based methods with true word dictionaries. Graphical Abstract:
- Is Part Of:
- Advanced robotics. Volume 30:Number 11/12(2016)
- Journal:
- Advanced robotics
- Issue:
- Volume 30:Number 11/12(2016)
- Issue Display:
- Volume 30, Issue 11/12 (2016)
- Year:
- 2016
- Volume:
- 30
- Issue:
- 11/12
- Issue Sort Value:
- 2016-0030-NaN-0000
- Page Start:
- 770
- Page End:
- 783
- Publication Date:
- 2016-06-17
- Subjects:
- Bayesian nonparametrics -- deep learning -- speech recognition -- unsupervised learning -- word discovery
Robotics -- Periodicals
Robotics -- Japan -- Periodicals
Robotics
Japan
Periodicals
629.89205 - Journal URLs:
- http://www.catchword.com/rpsv/cw/vsp/01691864/contp1.htm ↗
http://catalog.hathitrust.org/api/volumes/oclc/14883000.html ↗
http://www.tandfonline.com/toc/tadr20/current ↗
http://www.tandfonline.com/ ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0169-1864;screen=info;ECOIP ↗
http://www.ingentaselect.com/vl=16659242/cl=11/nw=1/rpsv/cw/vsp/01691864/contp1.htm ↗ - DOI:
- 10.1080/01691864.2016.1159981 ↗
- Languages:
- English
- ISSNs:
- 0169-1864
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0696.926500
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 437.xml