Integrating articulatory data in deep neural network-based acoustic modeling. (March 2016)

Record Type:: Journal Article
Title:: Integrating articulatory data in deep neural network-based acoustic modeling. (March 2016)
Main Title:: Integrating articulatory data in deep neural network-based acoustic modeling
Authors:: Badino, Leonardo
Canevari, Claudia
Fadiga, Luciano
Metta, Giorgio
Abstract:: Abstract : Highlights: We test strategies to exploit articulatory data in DNN-HMM phone recognition. Autoencoder-transformed articulatory features produce the best results. Pre-training of phone classifier DNNs driven by acoustic-to-articulatory mapping. Utility of articulatory information in noisy conditions and in cross-speaker settings. Abstract: Hybrid deep neural network–hidden Markov model (DNN-HMM) systems have become the state-of-the-art in automatic speech recognition. In this paper we experiment with DNN-HMM phone recognition systems that use measured articulatory information. Deep neural networks are both used to compute phone posterior probabilities and to perform acoustic-to-articulatory mapping (AAM). The AAM processes we propose are based on deep representations of the acoustic and the articulatory domains. Such representations allow to: (i) create different pre-training configurations of the DNNs that perform AAM; (ii) perform AAM on a transformed (through DNN autoencoders) articulatory feature (AF) space that captures strong statistical dependencies between articulators. Traditionally, neural networks that approximate the AAM are used to generate AFs that are appended to the observation vector of the speech recognition system. Here we also study a novel approach (AAM-based pretraining) where a DNN performing the AAM is instead used to pretrain the DNN that computes the phone posteriors. Evaluations on both the MOCHA-TIMIT msak0 and the mngu0 datasets show … (more)
Is Part Of:: Computer speech & language. Volume 36(2016)
Journal:: Computer speech & language
Issue:: Volume 36(2016)
Issue Display:: Volume 36, Issue 2016 (2016)
Year:: 2016
Volume:: 36
Issue:: 2016
Issue Sort Value:: 2016-0036-2016-0000
Page Start:: 173
Page End:: 195
Publication Date:: 2016-03
Subjects:: DNN-HMM -- Acoustic-to-articulatory mapping -- Deep neural networks -- Acoustic modeling -- Electromagnetic articulography -- Autoencoders
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2015.05.005 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 528.xml