Environmentally robust ASR front-end for deep neural network acoustic models. (May 2015)

Record Type:: Journal Article
Title:: Environmentally robust ASR front-end for deep neural network acoustic models. (May 2015)
Main Title:: Environmentally robust ASR front-end for deep neural network acoustic models
Authors:: Yoshioka, T.
Gales, M.J.F.
Abstract:: Abstract : Highlights: Effects of various front-end schemes are examined using DNN acoustic models. Meeting transcription experiments are conducted using a single distant microphone. Both speaker independent/adaptive configurations are considered. A pipeline is proposed to integrate different classes of front-end schemes. The pipeline is used to analyse the way in which different schemes interact. Abstract: This paper examines the individual and combined impacts of various front-end approaches on the performance of deep neural network (DNN) based speech recognition systems in distant talking situations, where acoustic environmental distortion degrades the recognition performance. Training of a DNN-based acoustic model consists of generation of state alignments followed by learning the network parameters. This paper first shows that the network parameters are more sensitive to the speech quality than the alignments and thus this stage requires improvement. Then, various front-end robustness approaches to addressing this problem are categorised based on functionality. The degree to which each class of approaches impacts the performance of DNN-based acoustic models is examined experimentally. Based on the results, a front-end processing pipeline is proposed for efficiently combining different classes of approaches. Using this front-end, the combined effects of different classes of approaches are further evaluated in a single distant microphone-based meeting transcription task … (more)
Is Part Of:: Computer speech & language. Volume 31(2015)
Journal:: Computer speech & language
Issue:: Volume 31(2015)
Issue Display:: Volume 31, Issue 2015 (2015)
Year:: 2015
Volume:: 31
Issue:: 2015
Issue Sort Value:: 2015-0031-2015-0000
Page Start:: 65
Page End:: 86
Publication Date:: 2015-05
Subjects:: Environmental robustness -- Deep neural network -- Front-end -- Meeting transcription
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2014.11.008 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 5432.xml