Acoustic adaptation to dynamic background conditions with asynchronous transformations. (January 2017)

Record Type:: Journal Article
Title:: Acoustic adaptation to dynamic background conditions with asynchronous transformations. (January 2017)
Main Title:: Acoustic adaptation to dynamic background conditions with asynchronous transformations
Authors:: Saz, Oscar
Hain, Thomas
Abstract:: Highlights: We propose asynchronous CMLLR adaptation for complex backgrounds. Noise adaptive techniques and speaker factorisation are also explored. Results in a noisy WSJ benchmark task show up to 40% WER reduction. Evaluation in the transcription of multi-media data achieves 3% WER reduction. Abstract: This paper proposes a framework for performing adaptation to complex and non-stationary background conditions in Automatic Speech Recognition (ASR) by means of asynchronous Constrained Maximum Likelihood Linear Regression (aCMLLR) transforms and asynchronous Noise Adaptive Training (aNAT). The proposed method aims to apply the feature transform that best compensates the background for every input frame. The implementation is done with a new Hidden Markov Model (HMM) topology that expands the usual left-to-right HMM into parallel branches adapted to different background conditions and permits transitions among them. Using this, the proposed adaptation does not require ground truth or previous knowledge about the background in each frame as it aims to maximise the overall log-likelihood of the decoded utterance. The proposed aCMLLR transforms can be further improved by retraining models in an aNAT fashion and by using speaker-based MLLR transforms in cascade for an efficient modelling of background effects and speaker. An initial evaluation in a modified version of the WSJCAM0 corpus incorporating 7 different background conditions provides a benchmark in which to evaluate the … (more)
Is Part Of:: Computer speech & language. Volume 41(2016)
Journal:: Computer speech & language
Issue:: Volume 41(2016)
Issue Display:: Volume 41, Issue 2016 (2016)
Year:: 2016
Volume:: 41
Issue:: 2016
Issue Sort Value:: 2016-0041-2016-0000
Page Start:: 180
Page End:: 194
Publication Date:: 2017-01
Subjects:: Speech recognition -- Acoustic adaptation -- Factorisation -- Dynamic background -- Media transcription
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2016.06.008 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 2481.xml