Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition. (November 2017)

Record Type:: Journal Article
Title:: Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition. (November 2017)
Main Title:: Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition
Authors:: Moritz, Niko
Adiloğlu, Kamil
Anemüller, Jörn
Goetze, Stefan
Kollmeier, Birger
Abstract:: Highlights: An elaborate ASR system is proposed that outperforms the best results of the 3rd CHiME challenge. Amplitude modulation filter bank (AMFB) features demonstrate enhanced ASR robustness in comparison to the commonly used frame splicing approach. A microphone failure detection method is proposed to detect erroneous recordings of a multichannel recording device. Qualities of two multi-channel speech enhancement techniques, which are a MVDR and NMF-based method, are emphasized by system combination using minimum Bayes risk (MBR) decoding. Abstract: The paper describes a system for automatic speech recognition (ASR) that is benchmarked with data of the 3rd CHiME challenge, a dataset comprising distant microphone recordings of noisy acoustic scenes in public environments. The proposed ASR system employs various methods to increase recognition accuracy and noise robustness. Two different multi-channel speech enhancement techniques are used to eliminate interfering sounds in the audio stream. One speech enhancement method aims at separating the target speaker's voice from background sources based on non-negative matrix factorization (NMF) using variational Bayesian (VB) inference to estimate NMF parameters. The second technique is based on a time-varying minimum variance distortionless response (MVDR) beamformer that uses spatial information to suppress sound signals not arriving from a desired direction. Prior to speech enhancement, a microphone channel failure detector … (more)
Is Part Of:: Computer speech & language. Volume 46(2017)
Journal:: Computer speech & language
Issue:: Volume 46(2017)
Issue Display:: Volume 46, Issue 2017 (2017)
Year:: 2017
Volume:: 46
Issue:: 2017
Issue Sort Value:: 2017-0046-2017-0000
Page Start:: 558
Page End:: 573
Publication Date:: 2017-11
Subjects:: Speech enhancement -- Non-negative matrix factorization -- Feature extraction -- Modulation frequency analysis -- CHiME -- Amplitude modulation filter bank
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2016.11.004 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 4753.xml