A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions. (November 2017)
- Record Type:
- Journal Article
- Title:
- A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions. (November 2017)
- Main Title:
- A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions
- Authors:
- Sivasankaran, Sunit
Vincent, Emmanuel
Illina, Irina - Abstract:
- Highlights: A detailed study of various techniques for reverberation robust ASR is conducted. Performances are evaluated and compared for new as well as established techniques. The weighted prediction error method of dereverberation and beamformit worked best in combination with DNN based dereverberation. fMLLR features were found to work best. Abstract: Robustness to reverberation is a key concern for distant-microphone ASR. Various approaches have been proposed, including single-channel or multichannel dereverberation, robust feature extraction, alternative acoustic models, and acoustic model adaptation. However, to the best of our knowledge, a detailed study of these techniques in varied reverberation conditions is still missing in the literature. In this paper, we conduct a series of experiments to assess the impact of various dereverberation and acoustic model adaptation approaches on the ASR performance in the range of reverberation conditions found in real domestic environments. We consider both established approaches such as WPE and newer approaches such as learning hidden unit contribution (LHUC) adaptations, whose performance has not been reported before in this context, and we employ them in combination. Our results indicate that performing weighted prediction error (WPE) dereverberation on a reverberated test speech utterance and decoding using a deep neural network (DNN) acoustic model trained with multi-condition reverberated speech with feature-space maximumHighlights: A detailed study of various techniques for reverberation robust ASR is conducted. Performances are evaluated and compared for new as well as established techniques. The weighted prediction error method of dereverberation and beamformit worked best in combination with DNN based dereverberation. fMLLR features were found to work best. Abstract: Robustness to reverberation is a key concern for distant-microphone ASR. Various approaches have been proposed, including single-channel or multichannel dereverberation, robust feature extraction, alternative acoustic models, and acoustic model adaptation. However, to the best of our knowledge, a detailed study of these techniques in varied reverberation conditions is still missing in the literature. In this paper, we conduct a series of experiments to assess the impact of various dereverberation and acoustic model adaptation approaches on the ASR performance in the range of reverberation conditions found in real domestic environments. We consider both established approaches such as WPE and newer approaches such as learning hidden unit contribution (LHUC) adaptations, whose performance has not been reported before in this context, and we employ them in combination. Our results indicate that performing weighted prediction error (WPE) dereverberation on a reverberated test speech utterance and decoding using a deep neural network (DNN) acoustic model trained with multi-condition reverberated speech with feature-space maximum likelihood linear regression (fMLLR) transformed features, outperforms more recent approaches and helps significantly reduce the word error rate (WER). … (more)
- Is Part Of:
- Computer speech & language. Volume 46(2017)
- Journal:
- Computer speech & language
- Issue:
- Volume 46(2017)
- Issue Display:
- Volume 46, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 46
- Issue:
- 2017
- Issue Sort Value:
- 2017-0046-2017-0000
- Page Start:
- 444
- Page End:
- 460
- Publication Date:
- 2017-11
- Subjects:
- Robust ASR -- Dereverberation -- Acoustic model adaptation -- Evaluation
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2017.02.003 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 2908.xml