Uncertainty weighting and propagation in DNN–HMM-based speech recognition. (January 2018)
- Record Type:
- Journal Article
- Title:
- Uncertainty weighting and propagation in DNN–HMM-based speech recognition. (January 2018)
- Main Title:
- Uncertainty weighting and propagation in DNN–HMM-based speech recognition
- Authors:
- Novoa, José
Fredes, Josué
Poblete, Víctor
Yoma, Néstor Becerra - Abstract:
- Highlights: An uncertainty weighting scheme is proposed for DNN–HMM based speech recognition. This uncertainty weighting scheme is combined with uncertainty propagation. Results were obtained with a competitive baseline system. Abstract: In this paper an uncertainty weighting scheme for DNN–HMM-based speech recognition is proposed to increase discriminability in the decoding process. To this end, the DNN pseudo-log-likelihoods are weighted according to the uncertainty variance assigned to the acoustic observation. The results presented here suggest that substantial reduction in WER is achieved with clean training. Moreover, modelling the uncertainty propagation through the DNN is not required and no approximations for non-linear activation functions are made. The presented method can be applied to any network topology that delivers log-likelihood-like scores. It can be combined with any noise removal technique and adds a minimal computational cost. This technique was exhaustively evaluated and combined with uncertainty-propagation-based schemes for computing the pseudo-log-likelihoods and uncertainty variance at the DNN output. Two proposed methods optimized the parameters of the weighting function by leveraging the grid search either on a development database representing the given task or on each utterance based on discrimination metrics. Experiments with Aurora-4 task showed that, with clean training, the proposed weighting scheme can reduce WER by a maximum of 21%Highlights: An uncertainty weighting scheme is proposed for DNN–HMM based speech recognition. This uncertainty weighting scheme is combined with uncertainty propagation. Results were obtained with a competitive baseline system. Abstract: In this paper an uncertainty weighting scheme for DNN–HMM-based speech recognition is proposed to increase discriminability in the decoding process. To this end, the DNN pseudo-log-likelihoods are weighted according to the uncertainty variance assigned to the acoustic observation. The results presented here suggest that substantial reduction in WER is achieved with clean training. Moreover, modelling the uncertainty propagation through the DNN is not required and no approximations for non-linear activation functions are made. The presented method can be applied to any network topology that delivers log-likelihood-like scores. It can be combined with any noise removal technique and adds a minimal computational cost. This technique was exhaustively evaluated and combined with uncertainty-propagation-based schemes for computing the pseudo-log-likelihoods and uncertainty variance at the DNN output. Two proposed methods optimized the parameters of the weighting function by leveraging the grid search either on a development database representing the given task or on each utterance based on discrimination metrics. Experiments with Aurora-4 task showed that, with clean training, the proposed weighting scheme can reduce WER by a maximum of 21% compared with a baseline system with spectral subtraction and uncertainty propagation using the unscented transform. The uncertainty weighting method reduced the gap between clean and multi-noise/multi-condition training. This can be useful when it is not easy to train a DNN–HMM system in conditions that are similar to the testing ones. Finally, the presented results on the use of uncertainty are very competitive with those published elsewhere using the same database as the one employed here. … (more)
- Is Part Of:
- Computer speech & language. Volume 47(2018)
- Journal:
- Computer speech & language
- Issue:
- Volume 47(2018)
- Issue Display:
- Volume 47, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 47
- Issue:
- 2018
- Issue Sort Value:
- 2018-0047-2018-0000
- Page Start:
- 30
- Page End:
- 46
- Publication Date:
- 2018-01
- Subjects:
- Automatic speech recognition -- Deep neural network -- Uncertainty weighting -- Uncertainty propagation -- DNN–HMM
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2017.06.005 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20786.xml