Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation. (March 2016)
- Record Type:
- Journal Article
- Title:
- Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation. (March 2016)
- Main Title:
- Differenced maximum mutual information criterion for robust unsupervised acoustic model adaptation
- Authors:
- Delcroix, Marc
Ogawa, Atsunori
Hahm, Seong-Jun
Nakatani, Tomohiro
Nakamura, Atsushi - Abstract:
- Abstract : Highlights: The differenced-MMI (dMMI) is a discriminative criterion that generalizes MPE and BMMI. We discuss the behavior of dMMI when there are errors in the transcription labels. dMMI may be less sensitive to such errors than other criteria. We support our claim with unsupervised speaker adaptation experiments. dMMI based adaptation achieves significant gains over MLLR for 2 LVCSR tasks. Abstract: Discriminative criteria have been widely used for training acoustic models for automatic speech recognition (ASR). Many discriminative criteria have been proposed including maximum mutual information (MMI), minimum phone error (MPE), and boosted MMI (BMMI). Discriminative training is known to provide significant performance gains over conventional maximum-likelihood (ML) training. However, as discriminative criteria aim at direct minimization of the classification error, they strongly rely on having accurate reference labels. Errors in the reference labels directly affect the performance. Recently, the differenced MMI (dMMI) criterion has been proposed for generalizing conventional criteria such as BMMI and MPE. dMMI can approach BMMI or MPE if its hyper-parameters are properly set. Moreover, dMMI introduces intermediate criteria that can be interpreted as smoothed versions of BMMI or MPE. These smoothed criteria are robust to errors in the reference labels. In this paper, we demonstrate the effect of dMMI on unsupervised speaker adaptation where the reference labelsAbstract : Highlights: The differenced-MMI (dMMI) is a discriminative criterion that generalizes MPE and BMMI. We discuss the behavior of dMMI when there are errors in the transcription labels. dMMI may be less sensitive to such errors than other criteria. We support our claim with unsupervised speaker adaptation experiments. dMMI based adaptation achieves significant gains over MLLR for 2 LVCSR tasks. Abstract: Discriminative criteria have been widely used for training acoustic models for automatic speech recognition (ASR). Many discriminative criteria have been proposed including maximum mutual information (MMI), minimum phone error (MPE), and boosted MMI (BMMI). Discriminative training is known to provide significant performance gains over conventional maximum-likelihood (ML) training. However, as discriminative criteria aim at direct minimization of the classification error, they strongly rely on having accurate reference labels. Errors in the reference labels directly affect the performance. Recently, the differenced MMI (dMMI) criterion has been proposed for generalizing conventional criteria such as BMMI and MPE. dMMI can approach BMMI or MPE if its hyper-parameters are properly set. Moreover, dMMI introduces intermediate criteria that can be interpreted as smoothed versions of BMMI or MPE. These smoothed criteria are robust to errors in the reference labels. In this paper, we demonstrate the effect of dMMI on unsupervised speaker adaptation where the reference labels are estimated from a first recognition pass and thus inevitably contain errors. In particular, we introduce dMMI-based linear regression (dMMI-LR) adaptation and demonstrate significant gains in performance compared with MLLR and BMMI-LR in two large vocabulary lecture recognition tasks. … (more)
- Is Part Of:
- Computer speech & language. Volume 36(2016)
- Journal:
- Computer speech & language
- Issue:
- Volume 36(2016)
- Issue Display:
- Volume 36, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 36
- Issue:
- 2016
- Issue Sort Value:
- 2016-0036-2016-0000
- Page Start:
- 24
- Page End:
- 41
- Publication Date:
- 2016-03
- Subjects:
- Discriminative criterion -- Differenced maximum mutual information -- Speech recognition -- Acoustic model adaptation -- Unsupervised adaptation
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2015.08.001 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 528.xml