Articulatory feature-based pronunciation modeling. (March 2016)
- Record Type:
- Journal Article
- Title:
- Articulatory feature-based pronunciation modeling. (March 2016)
- Main Title:
- Articulatory feature-based pronunciation modeling
- Authors:
- Livescu, Karen
Jyothi, Preethi
Fosler-Lussier, Eric - Abstract:
- Abstract : Highlights: We review our research developing articulatory feature-based pronunciation models. We argue for sub-phonetic features inspired by articulatory phonology. We review evaluations via frame-level perplexity and lexical access performance. Our models typically outperform phone-based models. Context-dependent surface feature models outperform context-independent models. Abstract: Spoken language, especially conversational speech, is characterized by great variability in word pronunciation, including many variants that differ grossly from dictionary prototypes. This is one factor in the poor performance of automatic speech recognizers on conversational speech, and it has been very difficult to mitigate in traditional phone-based approaches to speech recognition. An alternative approach, which has been studied by ourselves and others, is one based on sub-phonetic features rather than phones. In such an approach, a word's pronunciation is represented as multiple streams of phonological features rather than a single stream of phones. Features may correspond to the positions of the speech articulators, such as the lips and tongue, or may be more abstract categories such as manner and place. This article reviews our work on a particular type of articulatory feature-based pronunciation model. The model allows for asynchrony between features, as well as per-feature substitutions, making it more natural to account for many pronunciation changes that are difficult toAbstract : Highlights: We review our research developing articulatory feature-based pronunciation models. We argue for sub-phonetic features inspired by articulatory phonology. We review evaluations via frame-level perplexity and lexical access performance. Our models typically outperform phone-based models. Context-dependent surface feature models outperform context-independent models. Abstract: Spoken language, especially conversational speech, is characterized by great variability in word pronunciation, including many variants that differ grossly from dictionary prototypes. This is one factor in the poor performance of automatic speech recognizers on conversational speech, and it has been very difficult to mitigate in traditional phone-based approaches to speech recognition. An alternative approach, which has been studied by ourselves and others, is one based on sub-phonetic features rather than phones. In such an approach, a word's pronunciation is represented as multiple streams of phonological features rather than a single stream of phones. Features may correspond to the positions of the speech articulators, such as the lips and tongue, or may be more abstract categories such as manner and place. This article reviews our work on a particular type of articulatory feature-based pronunciation model. The model allows for asynchrony between features, as well as per-feature substitutions, making it more natural to account for many pronunciation changes that are difficult to handle with phone-based models. Such models can be efficiently represented as dynamic Bayesian networks. The feature-based models improve significantly over phone-based counterparts in terms of frame perplexity and lexical access accuracy. The remainder of the article discusses related work and future directions. … (more)
- Is Part Of:
- Computer speech & language. Volume 36(2016)
- Journal:
- Computer speech & language
- Issue:
- Volume 36(2016)
- Issue Display:
- Volume 36, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 36
- Issue:
- 2016
- Issue Sort Value:
- 2016-0036-2016-0000
- Page Start:
- 212
- Page End:
- 232
- Publication Date:
- 2016-03
- Subjects:
- Speech recognition -- Articulatory features -- Pronunciation modeling -- Dynamic Bayesian networks
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2015.07.003 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 528.xml