Evaluating the predictions of objective intelligibility metrics for modified and synthetic speech. (January 2016)
- Record Type:
- Journal Article
- Title:
- Evaluating the predictions of objective intelligibility metrics for modified and synthetic speech. (January 2016)
- Main Title:
- Evaluating the predictions of objective intelligibility metrics for modified and synthetic speech
- Authors:
- Tang, Yan
Cooke, Martin
Valentini-Botinhao, Cassia - Abstract:
- Abstract : Highlights: Algorithmically modified speech is used to assess objective intelligibility metrics. Reduced predictive power of the metrics for the given speech is demonstrated. Metrics show two opposite predictive patterns in fluctuating and stationary maskers. The glimpse proportion metric is extended. Abstract: Several modification algorithms that alter natural or synthetic speech with the goal of improving intelligibility in noise have been proposed recently. A key requirement of many modification techniques is the ability to predict intelligibility, both offline during algorithm development, and online, in order to determine the optimal modification for the current noise context. While existing objective intelligibility metrics (OIMs) have good predictive power for unmodified natural speech in stationary and fluctuating noise, little is known about their effectiveness for other forms of speech. The current study evaluated how well seven OIMs predict listener responses in three large datasets of modified and synthetic speech which together represent 396 combinations of speech modification, masker type and signal-to-noise ratio. The chief finding is a clear reduction in predictive power for most OIMs when faced with modified and synthetic speech. Modifications introducing durational changes are particularly harmful to intelligibility predictors. OIMs that measure masked audibility tend to over-estimate intelligibility in the presence of fluctuating maskersAbstract : Highlights: Algorithmically modified speech is used to assess objective intelligibility metrics. Reduced predictive power of the metrics for the given speech is demonstrated. Metrics show two opposite predictive patterns in fluctuating and stationary maskers. The glimpse proportion metric is extended. Abstract: Several modification algorithms that alter natural or synthetic speech with the goal of improving intelligibility in noise have been proposed recently. A key requirement of many modification techniques is the ability to predict intelligibility, both offline during algorithm development, and online, in order to determine the optimal modification for the current noise context. While existing objective intelligibility metrics (OIMs) have good predictive power for unmodified natural speech in stationary and fluctuating noise, little is known about their effectiveness for other forms of speech. The current study evaluated how well seven OIMs predict listener responses in three large datasets of modified and synthetic speech which together represent 396 combinations of speech modification, masker type and signal-to-noise ratio. The chief finding is a clear reduction in predictive power for most OIMs when faced with modified and synthetic speech. Modifications introducing durational changes are particularly harmful to intelligibility predictors. OIMs that measure masked audibility tend to over-estimate intelligibility in the presence of fluctuating maskers relative to stationary maskers, while OIMs that estimate the distortion caused by the masker to a clean speech prototype exhibit the reverse pattern. … (more)
- Is Part Of:
- Computer speech & language. Volume 35(2016)
- Journal:
- Computer speech & language
- Issue:
- Volume 35(2016)
- Issue Display:
- Volume 35, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 35
- Issue:
- 2016
- Issue Sort Value:
- 2016-0035-2016-0000
- Page Start:
- 73
- Page End:
- 92
- Publication Date:
- 2016-01
- Subjects:
- Objective intelligibility metric -- Noise -- Speech modifications -- Synthetic speech
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2015.06.002 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 8942.xml