Comparing human and automatic speech recognition in a perceptual restoration experiment. (January 2016)
- Record Type:
- Journal Article
- Title:
- Comparing human and automatic speech recognition in a perceptual restoration experiment. (January 2016)
- Main Title:
- Comparing human and automatic speech recognition in a perceptual restoration experiment
- Authors:
- Remes, Ulpu
Ramírez López, Ana
Juvela, Lauri
Palomäki, Kalle
Brown, Guy J.
Alku, Paavo
Kurimo, Mikko - Abstract:
- Abstract : Highlights: Missing-data methods are evaluated in a perceptual restoration task. Human and automatic speech recognition performance are compared. Methods include a novel approach to cepstral-domain bounded marginalisation. Abstract: Speech that has been distorted by introducing spectral or temporal gaps is still perceived as continuous and complete by human listeners, so long as the gaps are filled with additive noise of sufficient intensity. When such perceptual restoration occurs, the speech is also more intelligible compared to the case in which noise has not been added in the gaps. This observation has motivated so-called 'missing data' systems for automatic speech recognition (ASR), but there have been few attempts to determine whether such systems are a good model of perceptual restoration in human listeners. Accordingly, the current paper evaluates missing data ASR in a perceptual restoration task. We evaluated two systems that use a new approach to bounded marginalisation in the cepstral domain, and a bounded conditional mean imputation method. Both methods model available speech information as a clean-speech posterior distribution that is subsequently passed to an ASR system. The proposed missing data ASR systems were evaluated using distorted speech, in which spectro-temporal gaps were optionally filled with additive noise. Speech recognition performance of the proposed systems was compared against a baseline ASR system, and with human speech recognitionAbstract : Highlights: Missing-data methods are evaluated in a perceptual restoration task. Human and automatic speech recognition performance are compared. Methods include a novel approach to cepstral-domain bounded marginalisation. Abstract: Speech that has been distorted by introducing spectral or temporal gaps is still perceived as continuous and complete by human listeners, so long as the gaps are filled with additive noise of sufficient intensity. When such perceptual restoration occurs, the speech is also more intelligible compared to the case in which noise has not been added in the gaps. This observation has motivated so-called 'missing data' systems for automatic speech recognition (ASR), but there have been few attempts to determine whether such systems are a good model of perceptual restoration in human listeners. Accordingly, the current paper evaluates missing data ASR in a perceptual restoration task. We evaluated two systems that use a new approach to bounded marginalisation in the cepstral domain, and a bounded conditional mean imputation method. Both methods model available speech information as a clean-speech posterior distribution that is subsequently passed to an ASR system. The proposed missing data ASR systems were evaluated using distorted speech, in which spectro-temporal gaps were optionally filled with additive noise. Speech recognition performance of the proposed systems was compared against a baseline ASR system, and with human speech recognition performance on the same task. We conclude that missing data methods improve speech recognition performance in a manner that is consistent with perceptual restoration in human listeners. … (more)
- Is Part Of:
- Computer speech & language. Volume 35(2016)
- Journal:
- Computer speech & language
- Issue:
- Volume 35(2016)
- Issue Display:
- Volume 35, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 35
- Issue:
- 2016
- Issue Sort Value:
- 2016-0035-2016-0000
- Page Start:
- 14
- Page End:
- 31
- Publication Date:
- 2016-01
- Subjects:
- Automatic speech recognition -- Missing data -- Observation uncertainties -- Perceptual restoration -- Uncertainty propagation
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2015.06.005 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 8942.xml