Whispered speech recognition using deep denoising autoencoder. (March 2017)
- Record Type:
- Journal Article
- Title:
- Whispered speech recognition using deep denoising autoencoder. (March 2017)
- Main Title:
- Whispered speech recognition using deep denoising autoencoder
- Authors:
- Grozdić, Đorđe T.
Jovičić, Slobodan T.
Subotić, Miško - Abstract:
- Abstract: Recently Deep Denoising Autoencoders (DDAE) have shown state-of-the-art performance on various machine learning tasks. In this paper, the authors extended this approach to whispered speech recognition which is one of the most challenging problems in Automatic Speech Recognition (ASR). Namely, due to the profound differences between acoustic characteristics of neutral and whispered speech, the performance of traditional ASR systems trained on neutral speech degrades significantly when whisper is applied. This mismatch between training and testing is successfully alleviated with the new proposed system based on deep learning, where DDAE is applied for generating whisper-robust cepstral features. This system was tested and compared in terms of word recognition accuracy with conventional Hidden Markov Model (HMM) speech recognizer in an isolated word recognition task with a real database of whispered speech (WhiSpe). Three types of cepstral coefficients were used in the experiments: MFCC (Mel-Frequency Cepstral Coefficients), TECC (Teager-Energy Cepstral Coefficients) and TEMFCC (Teager-based Mel-Frequency Cepstral Coefficients). The experimental results showed that the proposed system significantly improves whisper recognition accuracy and outperforms traditional HMM-MFCC baseline, resulting in an absolute 31% improvement of whisper recognition accuracy. The highest word recognition rate of 92.81% in whispered speech was achieved with TECC feature.
- Is Part Of:
- Engineering applications of artificial intelligence. Volume 59(2016:Nov.)
- Journal:
- Engineering applications of artificial intelligence
- Issue:
- Volume 59(2016:Nov.)
- Issue Display:
- Volume 59 (2016)
- Year:
- 2016
- Volume:
- 59
- Issue Sort Value:
- 2016-0059-0000-0000
- Page Start:
- 15
- Page End:
- 22
- Publication Date:
- 2017-03
- Subjects:
- Automatic speech recognition (ASR) -- Whispered speech -- Deep denoising autoencoder (DDAE) -- Deep learning
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.engappai.2016.12.012 ↗
- Languages:
- English
- ISSNs:
- 0952-1976
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 253.xml