Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task. (January 2019)
- Record Type:
- Journal Article
- Title:
- Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task. (January 2019)
- Main Title:
- Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task
- Authors:
- Jokinen, Emma
Saeidi, Rahim
Kinnunen, Tomi
Alku, Paavo - Abstract:
- Highlights: Speaker recognition from shouts is poor with systems trained using normal speech. Recognition drops due to vocal effort mismatch between training and testing. Two methods are proposed to compensate for vocal effort mismatch in speaker recognition. The techniques modify the all-pole power spectrum of the MFCC computation chain. The proposed methods improve identification rates in vocal effort mismatch. Abstract: In shouting, speakers use increased vocal effort to convey spoken messages over distance or above environmental noise. For automatic speaker recognition systems trained using normal speech, shouting causes a severe vocal effort mismatch between the enrollment and test hence reducing the recognition performance. In this study, two compensation methods are proposed to tackle the mismatch in a shouted versus normal speaker recognition task. These techniques are applied in the feature extraction stage of a speaker recognition system to modify the spectral envelopes of shouts to be closer to those in normal speech. The techniques modify the all-pole power spectrum of the MFCC computation chain with shouted-to-normal compensation filtering that is obtained using a GMM-based statistical mapping. In an evaluation using the state-of-the-art i-vector based recognition system, the proposed techniques provided considerable improvements in identification rates compared to the case when shouted speech spectra were not processed.
- Is Part Of:
- Computer speech & language. Volume 53(2019)
- Journal:
- Computer speech & language
- Issue:
- Volume 53(2019)
- Issue Display:
- Volume 53, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 53
- Issue:
- 2019
- Issue Sort Value:
- 2019-0053-2019-0000
- Page Start:
- 1
- Page End:
- 11
- Publication Date:
- 2019-01
- Subjects:
- Speaker recognition -- Vocal effort mismatch -- Shouted speech
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2018.06.002 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 7529.xml