A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification. (May 2015)
- Record Type:
- Journal Article
- Title:
- A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification. (May 2015)
- Main Title:
- A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification
- Authors:
- Poblete, Victor
Espic, Felipe
King, Simon
Stern, Richard M.
Huenupán, Fernando
Fredes, Josué
Yoma, Nestor Becerra - Abstract:
- Abstract : Highlights: A novel method to instantaneously normalize speech features is proposed (LNCC). LNCC does not need any alterations to the statistical modeling. LNCC dramatically compensates for static spectral tilts. LNCC can be more robust to time varying spectral tilts than standard methods. LNCC does not require the computation and storage of additional statistics. Abstract: This paper proposes a new set of speech features called Locally-Normalized Cepstral Coefficients (LNCC) that are based on Seneff's Generalized Synchrony Detector (GSD). First, an analysis of the GSD frequency response is provided to show that it generates spurious peaks at harmonics of the detected frequency. Then, the GSD frequency response is modeled as a quotient of two filters centered at the detected frequency. The numerator is a triangular band pass filter centered around a particular frequency similar to the ordinary Mel filters. The denominator term is a filter that responds maximally to frequency components on either side of the numerator filter. As a result, a local normalization is performed without the spurious peaks of the original GSD. Speaker verification results demonstrate that the proposed LNCC features are of low computational complexity and far more effectively compensate for spectral tilt than ordinary MFCC coefficients. LNCC features do not require the computation and storage of a moving average of the feature values, and they provide relative reductions in Equal ErrorAbstract : Highlights: A novel method to instantaneously normalize speech features is proposed (LNCC). LNCC does not need any alterations to the statistical modeling. LNCC dramatically compensates for static spectral tilts. LNCC can be more robust to time varying spectral tilts than standard methods. LNCC does not require the computation and storage of additional statistics. Abstract: This paper proposes a new set of speech features called Locally-Normalized Cepstral Coefficients (LNCC) that are based on Seneff's Generalized Synchrony Detector (GSD). First, an analysis of the GSD frequency response is provided to show that it generates spurious peaks at harmonics of the detected frequency. Then, the GSD frequency response is modeled as a quotient of two filters centered at the detected frequency. The numerator is a triangular band pass filter centered around a particular frequency similar to the ordinary Mel filters. The denominator term is a filter that responds maximally to frequency components on either side of the numerator filter. As a result, a local normalization is performed without the spurious peaks of the original GSD. Speaker verification results demonstrate that the proposed LNCC features are of low computational complexity and far more effectively compensate for spectral tilt than ordinary MFCC coefficients. LNCC features do not require the computation and storage of a moving average of the feature values, and they provide relative reductions in Equal Error Rate (EER) as high as 47.7%, 34.0% or 25.8% when compared with MFCC, MFCC + CMN, or MFCC + RASTA in one case of variable spectral tilt, respectively. … (more)
- Is Part Of:
- Computer speech & language. Volume 31(2015)
- Journal:
- Computer speech & language
- Issue:
- Volume 31(2015)
- Issue Display:
- Volume 31, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 31
- Issue:
- 2015
- Issue Sort Value:
- 2015-0031-2015-0000
- Page Start:
- 1
- Page End:
- 27
- Publication Date:
- 2015-05
- Subjects:
- Channel robust feature extraction -- Auditorymodels -- Spectral local normalization -- Synchrony detection
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2014.10.006 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5432.xml