Discriminative subspace modeling of SNR and duration variabilities for robust speaker verification. (September 2017)

Record Type:: Journal Article
Title:: Discriminative subspace modeling of SNR and duration variabilities for robust speaker verification. (September 2017)
Main Title:: Discriminative subspace modeling of SNR and duration variabilities for robust speaker verification
Authors:: Li, Na
Mak, Man-Wai
Lin, Wei-Wei
Chien, Jen-Tzung
Abstract:: Highlights: Model SNR and duration variability of i-vectors in discriminative subspaces. Use variational Bayesian methods to infer the latent variable model that defines the SNR and duration subspaces. Perform better than PLDA, SNR-invariant PLDA and PLDA with uncertainty propagation on long test utterances. Abstract: Although i-vectors together with probabilistic LDA (PLDA) have achieved a great success in speaker verification, how to suppress the undesirable effects caused by the variability in utterance length and background noise level is still a challenge. This paper aims to improve the robustness of i-vector based speaker verification systems by compensating for the utterance-length variability and noise-level variability. Inspired by the recent findings that noise-level variability can be modeled by a signal-to-noise ratio (SNR) subspace and that duration variability can be modeled as additive noise in the i-vector space, we propose to add an SNR factor and a duration factor to the PLDA model. In this framework, we assume that i-vectors derived from utterances with comparable durations share similar duration-specific information and that i-vectors extracted from utterances within a narrow SNR range have similar SNR-specific information. Based on these assumptions, an i-vector can be represented as a linear combination of four components: speaker, SNR, duration, and channel. A variational Bayes algorithm is developed to infer this latent variable model via a … (more)
Is Part Of:: Computer speech & language. Volume 45(2017)
Journal:: Computer speech & language
Issue:: Volume 45(2017)
Issue Display:: Volume 45, Issue 2017 (2017)
Year:: 2017
Volume:: 45
Issue:: 2017
Issue Sort Value:: 2017-0045-2017-0000
Page Start:: 83
Page End:: 103
Publication Date:: 2017-09
Subjects:: Speaker verification -- Duration variation -- SNR mismatch -- Variational Bayes -- I-vector -- PLDA
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2017.04.001 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 2060.xml