Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition. (November 2019)

Record Type:: Journal Article
Title:: Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition. (November 2019)
Main Title:: Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition
Authors:: Novotný, Ondřej
Plchot, Oldřich
Glembek, Ondřej
Černocký, Jan "Honza"
Burget, Lukáš
Abstract:: Highlights: This work presents an analysis of a DNN-based autoencoder for speech enhancement, dereverberation and denoising for robust speaker verification (SV). It includes a detailed performance analysis for different SV system paradigms (i-vectors vs. x-vectors), features and test conditions. Presented results are computed on the public and widely used NIST SRE 2010, 2016, PRISM and SITW. It also presents results achieved with pure autoencoder enhancement, multi-condition PLDA training and their simultaneous use. Best system performance is achieved by combining the DNN autoencoder with x-vectors and multi-condition training in PLDA. Abstract: In this work, we present an analysis of a DNN-based autoencoder for speech enhancement, dereverberation and denoising. The target application is a robust speaker verification (SV) system. We start our approach by carefully designing a data augmentation process to cover a wide range of acoustic conditions and to obtain rich training data for various components of our SV system. We augment several well-known databases used in SV with artificially noised and reverberated data and we use them to train a denoising autoencoder (mapping noisy and reverberated speech to its clean version) as well as an x-vector extractor which is currently considered as state-of-the-art in SV. Later, we use the autoencoder as a preprocessing step for a text-independent SV system. We compare results achieved with autoencoder enhancement, multi-condition PLDA … (more)
Is Part Of:: Computer speech & language. Volume 58(2019)
Journal:: Computer speech & language
Issue:: Volume 58(2019)
Issue Display:: Volume 58, Issue 2019 (2019)
Year:: 2019
Volume:: 58
Issue:: 2019
Issue Sort Value:: 2019-0058-2019-0000
Page Start:: 403
Page End:: 421
Publication Date:: 2019-11
Subjects:: Speaker verification -- Signal enhancement -- Autoencoder -- Neural network -- Robustness -- Embedding
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2019.06.004 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 11148.xml