13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE. (September 2020)

Record Type:: Journal Article
Title:: 13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE. (September 2020)
Main Title:: 13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE
Authors:: Matějka, Pavel
Plchot, Oldřich
Glembek, Ondřej
Burget, Lukáš
Rohdin, Johan
Zeinali, Hossein
Mošner, Ladislav
Silnova, Anna
Novotný, Ondřej
Diez, Mireia
"Honza" Černocký, Jan
Abstract:: Highlights: We present a "longitudinal study" of all important milestone techniques used in speaker recognition by evaluating on multiple NIST SREs. We provide aa analysis of difficulty of individual NIST SREs. We investigate the impact of the amount of training data on performance of particular Speaker Recognition methods. We evaluate milestone techniques also on the Speakers In The Wild (SITW) and VOiCES challenge datasets, as the amount of- and interest in user-contributed audiovisual content grows. Abstract: In this paper, we present a brief history and a "longitudinal study" of all important milestone modelling techniques used in text independent speaker recognition since Brno University of Technology (BUT) first participated in the NIST Speaker Recognition Evaluation (SRE) in 2006—GMM MAP, GMM MAP with eigen-channel adaptation, Joint Factor Analysis, i-vector and DNN embedding (x-vector). To emphasize the historical context, the techniques are evaluated on all NIST SRE sets since 2004 on a time-machine principle, i.e. a system is always trained using all data available up till the year of evaluation. Moreover, as user-contributed audiovisual content dominates nowadays' Internet, we representatively include the Speakers In The Wild (SITW) and VOiCES challenge datasets in the evaluation of our systems. Not only we present a comparison of the modelling techniques, but we also show the effect of sampling frequency.
Is Part Of:: Computer speech & language. Volume 63(2020)
Journal:: Computer speech & language
Issue:: Volume 63(2020)
Issue Display:: Volume 63, Issue 2020 (2020)
Year:: 2020
Volume:: 63
Issue:: 2020
Issue Sort Value:: 2020-0063-2020-0000
Page Start:
Page End:
Publication Date:: 2020-09
Subjects:: Speaker recognition -- NIST -- Evaluations -- GMM -- Eigen-channel compensation -- JFA -- I-vectors -- DNN Embedding -- X-vectors
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2019.101035 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 13576.xml