Optimization of the area under the ROC curve using neural network supervectors for text-dependent speaker verification. (September 2020)

Record Type:: Journal Article
Title:: Optimization of the area under the ROC curve using neural network supervectors for text-dependent speaker verification. (September 2020)
Main Title:: Optimization of the area under the ROC curve using neural network supervectors for text-dependent speaker verification
Authors:: Mingote, Victoria
Miguel, Antonio
Ortega, Alfonso
Lleida, Eduardo
Abstract:: Highlights: The use of different alignment techniques inside of deep neural networks to keep the temporal structure and to encode the relevant information in a differentiable supervector. The use of Convolutional Neuronal Networks to increase the temporal context of the representation for the training process of the system. The use of a cost function close to the objective of the verification task. A set of experiments on a small database with several phrases for text-dependent phrased based speaker verification. Study of the system performance using the different alignment techniques proposed in combination with two neural network alternatives, and the novel cost function proposed contrast with the traditional triplet loss. Abstract: This paper explores two techniques to improve the performance of text-dependent speaker verification systems based on deep neural networks. Firstly, we propose a general alignment mechanism to keep the temporal structure of each phrase and obtain a supervector with the speaker and phrase information, since both are relevant for a text-dependent verification. As we show, it is possible to use different alignment techniques to replace the global average pooling providing significant gains in performance. Moreover, we also present a novel Back-end approach to train a neural network for detection tasks by optimizing the Area Under the Curve (AUC) as an alternative to the usual triplet loss function, so the system is end-to-end, with a cost function … (more)
Is Part Of:: Computer speech & language. Volume 63(2020)
Journal:: Computer speech & language
Issue:: Volume 63(2020)
Issue Display:: Volume 63, Issue 2020 (2020)
Year:: 2020
Volume:: 63
Issue:: 2020
Issue Sort Value:: 2020-0063-2020-0000
Page Start:
Page End:
Publication Date:: 2020-09
Subjects:: Text dependent speaker verification -- Supervectors -- Alignment -- Triplet neural network -- AUC
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2020.101078 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 13576.xml