On the use of blind channel response estimation and a residual neural network to detect physical access attacks to speaker verification systems. (March 2021)

Record Type:: Journal Article
Title:: On the use of blind channel response estimation and a residual neural network to detect physical access attacks to speaker verification systems. (March 2021)
Main Title:: On the use of blind channel response estimation and a residual neural network to detect physical access attacks to speaker verification systems
Authors:: Avila, Anderson R.
Alam, Jahangir
Prado, Fabiano O. Costa
O'Shaughnessy, Douglas
Falk, Tiago H.
Abstract:: Highlights: The use of blind channel response estimation as a new approach for replay attack detection. The proposed method outperformed the baseline systems in two spoofing datasets. Further improvement achieved after combining recent deep learning models. Front- and back-end based on single feature extraction and single neural network classifier. Abstract: Spoofing attacks have been acknowledged as a serious threat to automatic speaker verification (ASV) systems. In this paper, we are specifically concerned with replay attack scenarios. As a countermeasure to the problem, we propose a front-end based on the blind estimation of the channel response magnitude and as a back-end a residual neural network. Our hypothesis is that the magnitude response of the channel, obtained by subtracting the log-magnitude spectrum of the observed signal from the prediction of the log-magnitude spectrum average of the observed signal's clean counterpart, will capture the nuances of room ambiences, recordings and playback devices. The performance of these features is investigated on a benchmark back-end, based on a Gaussian mixture model and on a deep neural network classifier. Our experiments are performed on the 2017 and 2019 Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof) datasets. The benchmark systems are the same as used in the challenges and are based on constant-Q cepstral coefficients (CQCC) and linear-frequency cepstral coefficients (LFCC) features. … (more)
Is Part Of:: Computer speech & language. Volume 66(2021)
Journal:: Computer speech & language
Issue:: Volume 66(2021)
Issue Display:: Volume 66, Issue 2021 (2021)
Year:: 2021
Volume:: 66
Issue:: 2021
Issue Sort Value:: 2021-0066-2021-0000
Page Start:
Page End:
Publication Date:: 2021-03
Subjects:: Automatic speaker recognition -- Spoofing attacks -- Replay attack -- Channel estimation
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2020.101163 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 15413.xml