Deep domain adaptation for anti-spoofing in speaker verification systems. (November 2019)
- Record Type:
- Journal Article
- Title:
- Deep domain adaptation for anti-spoofing in speaker verification systems. (November 2019)
- Main Title:
- Deep domain adaptation for anti-spoofing in speaker verification systems
- Authors:
- Himawan, Ivan
Villavicencio, Fernando
Sridharan, Sridha
Fookes, Clinton - Abstract:
- Highlights: Novel deep domain architectures based on CNN for spoofing detection are proposed. A comparison of features using a CNN as a back-end classifier. An extensive cross-database evaluation on ASVspoof and BTAS databases. Shows that deep domain adaptation in both supervised and unsupervised settings improve the spoofing detection performance. Abstract: With the increasing use of voice as a biometric, it has become imperative to develop countermeasures to thwart malicious spoofing attacks on speaker recognition systems. Even though there has been a significant research effort over the last few years dedicated to the development of countermeasures to detect and deflect spoofing attacks, the problem is far from being solved. While a deep learning technique has been successfully applied in anti-spoofing research, it suffers from a data scarcity issue where large amounts of labeled training data are required to build a robust model. In this paper, we investigate a domain adaptation approach of deep architectures in both a supervised setting where we use labeled data, and in an unsupervised setting where we assume unlabeled data when transferring knowledge from the source to target domain. Specifically, we employ convolutional neural networks (CNNs) as back-end classifiers for spoofed speech detection. For supervised domain adaptation, we propose joint neural network training while allowing the weights to be shared between the source and target streams, and an additionalHighlights: Novel deep domain architectures based on CNN for spoofing detection are proposed. A comparison of features using a CNN as a back-end classifier. An extensive cross-database evaluation on ASVspoof and BTAS databases. Shows that deep domain adaptation in both supervised and unsupervised settings improve the spoofing detection performance. Abstract: With the increasing use of voice as a biometric, it has become imperative to develop countermeasures to thwart malicious spoofing attacks on speaker recognition systems. Even though there has been a significant research effort over the last few years dedicated to the development of countermeasures to detect and deflect spoofing attacks, the problem is far from being solved. While a deep learning technique has been successfully applied in anti-spoofing research, it suffers from a data scarcity issue where large amounts of labeled training data are required to build a robust model. In this paper, we investigate a domain adaptation approach of deep architectures in both a supervised setting where we use labeled data, and in an unsupervised setting where we assume unlabeled data when transferring knowledge from the source to target domain. Specifically, we employ convolutional neural networks (CNNs) as back-end classifiers for spoofed speech detection. For supervised domain adaptation, we propose joint neural network training while allowing the weights to be shared between the source and target streams, and an additional domain regularizer. In the unsupervised domain adaptation scenario, the weights are not shared in order to explicitly model the domain shift. However, the weights are related by weight regularizers to take into account the difference between the two domains. We conduct extensive cross-database (domain mismatch) experiments using ASVspoof 2015 and BTAS 2016 datasets to demonstrate the generalization capability of the proposed deep domain architectures for spoofing detection. Experimental results reveal that the proposed architectures can generalize across databases for both supervised and unsupervised adaptation scenarios. … (more)
- Is Part Of:
- Computer speech & language. Volume 58(2019)
- Journal:
- Computer speech & language
- Issue:
- Volume 58(2019)
- Issue Display:
- Volume 58, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 58
- Issue:
- 2019
- Issue Sort Value:
- 2019-0058-2019-0000
- Page Start:
- 377
- Page End:
- 402
- Publication Date:
- 2019-11
- Subjects:
- Convolutional neural networks -- ASVspoof 2015 -- BTAS 2016 -- Deep domain adaptation -- Speaker verification -- Cross-database study
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2019.05.007 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 11148.xml