Synthetic speech detection using fundamental frequency variation and spectral features. (March 2018)

Record Type:: Journal Article
Title:: Synthetic speech detection using fundamental frequency variation and spectral features. (March 2018)
Main Title:: Synthetic speech detection using fundamental frequency variation and spectral features
Authors:: Pal, Monisankha
Paul, Dipjyoti
Saha, Goutam
Abstract:: Highlights: Proposed synthetic speech detection using score fusion of CQCC, APGDF and fundamental frequency variation (FFV) features. Best spoofing detection performance on the ASVspoof 2015 evaluation dataset with an overall EER of 0.05%. Produced the state-of-the-art performance for ASV integrated with countermeasure framework. Superior performance in generalization ability assessment. Abstract: Recent works on the vulnerability of automatic speaker verification (ASV) systems confirm that malicious spoofing attacks using synthetic speech can provoke significant increase in false acceptance rate. A reliable detection of synthetic speech is key to develop countermeasure for synthetic speech based spoofing attacks. In this paper, we targeted that by focusing on three major types of artifacts related to magnitude, phase and pitch variation, which are introduced during the generation of synthetic speech. We proposed a new approach to detect synthetic speech using score-level fusion of front-end features namely, constant Q cepstral coefficients (CQCCs), all-pole group delay function (APGDF) and fundamental frequency variation (FFV). CQCC and APGDF were individually used earlier for spoofing detection task and yielded the best performance among magnitude and phase spectrum related features, respectively. The novel FFV feature introduced in this paper to extract pitch variation at frame-level, provides complementary information to CQCC and APGDF. Experimental results show that the … (more)
Is Part Of:: Computer speech & language. Volume 48(2018)
Journal:: Computer speech & language
Issue:: Volume 48(2018)
Issue Display:: Volume 48, Issue 2018 (2018)
Year:: 2018
Volume:: 48
Issue:: 2018
Issue Sort Value:: 2018-0048-2018-0000
Page Start:: 31
Page End:: 50
Publication Date:: 2018-03
Subjects:: All-pole group delay function (APGDF) -- Anti-spoofing -- Constant Q cepstral coefficient (CQCC) -- Fundamental frequency variation (FFV) -- Score-level fusion -- Spoofing attack
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2017.10.001 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 5454.xml