ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. (November 2020)
- Record Type:
- Journal Article
- Title:
- ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. (November 2020)
- Main Title:
- ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech
- Authors:
- Wang, Xin
Yamagishi, Junichi
Todisco, Massimiliano
Delgado, Héctor
Nautsch, Andreas
Evans, Nicholas
Sahidullah, Md
Vestman, Ville
Kinnunen, Tomi
Lee, Kong Aik
Juvela, Lauri
Alku, Paavo
Peng, Yu-Huai
Hwang, Hsin-Te
Tsao, Yu
Wang, Hsin-Min
Maguer, Sébastien Le
Becker, Markus
Henderson, Fergus
Clark, Rob
Zhang, Yu
Wang, Quan
Jia, Ye
Onuma, Kai
Mushika, Koji
Kaneda, Takashi
Jiang, Yuan
Liu, Li-Juan
Wu, Yi-Chiao
Huang, Wen-Chin
Toda, Tomoki
Tanaka, Kou
Kameoka, Hirokazu
Steiner, Ingmar
Matrouf, Driss
Bonastre, Jean-François
Govender, Avashna
Ronanki, Srikanth
Zhang, Jing-Xuan
Ling, Zhen-Hua
… (more) - Abstract:
- Highlights: We describe the protocol and design of the ASVspoof Challenge 2019 database We detail the speech synthesis and voice conversion algorithms used in the database We detail the carefully controlled simulation to generate replay spoofing speech We evaluate of baseline countermeasure and ASV systems on the database Human assessment found that one spoofing system can fool human listeners Abstract: Automatic speaker verification (ASV) is one of the most natural and convenient means of biometric person recognition. Unfortunately, just like all other biometric systems, ASV is vulnerable to spoofing, also referred to as "presentation attacks." These vulnerabilities are generally unacceptable and call for spoofing countermeasures or "presentation attack detection" systems. In addition to impersonation, ASV systems are vulnerable to replay, speech synthesis, and voice conversion attacks. The ASVspoof challenge initiative was created to foster research on anti-spoofing and to provide common platforms for the assessment and comparison of spoofing countermeasures. The first edition, ASVspoof 2015, focused upon the study of countermeasures for detecting of text-to-speech synthesis (TTS) and voice conversion (VC) attacks. The second edition, ASVspoof 2017, focused instead upon replay spoofing attacks and countermeasures. The ASVspoof 2019 edition is the first to consider all three spoofing attack types within a single challenge. While they originate from the same source databaseHighlights: We describe the protocol and design of the ASVspoof Challenge 2019 database We detail the speech synthesis and voice conversion algorithms used in the database We detail the carefully controlled simulation to generate replay spoofing speech We evaluate of baseline countermeasure and ASV systems on the database Human assessment found that one spoofing system can fool human listeners Abstract: Automatic speaker verification (ASV) is one of the most natural and convenient means of biometric person recognition. Unfortunately, just like all other biometric systems, ASV is vulnerable to spoofing, also referred to as "presentation attacks." These vulnerabilities are generally unacceptable and call for spoofing countermeasures or "presentation attack detection" systems. In addition to impersonation, ASV systems are vulnerable to replay, speech synthesis, and voice conversion attacks. The ASVspoof challenge initiative was created to foster research on anti-spoofing and to provide common platforms for the assessment and comparison of spoofing countermeasures. The first edition, ASVspoof 2015, focused upon the study of countermeasures for detecting of text-to-speech synthesis (TTS) and voice conversion (VC) attacks. The second edition, ASVspoof 2017, focused instead upon replay spoofing attacks and countermeasures. The ASVspoof 2019 edition is the first to consider all three spoofing attack types within a single challenge. While they originate from the same source database and same underlying protocol, they are explored in two specific use case scenarios. Spoofing attacks within a logical access (LA) scenario are generated with the latest speech synthesis and voice conversion technologies, including state-of-the-art neural acoustic and waveform model techniques. Replay spoofing attacks within a physical access (PA) scenario are generated through carefully controlled simulations that support much more revealing analysis than possible previously. Also new to the 2019 edition is the use of the tandem detection cost function metric, which reflects the impact of spoofing and countermeasures on the reliability of a fixed ASV system. This paper describes the database design, protocol, spoofing attack implementations, and baseline ASV and countermeasure results. It also describes a human assessment on spoofed data in logical access. It was demonstrated that the spoofing data in the ASVspoof 2019 database have varied degrees of perceived quality and similarity to the target speakers, including spoofed data that cannot be differentiated from bona fide utterances even by human subjects. It is expected that the ASVspoof 2019 database, with its varied coverage of different types of spoofing data, could further foster research on anti-spoofing. … (more)
- Is Part Of:
- Computer speech & language. Volume 64(2020)
- Journal:
- Computer speech & language
- Issue:
- Volume 64(2020)
- Issue Display:
- Volume 64, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 64
- Issue:
- 2020
- Issue Sort Value:
- 2020-0064-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-11
- Subjects:
- Automatic speaker verification -- Countermeasure -- Anti-spoofing -- Presentation attack -- Presentation attack detection -- Text-to-speech synthesis -- Voice conversion -- Replay -- ASVspoof challenge -- Biometrics -- Media forensics
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2020.101114 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 13459.xml