Speaker anonymization by modifying fundamental frequency and x-vector singular value. (May 2022)
- Record Type:
- Journal Article
- Title:
- Speaker anonymization by modifying fundamental frequency and x-vector singular value. (May 2022)
- Main Title:
- Speaker anonymization by modifying fundamental frequency and x-vector singular value
- Authors:
- Mawalim, Candy Olivia
Galajit, Kasorn
Karnjana, Jessada
Kidani, Shunsuke
Unoki, Masashi - Abstract:
- Abstract: Speaker anonymization is a method of protecting voice privacy by concealing individual speaker characteristics while preserving linguistic information. The VoicePrivacy Challenge 2020 was initiated to generalize the task of speaker anonymization. In the challenge, two frameworks for speaker anonymization were introduced; in this study, we propose a method of improving the primary framework by modifying the state-of-the-art speaker individuality feature (namely, x-vector) in a neural waveform speech synthesis model. Our proposed method is constructed based on x-vector singular value modification with a clustering model. We also propose a technique of modifying the fundamental frequency and speech duration to enhance the anonymization performance. To evaluate our method, we carried out objective and subjective tests. The overall objective test results show that our proposed method improves the anonymization performance in terms of the speaker verifiability, whereas the subjective evaluation results show improvement in terms of the speaker dissimilarity. The intelligibility and naturalness of the anonymized speech with speech prosody modification were slightly reduced (less than 5% of word error rate) compared to the results obtained by the baseline system. Graphical abstract: Highlights: Anonymizing personally identifiable information is crucial to protect speech privacy. We propose an anonymization method by modifying the F 0, duration, and x-vector. A reliableAbstract: Speaker anonymization is a method of protecting voice privacy by concealing individual speaker characteristics while preserving linguistic information. The VoicePrivacy Challenge 2020 was initiated to generalize the task of speaker anonymization. In the challenge, two frameworks for speaker anonymization were introduced; in this study, we propose a method of improving the primary framework by modifying the state-of-the-art speaker individuality feature (namely, x-vector) in a neural waveform speech synthesis model. Our proposed method is constructed based on x-vector singular value modification with a clustering model. We also propose a technique of modifying the fundamental frequency and speech duration to enhance the anonymization performance. To evaluate our method, we carried out objective and subjective tests. The overall objective test results show that our proposed method improves the anonymization performance in terms of the speaker verifiability, whereas the subjective evaluation results show improvement in terms of the speaker dissimilarity. The intelligibility and naturalness of the anonymized speech with speech prosody modification were slightly reduced (less than 5% of word error rate) compared to the results obtained by the baseline system. Graphical abstract: Highlights: Anonymizing personally identifiable information is crucial to protect speech privacy. We propose an anonymization method by modifying the F 0, duration, and x-vector. A reliable subjective test is conducted to evaluate a speaker anonymization system. The results show that our proposed method improved speaker verifiability. … (more)
- Is Part Of:
- Computer speech & language. Volume 73(2022)
- Journal:
- Computer speech & language
- Issue:
- Volume 73(2022)
- Issue Display:
- Volume 73, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 73
- Issue:
- 2022
- Issue Sort Value:
- 2022-0073-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-05
- Subjects:
- Speaker anonymization -- X-vector singular value -- Fundamental frequency -- Clustering -- Subjective evaluation
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2021.101326 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20398.xml