Accentron: Foreign accent conversion to arbitrary non-native speakers using zero-shot learning. (March 2022)

Record Type:: Journal Article
Title:: Accentron: Foreign accent conversion to arbitrary non-native speakers using zero-shot learning. (March 2022)
Main Title:: Accentron: Foreign accent conversion to arbitrary non-native speakers using zero-shot learning
Authors:: Ding, Shaojin
Zhao, Guanlong
Gutierrez-Osuna, Ricardo
Abstract:: Abstract: Foreign accent conversion (FAC) aims to create a new voice that has the voice identity of a given second-language (L2) speaker but with a native (L1) accent . Previous FAC approaches usually require training a separate model for each L2 speaker and, more importantly, generally require considerable speech data from each L2 speaker for training. To address these limitations, we propose Accentron, an approach that can generate accent-converted speech for arbitrary L2 speakers unseen during training. In the proposed approach, we first train a speaker-independent acoustic model on L1 corpora to extract bottleneck features that represent the linguistic content of utterances. Then, we develop a speaker encoder and an accent encoder to generate embedding vectors for the desired voice identity (L2 speaker's) and accent (L1 accent), respectively. Lastly, we use a sequence-to-sequence model to transform bottleneck-features to Mel-spectrograms, conditioned on the L2 speaker embedding and the L1 accent embedding. We conducted experiments on the L2-ARCTIC corpus under two testing conditions: the standard FAC setting where test L2 speakers were seen during training, and a zero-shot FAC setting where test L2 speakers were unseen during training. Accentron achieves over 27% relative improvement in accentedness ratings compared to two state-of-the-art FAC systems in the standard FAC setting. More importantly, our results show that Accentron generalizes to the zero-shot FAC setting … (more)
Is Part Of:: Computer speech & language. Volume 72(2022)
Journal:: Computer speech & language
Issue:: Volume 72(2022)
Issue Display:: Volume 72, Issue 2022 (2022)
Year:: 2022
Volume:: 72
Issue:: 2022
Issue Sort Value:: 2022-0072-2022-0000
Page Start:
Page End:
Publication Date:: 2022-03
Subjects:: Foreign accent conversion -- Speech synthesis -- Computer-assisted pronunciation training
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2021.101302 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 20051.xml