VStyclone: Real-time Chinese voice style clone. (January 2023)
- Record Type:
- Journal Article
- Title:
- VStyclone: Real-time Chinese voice style clone. (January 2023)
- Main Title:
- VStyclone: Real-time Chinese voice style clone
- Authors:
- Wu, Yichun
Zhao, Huihuang
Liang, Xiaoman
Sun, Yaqi - Abstract:
- Abstract: This paper proposes a novel Chinese speech cloning model named VStyclone, which consists of three stages: multi-speaker training, target speaker encoding, and target speaker synthesis. In this work, we design an efficient tone extractor, which can reallocate resources to the sequences of log-mel spectrogram frames obtained from multiple speakers' speech, thus allowing the network to learn multiple speakers' features differently. This approach allows the network to focus more on the voice features of the target speaker and extract the target features accurately. To cluster the voices of the same speaker and sparse the voices of different speakers, we build an optimal softmax loss to optimize the model. Then, we develop a style synthesizer, which adopts the idea of transformer instead of recurrent neural network, so that the model can not only process text information in parallel, but also improve the model's ability to process long-distance contextual information. Meanwhile, we embed a style extraction module in the style synthesizer to dynamically capture style ranges in an unsupervised manner. In addition, the VStyclone model uses generative adversarial networks as the base generation model of the vocoder to improve the generation speed, which runs 1.2 times faster than the real-time generation speed on CPU, and finally the VStyclone model obtains the SOTA effect.
- Is Part Of:
- Computers & electrical engineering. Volume 105(2023)
- Journal:
- Computers & electrical engineering
- Issue:
- Volume 105(2023)
- Issue Display:
- Volume 105, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 105
- Issue:
- 2023
- Issue Sort Value:
- 2023-0105-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- VStyclone -- Voice clone -- Efficient tone extractor -- Style synthesizer -- Transformer -- Vocoder
Computer engineering -- Periodicals
Electrical engineering -- Periodicals
Electrical engineering -- Data processing -- Periodicals
Ordinateurs -- Conception et construction -- Périodiques
Électrotechnique -- Périodiques
Électrotechnique -- Informatique -- Périodiques
Computer engineering
Electrical engineering
Electrical engineering -- Data processing
Periodicals
Electronic journals
621.302854 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00457906/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compeleceng.2022.108534 ↗
- Languages:
- English
- ISSNs:
- 0045-7906
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.680000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25029.xml