Language-independent acoustic cloning of HTS voices. (May 2019)
- Record Type:
- Journal Article
- Title:
- Language-independent acoustic cloning of HTS voices. (May 2019)
- Main Title:
- Language-independent acoustic cloning of HTS voices
- Authors:
- Magariños, Carmen
Erro, Daniel
Banga, Eduardo R. - Abstract:
- Highlights: We present a method to transplant speaker identity between synthetic voice models. Suitable for statistical parametric speech synthesis based on hidden Markov models. The method does not require any phonetic or linguistic information. Subjective and objective tests show a good performance of the proposed approach. Our method outperforms the Kullback–Leibler divergence-based transform mapping. Abstract: Speaker adaptation techniques can be classified as intra-lingual or cross-lingual depending on whether or not the source model and the target speaker employ the same language. Most of the work in this field has been focused on the first case, while the second one has been less explored. In this paper we address the cross-lingual paradigm in the framework of a HMM-based speech synthesis system by further developing a formerly proposed approach. This method is able to clone a given speaker into a different language by combining the linguistic structure and the acoustic characteristics of two HTS models. In this work, we discuss the extension of the adaptation procedure to some other source model parameters that were kept unmodified in the initial version, and compare the performance of both versions by means of subjective and objective tests. These results are also contrasted with those obtained by a KLD-based technique proposed in the literature for a similar purpose. While no significant preference for any of the versions of our method is observed, our approachHighlights: We present a method to transplant speaker identity between synthetic voice models. Suitable for statistical parametric speech synthesis based on hidden Markov models. The method does not require any phonetic or linguistic information. Subjective and objective tests show a good performance of the proposed approach. Our method outperforms the Kullback–Leibler divergence-based transform mapping. Abstract: Speaker adaptation techniques can be classified as intra-lingual or cross-lingual depending on whether or not the source model and the target speaker employ the same language. Most of the work in this field has been focused on the first case, while the second one has been less explored. In this paper we address the cross-lingual paradigm in the framework of a HMM-based speech synthesis system by further developing a formerly proposed approach. This method is able to clone a given speaker into a different language by combining the linguistic structure and the acoustic characteristics of two HTS models. In this work, we discuss the extension of the adaptation procedure to some other source model parameters that were kept unmodified in the initial version, and compare the performance of both versions by means of subjective and objective tests. These results are also contrasted with those obtained by a KLD-based technique proposed in the literature for a similar purpose. While no significant preference for any of the versions of our method is observed, our approach clearly outperforms the KLD-based technique. … (more)
- Is Part Of:
- Computer speech & language. Volume 55(2019)
- Journal:
- Computer speech & language
- Issue:
- Volume 55(2019)
- Issue Display:
- Volume 55, Issue 2019 (2019)
- Year:
- 2019
- Volume:
- 55
- Issue:
- 2019
- Issue Sort Value:
- 2019-0055-2019-0000
- Page Start:
- 168
- Page End:
- 186
- Publication Date:
- 2019-05
- Subjects:
- HMM-based speech synthesis -- Cross-lingual speaker adaptation -- Voice cloning -- Polyglot synthesis -- Multilingual synthesis
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2018.12.006 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9439.xml