HMM-based expressive singing voice synthesis with singing style control and robust pitch modeling. (November 2015)
- Record Type:
- Journal Article
- Title:
- HMM-based expressive singing voice synthesis with singing style control and robust pitch modeling. (November 2015)
- Main Title:
- HMM-based expressive singing voice synthesis with singing style control and robust pitch modeling
- Authors:
- Nose, Takashi
Kanemoto, Misa
Koriyama, Tomoki
Kobayashi, Takao - Abstract:
- Abstract : Highlights: We propose singing style control for HMM-based expressive singing voice synthesis. Feature-space pitch adaptive training is proposed for robust pitch modeling. Robust vibrato modeling is proposed for unclear vibrato expressions. Experiments show the techniques improve the naturalness and similarity. Experiments show we can control singing style expressivity intuitively Abstract: This paper proposes a singing style control technique based on multiple regression hidden semi-Markov models (MRHSMMs) for changing singing styles and their intensities appearing in synthetic singing voices. In the proposed technique, singing styles and their intensities are represented by low-dimensional vectors called style vectors and are modeled in accordance with the assumption that mean parameters of acoustic models are given as multiple regressions of the style vectors. In the synthesis process, we can weaken or emphasize the intensities of singing styles by setting a desired style vector. In addition, the idea of pitch adaptive training is extended to the case of the MRHSMM to improve the modeling accuracy of pitch associated with musical notes. A novel vibrato modeling technique is also presented to extract vibrato parameters from singing voices that sometimes have unclear vibrato expressions. Subjective evaluations show that we can intuitively control singing styles and their intensities while maintaining the naturalness of synthetic singing voices comparable to theAbstract : Highlights: We propose singing style control for HMM-based expressive singing voice synthesis. Feature-space pitch adaptive training is proposed for robust pitch modeling. Robust vibrato modeling is proposed for unclear vibrato expressions. Experiments show the techniques improve the naturalness and similarity. Experiments show we can control singing style expressivity intuitively Abstract: This paper proposes a singing style control technique based on multiple regression hidden semi-Markov models (MRHSMMs) for changing singing styles and their intensities appearing in synthetic singing voices. In the proposed technique, singing styles and their intensities are represented by low-dimensional vectors called style vectors and are modeled in accordance with the assumption that mean parameters of acoustic models are given as multiple regressions of the style vectors. In the synthesis process, we can weaken or emphasize the intensities of singing styles by setting a desired style vector. In addition, the idea of pitch adaptive training is extended to the case of the MRHSMM to improve the modeling accuracy of pitch associated with musical notes. A novel vibrato modeling technique is also presented to extract vibrato parameters from singing voices that sometimes have unclear vibrato expressions. Subjective evaluations show that we can intuitively control singing styles and their intensities while maintaining the naturalness of synthetic singing voices comparable to the conventional HSMM-based singing voice synthesis. … (more)
- Is Part Of:
- Computer speech & language. Volume 34(2015)
- Journal:
- Computer speech & language
- Issue:
- Volume 34(2015)
- Issue Display:
- Volume 34, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 34
- Issue:
- 2015
- Issue Sort Value:
- 2015-0034-2015-0000
- Page Start:
- 308
- Page End:
- 322
- Publication Date:
- 2015-11
- Subjects:
- HMM-based singing voice synthesis -- Singing style control -- Multiple-regression HSMM -- Pitch adaptive training -- Vibrato modeling
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2015.04.001 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 6446.xml