Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation. (November 2018)
- Record Type:
- Journal Article
- Title:
- Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation. (November 2018)
- Main Title:
- Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation
- Authors:
- Sharma, Pulkit
Abrol, Vinayak
Nivedita,
Sao, Anil Kumar - Abstract:
- Abstract: In this paper, we have explored the framework of compressed sensing (CS) and sparse representation (SR) to reduce the footprint of unit selection based speech synthesis (USS) system. In the CS based framework, footprint reduction is achieved by storing either CS measurements or signs of CS measurements, instead of storing the raw speech waveforms. For efficient reconstruction using CS measurements, the speech signal should have a sparse representation over a predefined basis/dictionary. Hence, in this work, we have also studied the effectiveness of sparse representation for compressing the speech waveform. The experimental results are demonstrated using an analytical dictionary (DCT matrix), and several learned dictionaries, derived using K-singular value decomposition (KSVD), method of optimal directions (MOD), greedy adaptive dictionary (GAD) and principal component analysis (PCA) algorithms. To further increase compression in SR based framework of footprint reduction, the significant coefficients of sparse vector are selected adaptively, based on the type of speech segment (e.g., voiced, unvoiced etc.). Experimental studies on two different Indian languages suggest that CS/SR based footprint reduction methods can be used as an alternative to existing compression methods employed in USS system.
- Is Part Of:
- Computer speech & language. Volume 52(2018)
- Journal:
- Computer speech & language
- Issue:
- Volume 52(2018)
- Issue Display:
- Volume 52, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 52
- Issue:
- 2018
- Issue Sort Value:
- 2018-0052-2018-0000
- Page Start:
- 191
- Page End:
- 208
- Publication Date:
- 2018-11
- Subjects:
- Sparse representation -- Speech synthesis -- Dictionary learning -- Compressed sensing
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2018.05.003 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17055.xml