Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation. (November 2018)

Record Type:: Journal Article
Title:: Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation. (November 2018)
Main Title:: Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation
Authors:: Sharma, Pulkit
Abrol, Vinayak
Nivedita,
Sao, Anil Kumar
Abstract:: Abstract: In this paper, we have explored the framework of compressed sensing (CS) and sparse representation (SR) to reduce the footprint of unit selection based speech synthesis (USS) system. In the CS based framework, footprint reduction is achieved by storing either CS measurements or signs of CS measurements, instead of storing the raw speech waveforms. For efficient reconstruction using CS measurements, the speech signal should have a sparse representation over a predefined basis/dictionary. Hence, in this work, we have also studied the effectiveness of sparse representation for compressing the speech waveform. The experimental results are demonstrated using an analytical dictionary (DCT matrix), and several learned dictionaries, derived using K-singular value decomposition (KSVD), method of optimal directions (MOD), greedy adaptive dictionary (GAD) and principal component analysis (PCA) algorithms. To further increase compression in SR based framework of footprint reduction, the significant coefficients of sparse vector are selected adaptively, based on the type of speech segment (e.g., voiced, unvoiced etc.). Experimental studies on two different Indian languages suggest that CS/SR based footprint reduction methods can be used as an alternative to existing compression methods employed in USS system.
Is Part Of:: Computer speech & language. Volume 52(2018)
Journal:: Computer speech & language
Issue:: Volume 52(2018)
Issue Display:: Volume 52, Issue 2018 (2018)
Year:: 2018
Volume:: 52
Issue:: 2018
Issue Sort Value:: 2018-0052-2018-0000
Page Start:: 191
Page End:: 208
Publication Date:: 2018-11
Subjects:: Sparse representation -- Speech synthesis -- Dictionary learning -- Compressed sensing
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2018.05.003 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 17055.xml