Sparse coding based features for speech units classification. (January 2018)
- Record Type:
- Journal Article
- Title:
- Sparse coding based features for speech units classification. (January 2018)
- Main Title:
- Sparse coding based features for speech units classification
- Authors:
- Sharma, Pulkit
Abrol, Vinayak
Dileep, A.D.
Sao, Anil Kumar - Abstract:
- Highlights: Sparse representation (SR) based features for speech units classification. PCA-based multiple dictionaries to extract SR based feature from speech signals. Emphasizing discriminative information using a dictionary based on weighted decomposition of principal components. Proposed sparse features perform well even in noisy conditions. Abstract: In this work, we propose sparse representation based features for speech units classification tasks. In order to effectively capture the variations in a speech unit, the proposed method employs multiple class specific dictionaries. Here, the training data belonging to each class is clustered into multiple clusters, and a principal component analysis (PCA) based dictionary is learnt for each cluster. It has been observed that coefficients corresponding to middle principal components can effectively discriminate among different speech units. Exploiting this observation, we propose to use a transformation function known as weighted decomposition (WD) of principal components, which is used to emphasize the discriminative information present in the PCA-based dictionary. In this paper, both raw speech samples and mel frequency cepstral coefficients (MFCC) are used as an initial representation for feature extraction. For comparison, various popular dictionary learning techniques such as K-singular value decomposition (KSVD), simultaneous codeword optimization (SimCO) and greedy adaptive dictionary (GAD) are also employed in theHighlights: Sparse representation (SR) based features for speech units classification. PCA-based multiple dictionaries to extract SR based feature from speech signals. Emphasizing discriminative information using a dictionary based on weighted decomposition of principal components. Proposed sparse features perform well even in noisy conditions. Abstract: In this work, we propose sparse representation based features for speech units classification tasks. In order to effectively capture the variations in a speech unit, the proposed method employs multiple class specific dictionaries. Here, the training data belonging to each class is clustered into multiple clusters, and a principal component analysis (PCA) based dictionary is learnt for each cluster. It has been observed that coefficients corresponding to middle principal components can effectively discriminate among different speech units. Exploiting this observation, we propose to use a transformation function known as weighted decomposition (WD) of principal components, which is used to emphasize the discriminative information present in the PCA-based dictionary. In this paper, both raw speech samples and mel frequency cepstral coefficients (MFCC) are used as an initial representation for feature extraction. For comparison, various popular dictionary learning techniques such as K-singular value decomposition (KSVD), simultaneous codeword optimization (SimCO) and greedy adaptive dictionary (GAD) are also employed in the proposed framework. The effectiveness of the proposed features is demonstrated using continuous density hidden Markov model (CDHMM) based classifiers for (i) classification of isolated utterances of E-set of English alphabet, (ii) classification of consonant-vowel (CV) segments in Hindi language and (iii) classification of phoneme from TIMIT phonetic corpus. … (more)
- Is Part Of:
- Computer speech & language. Volume 47(2018)
- Journal:
- Computer speech & language
- Issue:
- Volume 47(2018)
- Issue Display:
- Volume 47, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 47
- Issue:
- 2018
- Issue Sort Value:
- 2018-0047-2018-0000
- Page Start:
- 333
- Page End:
- 350
- Publication Date:
- 2018-01
- Subjects:
- Sparse representation -- Dictionary learning -- Speech recognition
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2017.08.004 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20786.xml