Speech intelligibility assessment of dysarthria using Fisher vector encoding. (January 2023)
- Record Type:
- Journal Article
- Title:
- Speech intelligibility assessment of dysarthria using Fisher vector encoding. (January 2023)
- Main Title:
- Speech intelligibility assessment of dysarthria using Fisher vector encoding
- Authors:
- H․M․, Chandrashekar
Karjigi, Veena
Sreedevi, N. - Abstract:
- Highlights: Intelligibility assessment of dysarthric speech using UA and TORGO databases. Different frame-level features are extracted in from temporal, spectral, and cepstral domains. Temporal encoding and Fisher vector encoding techniques are used to convert frame-level features to utterance level features. Probabilistic linear discriminant analysis (PLDA) and artificial neural network (ANN) classifiers are used for the intelligibility assessment of dysarthric speech. Fisher vector encoding performance is superior to temporal encoding and ANN performance is superior to PLDA. Abstract: Neuromuscular disorders can lead to dysarthria. Dysarthria is a speech disorder that mainly affects the human motor speech system. It often results in reduced speech intelligibility. Speech intelligibility is one of the parameters to assess the severity of dysarthria. The intelligibility of speech decreases as the severity increases. Automatic speech intelligibility assessment could be reliable and cost-effective compared to the conventional methods which need experienced speech pathologists. Automatic intelligibility assessment includes feature extraction and classification. Features extracted from spectral and cepstral domains are investigated for use in intelligibility assessment. Frame level features are extracted from different signal representations in spectral and cepstral domains. Fisher vector encoding and temporal encoding that uses descriptive statistics are applied to convertHighlights: Intelligibility assessment of dysarthric speech using UA and TORGO databases. Different frame-level features are extracted in from temporal, spectral, and cepstral domains. Temporal encoding and Fisher vector encoding techniques are used to convert frame-level features to utterance level features. Probabilistic linear discriminant analysis (PLDA) and artificial neural network (ANN) classifiers are used for the intelligibility assessment of dysarthric speech. Fisher vector encoding performance is superior to temporal encoding and ANN performance is superior to PLDA. Abstract: Neuromuscular disorders can lead to dysarthria. Dysarthria is a speech disorder that mainly affects the human motor speech system. It often results in reduced speech intelligibility. Speech intelligibility is one of the parameters to assess the severity of dysarthria. The intelligibility of speech decreases as the severity increases. Automatic speech intelligibility assessment could be reliable and cost-effective compared to the conventional methods which need experienced speech pathologists. Automatic intelligibility assessment includes feature extraction and classification. Features extracted from spectral and cepstral domains are investigated for use in intelligibility assessment. Frame level features are extracted from different signal representations in spectral and cepstral domains. Fisher vector encoding and temporal encoding that uses descriptive statistics are applied to convert frame level features into utterance level. These features are fed to PLDA and ANN classifiers for intelligibility assessment. Overall, the performance of Fisher vector encoding is found to be superior compared to temporal encoding. In all features, the performance of ANN is better in assessing the intelligibility levels compared to PLDA as expected. The STFT, as well as harmonics in spectral domain and CQCC in the cepstral domain, performed well for intelligibility assessment of unseen data of TORGO database. … (more)
- Is Part Of:
- Computer speech & language. Volume 77(2023)
- Journal:
- Computer speech & language
- Issue:
- Volume 77(2023)
- Issue Display:
- Volume 77, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 77
- Issue:
- 2023
- Issue Sort Value:
- 2023-0077-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- Dysarthria -- Intelligibility assessment -- Spectral domain representation -- Cepstral domain representation -- Fisher vector encoding -- Probabilistic linear discriminant analysis
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2022.101411 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 23321.xml