Automatic intelligibility classification of sentence-level pathological speech. (January 2015)
- Record Type:
- Journal Article
- Title:
- Automatic intelligibility classification of sentence-level pathological speech. (January 2015)
- Main Title:
- Automatic intelligibility classification of sentence-level pathological speech
- Authors:
- Kim, Jangwon
Kumar, Naveen
Tsiartas, Andreas
Li, Ming
Narayanan, Shrikanth S. - Abstract:
- Abstract : Highlights: We propose novel sentence-level features to capture atypical variation. Our sentence-level features are effective for intelligibility classification. We propose a post-classification posterior smoothing scheme. Our smoothing scheme improves classification accuracy of our systems. We test feature-level and subsystem fusions for the final intelligibility decision. Abstract: Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. Although automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist experts in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we propose novel sentence-level features to capture abnormal variation in the prosodic, voice quality and pronunciation aspects in pathological speech. In addition, we propose a post-classification posterior smoothing scheme which refines the posterior of a test sample based on the posteriors of other test samples. Finally, we perform feature-level fusions and subsystem decision fusion for arriving at a final intelligibility decision. The performances are tested on two pathological speech datasets, the NKI CCRT Speech Corpus (advanced head and neck cancer) and the TORGOAbstract : Highlights: We propose novel sentence-level features to capture atypical variation. Our sentence-level features are effective for intelligibility classification. We propose a post-classification posterior smoothing scheme. Our smoothing scheme improves classification accuracy of our systems. We test feature-level and subsystem fusions for the final intelligibility decision. Abstract: Pathological speech usually refers to the condition of speech distortion resulting from atypicalities in voice and/or in the articulatory mechanisms owing to disease, illness or other physical or biological insult to the production system. Although automatic evaluation of speech intelligibility and quality could come in handy in these scenarios to assist experts in diagnosis and treatment design, the many sources and types of variability often make it a very challenging computational processing problem. In this work we propose novel sentence-level features to capture abnormal variation in the prosodic, voice quality and pronunciation aspects in pathological speech. In addition, we propose a post-classification posterior smoothing scheme which refines the posterior of a test sample based on the posteriors of other test samples. Finally, we perform feature-level fusions and subsystem decision fusion for arriving at a final intelligibility decision. The performances are tested on two pathological speech datasets, the NKI CCRT Speech Corpus (advanced head and neck cancer) and the TORGO database (cerebral palsy or amyotrophic lateral sclerosis), by evaluating classification accuracy without overlapping subjects' data among training and test partitions. Results show that the feature sets of each of the voice quality subsystem, prosodic subsystem, and pronunciation subsystem, offer significant discriminating power for binary intelligibility classification. We observe that the proposed posterior smoothing in the acoustic space can further reduce classification errors. The smoothed posterior score fusion of subsystems shows the best classification performance (73.5% for unweighted, and 72.8% for weighted, average recalls of the binary classes). … (more)
- Is Part Of:
- Computer speech & language. Volume 29(2015)
- Journal:
- Computer speech & language
- Issue:
- Volume 29(2015)
- Issue Display:
- Volume 29, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 29
- Issue:
- 2015
- Issue Sort Value:
- 2015-0029-2015-0000
- Page Start:
- 132
- Page End:
- 144
- Publication Date:
- 2015-01
- Subjects:
- Pathological speech -- Automatic intelligibility assessment -- Dysarthric speech -- Head and neck cancer
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2014.02.001 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5426.xml