Sequence labeling to detect stuttering events in read speech. (July 2020)
- Record Type:
- Journal Article
- Title:
- Sequence labeling to detect stuttering events in read speech. (July 2020)
- Main Title:
- Sequence labeling to detect stuttering events in read speech
- Authors:
- Alharbi, Sadeen
Hasan, Madina
Simons, Anthony J H
Brumfitt, Shelagh
Green, Phil - Abstract:
- Highlights: The effect of data augmentation technique has improved the performance of all applied classifiers. The results on human transcripts show that, without feature engineering, the BLSTM outperform the CRF classifiers. The results after added auxiliary features to support the CRFaux classifier allows performance improvements. The results of CRFngram, CRFaux and BLSTM classifiers on ASR transcripts, scored against human transcription degrade in these three classifiers. Abstract: Stuttering is a speech disorder that, if treated during childhood, may be prevented from persisting into adolescence. A clinician must first determine the severity of stuttering, assessing a child during a conversational or reading task, recording each instance of disfluency, either in real time, or after transcribing the recorded session and analysing the transcript. The current study evaluates the ability of two machine learning approaches, namely conditional random fields (CRF) and bi-directional long-short-term memory (BLSTM), to detect stuttering events in transcriptions of stuttering speech. The two approaches are compared for their performance both on ideal hand-transcribed data and also on the output of automatic speech recognition (ASR). We also study the effect of data augmentation to improve performance. A corpus of 35 speakers' read speech (13K words) was supplemented with a corpus of 63 speakers' spontaneous speech (11K words) and an artificially-generated corpus (50K words).Highlights: The effect of data augmentation technique has improved the performance of all applied classifiers. The results on human transcripts show that, without feature engineering, the BLSTM outperform the CRF classifiers. The results after added auxiliary features to support the CRFaux classifier allows performance improvements. The results of CRFngram, CRFaux and BLSTM classifiers on ASR transcripts, scored against human transcription degrade in these three classifiers. Abstract: Stuttering is a speech disorder that, if treated during childhood, may be prevented from persisting into adolescence. A clinician must first determine the severity of stuttering, assessing a child during a conversational or reading task, recording each instance of disfluency, either in real time, or after transcribing the recorded session and analysing the transcript. The current study evaluates the ability of two machine learning approaches, namely conditional random fields (CRF) and bi-directional long-short-term memory (BLSTM), to detect stuttering events in transcriptions of stuttering speech. The two approaches are compared for their performance both on ideal hand-transcribed data and also on the output of automatic speech recognition (ASR). We also study the effect of data augmentation to improve performance. A corpus of 35 speakers' read speech (13K words) was supplemented with a corpus of 63 speakers' spontaneous speech (11K words) and an artificially-generated corpus (50K words). Experimental results show that, without feature engineering, BLSTM classifiers outperform CRF classifiers by 33.6%. However, adding features to support the CRF classifier yields performance improvements of 45% and 18% over the CRF baseline and BLSTM results, respectively. Moreover, adding more data to train the CRF and BLSTM classifiers consistently improves the results. … (more)
- Is Part Of:
- Computer speech & language. Volume 62(2020)
- Journal:
- Computer speech & language
- Issue:
- Volume 62(2020)
- Issue Display:
- Volume 62, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 62
- Issue:
- 2020
- Issue Sort Value:
- 2020-0062-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-07
- Subjects:
- Stuttering event detection -- Speech disorder -- CRF -- BLSTM
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2019.101052 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12937.xml