Dynamic time warping in phoneme modeling for fast pronunciation error detection. (1st February 2016)
- Record Type:
- Journal Article
- Title:
- Dynamic time warping in phoneme modeling for fast pronunciation error detection. (1st February 2016)
- Main Title:
- Dynamic time warping in phoneme modeling for fast pronunciation error detection
- Authors:
- Miodonska, Zuzanna
Bugdol, Marcin D.
Krecichwost, Michal - Abstract:
- Abstract: The presented paper describes a novel approach to the detection of pronunciation errors. It makes use of the modeling of well-pronounced and mispronounced phonemes by means of the Dynamic Time Warping (DTW) algorithm. Four approaches that make use of the DTW phoneme modeling were developed to detect pronunciation errors: Variations of the Word Structure (VoWS), Normalized Phoneme Distances Thresholding (NPDT), Furthest Segment Search (FSS) and Normalized Furthest Segment Search (NFSS). The performance evaluation of each module was carried out using a speech database of correctly and incorrectly pronounced words in the Polish language, with up to 10 patterns of every trained word from a set of 12 words having different phonetic structures. The performance of DTW modeling was compared to Hidden Markov Models (HMM) that were used for the same four approaches (VoWS, NPDT, FSS, NFSS). The average error rate (AER) was the lowest for DTW with NPDT (AER=0.287) and scored better than HMM with FSS (AER=0.473), which was the best result for HMM. The DTW modeling was faster than HMM for all four approaches. This technique can be used for computer-assisted pronunciation training systems that can work with a relatively small training speech corpus (less than 20 patterns per word) to support speech therapy at home. Abstract : Highlights: A low-complexity phoneme modeling method based on Dynamic Time Warping is proposed. Four variants of fast mispronunciation detection methodAbstract: The presented paper describes a novel approach to the detection of pronunciation errors. It makes use of the modeling of well-pronounced and mispronounced phonemes by means of the Dynamic Time Warping (DTW) algorithm. Four approaches that make use of the DTW phoneme modeling were developed to detect pronunciation errors: Variations of the Word Structure (VoWS), Normalized Phoneme Distances Thresholding (NPDT), Furthest Segment Search (FSS) and Normalized Furthest Segment Search (NFSS). The performance evaluation of each module was carried out using a speech database of correctly and incorrectly pronounced words in the Polish language, with up to 10 patterns of every trained word from a set of 12 words having different phonetic structures. The performance of DTW modeling was compared to Hidden Markov Models (HMM) that were used for the same four approaches (VoWS, NPDT, FSS, NFSS). The average error rate (AER) was the lowest for DTW with NPDT (AER=0.287) and scored better than HMM with FSS (AER=0.473), which was the best result for HMM. The DTW modeling was faster than HMM for all four approaches. This technique can be used for computer-assisted pronunciation training systems that can work with a relatively small training speech corpus (less than 20 patterns per word) to support speech therapy at home. Abstract : Highlights: A low-complexity phoneme modeling method based on Dynamic Time Warping is proposed. Four variants of fast mispronunciation detection method based on DTW are presented. The proposed method works on a small speech corpus with no additional linguistic data. The method proves to be faster and more efficient than methods of higher complexity. … (more)
- Is Part Of:
- Computers in biology and medicine. Volume 69(2016)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 69(2016)
- Issue Display:
- Volume 69, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 69
- Issue:
- 2016
- Issue Sort Value:
- 2016-0069-2016-0000
- Page Start:
- 277
- Page End:
- 285
- Publication Date:
- 2016-02-01
- Subjects:
- Pronunciation error detection -- CAPT systems -- DTW algorithm -- Phoneme modeling -- Word structure analysis
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2015.12.004 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 68.xml