A hybrid refinement scheme for intra- and cross- corpora phonetic segmentation. (January 2015)
- Record Type:
- Journal Article
- Title:
- A hybrid refinement scheme for intra- and cross- corpora phonetic segmentation. (January 2015)
- Main Title:
- A hybrid refinement scheme for intra- and cross- corpora phonetic segmentation
- Authors:
- Zhao, Sixuan
Soon, Ing Yann
Koh, Soo Ngee
Luke, Kang Kwong - Abstract:
- Highlights: A hybrid refinement scheme for phonetic segmentation is proposed. Statistical correction is applied to improve segmentation results. Multi-resolution fusion leads to higher segmentation accuracy. Refinements based on predictive models further reduce segmentation errors. Cross-corpora segmentation can be improved by the hybrid refinement scheme. Abstract: This paper proposes a hybrid refinement scheme for more accurate localization of phonetic boundaries by combining three different post-processing techniques, including statistical correction, fusion, and predictive models. A statistical method based on state-level correction is proposed to improve the segmentation results. Effects of search ranges on the statistical correction process are studied and a state selection scheme is used to enhance the correction results. This paper also examines the effects of time resolution, i.e., stepsize, of acoustic models on the accuracy of segmentation. A multi-resolution fusion process is proposed to further refine the statistically corrected results. Finally, predictive models are designed to improve the segmentation accuracy by incorporating various acoustic features and searching around the preliminary boundary with a smaller stepsize. By applying the hybrid refinement scheme on a well-known corpus, significant improvements of segmentation results in terms of segmentation accuracy with different tolerances, mean absolute error (MAE), and root-mean-square error (RMSE) canHighlights: A hybrid refinement scheme for phonetic segmentation is proposed. Statistical correction is applied to improve segmentation results. Multi-resolution fusion leads to higher segmentation accuracy. Refinements based on predictive models further reduce segmentation errors. Cross-corpora segmentation can be improved by the hybrid refinement scheme. Abstract: This paper proposes a hybrid refinement scheme for more accurate localization of phonetic boundaries by combining three different post-processing techniques, including statistical correction, fusion, and predictive models. A statistical method based on state-level correction is proposed to improve the segmentation results. Effects of search ranges on the statistical correction process are studied and a state selection scheme is used to enhance the correction results. This paper also examines the effects of time resolution, i.e., stepsize, of acoustic models on the accuracy of segmentation. A multi-resolution fusion process is proposed to further refine the statistically corrected results. Finally, predictive models are designed to improve the segmentation accuracy by incorporating various acoustic features and searching around the preliminary boundary with a smaller stepsize. By applying the hybrid refinement scheme on a well-known corpus, significant improvements of segmentation results in terms of segmentation accuracy with different tolerances, mean absolute error (MAE), and root-mean-square error (RMSE) can be observed. Furthermore, a scenario of cross-corpora segmentation is examined in generating the segmentation results for a new corpus with a small set of labeled data. Experimental results show that the proposed refinement procedure can generate segmentation results comparable to those given by well-trained acoustic models obtained from the new corpus. … (more)
- Is Part Of:
- Computer speech & language. Volume 29(2015)
- Journal:
- Computer speech & language
- Issue:
- Volume 29(2015)
- Issue Display:
- Volume 29, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 29
- Issue:
- 2015
- Issue Sort Value:
- 2015-0029-2015-0000
- Page Start:
- 81
- Page End:
- 97
- Publication Date:
- 2015-01
- Subjects:
- Phonetic segmentation -- Statistical correction -- Multi-resolution -- Predictive model -- Cross-corpora segmentation
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2014.08.001 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5426.xml