Native and non-native class discrimination using speech rhythm- and auditory-based cues. (May 2015)
- Record Type:
- Journal Article
- Title:
- Native and non-native class discrimination using speech rhythm- and auditory-based cues. (May 2015)
- Main Title:
- Native and non-native class discrimination using speech rhythm- and auditory-based cues
- Authors:
- Selouani, S.-A.
Alotaibi, Y.
Cichocki, W.
Gharsellaoui, S.
Kadi, K. - Abstract:
- Abstract : Highlights: New models of speech rhythm and auditory knowledge are proposed. Use both duration and intensity metrics to classify native and non-native accents. Perform accent classification by logistic regression (LR) and with baseline systems. LR-based approach provides the best classification of native/non-native Arabic speech. Combination of auditoryc indicative features and rhythm metrics provides the best classification. Abstract: In recent years, the use of rhythm-based features in speech processing systems has received growing interest. This approach uses a wide array of rhythm metrics that have been developed to capture speech timing differences between and within languages. However, the reliability of rhythm metrics is being increasingly called into question. In this paper, we propose two modifications to this approach. First, we describe a model that is based on auditory cues that simulate the external, middle and inner parts of the ear. We evaluate this model by performing experiments to discriminate between native and non-native Arabic speech. Data are from the West Point Arabic Speech Corpus; testing is done on standard classifiers based on Gaussian Mixture Models (GMMs), Support Vector Machines (SVMs) and a hybrid GMM/SVM. Results show that the auditory-based model consistently outperforms a traditional rhythm-metric approach that includes both duration- and intensity-based metrics. Second, we propose a framework that combines the rhythm metrics andAbstract : Highlights: New models of speech rhythm and auditory knowledge are proposed. Use both duration and intensity metrics to classify native and non-native accents. Perform accent classification by logistic regression (LR) and with baseline systems. LR-based approach provides the best classification of native/non-native Arabic speech. Combination of auditoryc indicative features and rhythm metrics provides the best classification. Abstract: In recent years, the use of rhythm-based features in speech processing systems has received growing interest. This approach uses a wide array of rhythm metrics that have been developed to capture speech timing differences between and within languages. However, the reliability of rhythm metrics is being increasingly called into question. In this paper, we propose two modifications to this approach. First, we describe a model that is based on auditory cues that simulate the external, middle and inner parts of the ear. We evaluate this model by performing experiments to discriminate between native and non-native Arabic speech. Data are from the West Point Arabic Speech Corpus; testing is done on standard classifiers based on Gaussian Mixture Models (GMMs), Support Vector Machines (SVMs) and a hybrid GMM/SVM. Results show that the auditory-based model consistently outperforms a traditional rhythm-metric approach that includes both duration- and intensity-based metrics. Second, we propose a framework that combines the rhythm metrics and the auditory-based cues in the context of a Logistic Regression (LR) method that can optimize feature combination. Further results show that the proposed LR-based method improves performance over the standard classifiers in the discrimination between the native and non-native Arabic speech. … (more)
- Is Part Of:
- Computer speech & language. Volume 31(2015)
- Journal:
- Computer speech & language
- Issue:
- Volume 31(2015)
- Issue Display:
- Volume 31, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 31
- Issue:
- 2015
- Issue Sort Value:
- 2015-0031-2015-0000
- Page Start:
- 28
- Page End:
- 48
- Publication Date:
- 2015-05
- Subjects:
- Speech rhythm -- Auditory cues -- Duration -- Intensity -- SVMs -- GMMs -- Hybrid GMM/SVM -- Logistic regression
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2014.11.003 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5432.xml