Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits. (January 2015)
- Record Type:
- Journal Article
- Title:
- Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits. (January 2015)
- Main Title:
- Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits
- Authors:
- Pohjalainen, Jouni
Räsänen, Okko
Kadioglu, Serdar - Abstract:
- Abstract : Highlights: We study high-dimensional feature selection for paralinguistic classification. New supervised and unsupervised methods overfit less than forward selection. Combined selection methods and kNN challenge popular high-dimensional classifiers. In each task, the best methods show improved performance using much fewer features. We report features for speaker traits that we found using the selection methods. Abstract: This study focuses on feature selection in paralinguistic analysis and presents recently developed supervised and unsupervised methods for feature subset selection and feature ranking. Using the standard k -nearest-neighbors (kNN) rule as the classification algorithm, the feature selection methods are evaluated individually and in different combinations in seven paralinguistic speaker trait classification tasks. In each analyzed data set, the overall number of features highly exceeds the number of data points available for training and evaluation, making a well-generalizing feature selection process extremely difficult. The performance of feature sets on the feature selection data is observed to be a poor indicator of their performance on unseen data. The studied feature selection methods clearly outperform a standard greedy hill-climbing selection algorithm by being more robust against overfitting. When the selection methods are suitably combined with each other, the performance in the classification task can be further improved. In general, itAbstract : Highlights: We study high-dimensional feature selection for paralinguistic classification. New supervised and unsupervised methods overfit less than forward selection. Combined selection methods and kNN challenge popular high-dimensional classifiers. In each task, the best methods show improved performance using much fewer features. We report features for speaker traits that we found using the selection methods. Abstract: This study focuses on feature selection in paralinguistic analysis and presents recently developed supervised and unsupervised methods for feature subset selection and feature ranking. Using the standard k -nearest-neighbors (kNN) rule as the classification algorithm, the feature selection methods are evaluated individually and in different combinations in seven paralinguistic speaker trait classification tasks. In each analyzed data set, the overall number of features highly exceeds the number of data points available for training and evaluation, making a well-generalizing feature selection process extremely difficult. The performance of feature sets on the feature selection data is observed to be a poor indicator of their performance on unseen data. The studied feature selection methods clearly outperform a standard greedy hill-climbing selection algorithm by being more robust against overfitting. When the selection methods are suitably combined with each other, the performance in the classification task can be further improved. In general, it is shown that the use of automatic feature selection in paralinguistic analysis can be used to reduce the overall number of features to a fraction of the original feature set size while still achieving a comparable or even better performance than baseline support vector machine or random forest classifiers using the full feature set. The most typically selected features for recognition of speaker likability, intelligibility and five personality traits are also reported. … (more)
- Is Part Of:
- Computer speech & language. Volume 29(2015)
- Journal:
- Computer speech & language
- Issue:
- Volume 29(2015)
- Issue Display:
- Volume 29, Issue 2015 (2015)
- Year:
- 2015
- Volume:
- 29
- Issue:
- 2015
- Issue Sort Value:
- 2015-0029-2015-0000
- Page Start:
- 145
- Page End:
- 171
- Publication Date:
- 2015-01
- Subjects:
- Feature selection -- Pattern recognition -- Machine learning -- Computational paralinguistics
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2013.11.004 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 5426.xml