Regularization of neural network model with distance metric learning for i-vector based spoken language identification. (July 2017)

Record Type:: Journal Article
Title:: Regularization of neural network model with distance metric learning for i-vector based spoken language identification. (July 2017)
Main Title:: Regularization of neural network model with distance metric learning for i-vector based spoken language identification
Authors:: Lu, Xugang
Shen, Peng
Tsao, Yu
Kawai, Hisashi
Abstract:: Highlights: Pair-wise distance metric learning was designed on the feature transform layers. Stacking a soft-max layer on the feature transform layers as a classifier layer. Learning the coupled functions in feature extraction and classification layers with a regularized objective function. Experiments showed significant improvements compared to conventional regularization algorithms. Abstract: The i-vector representation and modeling technique has been successfully applied in spoken language identification (SLI). The advantage of using the i-vector representation is that any speech utterance with a variable duration length can be represented as a fixed length vector. In modeling, a discriminative transform or classifier must be applied to emphasize the variations correlated to language identity since the i-vector representation encodes several types of the acoustic variations (e.g., speaker variation, transmission channel variation, etc.). Owing to the strong nonlinear discriminative power, the neural network model has been directly used to learn the mapping function between the i-vector representation and the language identity labels. In most studies, only the point-wise feature-label information is fed to the model for parameter learning that may result in model overfitting, particularly when with limited training data. In this study, we propose to integrate pair-wise distance metric learning as the regularization of model parameter optimization. In the representation … (more)
Is Part Of:: Computer speech & language. Volume 44(2017)
Journal:: Computer speech & language
Issue:: Volume 44(2017)
Issue Display:: Volume 44, Issue 2017 (2017)
Year:: 2017
Volume:: 44
Issue:: 2017
Issue Sort Value:: 2017-0044-2017-0000
Page Start:: 48
Page End:: 60
Publication Date:: 2017-07
Subjects:: Neural network model -- Cross entropy -- Pair-wise distance metric learning -- Spoken language identification,
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2017.01.006 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 371.xml