Restricted Boltzmann machines for vector representation of speech in speaker recognition. (January 2018)

Record Type:: Journal Article
Title:: Restricted Boltzmann machines for vector representation of speech in speaker recognition. (January 2018)
Main Title:: Restricted Boltzmann machines for vector representation of speech in speaker recognition
Authors:: Ghahabi, Omid
Hernando, Javier
Abstract:: Highlights: An efficient low dimensional vector representation of speech based on GMM and RBM, referred to as GMMRBM vectors, is proposed. A Universal RBM (URBM) is trained to learn the total speaker and session variability among background GMM supervectors. A variant of Rectified Linear Units (ReLU), referred to as variable ReLU (VReLU), is proposed to train the URBM efficiently. URBM is then used to transform unseen supervectors to the proposed GMM–RBM vectors. Abstract: Over the last few years, i-vectors have been the state-of-the-art technique in speaker recognition. Recent advances in Deep Learning (DL) technology have improved the quality of i-vectors but the DL techniques in use are computationally expensive and need phonetically labeled background data. The aim of this work is to develop an efficient alternative vector representation of speech by keeping the computational cost as low as possible and avoiding phonetic labels, which are not always accessible. The proposed vectors will be based on both Gaussian Mixture Models (GMM) and Restricted Boltzmann Machines (RBM) and will be referred to as GMM–RBM vectors. The role of RBM is to learn the total speaker and session variability among background GMM supervectors. This RBM, which will be referred to as Universal RBM (URBM), will then be used to transform unseen supervectors to the proposed low dimensional vectors. The use of different activation functions for training the URBM and different transformation functions … (more)
Is Part Of:: Computer speech & language. Volume 47(2018)
Journal:: Computer speech & language
Issue:: Volume 47(2018)
Issue Display:: Volume 47, Issue 2018 (2018)
Year:: 2018
Volume:: 47
Issue:: 2018
Issue Sort Value:: 2018-0047-2018-0000
Page Start:: 16
Page End:: 29
Publication Date:: 2018-01
Subjects:: Restricted Boltzmann machine -- Deep learning -- Variable rectified linear unit -- Speaker recognition -- GMM–RBM vector -- i-vector
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2017.06.007 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 20786.xml