Ensemble of deep neural networks using acoustic environment classification for statistical model-based voice activity detection. (July 2016)

Record Type:: Journal Article
Title:: Ensemble of deep neural networks using acoustic environment classification for statistical model-based voice activity detection. (July 2016)
Main Title:: Ensemble of deep neural networks using acoustic environment classification for statistical model-based voice activity detection
Authors:: Hwang, Inyoung
Park, Hyung-Min
Chang, Joon-Hyuk
Abstract:: Abstract : Highlights: We develop the voice activity detection based on statistical model. The DNNs are used for the voice activity detection. Ensemble of the DNN is devised for different noise environments. A separate DNN is built to detect the current environment. Abstract: In this paper, we investigate the ensemble of deep neural networks (DNNs) by using an acoustic environment classification (AEC) technique for the statistical model-based voice activity detection (VAD). From an investigation of the statistical model-based VAD, it is known that the traditional decision rule is based on the geometric mean of the likelihood ratio or the support vector machine (SVM), which is a shallow model with zero or one hidden layer. Since the shallow models cannot take an advantage of the diversity of the space distribution of features, in the training step, we basically build the multiple DNNs according the different noise types by employing the parameters of the statistical model-based VAD algorithm. In addition, the separate DNN is designed for the AEC algorithm in order to choose the best DNN for each noise. In the on-line noise-aware VAD step, the AEC is first performed on a frame-by-frame basis using the separate DNN so the a posteriori probabilities to identify noise are obtained. Once the probabilities are achieved for each noise, the environmental knowledge is contributed to allow us to combine the speech presence probabilities which are derived from the ensemble of the DNNs … (more)
Is Part Of:: Computer speech & language. Volume 38(2016)
Journal:: Computer speech & language
Issue:: Volume 38(2016)
Issue Display:: Volume 38, Issue 2016 (2016)
Year:: 2016
Volume:: 38
Issue:: 2016
Issue Sort Value:: 2016-0038-2016-0000
Page Start:: 1
Page End:: 12
Publication Date:: 2016-07
Subjects:: Voice activity detection -- Statistical model -- Acoustic environment classification -- Deep neural network -- Ensemble
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2015.11.003 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 2379.xml