Design of mixture of GMMs for Query-by-Example Spoken Term Detection. (November 2018)

Record Type:: Journal Article
Title:: Design of mixture of GMMs for Query-by-Example Spoken Term Detection. (November 2018)
Main Title:: Design of mixture of GMMs for Query-by-Example Spoken Term Detection
Authors:: Madhavi, Maulik C.
Patil, Hemant A.
Abstract:: Highlights: Mixture of GMMs is presented, where the prior probabilities of mixtures are set by broad phoneme posterior probabilities. Posteriorgram of a mixture of GMMs is computed with different cepstral representations. Posteriorgram of a mixture of GMMs is effective under different evaluation factors for TIMIT and SWS 2013 corpora. Broad phoneme posterior probabilities obtained with limited labeled data is also exploited for training. Abstract: This paper presents the design of a mixture of Gaussian Mixture Models (GMMs) for Query-by-Example Spoken Term Detection (QbE-STD). The speech data governs acoustically similar broad phonetic structures. To capture broad phonetic structure, we exploit additional information of broad phoneme classes (such as vowels, semi-vowels, nasals, fricatives, and plosives) for the training of the GMM. The mixture of GMMs is tied with GMMs of these broad phoneme classes, i.e., each GMM expresses the probability density function ( pdf ) of a broad phoneme category. The Expectation Maximization (EM) algorithm is used to obtain the GMM for each broad phoneme class. Thus, a mixture of GMMs represents the spoken query with the broad phonetic constraints. These constraints restrict the posterior probability within the broad class, which results into a better posteriorgram design. The novelty of our work lies in prior probability assignments (as weights of the mixture of GMMs) for better Gaussian posteriorgram design. The proposed simple yet effective … (more)
Is Part Of:: Computer speech & language. Volume 52(2018)
Journal:: Computer speech & language
Issue:: Volume 52(2018)
Issue Display:: Volume 52, Issue 2018 (2018)
Year:: 2018
Volume:: 52
Issue:: 2018
Issue Sort Value:: 2018-0052-2018-0000
Page Start:: 41
Page End:: 55
Publication Date:: 2018-11
Subjects:: Query-by-Example Spoken Term Detection -- Phone posteriorgram -- Mixture of GMMs -- Gaussian posteriorgram
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2018.04.006 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 17055.xml