A novel approach to unsupervised pattern discovery in speech using Convolutional Neural Network. (January 2022)
- Record Type:
- Journal Article
- Title:
- A novel approach to unsupervised pattern discovery in speech using Convolutional Neural Network. (January 2022)
- Main Title:
- A novel approach to unsupervised pattern discovery in speech using Convolutional Neural Network
- Authors:
- R., Kishore Kumar
Sreenivasa Rao, K. - Abstract:
- Abstract: In this paper, a novel approach to unsupervised pattern discovery for speech signals is proposed. Recently, we introduced an image processing method (IPM) that extracts the desired keywords present in a pair of speech utterances. This method performs well in detecting true positives but, at the same time it also detects higher number of false positives. Therefore, this paper aims to reduce the detection of false positives and improve the accuracy of the pattern discovery task. In the proposed work, we use the Convolutional Neural Network (CNN) as a binary classifier to detect the true and false keyword match candidates. A new frame histogram technique is introduced to generate sufficient training samples from IPM to train the CNN. The trained CNN model classifies the matched patterns into true and false classes and identifies the pairs of speech documents that contain the same keyword. The proposed method is evaluated on the Hindi as well as Bengali speech databases. The results are compared with state-of-the-art methods. The detected matched pairs of speech utterances are grouped into broader domain clusters using Newman's clustering algorithm. These clusters are useful for speech retrieval tasks. Highlights: A novel approach is proposed to detect the keyword patterns using a convolutional neural network on the image processing method. The frame histogram technique along statistical logic is adopted to extract the training data for building CNN. The trained CNNAbstract: In this paper, a novel approach to unsupervised pattern discovery for speech signals is proposed. Recently, we introduced an image processing method (IPM) that extracts the desired keywords present in a pair of speech utterances. This method performs well in detecting true positives but, at the same time it also detects higher number of false positives. Therefore, this paper aims to reduce the detection of false positives and improve the accuracy of the pattern discovery task. In the proposed work, we use the Convolutional Neural Network (CNN) as a binary classifier to detect the true and false keyword match candidates. A new frame histogram technique is introduced to generate sufficient training samples from IPM to train the CNN. The trained CNN model classifies the matched patterns into true and false classes and identifies the pairs of speech documents that contain the same keyword. The proposed method is evaluated on the Hindi as well as Bengali speech databases. The results are compared with state-of-the-art methods. The detected matched pairs of speech utterances are grouped into broader domain clusters using Newman's clustering algorithm. These clusters are useful for speech retrieval tasks. Highlights: A novel approach is proposed to detect the keyword patterns using a convolutional neural network on the image processing method. The frame histogram technique along statistical logic is adopted to extract the training data for building CNN. The trained CNN model is tested with the outcome of the image processing method to improve accuracy. The derived keyword match candidates are used to cluster the speech utterances to perform the speech retrieval task efficiently. … (more)
- Is Part Of:
- Computer speech & language. Volume 71(2022)
- Journal:
- Computer speech & language
- Issue:
- Volume 71(2022)
- Issue Display:
- Volume 71, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 71
- Issue:
- 2022
- Issue Sort Value:
- 2022-0071-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-01
- Subjects:
- 00-01 -- 99-00
Speech processing -- Gaussian posterior features -- Frame histogram -- Unsupervised pattern discovery -- Convolutional neural network -- Clustering of speech utterances
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2021.101259 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19108.xml