Speech enhancement using sparse dictionary learning in wavelet packet transform domain. (July 2017)
- Record Type:
- Journal Article
- Title:
- Speech enhancement using sparse dictionary learning in wavelet packet transform domain. (July 2017)
- Main Title:
- Speech enhancement using sparse dictionary learning in wavelet packet transform domain
- Authors:
- Mavaddaty, Samira
Ahadi, Seyed Mohammad
Seyedin, Sanaz - Abstract:
- Highlights: A new learning-based speech enhancement method via sparse representation in wavelet packet transform domain is presented. The dictionary learning procedure for training data of speech and noise signals based on coherence criterion are proposed. The supervised and semi-supervised enhancement algorithms are introduced using the presented voice activity detector. The proposed VAD is employed based on the energy of sparse coefficient matrices in each speech enhancement scenario. Domain adaptation method that alleviates the mismatch between train and test steps is employed in enhancement procedure. Abstract: Sparse coding, as a successful representation method for many signals, has been recently employed in speech enhancement. This paper presents a new learning-based speech enhancement algorithm via sparse representation in the wavelet packet transform domain. We propose sparse dictionary learning procedures for training data of speech and noise signals based on a coherence criterion, for each subband of decomposition level. Using these learning algorithms, self-coherence between atoms of each dictionary and mutual coherence between speech and noise dictionary atoms are minimized along with the approximation error. The speech enhancement algorithm is introduced in two scenarios, supervised and semi-supervised. In each scenario, a voice activity detector scheme is employed based on the energy of sparse coefficient matrices when the observation data is coded overHighlights: A new learning-based speech enhancement method via sparse representation in wavelet packet transform domain is presented. The dictionary learning procedure for training data of speech and noise signals based on coherence criterion are proposed. The supervised and semi-supervised enhancement algorithms are introduced using the presented voice activity detector. The proposed VAD is employed based on the energy of sparse coefficient matrices in each speech enhancement scenario. Domain adaptation method that alleviates the mismatch between train and test steps is employed in enhancement procedure. Abstract: Sparse coding, as a successful representation method for many signals, has been recently employed in speech enhancement. This paper presents a new learning-based speech enhancement algorithm via sparse representation in the wavelet packet transform domain. We propose sparse dictionary learning procedures for training data of speech and noise signals based on a coherence criterion, for each subband of decomposition level. Using these learning algorithms, self-coherence between atoms of each dictionary and mutual coherence between speech and noise dictionary atoms are minimized along with the approximation error. The speech enhancement algorithm is introduced in two scenarios, supervised and semi-supervised. In each scenario, a voice activity detector scheme is employed based on the energy of sparse coefficient matrices when the observation data is coded over corresponding dictionaries. In the proposed supervised scenario, we take advantage of domain adaptation techniques to transform a learned noise dictionary to a dictionary adapted to noise conditions captured based on the test environment circumstances. Using this step, observation data is sparsely coded, based on the current situation of the noisy space, with low sparse approximation error. This technique has a prominent role in obtaining better enhancement results particularly when the noise is non-stationary. In the proposed semi-supervised scenario, adaptive thresholding of wavelet coefficients is carried out based on the variance of the estimated noise in each frame of different subbands. The proposed approaches lead to significantly better speech enhancement results in comparison with the earlier methods in this context and the traditional procedures, based on different objective and subjective measures as well as a statistical test. … (more)
- Is Part Of:
- Computer speech & language. Volume 44(2017)
- Journal:
- Computer speech & language
- Issue:
- Volume 44(2017)
- Issue Display:
- Volume 44, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 44
- Issue:
- 2017
- Issue Sort Value:
- 2017-0044-2017-0000
- Page Start:
- 22
- Page End:
- 47
- Publication Date:
- 2017-07
- Subjects:
- Speech enhancement -- Dictionary learning -- Sparse representation -- Domain adaptation -- Voice activity detector -- Wavelet packet transform
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2017.01.009 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 371.xml