Phase sensitive masking-based single channel speech enhancement using conditional generative adversarial network. (January 2022)

Record Type:: Journal Article
Title:: Phase sensitive masking-based single channel speech enhancement using conditional generative adversarial network. (January 2022)
Main Title:: Phase sensitive masking-based single channel speech enhancement using conditional generative adversarial network
Authors:: Routray, Sidheswar
Mao, Qirong
Abstract:: Abstract: We propose PSMGAN, an efficient phase sensitive masking-based single-channel speech enhancement technique using a conditional generative adversarial network (cGAN). The time–frequency (T-F) masking-based speech enhancement approaches through deep neural networks (DNNs) have shown large speech intelligibility improvements. However, these approaches fail to achieve better enhancement results at low signal-to-noise ratio (SNR) conditions since they ignore the phase information during reconstruction. Alternatively, GANs have been introduced effectively for speech enhancement and achieved improved performance due to the adversarial training. Motivated by the recent success of GAN, we introduce the phase sensitive masking (PSM) in a cGAN framework for speech enhancement task. The reason for choosing a conditional generative model is that the data generation process can be controlled with the use of additional temporal context information. In addition, we use gradient penalty regularization in the discriminator of the cGAN network to avoid vanishing gradients problem which in turn stabilizes the training of the cGAN network and increases the quality of the generated samples. The use of PSM is due to the fact that it involves both amplitude and phase information and produces an improved estimate of clean speech signal with higher SNR as compared to other T-F masks. Experimental results show the proposed PSM based cGAN architecture has shown significant improvements in … (more)
Is Part Of:: Computer speech & language. Volume 71(2022)
Journal:: Computer speech & language
Issue:: Volume 71(2022)
Issue Display:: Volume 71, Issue 2022 (2022)
Year:: 2022
Volume:: 71
Issue:: 2022
Issue Sort Value:: 2022-0071-2022-0000
Page Start:
Page End:
Publication Date:: 2022-01
Subjects:: Single channel speech enhancement -- Phase sensitive mask -- Deep learning -- Conditional generative adversarial network (cGAN) -- Adversarial training
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2021.101270 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 19058.xml