Μ-law SGAN for generating spectra with more details in speech enhancement. (April 2021)
- Record Type:
- Journal Article
- Title:
- Μ-law SGAN for generating spectra with more details in speech enhancement. (April 2021)
- Main Title:
- Μ-law SGAN for generating spectra with more details in speech enhancement
- Authors:
- Li, Hongfeng
Xu, Yanyan
Ke, Dengfeng
Su, Kaile - Abstract:
- Abstract: The goal of monaural speech enhancement is to separate clean speech from noisy speech. Recently, many studies have employed generative adversarial networks (GAN) to deal with monaural speech enhancement tasks. When using generative adversarial networks for this task, the output of the generator is a speech waveform or a spectrum, such as a magnitude spectrum, a mel-spectrum or a complex-valued spectrum. The spectra generated by current speech enhancement methods in the time–frequency domain usually lack details, such as consonants and harmonics with low energy. In this paper, we propose a new type of adversarial training framework for spectrum generation, named μ -law spectrum generative adversarial networks ( μ -law SGAN). We introduce a trainable μ -law spectrum compression layer (USCL) into the proposed discriminator to compress the dynamic range of the spectrum. As a result, the compressed spectrum can display more detailed information. In addition, we use the spectrum transformed by USCL to regularize the generator's training, so that the generator can pay more attention to the details of the spectrum. Experimental results on the open dataset Voice Bank + DEMAND show that μ -law SGAN is an effective generative adversarial architecture for speech enhancement. Moreover, visual spectrogram analysis suggests that μ -law SGAN pays more attention to the enhancement of low energy harmonics and consonants. Highlights: Introduce the trainable USCL to D (theAbstract: The goal of monaural speech enhancement is to separate clean speech from noisy speech. Recently, many studies have employed generative adversarial networks (GAN) to deal with monaural speech enhancement tasks. When using generative adversarial networks for this task, the output of the generator is a speech waveform or a spectrum, such as a magnitude spectrum, a mel-spectrum or a complex-valued spectrum. The spectra generated by current speech enhancement methods in the time–frequency domain usually lack details, such as consonants and harmonics with low energy. In this paper, we propose a new type of adversarial training framework for spectrum generation, named μ -law spectrum generative adversarial networks ( μ -law SGAN). We introduce a trainable μ -law spectrum compression layer (USCL) into the proposed discriminator to compress the dynamic range of the spectrum. As a result, the compressed spectrum can display more detailed information. In addition, we use the spectrum transformed by USCL to regularize the generator's training, so that the generator can pay more attention to the details of the spectrum. Experimental results on the open dataset Voice Bank + DEMAND show that μ -law SGAN is an effective generative adversarial architecture for speech enhancement. Moreover, visual spectrogram analysis suggests that μ -law SGAN pays more attention to the enhancement of low energy harmonics and consonants. Highlights: Introduce the trainable USCL to D (the discriminator in GAN) to compress the dynamic range of inputs and make D focus on the details of input spectra. Utilize the trainable USCL regularization to make G (the generator in GAN) pay more attention to the spectrum details. Utilize SSNR regularization as the time-domain objective function, which is complementary to the objective function in the T–F domain. … (more)
- Is Part Of:
- Neural networks. Volume 136(2021)
- Journal:
- Neural networks
- Issue:
- Volume 136(2021)
- Issue Display:
- Volume 136, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 136
- Issue:
- 2021
- Issue Sort Value:
- 2021-0136-2021-0000
- Page Start:
- 17
- Page End:
- 27
- Publication Date:
- 2021-04
- Subjects:
- Signal processing -- Speech enhancement -- Deep neural networks -- Generative adversarial networks -- μ-law SGAN
Neural computers -- Periodicals
Neural networks (Computer science) -- Periodicals
Neural networks (Neurobiology) -- Periodicals
Nervous System -- Periodicals
Ordinateurs neuronaux -- Périodiques
Réseaux neuronaux (Informatique) -- Périodiques
Réseaux neuronaux (Neurobiologie) -- Périodiques
Neural computers
Neural networks (Computer science)
Neural networks (Neurobiology)
Periodicals
006.32 - Journal URLs:
- http://www.sciencedirect.com/science/journal/08936080 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.neunet.2020.12.017 ↗
- Languages:
- English
- ISSNs:
- 0893-6080
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 6081.280800
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15792.xml