LSTM-convolutional-BLSTM encoder-decoder network for minimum mean-square error approach to speech enhancement. (15th January 2021)

Record Type:: Journal Article
Title:: LSTM-convolutional-BLSTM encoder-decoder network for minimum mean-square error approach to speech enhancement. (15th January 2021)
Main Title:: LSTM-convolutional-BLSTM encoder-decoder network for minimum mean-square error approach to speech enhancement
Authors:: Wang, Zeyu
Zhang, Tao
Shao, Yangyang
Ding, Biyun
Abstract:: Abstract: In recent years, deep learning models have been employed for speech enhancement. Most of the existing methods based on deep learning use fully Convolutional Neural Network (CNN) to capture time–frequency information of input features. Compared with CNNs, it is more reasonable to use Long Short-Term Memory (LSTM) network to capture contextual information on the time axis of features. However, the computation load of a fully LSTM structure is heavy. To balance the model complexity and the capability of capturing time–frequency features, we present an LSTM-Convolutional-BLSTM Encoder-Decoder (LCLED) network for speech enhancement. The LCLED additionally incorporates transpose convolution and skip connection. The key idea is that we use two LSTM parts and convolutional layers to model the contextual information and frequency dimension features, respectively. Furthermore, in order to achieve a higher quality of enhanced speech, a priori Signal-to-Noise Ratio (SNR) is applied as the learning target of LCLED. The Minimum Mean-Square Error (MMSE) approach is used for postprocessing. The results indicate that the proposed LCLED not only reduces the model complexity and training time but also improves the quality and the intelligibility of enhanced speech compared with the fully LSTM structure.
Is Part Of:: Applied acoustics. Volume 172(2021)
Journal:: Applied acoustics
Issue:: Volume 172(2021)
Issue Display:: Volume 172, Issue 2021 (2021)
Year:: 2021
Volume:: 172
Issue:: 2021
Issue Sort Value:: 2021-0172-2021-0000
Page Start:
Page End:
Publication Date:: 2021-01-15
Subjects:: Speech enhancement -- LSTM-Convolutional-BLSTM Encoder-Decoder network -- Transpose convolution -- Minimum Mean-Square Error
Acoustical engineering -- Periodicals
Periodicals
620.2
Journal URLs:: http://www.sciencedirect.com/science/journal/0003682X ↗
http://www.elsevier.com/journals ↗
http://www.elsevier.com/homepage/elecserv.htt ↗
DOI:: 10.1016/j.apacoust.2020.107647 ↗
Languages:: English
ISSNs:: 0003-682X
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 1571.400000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 20403.xml