Attention gated tensor neural network architectures for speech emotion recognition. (January 2022)
- Record Type:
- Journal Article
- Title:
- Attention gated tensor neural network architectures for speech emotion recognition. (January 2022)
- Main Title:
- Attention gated tensor neural network architectures for speech emotion recognition
- Authors:
- Pandey, Sandeep Kumar
Shekhawat, Hanumant Singh
Prasanna, S.R.M - Abstract:
- Abstract: In an attempt to make Human-Computer Interactions more natural, we propose the use of Tensor Factorized Neural Networks (TFNN) and Attention Gated Tensor Factorized Neural Network (AG-TFNN) for Speech Emotion Recognition (SER) task. Standard speech representations such as 2D and 3D Mel-Spectrogram and Temporal Modulation Spectrogram is explored to investigate the emotion salient information capturing effectiveness of the Tensor Factorization based architectures. The hidden layers are explained as Deep Tensor Factorization based on the Tucker Decomposition but with a unified discriminative objective function to learn the factor matrices in a discriminative sense. The core tensor produced in each hidden layer is the feature associated with that factorisation layer. Mel Spectrograms are naturally in 2D tensor form, and thus TFNN and AG-TFNN becomes an appropriate choice over baselines such as Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) by providing reduced parameters to be learned and a simple architecture. Experiments conducted on standard emotional speech datasets- Emo-DB and IEMOCAP shows that TFNN and AG-TFNN surpasses the state-of-the-art given by CNN+LSTM combination with fewer number of parameters.
- Is Part Of:
- Biomedical signal processing and control. Volume 71(2022)Part A
- Journal:
- Biomedical signal processing and control
- Issue:
- Volume 71(2022)Part A
- Issue Display:
- Volume 71, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 71
- Issue:
- 2022
- Issue Sort Value:
- 2022-0071-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-01
- Subjects:
- Tensor Factorized Neural Network -- Attention -- CNN -- LSTM -- Speech emotion -- Emo-DB -- IEMOCAP
Signal processing -- Periodicals
Biomedical engineering -- Periodicals
Signal Processing, Computer-Assisted -- Periodicals
Image Processing, Computer-Assisted -- Periodicals
Biomedical Engineering -- Periodicals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/17468094 ↗
http://www.elsevier.com/journals ↗
http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%2329675%232006%23999989998%23626449%23FLA%23&_cdi=29675&_pubType=J&_auth=y&_acct=C000045259&_version=1&_urlVersion=0&_userid=836873&md5=664b5cf9a57fc91971a17faf20c32ec1 ↗ - DOI:
- 10.1016/j.bspc.2021.103173 ↗
- Languages:
- English
- ISSNs:
- 1746-8094
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2087.880400
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19704.xml