I see it in your eyes: Training the shallowest-possible CNN to recognise emotions and pain from muted web-assisted in-the-wild video-chats in real-time. Issue 6 (November 2020)
- Record Type:
- Journal Article
- Title:
- I see it in your eyes: Training the shallowest-possible CNN to recognise emotions and pain from muted web-assisted in-the-wild video-chats in real-time. Issue 6 (November 2020)
- Main Title:
- I see it in your eyes: Training the shallowest-possible CNN to recognise emotions and pain from muted web-assisted in-the-wild video-chats in real-time
- Authors:
- Pandit, Vedhas
Schmitt, Maximilian
Cummins, Nicholas
Schuller, Björn - Abstract:
- Highlights: We propose the shallowest-possible, and perhaps the shallowest-ever, convolutional neural network model that can predict emotions from real-life, noisy, laggy, internet-based (in-the-wild) videos real-time, capturing nuances of emotions, i.e. value- and time-continuous affect prediction. The research we present in this paper is directly relevant to healthcare for applications such as real-time patient monitoring, AI-assisted doctor-patient consultations. The proposed models are computationally inexpensive, can be embedded into devices such as smartglasses. We use a novel feature selection paradigm that is driven by feature attribution score computations. We investigate and reason the AI performance, present computations on how exactly it utilises the input features to make affect-related predictions (Explainable AI). We compute the relevance and utilisation of facial action unit (FAU)-derived features by the model, comparing it against the human perception of emotion expression. We extend this FAU-based 'affect' prediction approach to the FAU-based 'pain-intensity' prediction problem. As FAUs can be extracted in near real-time, and because the models we developed are exceptionally shallow, this study paves the way for a robust, cross-cultural, end-to-end, in-the-wild real-time affect and pain prediction, that is also (nuanced or) value- and time-continuous. Abstract: A robust value- and time-continuous emotion recognition has enormous potential benefits withinHighlights: We propose the shallowest-possible, and perhaps the shallowest-ever, convolutional neural network model that can predict emotions from real-life, noisy, laggy, internet-based (in-the-wild) videos real-time, capturing nuances of emotions, i.e. value- and time-continuous affect prediction. The research we present in this paper is directly relevant to healthcare for applications such as real-time patient monitoring, AI-assisted doctor-patient consultations. The proposed models are computationally inexpensive, can be embedded into devices such as smartglasses. We use a novel feature selection paradigm that is driven by feature attribution score computations. We investigate and reason the AI performance, present computations on how exactly it utilises the input features to make affect-related predictions (Explainable AI). We compute the relevance and utilisation of facial action unit (FAU)-derived features by the model, comparing it against the human perception of emotion expression. We extend this FAU-based 'affect' prediction approach to the FAU-based 'pain-intensity' prediction problem. As FAUs can be extracted in near real-time, and because the models we developed are exceptionally shallow, this study paves the way for a robust, cross-cultural, end-to-end, in-the-wild real-time affect and pain prediction, that is also (nuanced or) value- and time-continuous. Abstract: A robust value- and time-continuous emotion recognition has enormous potential benefits within healthcare. For example, within mental health, a real-time patient monitoring system capable of accurately inferring a patient's emotional state could help doctors make an appropriate diagnosis and treatment plan. Such interventions could be vital in terms of ensuring a higher quality of life for the patient involved. To make such tools a reality, the associated machine learning systems need to be fast, robust and generalisable. In this regard, we present herein, a novel emotion recognition system consisting of the shallowest realisable Convolutional Neural Network (CNN) architecture. We draw insights from visualisations of the trained filter weights and the facial action unit (FAU) activations, i. e. the inputs to the model, of the participants featured in the in-the-wild, spontaneous video-chat sessions of the SEWA corpus. Further, we demonstrate the generalisablity of this approach on the German, Hungarian, and Chinese cultures available in this corpus. The obtained cross-cultural performance is a testimony to the universality of FAUs in expression and understanding of the human affective behaviours. These learnings were moderately consistent with the human perception of emotional expression. The practicality of the proposed approach is also demonstrated in another key healthcare applications; pain intensity prediction. Key results from these experiments highlight the transparency of the shallow CNN structure. As FAU can be extracted in near real-time, and because the models we developed are exceptionally shallow, this study paves the way for a robust, cross-cultural, end-to-end, in-the-wild, explainable real-time affect and pain prediction, that is value- and time-continuous. … (more)
- Is Part Of:
- Information processing & management. Volume 57:Issue 6(2020:Nov.)
- Journal:
- Information processing & management
- Issue:
- Volume 57:Issue 6(2020:Nov.)
- Issue Display:
- Volume 57, Issue 6 (2020)
- Year:
- 2020
- Volume:
- 57
- Issue:
- 6
- Issue Sort Value:
- 2020-0057-0006-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-11
- Subjects:
- Affect recognition -- Healthcare -- Real-time -- Explainable -- Feature selection -- In-the-wild
Information storage and retrieval systems -- Periodicals
Information science -- Periodicals
Systèmes d'information -- Périodiques
Sciences de l'information -- Périodiques
Information science
Information storage and retrieval systems
Periodicals
658.4038 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064573 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.ipm.2020.102347 ↗
- Languages:
- English
- ISSNs:
- 0306-4573
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4493.893000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14754.xml