Analysis of gender and identity issues in depression detection on de-identified speech. (January 2021)
- Record Type:
- Journal Article
- Title:
- Analysis of gender and identity issues in depression detection on de-identified speech. (January 2021)
- Main Title:
- Analysis of gender and identity issues in depression detection on de-identified speech
- Authors:
- Lopez-Otero, Paula
Docio-Fernandez, Laura - Abstract:
- Highlights: Depression detection tools can help to remotely monitor the patients' mental state. Speaker de-identification allows the protection of patients' privacy. Depression detection with different speaker de-identification techniques is evaluated. Gender-dependent de-identification leads to better depression detection results. Speaker-dependent de-identification do not outperform speaker-independent approaches. Abstract: Research in the area of automatic monitoring of emotional state from speech permits envisaging future novel applications for the remote monitoring of some common mental disorders, such as depression. However, these tools raise some privacy concerns since speech is sent via telephone or the Internet, and it is further stored or processed in remote servers. Speaker de-identification can be used to protect the privacy of these patients, but this procedure might affect the ability to perceive the disease when using automatic depression detection approaches. It is also important that the resulting de-identified speech has enough quality since practitioners may need to listen to the recordings to assess the patients' state. This paper performs an extensive analysis of depression detection from de-identified speech using different de-identification approaches based on voice conversion. In previous work, a de-identification technique based on pretrained transformation functions was assessed in the context of depression detection. That strategy isHighlights: Depression detection tools can help to remotely monitor the patients' mental state. Speaker de-identification allows the protection of patients' privacy. Depression detection with different speaker de-identification techniques is evaluated. Gender-dependent de-identification leads to better depression detection results. Speaker-dependent de-identification do not outperform speaker-independent approaches. Abstract: Research in the area of automatic monitoring of emotional state from speech permits envisaging future novel applications for the remote monitoring of some common mental disorders, such as depression. However, these tools raise some privacy concerns since speech is sent via telephone or the Internet, and it is further stored or processed in remote servers. Speaker de-identification can be used to protect the privacy of these patients, but this procedure might affect the ability to perceive the disease when using automatic depression detection approaches. It is also important that the resulting de-identified speech has enough quality since practitioners may need to listen to the recordings to assess the patients' state. This paper performs an extensive analysis of depression detection from de-identified speech using different de-identification approaches based on voice conversion. In previous work, a de-identification technique based on pretrained transformation functions was assessed in the context of depression detection. That strategy is speaker-independent (i.e. not speaker-specific) and gender-independent (i.e. the gender of the speaker is not necessarily preserved), which makes it possible to implement it in a real-world scenario where no parallel training data is required between input and source speakers. This paper aims at analyzing different aspects of the aforementioned speaker de-identification approach in a depression detection scenario: 1) compare the performance of the proposed speaker-independent technique with a speaker-dependent setting where parallel data between input and source speaker are available; 2) analyze how this system behaves when the gender of the speaker is preserved, since this might be a desirable feature and has not been addressed in previous work; 3) assess the performance of two different voice conversion methods in a setting where a limited amount of training data is available; specifically de-identification based on frequency warping and amplitude scaling (FW+AS) was compared with a strategy based on generative adversarial networks (GAN). Experimental validation was carried out in the framework of the Audio/Visual Emotion Challenge 2014, and the results suggest that speaker-independent and gender-dependent de-identification is the most suitable option for depression level estimation since the trade-off between de-identification and depression estimation performances was superior to the other alternatives. In addition, the results suggest that the de-identification approach based on GAN achieves better de-identification performance than FW+AS while achieving comparable results for depression detection. … (more)
- Is Part Of:
- Computer speech & language. Volume 65(2021)
- Journal:
- Computer speech & language
- Issue:
- Volume 65(2021)
- Issue Display:
- Volume 65, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 65
- Issue:
- 2021
- Issue Sort Value:
- 2021-0065-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-01
- Subjects:
- Speaker de-identification -- Depression detection -- Voice conversion -- Frequency warping and amplitude scaling -- Generative adversarial networks
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2020.101118 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 16859.xml