Unsupervised speech representation learning for behavior modeling using triplet enhanced contextualized networks. (November 2021)

Record Type:: Journal Article
Title:: Unsupervised speech representation learning for behavior modeling using triplet enhanced contextualized networks. (November 2021)
Main Title:: Unsupervised speech representation learning for behavior modeling using triplet enhanced contextualized networks
Authors:: Li, Haoqi
Baucom, Brian
Narayanan, Shrikanth
Georgiou, Panayiotis
Abstract:: Highlights: A manifold representation is derived from human speech to capture behavioral information. The feasibility of unsupervised cross-domain behavioral modeling is verified. Context information and metric learning are complementary in behavior modeling. Unsupervised training with large-scale data helps domain behavior modeling. Abstract: Speech encodes a wealth of information related to human behavior and has been used in a variety of automated behavior recognition tasks. However, extracting behavioral information from speech remains challenging including due to inadequate training data resources stemming from the often low occurrence frequencies of specific behavioral patterns. Moreover, supervised behavioral modeling typically relies on domain-specific construct definitions and corresponding manually-annotated data, rendering generalizing across domains challenging. In this paper, we exploit the stationary properties of human behavior within an interaction and present a representation learning method to capture behavioral information from speech in an unsupervised way. We hypothesize that nearby segments of speech share the same behavioral context and hence map onto similar underlying behavioral representations. We present an encoder-decoder based Deep Contextualized Network (DCN) as well as a Triplet-Enhanced DCN (TE-DCN) framework to capture the behavioral context and derive a manifold representation, where speech frames with similar behaviors are closer while … (more)
Is Part Of:: Computer speech & language. Volume 70(2021)
Journal:: Computer speech & language
Issue:: Volume 70(2021)
Issue Display:: Volume 70, Issue 2021 (2021)
Year:: 2021
Volume:: 70
Issue:: 2021
Issue Sort Value:: 2021-0070-2021-0000
Page Start:
Page End:
Publication Date:: 2021-11
Subjects:: Behavior modeling -- Unsupervised representation learning -- Context information -- Metric learning
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454
Journal URLs:: http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.csl.2021.101226 ↗
Languages:: English
ISSNs:: 0885-2308
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 17252.xml