Automatic offline annotation of turn-taking transitions in task-oriented dialogue. (March 2023)
- Record Type:
- Journal Article
- Title:
- Automatic offline annotation of turn-taking transitions in task-oriented dialogue. (March 2023)
- Main Title:
- Automatic offline annotation of turn-taking transitions in task-oriented dialogue
- Authors:
- Brusco, Pablo
Gravano, Agustín - Abstract:
- Abstract: As the volume of recorded conversations continues to surge, so does the need for their automatic processing. Plenty of information beyond words may be extracted from the speech signal that could be valuable in domains such as call-center quality assurance. In particular, describing the dynamics of turn-taking exchanges allows for a deeper understanding of the development and outcome of a dialogue. In this paper, we investigate the construction of an automatic turn-taking annotation tool based on recordings of entire conversations (in offline mode) — an unexplored topic to our knowledge. We experiment with two supervised learning approaches, using recurrent neural networks and random forests, on a corpus of Argentine Spanish task-oriented dialogues annotated with 12 turn-taking categories following standard guidelines. Our models achieve promising results, with F1 scores ranging 0.7–0.9 for the most frequent labels (e.g., smooth switches, backchannels), but much lower for the least frequent ones (various kinds of interruptions), for which further research is needed. We also evaluate our best-performing models considering their generalizability in scenarios of growing difficulty, including dialogues in two different languages (English and Slovak). Finally, to address the typical data scarcity issue, we analyze the impact of combining training data from different corpora, again including cross-linguistic data. Highlights: Machine learning (ML) models for offlineAbstract: As the volume of recorded conversations continues to surge, so does the need for their automatic processing. Plenty of information beyond words may be extracted from the speech signal that could be valuable in domains such as call-center quality assurance. In particular, describing the dynamics of turn-taking exchanges allows for a deeper understanding of the development and outcome of a dialogue. In this paper, we investigate the construction of an automatic turn-taking annotation tool based on recordings of entire conversations (in offline mode) — an unexplored topic to our knowledge. We experiment with two supervised learning approaches, using recurrent neural networks and random forests, on a corpus of Argentine Spanish task-oriented dialogues annotated with 12 turn-taking categories following standard guidelines. Our models achieve promising results, with F1 scores ranging 0.7–0.9 for the most frequent labels (e.g., smooth switches, backchannels), but much lower for the least frequent ones (various kinds of interruptions), for which further research is needed. We also evaluate our best-performing models considering their generalizability in scenarios of growing difficulty, including dialogues in two different languages (English and Slovak). Finally, to address the typical data scarcity issue, we analyze the impact of combining training data from different corpora, again including cross-linguistic data. Highlights: Machine learning (ML) models for offline annotation of turn-taking exchanges in entire dialogues. Full taxonomy of turn-taking transitions, including smooth switches, interruptions, and backchannels. Comparison of two ML approaches: random forests, and recurrent neural model (with a novel loss function strategy). Generalizability assessment of our best models in scenarios of growing difficulty in Spanish. Cross-linguistic evaluation in English and Slovak dialogues, as a way of dealing with labeled data scarcity. … (more)
- Is Part Of:
- Computer speech & language. Volume 78(2023)
- Journal:
- Computer speech & language
- Issue:
- Volume 78(2023)
- Issue Display:
- Volume 78, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 78
- Issue:
- 2023
- Issue Sort Value:
- 2023-0078-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-03
- Subjects:
- Dialogue -- Turn-taking -- Machine learning -- Offline audio processing
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2022.101462 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24470.xml