The TTS-driven affective embodied conversational agent EVA, based on a novel conversational-behavior generation algorithm. (January 2017)
- Record Type:
- Journal Article
- Title:
- The TTS-driven affective embodied conversational agent EVA, based on a novel conversational-behavior generation algorithm. (January 2017)
- Main Title:
- The TTS-driven affective embodied conversational agent EVA, based on a novel conversational-behavior generation algorithm
- Authors:
- Rojc, Matej
Mlakar, Izidor
Kačič, Zdravko - Abstract:
- Abstract: As a result of the convergence of different services delivered over the internet protocol, internet protocol television (IPTV) may be regarded as the one of the most widespread user interfaces accepted by a highly diverse user domain. Every generation, from children to the elderly, can use IPTV for recreation, as well as for gaining social contact and stimulating the mind. However, technological advances in digital platforms go hand in hand with the complexity of their user interfaces, and thus induce technological disinterest and technological exclusion. Therefore, interactivity and affective content presentations are, from the perspective of advanced user interfaces, two key factors in any application incorporating human-computer interaction (HCI). Furthermore, the perception and understanding of the information (meaning) conveyed is closely interlinked with visual cues and non-verbal elements that speakers generate throughout human-human dialogues. In this regard, co-verbal behavior provides information to the communicative act. It supports the speaker's communicative goal and allows for a variety of other information to be added to his/her messages, including (but not limited to) psychological states, attitudes, and personality. In the present paper, we address complexity and technological disinterest through the integration of natural, human-like multimodal output that incorporates a novel combined data- and rule-driven co-verbal behavior generator that isAbstract: As a result of the convergence of different services delivered over the internet protocol, internet protocol television (IPTV) may be regarded as the one of the most widespread user interfaces accepted by a highly diverse user domain. Every generation, from children to the elderly, can use IPTV for recreation, as well as for gaining social contact and stimulating the mind. However, technological advances in digital platforms go hand in hand with the complexity of their user interfaces, and thus induce technological disinterest and technological exclusion. Therefore, interactivity and affective content presentations are, from the perspective of advanced user interfaces, two key factors in any application incorporating human-computer interaction (HCI). Furthermore, the perception and understanding of the information (meaning) conveyed is closely interlinked with visual cues and non-verbal elements that speakers generate throughout human-human dialogues. In this regard, co-verbal behavior provides information to the communicative act. It supports the speaker's communicative goal and allows for a variety of other information to be added to his/her messages, including (but not limited to) psychological states, attitudes, and personality. In the present paper, we address complexity and technological disinterest through the integration of natural, human-like multimodal output that incorporates a novel combined data- and rule-driven co-verbal behavior generator that is able to extract features from unannotated, general text. The core of the paper discusses the processes that model and synchronize non-verbal features with verbal features even when dealing with unknown context and/or limited contextual information. In addition, the proposed algorithm incorporates data-driven (speech prosody, repository of motor skills) and rule-based concepts (grammar, gesticon). The algorithm firstly classifies the communicative intent, then plans the co-verbal cues and their form within the gesture unit, generates temporally synchronized co-verbal cues, and finally realizes them in the form of human-like co-verbal movements. In this way, the information can be represented in the form of both meaningfully and temporally synchronized co-verbal cues with accompanying synthesized speech, using communication channels to which people are most accustomed. Highlights: Automatic planning, designing, and recreation of co-verbal behavior for Smart IPTV system, named UMB-SmartTV. Procedures and algorithms for modeling the conversational dialog. TTS- and data-driven expressive model for generating co-verbal behavior. Semiotic classification of intent incorporating linguistic and prosodic cues. Visual prosody reflecting features of speech signal and context of input text. … (more)
- Is Part Of:
- Engineering applications of artificial intelligence. Volume 57(2016:Sep.)
- Journal:
- Engineering applications of artificial intelligence
- Issue:
- Volume 57(2016:Sep.)
- Issue Display:
- Volume 57 (2016)
- Year:
- 2016
- Volume:
- 57
- Issue Sort Value:
- 2016-0057-0000-0000
- Page Start:
- 80
- Page End:
- 104
- Publication Date:
- 2017-01
- Subjects:
- Co-verbal behavior -- Multimodal interfaces -- IPTV -- Embodied conversational agents -- Communicative intent -- Gestures -- Speech prosody -- Data-driven -- Segmentation
Engineering -- Data processing -- Periodicals
Artificial intelligence -- Periodicals
Expert systems (Computer science) -- Periodicals
Ingénierie -- Informatique -- Périodiques
Intelligence artificielle -- Périodiques
Systèmes experts (Informatique) -- Périodiques
Artificial intelligence
Engineering -- Data processing
Expert systems (Computer science)
Periodicals
620.00285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09521976 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.engappai.2016.10.006 ↗
- Languages:
- English
- ISSNs:
- 0952-1976
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3755.704500
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 2033.xml