14th Speech in Noise Workshop, 12-13 January 2023, Split, Croatia 14th Speech in Noise Workshop, 12-13 January 2023, Split, Croatia

T08
Towards spoken interaction with embodied agents

Khiet Truong
University of Twente, Enschede, Netherlands

In this presentation, I will highlight some of the research we have been carrying out on spoken interactions with embodied agents. Talking to embodied agents, such as robots or virtual agents, can be beneficial for several reasons, ranging from being able to multitask hands-free, to expressing oneself in a less restricted manner in the context of storytelling or information search. Creating these spoken interactions with embodied agents requires not only automatic speech recognition technology but also knowledge about paralinguistics, conversational interaction, and human-computer interaction as well. I will present several examples of our research in which speech technology and knowledge from speech communication research drive our investigations into spoken interactions with embodied agents. The first example involves creating a listening agent. From research, it appears that the display of listening behavior is often cued by the speaker's prosodic behavior. We developed and evaluated several backchannel strategies for a virtual agent. The second example concerns robot-supported spoken conversational search in media for children. Searching via speech in multi-turn conversational interaction with the help of a robot has several benefits for children who find it difficult to compress their search needs into one query. Through asking clarification questions or giving suggestions, the robot can support the child in their information seeking. However, how do we design these interactions in a responsible manner and what role does trust play here? Thirdly, in current times where voice assistants are pre-dominantly female-sounding by default, we were wondering to what extent gender norms control the way we perceive voices. In this case, we looked into a nonverbal vocalisation that can be associated with multiple social functions, namely laughter. I will present the results of these studies and discuss implications for the development of spoken interactive embodied agents.

Last modified 2023-01-06 23:41:06