Challenges in reinforcement learning of negotiation dialogue policies

Kallirroi Georgila
University of Southern California, USA

3:15pm-4-15pm, 27 April 2016
EM G.44

Abstract

The dialogue policy of a dialogue system decides on what dialogue move (also called "action") the system should make given the dialogue context (also called "dialogue state"). Building hand-crafted dialogue policies is a hard task, and there is no guarantee that the resulting policies will be optimal. This issue has motivated the dialogue community to use statistical methods for automatically learning dialogue policies, the most popular of which is reinforcement learning (RL). However, to date, RL has mainly been used to learn dialogue policies in slot-filling applications (e.g., restaurant recommendation, flight reservation, etc.) largely ignoring other more complex genres of dialogue such as negotiation.

This talk presents challenges in reinforcement learning of negotiation dialogue policies. The first part of the talk focuses on applying RL to a two-party multi-issue negotiation domain. Here the main challenges are the very large state and action space, and learning negotiation dialogue policies that can perform well for a variety of negotiation settings, including against interlocutors whose behaviour has not been observed before. In negotiation, the reward function of an agent can depend on multiple individual and socio-cultural factors. In the second part of the talk, I will show how we can use inverse reinforcement learning (IRL) to learn a model for cultural decision-making in a simple negotiation game (the Ultimatum Game), which generates behaviour close to the behaviour of human players of the game in four different cultures. Good negotiators try to adapt their behaviours based on their interlocutors' behaviours. However, current approaches to using RL for dialogue management assume that the dialogue system learns by interacting with a stationary environment (the user), i.e., an environment that does not change over time. In the third part of the talk, I will present an experiment comparing single-agent RL and multi-agent RL of negotiation dialogue policies in a resource allocation scenario.

Bio

Kallirroi Georgila is a Research Scientist at the Institute for Creative Technologies (ICT) at the University of Southern California (USC) and a Research Assistant Professor at USC's Computer Science Department. Before joining USC-ICT in 2009 she was a Research Scientist at the Educational Testing Service (ETS) and before that a Research Fellow at the School of Informatics at the University of Edinburgh. Her research interests include all aspects of spoken dialogue processing with a focus on reinforcement learning of dialogue policies, expressive conversational speech synthesis, and speech recognition. Dr. Georgila has published over 80 journal articles, conference and workshop papers, and technical reports on various topics in spoken dialogue processing. She has served on the organizing, senior, and program committees of many conferences and workshops. Her research work is funded by the National Science Foundation and the Army Research Office.