Interview: AAAI Symposium on Dialog with Robots Brings Together Researchers in Speech Technology, Artificial Intelligence, and Human-Robot Interaction
Matthew Marge
SLTC Newsletter, February 2011
Overview
The idea that we can have fluid conversations with robots has largely been limited to such fictional robots as C3PO from Star Wars and Data from Star Trek. Researchers in human-robot dialogue believe that we can leverage an existing established technology (spoken dialog systems), add it to robots, and redesign robots to work well with people for a variety of tasks (human-robot interaction). While several groups have developed dialog systems for robots, the spoken dialog and human-robot interaction communities have largely worked independently, publishing their work at such conferences as SIGdial and HRI.
In November, an AAAI symposium sought out to address a growing problem in robotics - how should we build dialog systems for robots? This meeting brought together researchers from human-robot interaction and spoken dialog. The AAAI Fall Symposium on Dialog with Robots provided a forum for researchers in speech technology, robotics, human-computer interaction and related fields to identify key challenges in the field, present new work in the area, and have open discussions on several emerging topics. Leading the efforts behind the workshop were Dan Bohus (Microsoft Research), Eric Horvitz (Microsoft Research), Takayuki Kanda (Advanced Telecommunications Research Institute International), Bilge Mutlu (University of Wisconsin-Madison), and Antoine Raux (Honda Research Institute).
We had a chance to interview one of the organizers about the workshop and existing challenges facing human-robot dialog researchers.
SLTC: Would you consider the workshop a success? What do you think attendees learned most from the workshop?
Dan Bohus:
My sense is that overall the symposium was a great success - I thought we had a great response from the community, with a larger than expected number of technical submissions and participants. A large number of ideas and viewpoints were discussed and reflected upon both during the technical presentations, the keynotes, and the very lively open discussion sessions. People I spoke with were all excited about the symposium and about getting the dialog and HRI communities to interact more closely, and I am hopeful we can carry that momentum forward: a few follow-up efforts are already in place, for instance there is a special theme on situated dialog at this year's SIGdial...personally I enjoyed a lot seeing the diversity of viewpoints and angles from which people come towards this space.
SLTC: What were your impressions from the discussion sessions? Can you identify any focus areas from them? Particularly those that speech and language researchers should consider.
DB:
Like you point out, there were indeed a variety of topics raised and discussed throughout the symposium. For me, some of the major areas that emerged from the paper submissions and from the discussions we had surrounded issues of modeling communicative mechanisms in embodied settings (e.g., attention, engagement, turn-taking, grounding), the role of physicality in communication, the interplay between communication and actions (and the various challenges that interplay raises at different time scales), aspects of learning from and through interaction, spatial language understanding, etc. I think each of these areas poses interesting challenges and I think as a community we are only beginning to chart the landscape at this intersection of spoken dialog and HRI. There were also a number of interesting papers at the symposium describing various research systems, platforms, toolkits, etc., reflecting vibrant efforts in this space, and also highlighting the need for developing common challenges, toolkits, metrics, etc. as this nascent community moves forward.
Lessons learned
The organizers also composed a final report for the meeting [1]. Here's an excerpt summarizing the key contributions from the workshop:
Ideas spanning a spectrum of interrelated research topics were presented and discussed during oral presentations and a poster session. Recurrent themes centered around challenges and directions with the use of dialog by physically embodied agents, taking into consideration aspects of the task, surrounding environment, and broader context. Several presentations highlighted problems with modeling communicative competencies that are fundamental in creating, maintaining and organizing interactions in physical space, such as engagement, turn-taking, joint attention, and verbal and non-verbal communicative skills. Other presenters explored the challenges of leveraging physical context in various language understanding problems such as reference resolution, or the challenges of coupling action and communication in the interaction planning and dialog management process. A number of papers reported on developmental approaches for acquiring knowledge through interaction, and focused on challenges such as learning new words, concepts, meanings and intents, and grounding this learning in the interaction with the physical world. The topics covered also included interaction design challenges, descriptions of existing or planned systems, research platforms and toolkits, theoretical models, and experimental results.
Several discussion sessions allowed researchers from the diverse fields that gathered together at the meeting to talk about what's next:
The symposium included three moderated, open discussions that provided a forum for exchanging ideas on some of the key topics in the symposium. The first discussion aimed to address some of the challenges at the crossroads of dialog and HRI. The physicality of such interactions was highlighted as a critical factor and the prospect of identifying a core, yet simple set of principles and first-order concepts to be reasoned about, or a “naïve physics” of situated dialog and discourse, was raised and discussed. A second open discussion centered around the interplay between action and communication, and highlighted ideas such as viewing communication as joint action and the importance of creating models for incremental processing that can support recognition and generation of actions and phenomena occurring on different time scales. The final discussion addressed several other fundamental issues such as how we might move forward in this nascent field. Discussion touched on the need for unified platforms and challenges for supporting comparative evaluations of different techniques, the pros and cons of simulation-based approaches, and even the value of revisiting fundamental questions: Why should we endow robots with the ability to engage in dialogue with people? What assumptions are we making - and which can we make?
We look forward to hearing more about how this rather new field grows!
References
- [1] Dan Bohus, Eric Horvitz, Takayuki Kanda, Bilge Mutlu, and Antoine Raux (2011), "Final Report: AAAI Fall Symposium on Dialog with Robots."
If you have comments, corrections, or additions to this article, please contact the author: Matthew Marge, mrma...@cs.cmu.edu.

