The CLASSiC Project: Computational Learning in Adaptive Systems for Spoken Conversation

Oliver Lemon

SLTC Newsletter, February 2011

This article describes some of the main achievements to date in the EC FP7 project "CLASSiC", which ends in early 2011. The project focuses on statistical methods for dialogue processing, and is one of the largest current European research projects in speech and language technology. It is coordinated by Heriot-Watt University, and is a collaboration between Cambridge University, the University of Geneva, University of Edinburgh, L'Ecole Supérieure d'électricité (SUPELEC), and France Telecom / Orange Labs

Project Summary

The overall goal of the CLASSiC project has been to develop statistical machine learning methods for the deployment of accurate and robust spoken dialogue systems (SDS). These systems can learn from experience - either from dialogue data that has already been collected, or online through interactions with users. We have deployed systems (for data collection and evaluation) for tourist information, customer support, and appointment scheduling. One system, for appointment scheduling, has been available for public use in France since March 2010.

The CLASSiC architecture

CLASSiC has proposed and developed a unified treatment of uncertainty across the entire SDS architecture (speech recognition, spoken language understanding, dialogue management, natural language generation, and speech synthesis). This architecture allows multiple possible analyses (e.g. n-best lists of ASR hypothesis, distributions over user goals) to be represented, maintained, and reasoned with robustly and efficiently. It supports a layered hierarchy of supervised learning and reinforcement learning methods, in order to facilitate mathematically principled optimisation and adaptation techniques. However, the CLASSiC architecture still maintains the modularity of traditional SDS, allowing the separate development of statistical models of speech recognition, spoken language understanding, dialogue management, natural language generation and speech synthesis. For more details, see citations below, or Deliverable 5.1.2.


A system "belief state" showing a probability distribution over possible user goals (size of the bar on the left indicates relative probability of the corresponding meanings on the right).

Research Areas

Progress is being made in several areas:

  • Computational Learning approaches to Dialogue Management (DM):
    • using Partially Observable Markov Decision Processes (POMDPs)
    • integration of online reinforcement learning capabilities into an industrial SDS.
  • Statistical approaches to Spoken Language Understanding (SLU).
  • Developing simulated users for training Dialogue Management and Natural Language Generation strategies.
  • A new statistical learning approach to Natural Language Generation (NLG) in SDS.
  • Online reinforcement learning of optimal Text-To-Speech (TTS) variants.
  • An effective statistical approach to synthesize natural speech with word-level emphasis.
  • Design of the CLASSiC architecture, and implementation of the CLASSiC prototype systems:
    • a Town Information system
    • an internet connection Self-Help system.
    • a customer service Appointment Scheduling system.
    • VoIP-based tourist information systems for real tourist use. (Please check back on our website for details.)

Evaluation Results

The CLASSiC systems and components have been evaluated both in simulation and in trials with real users, both in laboratory conditions and "in the wild" (i.e. with real users outside of the lab). The final evaluation of the systems is ongoing at the time of writing. (Please see our publications page for the referenced papers.)

We have obtained the following evaluation results to date:

  • The Hidden Information State system (a POMDP system) improves task success by 25% in high-noise conditions, when tested in simulation (Young et. al, Computer Speech and Language 2009)
  • 5% reduction in word error rate when using predictions from a similated user to re-rank n-best lists of ASR hypotheses (Lemon and Konstas, EACL 2009)
  • Effective transfer of semantic annotations from English to French (Henderson et. al, Deliverable 2.3, 2009)
  • An SVM-based data-driven semantic parser was shown to perform as well as the state-of-the-art on the ATIS dataset, and outperformed a handcrafted parser by 4% in terms of semantic concept precision and recall, in the tourist information domain. (Mairesse et. al, ICASSP 2009)
  • Adaptive Natural Language Generation using Reinforcement Learning techniques, evaluation with real users: 12% decrease in time taken, and a 15% increase in task completion rate (Janarthanam and Lemon, SIGDIAL 2010)
  • A statistical planning approach to NLG for Information Presentation (content planning and attribute selection) outperforms hand-coded policies and a policy learned from human performance. Tested in simulation in the Tourist Information domain (Rieser et al, ACL 2010)
  • A fully data-driven statistical language generator using factored language models was developed to produce utterances in the tourist information domain. Its output was shown not to differ significantly from human paraphrases (over 200 test utterances). Furthermore, active learning from semantically-labelled utterances collected through crowd-sourcing was shown to significantly improve performance on sparse training sets. (Mairesse et. al, ACL 2010)
  • Expressive speech synthesis with word-level emphasis, evaluated on human listeners, obtained significant improvements in conveying emphasis -- compared to a standard statistical synthesier, the percentage of correctly conveyed emphasized words increased over 20%. (Yu et. al, ICASSP 2010)
  • Online reinforcement learning improved the commercial application's completion rate by 10% with real customers (Putois et. al, SIGDial 2010)

Performance in shared challenges

CLASSiC partners have deployed the technology in the Spoken Dialogue Challenge and the CONLL shared tasks on syntactic-semantic dependency parsing. The CLASSiC technologies were amongst the top performers on these tasks.

New and open dialogue data-sets

The CLASSiC project will release freely available dialogue data to the research community at the end of the project (project Deliverable D6.5). This can be expected towards the middle of 2011. This data will consist of anonymised system audio, logs, transcriptions, and some annotated data from several of the CLASSiC dialogue systems. The released data will amount to several thousand dialogues.

Acknowledgements and more information

The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 216594 (CLASSiC project).

Thanks to the CLASSiC partners for providing input to this article.

For more information, see:

Oliver Lemon is a Professor in the School of Mathematics and Computer Science at Heriot-Watt University, Edinburgh, where he leads the Interaction Lab. He is the Coordinator of the CLASSiC Project.