Report from IEEE Workshop on Spoken Language Technology in Goa

Alex Rudnicky

SLTC Newsletter, January 2009

The second bi-annual IEEE workshop on Spoken Language Technology (SLT) took place December 15-18th 2008 in Goa, India. The SLT workshop alternates with the IEEE ASRU (Automatic Speech Recognition and Understanding) workshop and is meant to focus on topics that connect speech recognition and understanding to some of its applications, such as machine translation, dialog, search and summarization. A particular focus of this year’s meeting was Spoken Language Technology for Development (sometimes abbreviated as SLT4D), the application of SLT to social needs, and several sessions were devoted to this topic. SLT 2008 was also the first IEEE speech technology meeting to be held on the Indian sub-continent and it provided a long-overdue opportunity for European, Asian and American researchers to interact with members of the very active Indian speech research community. Amitav Das, the Chair, and Srinivas Bangalore, the Co-Chair, did an outstanding job of organizing the meeting and its associated activities. Unfortunately, due to circumstances beyond the organizers’ control attendance was much smaller than anticipated: there were 88 attendees, out of 122 registrations.This year’s workshop attracted 154 regular submissions, of which 72 were accepted.

A special session on SLT in India introduced participants to the linguistic landscape of India. Your correspondent was struck by both the sheer diversity and sizes of language groups (22 official languages, 29 with more than a million speakers each, 122 languages spoken by at least 10,000 people). Some interesting accommodations have been prompted by practical considerations; for example, communities without a written language might borrow script from multiple neighboring groups. Dr. Mallinkarjun of the Central Institute of Indian Languages described a major effort currently under way to collect corpora for 24 major languages, under the direction of the Linguistic Data Consortium for Indian Languages. The collection effort extends beyond speech and encompasses a variety of linguistic resources, including parallel corpora and dictionaries. Prof Hema Murthy of IIT-Madras described a variety of ways that language technologies are being used for education and training (for example to provide exam preparation); of particular interest were descriptions of how the needs of practical applications drive development in core technology (for example, speech synthesis). Dr Amitav Das of Microsoft India provided a comprehensive overview of speech technology activity in India taking place both in industry and in academia. Dr Das reinforced the point that understanding how to use speech technology to meet peoples’ needs in turn generates interesting research questions.

The workshop keynote address was given by Prof Giuseppe Riccardi of the University of Trento and described a comprehensive program of research in exploring next-generation spoken language interfaces, including multi-modality and system-originated transactions (and featured a compelling demonstration of the latter in action).

The workshop also included two tutorials addressing spoken language technology in development. "Rapid Language Adaptation Tools and technologies for Multilingual Speech Processing System" prepared by Prof Tanja Schultz of Karlsruhe and Carnegie Mellon Universities (and presented by Prof Pascale Fung of HKUST) focused on the SPICE project and the development of tools for rapid acquisition of language data and simplified configuration of models and applications that would allow non-specialists to create useful artifacts. The second tutorial, on the World Wide Telecom Web, given by Dr Nitendra Rajput of IBM India in Delhi, described the development and successful deployment of a speech-only web service that enables access to information resources over mobile phones (for example, market prices). Of particular note is that the system allows ordinary users to create their own content entirely over the phone (for example, advertizing services or sharing knowledge about farming techniques).

Following the model set by previous workshops, the meeting consisted of a single track of poster sessions organized around a specific topic and preceded by an introductory overview lecture. In contrast to the previous SLT meeting, SLT 2008 featured a full session on spoken language generation; the kinds of problems addressed in summarization and search seem to have evolved to ones focusing more on complete systems rather than component technologies. In this meeting, as in others recently, machine translation featured prominently.