A Brief Overview of Meeting Understanding Tasks
Dilek Hakkani-Tür
SLTC Newsletter, January 2010
Meetings provide an efficient way of interaction, and create unique knowledge sharing opportunities between people with different areas of expertise. Every day, especially within organizations, people meet for various reasons, such as discussing issues, task assignments, and planning. Although meetings are so common, there is still no globally adopted automated or semi-automated mechanism for tracking meeting discussions, following up with task assignments, saving the meeting content for later use by participants or non-participants, or checking the efficiency of meetings.
Recently, the availability of meetings corpora such as the ICSI [1] and AMI [2] corpora, and shared task evaluations such as the ones performed by NIST, have facilitated research on automatic processing of meeting speech. The ICSI corpus [1] consists of 75 natural meetings with a variable number of participants, that took place at the ICSI. The AMI corpus [2] is a multi-modal, scenario-driven data collection from meetings of 4 participants each. In general, such human-human communication in stored audio form has rapidly grown in providing ample source material for later use. In particular, the prominence of the voice search as a basic user activity has increased significantly along with the advances in speech recognition and retrieval.
However, there are still open questions: What information from these meetings would be useful for later use? Does this information depend on user/purpose? How could the transcripts of these conversations be used? Our goal in this article is not to answer these questions, but provide a brief (and far from complete) overview of some of the research done in the last decade to process meetings for providing access to their contents:
- Meeting summarization aims to generate a compact, summary version of meeting discussions. These summaries can be formed by extracting original speaker utterances (extractive summarization) or by formulating new sentences for the summary (abstractive summarization). Inspired from text summarization approaches, previous work on meeting summarization mainly focused on the first approach [3,4,5]. Other work also considered interactive meeting summarization, where feedback from users (in the form of weighted keywords entered through a graphical user interface) is incorporated into summarization [6], or participants' notes are used to learn to detect "noteworthy" utterances over the sequence of meetings of the same group of participants [7].
- Action item detection aims to detect task assignments to people and associated deadlines, during a meeting. These can be used to enter such information into the relevant person's calendar, or to track status and progress in the following meetings. This task can be viewed as classifying meeting utterances into one or more categories. For example, 4 types of action item related utterances are identified in [8]: task description, responsible party identification, deadline assignment, and agreement/disagreement of the responsible party.
- Decision detection aims to detect decision-making sub-dialogs in multi-party dialogs. The decisions made in meetings can be used for indexing meetings, and one can go back and access the contents of the meeting around the time that a specific decision was made. They can also be used to track progress and efficiency of meetings. In related work, [9] attempted to model the structure of decision making-subdialogs, and identified the role of utterances in the decision making process, such as utterances which initiate a discussion by raising an issue, utterances which propose a resolution for the raised issue, and utterances which express agreement for a proposed resolution.
All of these tasks, along with lesser studied tasks such as argument diagramming [10] and detection of sentiments [13], agreement and disagreements [11,12] would greatly assist the ability to automatically browse, summarize, and graphically visualize various aspects of the spoken content of the meetings.
References:
- E. Shriberg, R. Hulling, S. Bag, J. Ang, H. Carvey. The ICSI meeting recorder dialog act (MRDA) corpus. In SIGdial Workshop on Discourse and Dialogue, 2004.
- J. Carletta, S. Ashby, S. Bourban, M. Guillemot, M. Kronen-thal, G. Lathoud, M. Lincoln, I. McCowan, T. Hain, W. Kraaij, W. Post, J. Kadlec, P. Wellner, M. Flynn, and D. Reidsma. The AMI meeting corpus. In Proceedings of MLMI.05, Edinburgh, 2005.
- G. Murray, S. Renals, and J. Carletta. Extractive summarization of meeting recordings. In Proceedings of Interspeech. September, 2005.
- S. Xie and Y. Liu. Using Corpus and Knowledge-Based Similarity Measure in Maximum Marginal Relevance for Meeting Summarization. In Proceedings of the ICASSP. Las Vegas, NV, 2008.
- K. Riedhammer, D. Gillick, B. Favre, and D. Hakkani-Tur. Packing the Meeting Summarization Knapsack. In Proceedings of Interspeech. Brisbane, Australia, 2008.
- K. Riedhammer, B. Favre, and D. Hakkani-Tur. A Keyphrase-Based Approach to Interactive Meeting Summarization. In Proceedings of IEEE Workshop on Spoken Languge Technologies (SLT). Goa (India), 2008.
- S. Banerjee and A. Rudnicky. Detecting the Noteworthiness of Utterances in Human Meetings. In the Proceedings of the 2009 SIGDial Conference on Dialog and Discourse Research. London, UK. 2009.
- M. Purver, P. Ehlen, and J. Niekrasz. Detecting action items in multi-party meetings: Annotation and initial experiments. In Proceedings of Machine Learning for Multi-modal Interaction. Springer-Verlag. 2006.
- R. Fernandez, M. Frampton, P. Ehlen, M. Purver, and S. Peters. Modelling and detecting decisions in multi-party dialogue. In Proc. of the 9th SIGdial Workshop on Discourse and Dialogue. 2008.
- R. Rienks, D. Heylen, and E. van der Weijden. 2005. Argument diagramming of meeting conversations. InVinciarelli A. and Odobez J., editors, Multimodal/Multiparty Meeting Processing, Workshop at the 7th Intl. Conference on Multimodal Interfaces.
- D. Hillard, M. Ostendorf, and E Shriberg. Detection of agreement vs. disagreement in meetings: training with unlabeled data. In Proc. of HLT/NAACL. 2003.
- M. Galley, K. McKeown, J. Hirschberg, and E. Shriberg. Identifying agreement and disagreement in conversational speech: Use of bayesian networks to model pragmatic dependencies. In Proc. of ACL, pages: 669-676. 2004.
- T. Wilson. Annotating subjective content in meetings. In Proc. of LREC 2008, Marrackech, Morocco, 2008.
Dilek Hakkani-Tür is a Senior Research Scientist at the International Computer Science Institute (ICSI), with research interests in natural language and speech processing, spoken dialog processing, active and unsupervised learning for language processing. Email: dilek@icsi.berkeley.edu


Add A Comment