Overview of NAACL HLT 2010
Mary Harper
SLTC Newsletter, July 2010
This article provides an overview and useful links to the NAACL HLT 2010 conference, which included papers spanning computational linguistics, information retrieval, and speech technology. Due to its three-area focus and emphasis on statistical modeling and machine learning, the NAACL HLT conference will be of interest to many SLT readers, who should consider submitting papers to future conferences.
Conference Overview
The North American Chapter of Association for Computational Linguistics - Human Language Technologies (NAACL HLT 2010) conference was recently held on June 1–6, 2010 in downtown Los Angeles at the Millennium Biltmore Hotel. This conference included papers on innovative high-quality work spanning computational linguistics, information retrieval, and speech technology. This year there was a special "Noisy Data" theme to acknowledge the significant work taking place across several disciplines with non-pristine data. This conference, due to its three-area focus together with a heavy dose of statistical modeling and machine learning, should be of interest to SLT members.
The program contained pre-conference tutorials, oral and poster presentations of full (8 pages plus 1 for references) and short papers (4 pages), application demonstrations, a student research workshop, and post-conference workshops. For a look at the conference program please see the program. The conference had three sub-parts:
- Tutorials: Six tutorial sessions were held on June 1st. Links are included to the description of each tutorial. In some cases, the authors include presentation materials.
- Computational Psycholinguistics
- Data-Intensive Text Processing with MapReduce
- Integer Linear Programming in NLP
- Distributional Semantic Models
- Markov Logic in NLP
- Noisy Text Analytics
- Recent Advances in Dependency Parsing
- Textual Entailment
- Main Conference: The main conference was held on June 2-4. This year 291 full and 159 short papers were submitted and reviewed relating to Acoustic Models, Dialog, Discourse, Generation, Grammar Engineering, Information Retrieval and Extraction, Language Models, Machine Learning, Machine Translation, Mathematical Linguistics, Morphology and Phonology, Parsing, Semantics, Sentiment, Summarization, and Word Sense Disambiguation, with a 30.9% full paper acceptance rate and a 35.2% short paper acceptance rate. Full papers describe substantial, original, completed and unpublished work, and whenever appropriate, included concrete evaluation and analysis. Short paper submissions were required to describe original and unpublished work with one of the following characteristics: a small, focused contribution, work in progress, negative result, opinion piece, or interesting application nugget.
The conference had the benefit of two interesting and diverse keynote presentations: Steve Renals, University of Edinburgh, Recognition and Understanding of Meetings and David Temperley, University of Rochester, Music, Language, and Computational Modeling: Lessons from the Key-Finding Problem. See the keynote web page for talk abstracts and presentation slides. In addition, there was a panel session reflecting the noisy data conference theme: Recent and Future HLT Challenges in Industry, chaired by Kristina Toutanova. Two excellent papers were selected by committee for best paper awards this year: Best Full Paper- Coreference Resolution in a Modular, Entity-Centered Model, by Aria Haghighi and Dan Klein, and Best Short Paper- "cba to check the spelling": Investigating Parser Performance on Discussion Forum Posts, by Jennifer Foster.
Electronic versions of the papers from the main conference, short papers on demonstrations, papers from the Student Research Workshop, and Tutorial abstracts can be found by following the link to the ACL anthology . Overall the conference program was quite strong.
- Post-conference Workshops: Sixteen post-conference workshops were held on June 5-6 on an interesting and diverse set of emerging topic areas. Many of the papers in these workshops are likely to e of interest to the SLT community. You can access electronically all the papers from the conference-affiliated workshop link to the ACL anthology. The workshops are listed below:
- WS1: ALNLP: Workshop on Active Learning for NLP
- WS2: Analysis and Generation of Emotion: Workshop on Computational Approaches to Analysis and Generation of Emotion in Text
- WS3: CALC-10:Workshop on Computational Approaches to Linguistic Creativity 2010
- WS4: CL&W: Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids
- WS5: SocialMedia: Workshop on Computational Linguistics in a World of Social Media
- WS6: Computational Neurolinguistics: Workshop on Computational Neurolinguistics
- WS7: Mechanical Turk: Workshop on Creating Speech and Language Data With Amazon’s Mechanical Turk [Related articles: Creating Speech and Text Language Data with Amazon’s Mechanical Turk: a Report on the NAACL-HLT 2010 Workshop., and Interview: MTurk NAACL Workshop Organizers Talk Crowdsourcing, Speech, and the Future of Unsupervised Learning]
- WS8: Extracting and Using Constructions: Workshop on Extracting and Using Constructions in Computational Linguistics
- WS9: FAM-LbR: Formalisms and Methodology for Learning by Reading (FAM-LbR)
- WS10: Educational Applications: The 5th Workshop on Innovative Use of NLP for Building Educational Applications
- WS11: Louhi '10: Second Louhi Workshop on Text and Data Mining of Health Documents
- WS12: SemanticSearch: Workshop on Semantic Search [Related article: Bringing Semantics into Search: An Overview of the NAACL-HLT-2010 Workshop on Semantic Search]
- WS13: Assistive Technologies: First Workshop on Speech and Language Processing for Assistive Technologies (SLPAT)
- WS14: SPMRL: First Workshop on Statistical Parsing of Morphologically Rich Languages
- WS15: WAC-6: Sixth Web as Corpus Workshop
- WS16: Young Investigators in the Americas: Young Investigators Workshop on Computational Approaches to Languages of the Americas
Conference History
The Association for Computational Linguistics (ACL) Executive Board began the task of establishing the NAACL Chapter in 1997, and the first NAACL Executive Board was elected in December of 1999. The NAACL chapter, once established, determined to hold a domestic conference every year that neither an ACL nor COLING conference would be held in North America. The NAACL conference was first convened in Seattle in 2000, and then annually except when ACL was held in North America. In 2003, NAACL was combined with the Human Language Technology (HLT) conference series. The HLT conferences brought together cross-disciplinary researchers focused on enabling computers to interact with humans using natural language and to provide them with language services (e.g., translation, speech recognition, information retrieval, text summarization, and information extraction). The combination of the two conference series has been successful in providing a unified forum for the presentation of high-quality, cross-disciplinary, cutting-edge work and in fostering new research directions.
Acknowledgements
I would like to thank the NAACL HLT 2010 general chair, Ronald Kaplan, my program co-chairs, Jill Burnstein and Gerald Penn, the local arrangements co-chairs, David Chiang, Eduard Hovy, Jonathan May, and Jason Riesa, who, among many other tasks, developed the NAACL HLT 2010 website, the publication co-chairs, Claudia Leacock and Richard Wicentowski, who prepared the proceedings for the conference and the ACL Anthology, and the many other people who contributed in various ways to the conference. I would also like to thank Giuseppe Di Fabbrizio for his input on this article.
Mary Harper was a technical program co-chair for NAACL HLT 2010. She is an Affiliate Research Professor in Computer Science and Electrical and Computer Engineering at the University of Maryland and a Principal Research Scientist at the Human Language Technology Center of Excellence at Johns Hopkins University. Her research focuses on computer modeling of human communication involving audio, textual, and visual sources. Email: mharper@umd.edu, Web: http://www.wam.umd.edu/~mharper

