Coling 2008

Manchester, 18-22 August, 2008

The 22nd International Conference on Computational Linguistics
Home · Programme · Info for presenters · Workshops · Tutorials · Registration · Venue · Accommodation · Sponsorship · Archive · ICCL

Workshops: 16-17 and 23-24 August, 2008

Workshops will take place post-Coling 23-24 August, with the exception of CoNLL, a two-day workshop which will be co-located with Coling 2008, and will take place pre-Coling, 16-17 August.
Workshops accepted for Coling 2008 appear below. All workshops are one-day events: we reserve the right to change the date of workshops depending on numbers of participants.

List of Workshops

August 16-17: August 23: August 24:

Post-Coling workshops will take place in the Alan Turing Building (pictured right).

For your information, the Call for Workshop Proposals is available in our Archive section.


Submission deadline for Workshop proposals3 February
Notification of acceptance of Workshop proposal24 February
Workshop calls for papers1 March
CoNLL and IR4QA paper submission deadline28 April
Submission deadline for all other workshops5 May
Notification of acceptance of workshop papers6 June
Camera-ready copy of papers due1 July
Pre-Coling workshop: CoNLL16-17 August
Post-Coling workshops23, 24 August


Mark Stevenson (Chair), Sheffield University, UK
Eneko Agirre, Euskal Herriko Unibertsitatea, Spain
Tim Baldwin, University of Melbourne, Australia
Stephen Clark, Oxford University, UK
Diana McCarthy, University of Sussex, UK
Ellen Riloff, University of Utah, USA
Satoshi Sekine, New York University, USA

Pre-Coling Workshop, August 16-17

Twelfth Conference on Computational Natural Language Learning (CoNLL-2008)

The Twelfth Conference on Computational Natural Language Learning will take place as an official Coling Workshop, over the weekend preceding Coling, August 16-17. The CoNLL web site is at
CoNLL also involves as usual a shared task, "Joint Learning of Syntactic and Semantic Dependencies". The website for the CoNLL shared task 2008 is This site is updated continually, so please make sure to check it out from time to time.
Workshop organizers: Alex Clark (Royal Holloway University of London) and Kristina Toutanova (Microsoft Research, Redmond WA)

Post-Coling Workshops, August 23

[W1] Human Judgements in Computational Linguistics

Human judgements play a key role in the development and the assessment of linguistic resources and methods in Computational Linguistics. They are commonly used in the creation of lexical resources and corpus annotation, and also in the evaluation of automatic approaches to linguistic tasks: In the developmental phase, human judgements help to define an inventory of categories as well as robust annotation criteria, and in the assessment phase they are used to evaluate the results of automatic systems against existing linguistic standards. Furthermore, systematically collected human judgements provide clues for research on linguistic issues that underlie the judgement task, providing insights complementary to introspective analysis or evidence gathered from corpora. The goal of this workshop is to discuss experiments that collect human judgements for Computational Linguistic purposes. A particular focus of the workshop is concerned with human judgements on "controversial" linguistic tasks (those that are not clear from a theoretical point of view, such as many tasks having to do with semantics or pragmatics), which tend to result in low agreement scores. Such controversial tasks and their sub-optimal results are typically poorly documented in the literature; however, they are especially well-suited as a basis for a fruitful discussion.
The workshop's website is at
Workshop organizers: Ron Artstein (University of Southern California, Marina del Rey CA), Gemma Boleda (Universitat Politècnica de Catalunya, Barcelona), Frank Keller (University of Edinburgh) and Sabine Schulte im Walde (Unversität Stuttgart).

[W2] Cross-framework and Cross-domain Parser Evaluation

Broad-coverage parsing has come to a point where distinct approaches can offer (seemingly) comparable performance. Evaluation against trees in the WSJ section of the Penn Treebank (PTB) has helped significantly advance parsing research over the course of the past decade. However, modern treebank parsers still restrict themselves to only a subset of PTB annotation; there is reason to worry about the idiosyncrasies of this particular corpus; it remains unknown how much the ParsEval metric (or any intrinsic evaluation) can inform NLP application developers; and PTB-style analyses leave a lot to be desired in terms of linguistic sophistication. This workshop aims to bring together developers of broad-coverage parsers who are interested in questions of target representations and cross-framework and cross-domain evaluation and benchmarking.
The website for this workshop is at
Workshop administrative contact: Stephan Oepen (Universitetet i Oslo and CSLI Stanford)

[W3] 2nd MMIES Workshop: Multi-source, Multilingual Information Extraction and Summarization

The objective of the workshop is to bring together researchers and practitioners in the areas of extraction, summarization, and other information access technologies, to discuss recent approaches to multi-source and multi-lingual challenges. Approaches to coping with the idiosyncratic nature of the new Web2.0 media are especially welcome, including: mixed input, new jargon, ungrammatical and mixed-language input, and emotional discourse.
The MMIES2 website is at
Workshop organizers: Sivaji Bandyopadhyay (Jadavpur University, Kolkatta), Thierry Poibeau (CNRS/Université Paris 13), Horacio Saggion (University of Sheffield) and Roman Yangarber (Helsingin Yliopisto).

[W4] Speech Processing for Safety Critical Translation and Pervasive Applications

Two ideas currently gaining popularity in spoken dialogue application constructions are safety critical translation and pervasive applications. Medical applications have emerged as one of the most popular domains for speech translation. At a workshop on medical speech translation, held at HLT 2006, a measure of consensus emerged on at least some points: key issue that differentiates the medical domain from most other application areas for speech translation is its safety-critical nature, systems can realistically be field- deployed now or in the very near future, the basic communication model should be collaborative, and allow the client users to play an active role. In addition, Pervasive computing applications offer great opportunities as well as challenges for spoken language technologies. They can provide an effective and natural interface for mobile devices in situations where traditional modes of communication are less appropriate, including medical and other safety critical systems. However, there is so far little agreement on many central questions, including choices of architectures, component technologies, and evaluation methodologies. In this workshop we would like to create a forum where people interested in these types of systems can meet, exchange ideas and demo live systems.
The website for this workshop is at
A list of accepted papers is here.
Workshop organizers: Pierrette Bouillon (Université de Genève), Farzad Ehsani (Fluential, Sunnyvale CA), Robert Frederking (Carengie Mellon University, Pittsburgh) and Manny Rayner (Université de Genève)
Note: This workshop is a merger of the proposed Workshops Speech Translation for Medical and Other Safety-Critical Applications and Spoken Language Technologies for Pervasive Speech-based and Multimodal Applications (PESMA08).

[W5] Knowledge and Reasoning for Question Answering (KRAQ08)

The aim of this workshop is to investigate aspects of question answering that combine NLP, IR and AI techniques and formalisms. Of particular interest are: reasoning aspects (fusion, summarization, failure detection, dealing with incompleteness, etc.), textual entailment, argumentation and explanation analysis and production, cooperative response generation, complex question processing, and innovative applications that include e.g. multimedia aspects, etc.
The KRAQ08 website is at
Workshop organizers: Patrick Saint-Dizier (CNRS IRIT Toulouse) and Marie-Francine Moens (Katholieke Universiteit Leuven)

Post-Coling Workshops, August 24

[W6] Grammar Engineering Across Frameworks (GEAF08)

Recent years have seen the development of techniques and resources to support robust, deep grammatical analysis of natural language in real-world domains and applications. The demands of these types of tasks have resulted in significant advances in areas such as parser efficiency, hybrid statistical/symbolic approaches to disambiguation, and the acquisition of large-scale lexicons. The effective acquisition, development, maintenance and enhancement of grammars is a central issue in such efforts, and the size and complexity of realistic grammars makes these tasks extremely challenging; indeed, these tasks are often tackled in ways that have much in common with software engineering. This workshop aims to bring together grammar engineers from different frameworks (for example LFG, HPSG, TAG, CCG, dependency grammar) to compare their research and methodologies.
The GEAF08 website is at
Workshop organizers: Stephen Clark (Oxford University) and Tracy Holloway King (PARC, Palo Alto CA)

[W7] 2nd Information Retrieval for Question Answering Workshop (IR4QA'08)

Open domain question answering (QA) has become a very active research area over the past decade, due in large measure to the stimulus of the TREC Question Answering track. This track addresses the task of finding answers to natural language (NL) questions (e.g. "How tall is the Eiffel Tower?", "Who is Aaron Copland?", "What effect does second-hand smoke have on non-smokers?") from large text collections. This task stands in contrast to the more conventional IR task of retrieving documents relevant to a query, where the query may be simply a collection of keywords (e.g. "Eiffel Tower", "American composer, born Brooklyn NY 1900, ...").
The IR4QA'08 website is at
Workshop organizer: Mark Greenwood (University of Sheffield).

[W8] Cognitive Aspects of the Lexicon (CogALex-08)

Dictionaries are a fundamental componant of any NLP system, natural or artificial. The general belief is, the bigger the better. Yet, the quality of a dictionary depends not only on the coverage (number of entries) and granularity of information it contains, but also on the means it offers to access the desired information (meaning, sound, word form, etc.). Access strategies depend, of course, not only on the task, but also on the knowledge available at the onset of consulting the resource. During analysis (reading) we start from words looking for meanings (or other gramatical information), while in synthesis (writing) we start from concepts for which we try to find the corresponding lexical form.
This diversity of needs and knowledge available at the onset of consultation has given rise to different kinds of resources (dictionaries, thesaurus, encyclopedia, etc.). This strategy was justified when paper was the sole support to store information. Yet, modern lexicographers work with huge digital corpora, using language technology to build, maintain and exploit the resource. If paper dictionaries are static and limited in terms of access possibilities, their electronic counterparts are highly reactive, allowing for navigation in a huge conceptual-lexical space, with information at the distance of a mouse click, quasy instantaneously and in many forms (alphabetically, by domain, frequency, ...). In sum, the new possibilites are enormeous, but to take advantage of them, and this is the goal of this workshop, we must study people’s needs and access strategies (what do they know, what are they looking for, how do they search?) to decide on the kind of indexes to build.
The CogALex-08 website is at:
Workshop organizers: Michael Zock (CNRS-LIF, Marseille) and Chu-Ren Huang (Sinica, Taipei)

[W9] Textgraphs-3: Graph-based Algorithms for Natural Language Processing

Recent years have shown an increased interest in bringing the field of graph theory into Natural Language Processing. In many NLP applications entities can be naturally represented as nodes in a graph and relations between them can be represented as edges. Recent research has shown that graph-based representations of linguistic units as diverse as words, sentences and documents give rise to novel and efficient solutions in a variety of NLP tasks, ranging from part-of-speech tagging, word-sense disambiguation and parsing to information extraction, semantic role assignment, summarisation, sentiment analysis and up to the study of the evolutionary dynamics of language. The Textgraphs workshop addresses a broad spectrum of research areas and brings together researchers working on problems related to the use of graph-based algorithms for NLP as well as on the theory of graph-based methods.
The Textgraphs-3 website is at
Workshop organizers: Irina Matveeva (Accenture Technology Labs), Chris Biemann (Powerset, San Francisco), Monojit Choudhury (Microsoft Research India, Bangalore) and Mona Diab (Columbia University, New York).

Last updated: 2.6.08