Coling 2008

Manchester, 18-22 August, 2008

The 22nd International Conference on Computational Linguistics
Home · Programme · Info for authors · Workshops · Tutorials · Registration · Venue · Accommodation · Sponsorship · Archive · ICCL

Tutorial T2
Data Analysis for Lifelike Computational Linguistics

Sunday August 17, 9.30 - 1.00

Outline· Objectives· Structure· Instructors

Download an trailer for this tutorial here.


Modern theories of language production typically assert that no sentence can be generated in words (and that includes language cycled internally as thought) without the prior existence of a preverbal "speech act", a primitive expression of an individual's momentary communicative intent. Unfortunately, the basis of speech act processing is not yet clearly understood, not least because - being by definition preverbal - the encoding must involve icons and conceptual abstractions rather than words! As a consequence, the pragmatic aspects of language production have so far defied the "small system" techniques of analysis and design which tend to be used by academic software development teams. Nowhere is this lack of heavy-end engineering expertise more acutely felt than at the interface between the processes of verbal praxis and the structures of the mind's declarative memory "database".
This tutorial briefly reviews a number of industrial "big system" skills, before selecting the practice of relational data modelling for detailed attention. The ensuing tutor-guided practicals will address known linguistic issues, impart specific data modelling skills, and be illustrated throughout by output obtained from Project Konrad, a "Codasyl-style" semantic network database being developed by the author with technical support from International Software Products, Toronto.


The tutorial will equip participants with


The tutorial will be structured as follows:
  1. Introduction
  2. Some "small systems" systems analysis techniques
  3. Some "big systems" systems analysis techniques
  4. The mind as a big system
  5. Basic data modelling skills
  6. Data modelling for students of language
  7. Concluding comments


Derek J. Smith

Kathryn Livesey
Cardiff School of Health Sciences
University of Wales Institute
Cardiff CF5 2YB, Wales
dsmith at klivesey at

Derek Smith graduated as a psychologist in 1972 (London) and had an early career as analyst-programmer with the data processing arm of British Telecom, where he specialised in the design and operation of very large databases and management support systems. Since 1991, he has lectured at UWIC's School of Health Sciences, where he has been responsible for the psycholinguistics and neuropsychology modules of the BSc (Hons) Speech and Language Therapy and BSc (Hons) Psychology undergraduate programmes, as well as for the Informatics and Project Management module of the MSc Interprofessional Studies. He holds a Postgraduate Diploma in Medical Education (Dundee) and applies the tried-and-tested Dundee "spiral curriculum" philosophy to his teaching. His research interests are based around his work with the Project Konrad software, and include simulations of biological memory and the network structure of Heideggerian Dasein.

Kathryn Livesey graduated as a psychologist in 2004 (UWIC), having been attracted to the subject by her earlier experiences working with schoolchildren and adults with learning disabilities. She was immediately recruited as a research assistant in UWIC's Food Research and Consultancy Unit, where she divides her time between her PhD research into the forensic ergonomics of food safety and occasional small outreach projects. For National Science Week 2006, she helped design and deliver a Welsh Assembly-funded "edudrama" designed to raise the awareness of Key Stage 2 schoolchildren as to what goes on in the brain during mathematics learning, and since 2007 she has worked with a drama group charity specialising in developing children's social and communication skills. She is currently leading the system testing in those areas of the Project Konrad database which simulate the phenomenon of inner speech, and is particularly interested in locating vulnerable points within the architecture of inner speech where comparatively minor cognitive deficits can potentially go on to impair other aspects of cognition.

BACK to Tutorials main page