|
Coling 2008
Manchester, 18-22 August, 2008The 22nd International Conference on Computational Linguistics
|
Home
·
Programme
·
Info for authors
·
Workshops
·
Tutorials
·
Registration
·
Venue
·
Accommodation
·
Sponsorship
·
Archive
·
ICCL
Tutorial T2
Data Analysis for Lifelike Computational Linguistics
Sunday August 17, 9.30 - 1.00
Outline·
Objectives·
Structure·
Instructors
Download an trailer for this tutorial here.
Outline
Modern theories of language production typically assert that no sentence can be generated in words (and that includes language cycled internally as thought) without the prior existence of a preverbal "speech act", a primitive expression of an individual's momentary communicative intent. Unfortunately, the basis of speech act processing is not yet clearly understood, not least because - being by definition preverbal - the encoding must involve icons and conceptual abstractions rather than words! As a consequence, the pragmatic aspects of language production have so far defied the "small system"
techniques of analysis and design which tend to be used by academic software development teams. Nowhere is this lack of heavy-end engineering expertise more acutely felt than at the interface between the processes of verbal praxis and the structures of the mind's declarative memory "database".
This tutorial briefly reviews a number of industrial "big system" skills, before selecting the practice of relational data modelling for detailed attention. The ensuing tutor-guided practicals will address known linguistic issues, impart specific data modelling skills, and be illustrated throughout by output obtained from Project Konrad, a "Codasyl-style" semantic network database being developed by the author with technical support from International Software Products, Toronto.
Objectives
The tutorial will equip participants with
- An appreciation of the pivotal role of data analysis in big systems design.
- The skills of data analysis, including that of data "normalisation".
- The basic skills to draw up, interpret, and if necessary improve data models in Bachman diagram format.
- The ability to integrate a data model with other graphic aids, especially the dataflow diagram and the Jackson structure diagram.
Structure
The tutorial will be structured as follows:
- Introduction
- Introductions and session outline
- Settling-in exercise
- Some "small systems" systems analysis techniques
- The logic flowchart; two-minute illustrative workbook exercise
- The Jackson program structure diagram; two-minute illustrative workbook exercise
- Some "big systems" systems analysis techniques
- The logic of "structured development"; logical design versus physical implementation
- The document flow diagram; two-minute illustrative workbook exercise
- The program suite diagram; two-minute illustrative workbook exercise
- Konrad 1 presentation of output
- The mind as a big system
- What the study of Speech and Language Pathology tells us about the mind
- Distributed modular cognition and the dataflow diagram
- Coffee break and 20-minute groupwork exercise
- Hierarchical modular cognition and the motor hierarchy diagram; Lordat (1843); two-minute illustrative workbook exercise
- The PALPA (Kay, Lesser, and Coltheart, 1992) as the canonical language flow diagram; speech acts therein; declarative knowledge therein
- The network occurrence diagram and its shortcomings
- Basic data modelling skills
- Entities, relations, and attributes defined
- Relational analysis
- The entity-relationship diagram
- Anderson's (1993) ACT-R propositional network diagram
- The Bachman diagram; two-minute illustrative workbook exercise
- Coffee break and 20-minute groupwork exercise
- Data modelling for students of language
- Exercise feedback
- 20-minute summative groupwork exercise
- Concluding comments
- Exercise feedback
- Work in progress - coping with speech acts in Konrad 2
- Tutorial summary
- Post-tutorial materials and contact instructions
- Questions
Instructors
Derek Smith graduated as a psychologist in 1972 (London) and had an early career as analyst-programmer with the data processing arm of British Telecom, where he specialised in the design and operation of very large databases and management support systems. Since 1991, he has lectured at UWIC's School of Health Sciences, where he has been responsible for the psycholinguistics and neuropsychology modules of the BSc (Hons) Speech and Language Therapy and BSc (Hons) Psychology undergraduate programmes, as well as for the Informatics and Project Management module of the MSc Interprofessional Studies. He holds a Postgraduate Diploma in Medical Education (Dundee) and applies the tried-and-tested Dundee "spiral curriculum" philosophy to his teaching. His research interests are based around his work with the Project Konrad software, and include simulations of biological memory and the network structure of Heideggerian Dasein.
Kathryn Livesey graduated as a psychologist in 2004 (UWIC), having been attracted to the subject by her earlier experiences working with schoolchildren and adults with learning disabilities. She was immediately recruited as a research assistant in UWIC's Food Research and Consultancy Unit, where she divides her time between her PhD research into the forensic ergonomics of food safety and occasional small outreach projects. For National Science Week 2006, she helped design and deliver a Welsh Assembly-funded "edudrama" designed to raise the awareness of Key Stage 2 schoolchildren as to what goes on in the brain during mathematics learning, and since 2007 she has worked with a drama group charity specialising in developing children's social and communication skills. She is currently leading the system testing in those areas of the Project Konrad database which simulate the phenomenon of inner speech, and is particularly interested in locating vulnerable points within the architecture of inner speech where comparatively minor cognitive deficits can potentially go on to impair other aspects of cognition.
BACK to Tutorials main page