Paul Thompson
Research Associate

Contact details:
National Centre for Text Mining
(NaCTeM),
School of Computer Science,
Manchester Interdisciplinary Biocentre,
University of Manchester,
131 Princess Street,
M1 7DN
Tel: +44 (0)161 306 3091
Email: Paul.Thompson[at]manchester.ac.uk
Education
MPhil, UMIST (2004)
BSc (Hons) Computational Linguistics (1st Class), UMIST (1999)
Employment History
| 03/2007 - present | Research Associate,
National Centre for Text Mining, The University of Manchester
|
| 11/2006 - 02/2007 | Research Assistant,
The University of Manchester
|
| 06/2005 - 10/2006 | Knowledge Transfer
Partnership Associate, Lorien, Plc and The University of Manchester
|
| 01/2000 - 05/2005 | Research Assistant,
UMIST/The University of Manchester
|
Research
My main research interests lie in Natural Language Processing, in
particular information extraction and corpus annotation. I have been
involved in the development of resources and user interfaces for a
number of NLP systems, as detailed in the Projects section below.
I am currently working on corpus annotation in the biomedical field. I was involved in the development of GREC, a corpus of MEDLINE abstracts that have been semantically annotated with gene regulation events. My most recent work involves the annotation of meta-knowledge about bio-events to aid in their correct interpretation.
Projects
METANET4U is a European project aiming at supporting language technology for European languages and multilingualism.It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the mission of META.
META is the Multilingual Europe Technology Alliance, dedicated to building the technological foundations of a multilingual European information society. Our work at the
University is concerned primarily with demonstrating how, by ensuring that individual language processing tools and resources are made interoperable, new applications can be built rapidly
by combining together these interoperable components. We are using the UIMA framework and the U-Compare system to facilitate interoperability of tools and resources.
The main purpose of this project is to build two major reusable,
wide-coverage lexical and conceptual repositories for the biology
domain, i.e. a bio-lexicon and a bio-ontology, using text mining
techniques. My work has been focussed on annotation of gene regulation
events in MEDLINE abstracts, with the purpose of acquiring semantic
frames for verbs and nominalised verbs. The frames will be included
within the bio-lexicon, to aid in the extraction of biological facts.
The Arabic WordNet is based on WordNet
for English, developed at Princeton University, in which words are
grouped into sets of synonyms and structured according to basic semantic
relations between them. I worked on the development of a Java user
interface to allow of searching the Arabic Wordnet and browsing of
relationships between words.
| 06/2005 - 10/2006 | Knowledge Transfer
Partnership |
This project was a collaboration between the University of Manchester
and Lorien Plc, a recruitment company based in Leeds. It had the purpose
of assisting Lorien to increase their use of technology within the
recruitment and selection process. My work focussed on the design and
development of a web-based systems for creating and administering online
pre-selection interviews and technical tests for job candidates.
| 09/2004 - 05/2005 | Parmenides |
The Parmenides project was conerned with knowledge and information
management. I worked on the development of pattern-matching rules and
associated ontologies used to extract entities and events in 3 separate
domains, i.e. biotechnology, weight management and terrorist attacks. I
also developed a user interface to display the extraction results.
This project involved the development of a framework for building
agent-based multilingual speech-based applications. I worked on the
development of several components and resources of a spoken email system
based on this framework.
| 01/2000 - 04/2001 | CONCERTO |
CONCERTO involved the conceptual annotation and retrieval of digital
documents. My work was centred on the infomation extraction module and
included writing pattern matching rules to discover named entities and
relationships between them, in addition to the the development of user
interfaces to facilitate collaborative development of rules and
semi-automatic conceptual annotation.
Publications
Paul Thompson, Yoshinobu Kano, John McNaught, Steve Pettifer, Teresa Attwood, John Keane and Sophia Ananiadou (2011). "Promoting Interoperability of Resources in META-SHARE". Proceedings of the IJCNLP Workshop on Language Resources, Technology and Services in the Sharing Paradigm (LRTS), Chiang Mai, Thailand, November, pp. 50-58 (pdf).
Paul Thompson, Raheel Nawaz, John McNaught and Sophia Ananiadou (2011). "Enriching a biomedical event corpus with meta-knowledge annotation". BMC Bioinformatics, 12:393 (link) (Highly Accessed)
Paul Thompson, John McNaught, Simonetta Montemagni, Nicoletta Calzolari, Riccardo del Gratta, Vivian Lee, Simone Marchi, Monica Monachini, Piotr Pezik, Valeria Quochi, C.J. Rupp, Yutaka Sasaki, Giulia Venturi, Dietrich Rebholz-Schuhmann and Sophia Ananiadou. (2011). "The BioLexicon: a large-scale terminological resource for biomedical text mining." BMC Bioinformatics 12:397 (link) (Highly Accessed)
Sophia Ananiadou, Paul Thompson, Yoshinobu Kano, John McNaught, Teresa K. Attwood, Philip J. R. Day, John Keane, Dean A. Jackson and Steve Pettifer (2011). "Towards Interoperability of European Language Resources". Ariadne, 67 (link)
C.J. Rupp, Paul Thompson, William J. Black, John McNaught and Sophia Ananiadou (2010). "A Specialised Verb Lexicon as the Basis of Fact Extraction in the Biomedical Domain". Proceedings of Interdisciplinary Workshop on Verbs: The Identification and Representation of Verb Features (Verb 2010), Pisa, Italy. (pdf)
Raheel Nawaz, Paul Thompson and Sophia Ananiadou (2010). "Event Interpretation: A Step towards Event-Centred Text Mining". Proceedings of the First International Workshop on Automated Motif Discovery in Cultural Heritage and Scientific Communication Texts (AMICUS 2010), CLARIN/DARIAH 2010, Vienna, Austria.
Sophia Ananiadou, Paul Thompson and Raheel Nawaz (2010). "Improving Search Through Event-based Biomedical Text Mining". Proccedings of the First International Workshop on Automated Motif Discovery in Cultural Heritage and Scientific Communication Texts (AMICUS 2010), CLARIN/DARIAH 2010, Vienna, Austria.
Raheel Nawaz, Paul Thompson and Sophia Ananiadou (2010). "Evaluating a Meta-Knowledge Annotation Scheme for Bio-Events". Proccedings of the Workshop on Negation and Speculation in Natural Language Processing (NeSp-NLP 2010), ACL 2010, Uppsala, Sweden. p. 69-77 (pdf)
Sophia Ananiadou, Paul Thompson, James Thomas, Tingting Mu, Sandy Oliver, Mark Rickinson, Yutaka Sasaki, Davy Weissenbacher and John McNaught (2010). "Supporting the Education Evidence Portal via Text Mining". Philosophical Transcations of the Royal Society A, 368(1925), 3829-3844.(link)
Raheel Nawaz, Paul Thompson, John McNaught and Sophia Ananiadou (2010).
"Meta-Knowledge Annotation of Bio-Events". Proceedings of the Seventh
International Conference on Language Resources and Evaluation (LREC
2010), Malta, May.
(pdf)
Paul Thompson, Syed A. Iqbal, John McNaught and Sophia Ananiadou (2009).
"Construction of an annotated corpus to support biomedical information
extraction". BMC Bioinformatics 10:349(link)
Yutaka Sasaki, Paul Thompson, John McNaught and Sophia Ananiadou (2009).
"Biological Event Recognition with Textual Induction" Proceedings of
3rd International Symposium on Languages in Biology and Medicine
(LBM-2009). (pdf)
Yutaka Sasaki, Paul Thompson, John McNaught and Sophia Ananiadou (2009).
"Three BioNLP Tools Powered by the BioLexicon." Proceeedings of EACL
2009 Demonstration Session, pp. 61--64. (pdf)
Giulia Venturi, Simonetta Montemagni, Simone Marchi, Yutaka Sasaki, Paul
Thompson, John McNaught, Sophia Ananiadou (2009). "Bootstrapping a Verb
Lexicon for Biomedical Information Extraction". Proceedings of the
10th International Conference on Intelligent Text Processing and
Computational Linguistics (CICLing 2009), pp. 137--148, Springer (pdf)
Yutaka Sasaki, Paul Thompson, Philip Cotter, John McNaught and Sophia
Ananiadou (2008) Event Frame Extraction Based on a Gene Regulation
Corpus, Proceedings of the 22nd International Conference on
Computational Linguistics (Coling-2008), pp. 761-768, Manchester,
August (pdf)
Paul Thompson, Giuila Venturi, John McNaught, Simonetta Montemagni and
Sophia Ananiadou (2008). "Categorising Modality in Biomedical Texts". LREC
2008 workshop "Building and Evaluating resources for biomedical text
mining" Marrakech, Morocco, May. (pdf)
Paul Thompson, Philip Cotter, Sophia Ananiadou, John McNaught,
Simonetta Montemagni, Andrea Trabucco and Giulia Venturi (2008).
"Building a Bio-Event Annotated Corpus for the Acquisition of Semantic
Frames from Biomedical Corpora". Proceedings of the Sixth
International Conference on Language Resources and Evaluation (LREC
2008), Marrakech, Morocco, May.
(pdf)
William Black, Andrew Conroy, Adam Funk, Allan Ramsay, Mark Stairmand,
and Paul Thompson. (2004) “Multilingual Discourse Processing”. In B.
Gambäck and K. Jokinen, editors, Proceedings of the 20th
International Conference on Computational Linguistics, pp. 15-21,
Geneva, Switzerland, August. ACL. ‘Robust and Adaptive Information
Processing for Mobile Speech Interfaces: DUMAS Final Workshop’
(pdf)
Markku Turunen, Esa-Pekka Salonen, Mikko Hartikainen, Jaakko Hakulinen,
William Black, Allan Ramsay, Adam Funk, Andrew Conroy, Paul Thompson,
Mark Stairmand, Kristiina Jokinen, Jyrki Rissanen, Kari Kanto, Antti
Kerminen, Björn Gambäck, Magnus Sahlgren, Fredrik Olsson, Maria Cheadle,
Preben Hansen, and Stina Nylander. (2004). “AthosMail: A Multilingual
Adaptive Spoken Dialogue System for the E-Mail Domain”. In B. Gambäck
and K. Jokinen, editors, Proceedings of the 20th International
Conference on Computational Linguistics, pp. 77-86, Geneva,
Switzerland, August. ACL. ‘Robust and Adaptive Information Processing
for Mobile Speech Interfaces: DUMAS Final Workshop’.
(pdf)
Paul Thompson, Mark Stairmand, and William Black. (2004) “Utterance
Planning in an Agent-based Dialogue System”. In Proceedings of the
3rd International Conference on Natural Language Generation,
University of Brighton, Brockenhurst, England, July
(pdf)
William Black, Paul Thompson, Adam Funk, and Andrew Conroy. (2003)
“Learning to Classify Utterances in a Task-Oriented Dialogue”. In
Kristiina Jokinen, Yorick Wilks, Björn Gambäck, William Black, and
Roberta Catizone, editors, Proceedings of the EACL Workshop on
Dialogue Systems: Interaction, Adaptation and Styles of Management,
pp 9-16, Budapest, Hungary, April. (pdf)
Paul Thompson (2003) “Decision Trees for Dialogue Act Classification”.
In 6th Annual Computational Linguistics in the UK Research Colloquium,
Edinburgh, Scotland, January.