Paul Thompson

Research Fellow

Contact details:
National Centre for Text Mining (NaCTeM),
School of Computer Science,
Manchester Interdisciplinary Biocentre,
University of Manchester,
131 Princess Street,
M1 7DN
Tel: +44 (0)161 306 3091
Email: Paul.Thompson[at]manchester.ac.uk

Education

MPhil, UMIST (2004)
BSc (Hons) Computational Linguistics (1st Class), UMIST (1999)

Employment History

02/2016 - present	Research Fellow, National Centre for Text Mining, The University of Manchester
03/2007 - 01/2016	Research Associate, National Centre for Text Mining, The University of Manchester
11/2006 - 02/2007	Research Assistant, The University of Manchester
06/2005 - 10/2006	Knowledge Transfer Partnership Associate, Lorien, Plc and The University of Manchester
01/2000 - 05/2005	Research Assistant, UMIST/The University of Manchester

Research

My main research interests lie in Natural Language Processing, in particular information extraction and corpus annotation. I have been involved in the development of resources and user interfaces for a number of NLP systems, as detailed in the Projects section below.

Projects

01/2025 - 03/2024

Moxcy.AI

Mocxy.AI is a company specialised in helping other companies to improve an optimise their customer experience design. I worked on helping to Mocxy.AI to streamline their workflow by leveraging the power of large language models (specifically ChatGPT). The aim is to use ChatGPT to semi-automate the process of extracting information about client companiesâ€™ customers, and to use the extracted information to create detailed customer journey maps.

08/2022 - 12/2024EPHOR The Exposome Project for Health and Occupational Research (EPHOR) project is laying the groundwork for evidence-based and cost-effective prevention for improving health at work, by developing a working life exposome toolbox. This project consortium consists of 19 exposure, health, and data scientists and technology partners from 12 different countries, who will work together to advance occupational health science in a unique way to reduce the burden of disease. I have been working mainly on exploring the use of Natural Language Processing (NLP) as means for streamlining the update of existing Job Exposure Matrices (JEMs) and the development of new JEMs. Based on a dataset of 100 papers sourced from existing literature reviews on workplace exposure to diesel exhaust and respirable crystalline silica, we have developed annotation guidelines for the semantic labelling of six individual named entity types (NEs) within these papers, i.e. substance, occupation, industry/workplace, job task/activity, measurement device and sample type. We have used these guidelines to train experts in exposure assessment to annotate these paprs according to the scheme. The resulting annotated corpus was used to train named entity recognition tools to automatically detect mentions of these NE types in literature articles. We subsequently developed guidelines for annotation of events that link named entities to encode detailed information about the findings or outcomes of exposure assessment studies. Domain experts used the guidelines to enrich the named entity corpus with event annotations to facilitate the training of methods to extract complex information from text relating to workplaces exposures. It is intened that models trained using the corpus will be used as the basis to develop a semantic search system to ease the burden of locating literature evidence to support the development and update of JEMs.

03/2019 - 07/2022Pacific Life Re/UnderwriteMe This collaborative project with Pacific Life Re and UnderwriteMe developed text mining software that is aimed at semi-automating the process of life insurance underwriting. The software developed is able to automatically detect the presence of potential mortality risk factors (such as medical conditions, treatments and laboratory tests) in medical reports and documents, normalise them to concepts in the SNOMED CT clinical vocabulary, and automatically predict the level of mortality risk associated with each detected risk factor. I worked both with underwriters at Pacific Life Re to develop resources and tools required to develop an initial prototype of the system, and subsequently with the software development team at UnderwriteMe to extend the functionality of the prototype and to prepare the software for release as a commercial product.

02/2016 - 02/2019MMPathIC The aim of this project is to create an environment which enables new biomarker tests, based on molecular pathology techniques, to be developed. These can then be used to stratify patients, to allow more accurate diagnosis or prediction of the best treatments to use. The initial focus will be on people who suffer from inflammatory disease (psoriasis, rheumatoid arthritis and lupus), given the availability of a large number of patient samples for these diseases. Text mining will be employed to carry out automated semantic analysis of various "unstructured" textual information sources thet may contain information that is relevant to the development of biomarker tests, including biomedical literature and electronic health records. Given that each of these sources constitutes vast numbers of documents, information contained within them may be hidden and easily overlooked. TM techniques will be used in a number of ways to enhance the ease and efficiency with which unstructured textual information sources can be exploited to support the development of biomarker tests.

01/2014 - 06/2015Mining the History of Medicine This project, a collaboration between NaCTEM and the Centre for the History of Science, Technology and Medicine (CHSTM) aims to demostrate the potential of text mining technlogy to assist medical historians to search and explore long-spanning archives of historical medical documents, and to help them to reveal, explore and discuss long-term, large-scale historical transformations related to medicine and public health. The project has focussed on two specific archives, i.e., the British Medical Journal (BMJ) (1840 - present day) and the London-area Medical Officer of Health (MOH) reports (1848-1972). To faciltate the automatic extraction of semantic information from these and other historical medical archives, we have developed a corpus, including documents from different periods and covering different writing styles, which has been manually annotated by medical historians with medically and historically relevant entity types, and events that involve these entities. Using this corpus, we have trained models for entity and event recognition that are robust to temporal and stylistic variations in the archives. We have additionally developed a time-sensitive inventory of medical terminology, by applying futher text mining methods to the archives. The inventory lists medical terms, along with their (possibly time-sensitive) synoynms, variants and other sematically related terms. The trained models for entity and event recognition have been incorporated into an interoperable text mining pipeline for medical history. The culmination of the project has been the development of the History of Medicine (HOM) semantic search system. Using the results of applying the text mining pipeline to the entire contents of the archives, HOM allows users to rapidy refine search results, based upon the presence of specific types of semantic information within documents. Furthermore, the system provides graphical tracking of terminology usage over time, and suggests terms that are related to initial query terms, to help users to widen their searches.

02/2011 - 02/2013

METANET4U

METANET4U is a European project aiming at supporting language technology for European languages and multilingualism.It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the mission of META. META is the Multilingual Europe Technology Alliance, dedicated to building the technological foundations of a multilingual European information society. Our work at the University is concerned primarily with demonstrating how, by ensuring that individual language processing tools and resources are made interoperable, new applications can be built rapidly by combining together these interoperable components. We are using the UIMA framework and the U-Compare system to facilitate interoperability of tools and resources.

03/2007 - 03/2009

BOOTStrep

The main purpose of this project is to build two major reusable, wide-coverage lexical and conceptual repositories for the biology domain, i.e. a bio-lexicon and a bio-ontology, using text mining techniques. My work has been focussed on annotation of gene regulation events in MEDLINE abstracts, with the purpose of acquiring semantic frames for verbs and nominalised verbs. The frames will be included within the bio-lexicon, to aid in the extraction of biological facts.

11/2006 - 02/2007

Arabic WordNet

The Arabic WordNet is based on WordNet for English, developed at Princeton University, in which words are grouped into sets of synonyms and structured according to basic semantic relations between them. I worked on the development of a Java user interface to allow of searching the Arabic Wordnet and browsing of relationships between words.

06/2005 - 10/2006

Knowledge Transfer Partnership

This project was a collaboration between the University of Manchester and Lorien Plc, a recruitment company based in Leeds. It had the purpose of assisting Lorien to increase their use of technology within the recruitment and selection process. My work focussed on the design and development of a web-based systems for creating and administering online pre-selection interviews and technical tests for job candidates.

09/2004 - 05/2005

Parmenides

The Parmenides project was conerned with knowledge and information management. I worked on the development of pattern-matching rules and associated ontologies used to extract entities and events in 3 separate domains, i.e. biotechnology, weight management and terrorist attacks. I also developed a user interface to display the extraction results.

10/2001 - 08/2004

DUMAS

This project involved the development of a framework for building agent-based multilingual speech-based applications. I worked on the development of several components and resources of a spoken email system based on this framework.

01/2000 - 04/2001

CONCERTO

CONCERTO involved the conceptual annotation and retrieval of digital documents. My work was centred on the infomation extraction module and included writing pattern matching rules to discover named entities and relationships between them, in addition to the the development of user interfaces to facilitate collaborative development of rules and semi-automatic conceptual annotation.

Publications

Liu, Z., Thompson, P., Rong, J. and Ananiadou, S. (In Press) ConspEmoLLM-v2: A robust and stable model to detect sentiment-transformed conspiracy theories. To appear in Proceedings of the 14th Conference on Prestigious Applications of Intelligent Systems (PAIS-2025)

Liu, Z., Liu.,B, Thompson,P., Yang, K. and Ananiadou, S. (2024) ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model. Proceedings of the 27th European Conference on Artificial Intelligence (ECAI), pp. 4649-4656

Thompson, P., Ananiadou, S., Basinas, I., Brinchmann, B.C, Cramer, C., Galea, K.S., Ge, C., Georgiadis, P., Kirkeleit, J., Kuijpers, E., Nguyen, N., NuÃ±ez. R., SchlÃ¼nssen, V., Stokholm, Z.A, Taher, E.A., Tinnerberg H., Van Tongeren, M and Xie, Q. (2024). Supporting the working life exposome: Annotating occupational exposure for enhanced literature search. PLoS ONE 19(8): e0307844

Liu, Z., Zhang T., Yang, K., Thompson, P, Yu, Z. and Ananiadou, S. (2024) Emotion detection for misinformation: A review. Information Fusion 107: 102300

Liu, B., Schlegel, V., Thompson, P., Batista-Navarro R.T. and Annaniadou, S. (2023) Global information-aware argument mining based on a top-down multi-turn QA model. Information Processing & Management, 60(5): 103445

Inan, E., Thompson, P., Christopoulou, F., Yates, T. and Ananiadou, S. (2022) Knowledge Graph Enrichment of a Semantic Search System for Construction Safety. Intelligent Systems and Applications (IntelliSys 2022), pp. 33-52, Springer.

Inan, E., Thompson, P., Yates, T. and Ananiadou, S. (2021). HSEarch: semantic search system for workplace accident reports. Proceedings of the 43rd European Confererence on Information Retrieval (ECIR 2021), pp. 514-519.

Thompson, P., Yates, T., Inan, E. and Ananiadou, S. (2020). Semantic Annotation for Improved Safety in Construction Work . Proceedings of LREC 2020, pp. 1983-1992.

Ju., M., Short, A.D., Thompson, P., Bakerly, N. D., Gkoutos, G., Tsaprouni, L. and Ananiadou, S. (2019). Annotating and Detecting Phenotypic Information for Chronic Obstructive Pulmonary Disease.JAMIA Open, 2(2), 261-271

Thompson, P., Daikou, S., Ueno, K., Batista-Navarro, R., Tsujii, J. and Ananiadou, S. (2018). Annotation and Detection of Drug Effects in Text for Pharmacovigilance. Journal of Cheminformatics, 10:37.

Thompson, P. and Ananiadou, S. (2018). HYPHEN: A flexible, hybrid method to map phenotype concept mentions to terminological resources. Terminology, 24(1), 91-121

Shardlow, M., Batista-Navarro, R., Thompson, P., Nawaz, R., McNaught, J. and Ananiadou, S. (2018). Identification of Research Hypotheses and New Knowledge from Scientific Literature. BMC Medical Informatics and Decision Making, 18:46

Ananiadou, S. and Thompson, P. (2017). Supporting Biological Pathway Curation Through Text Mining. In: Kalinichenko, L., Kuznetsov, S. and Manolopoulos, Y.(Eds.) Data Analytics and Management in Data Intensive Domains, pp 59-73, Springer

Thompson, P. and Ananiadou, S. (2017). Extracting Gene-Disease Relations from Text to Support Biomarker Discovery. Proceedings of the 7th International conference on Digital Health, pp. 180-189

Thompson, P., Ananiadou, S. and Tsujii, J. (2017). The GENIA Corpus: Annotation Levels and Applications. In: Ide, N. and Pustejovsky, J.(Eds.) Handbook of Linguistic Annotation, pp. 1395-1432, Springer

Thompson, P., Boylan, K., Freemont, A. and Ananiadou, S. (2017). Supporting biomarker discovery using text mining. Proceedings of Informatics for Health 2017

Alnazzawi, N., Thompson, P. and Ananiadou, S. (2016). Mapping Phenotypic Information in Heterogeneous Textual Sources to a Domain-Specific Terminological Resource. PLOS ONE, 11(9), e0162287

Korkontzelos, I., Thompson, P. and Ananiadou, S. (2016). Identifying content types of messages related to Open Source Software projects. Proceedings of LREC 2016, pp. 1837-1844

Thompson, P., Nawaz, R., McNaught, J. and Ananiadou, S. (2016). Enriching News Events with Meta-knowledge Information. Language Resources and Evaluation, 51(2), 409-438

Rehm, G., Uszkoreit, H., Ananiadou, S., Bel, N., Bieleviciene, A., Borin, L., Branco, A., Budin, G., Calzolari, N., Daelemans, W., Garabik, R., Grobelnik, M., Garcia-Mateo, C., Van Genabith, J., Hajic, J., Hernaez, I., Judge, J., Koeva, S., Krek, S., Krstev, C., Linden, K., Magnini, B., Mariani, J., McNaught, J., Melero, M., Monachini, M., Moreno, A., Odijk, J., Ogrodniczuk, M., Pezik, P., Piperidis, S., Przepiorkowski, A., Rognvaldsson, E., Rosner, M., Pedersen, B., Skadina, I., De Smedt, K., Tadic, M., Thompson, P., Tufis, D., Varadi, T., Vasiljevs, A., Vider, K. and Zabarskaite, J. (2016). The strategic impact of META-NET on the regional, national and international level. Language Resources and Evaluation

Thompson, P., Batista-Navarro, R. T. B., Kontonatsios, G., Carter, J., Toon, E., McNaught, J., Timmermann, C., Worboys, M. and Ananiadou, S. (2016). Text Mining the History of Medicine. PLOS One, 11(1), e0144717

Thompson, P., Carter, J., McNaught, J. and Ananiadou, S. (2015). Semantically Enhanced Search System for Historical Medical Archives. Proceedings of DigitalHeritage 2015 , pp. 387- 390

Thompson, P., McNaught, J. and Ananiadou, S. (2015). Customised OCR Correction for Historical Medical Text. Proceedings of DigitalHeritage 2015, pp. 35-42.

Alnazzawi, N., Thompson, P., Batista-Navarro, R. T. B. and Ananiadou, S. (2015). Using text mining techniques to extract phenotypic information from the PhenoCHF corpus. BMC Medical Informatics and Decision Making, 15(Suppl. 2):S3

Mihaila, C., Batista-Navarro, R. T. B., Alnazzawi, N., Kontonatsios, G., Korkontzelos, I., Rak, R., Thompson, P. and Ananiadou, S. (2015). Mining the biomedical literature. Health Care Analytics, CRC Press, pages 251-308.

Alnazzawi, N., Thompson, P. and Ananiadou, S. (2014). Building a semantically annotated corpus for congestive heart and renal failure from clinical records and the literature. Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi), Gothenburg, Sweden, pp. 69-74, Association for Computational Linguistics

Ananiadou, S., Thompson, P., Nawaz, R., McNaught, J. and Kell, D. B. (2014). Event Based Text Mining for Biology and Functional Genomics. Briefings in Functional Genomics, 14(3), 213-230

Kontonatsios, G., Mihaila, C., Korkontzelos, I., Thompson, P. and Ananiadou, S. (2014). A hybrid approach to compiling bilingual dictionaries of medical terms from parallel corpora. Statistical Language and Speech Processing, Second International Conference, SLSP 2014 pages 57-69, Springer

Miwa, M., Thompson, P., Korkontzelos, I. and Ananiadou, S. (2014). Comparable Study of Event Extraction in Newswire and Biomedical Domains. Proceedings of Coling 2014, pp. 2270 -2279

Rehm, G., Uszkoreit, H., Ananiadou, S., Bel, N., Bieleviciene, A., Borin, L., Branco, A., Budin, G., Calzolari, N., Daelemans, W., Garabik, R., Grobelnik, M., Garcia-Mateo, C., Van Genabith, J., Hajic, J., Hernaez, I., Judge, J., Koeva, S., Krek, S., Krstev, C., Linden, K., Magnini, B., Mariani, J., McNaught, J., Melero, M., Monachini, M., Moreno, A., Odijk, J., Ogrodniczuk, M., Pezik, P., Piperidis, S., Przepiorkowski, A., Rognvaldsson, E., Rosner, M., Pedersen, B., Skadina, I., De Smedt, K., Tadic, M., Thompson, P., Tufis, D., Varadi, T., Vasiljevs, A., Vider, K. and Zabarskaite, J.. (2014). The Strategic Impact of META-NET on the Regional, National and International Level. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pp. 1517-1524, European Language Resources Association

Rosner, M., Attard, A., Thompson, P., Gatt, A. and Ananiadou, S. (2014). Extending a Tool Resource Framework with U-Compare. Human Language Technology Challenges for Computer Science and Linguistics, Lecture Notes in Computer Science, vol 8387, pp. 315-326

Ananiadou, S., Thompson, P. and Nawaz, R. (2013). Enhancing Search: Events and their Discourse Context. Computational Linguistics and Intelligent Text Processing, Lecturure Notes in Computer Science, vol 7817, pages 318-334, Springer

Ananiadou, S., Thompson, P. and Nawaz, R. (2013). Mining events from the literature for bioinformatics applications. ACM SIGWEB Newsletter, Autumn 2013

Batista-Navarro, R. T. B., Kontonatsios, G., Mihaila, C., Thompson, P., Rak, R., Nawaz, R., Korkontzelos, I. and Ananiadou, S. (2013). Facilitating the Analysis of Discourse Phenomena in an Interoperable NLP Platform. Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science, vol 7816, pages 559-571, Springer, Berlin Heidelberg

Kontonatsios, G., Korkontzelos, I., Kolluru, B., Thompson, P. and Ananiadou, S. (2013). Deploying and Sharing U-Compare Workflows as Web Services. Journal of Biomedical Semantics, 4:7

Kontonatsios, G., Thompson, P., Batista-Navarro, R. T. B., Mihaila, C., Korkontzelos, I. and Ananiadou, S. (2013). Extending an interoperable platform to facilitate the creation of multilingual and multimodal NLP applications. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Association for Computational Linguistics, Sofia, Bulgaria, pp. 43-48

Mihaila, C., Kontonatsios, G., Batista-Navarro, R. T. B., Thompson, P., Korkontzelos, I. and Ananiadou, S. (2013). Towards a Better Understanding of Discourse: Integrating Multiple Discourse Annotation Perspectives Using UIMA. Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, Association for Computational Linguistics, Sofia, Bulgaria, pp. 79-88 (LAW Challenge Award)

Nawaz, R., Thompson, P. and Ananiadou, S. (2013). Negated BioEvents: Analysis and Identification. BMC Bioinformatics, 14:14 (Highly Accessed)

Nawaz, R., Thompson, P. and Ananiadou, S. (2013). Towards Event-based Discourse Analysis of Biomedical Text. International Journal of Computational Linguistics and Applications, 4(2), 101-120

Nawaz, R., Thompson, P. and Ananiadou, S. (2013). Something old, something new: identifying knowledge source in bio-events. International Journal of Computational Linguistics and Applications, 4(1), 129-144

Thompson, P., Nawaz, R., Korkontzelos, I., Black, W.J., McNaught, J. and Ananiadou, S. (2013). News Search Using Discourse Analytics. Proceedings of the 2013 Digital Heritage International Congress, Marseille, France, pp. 597-604, IEEE

Sophia Ananiadou, John McNaught and Paul Thompson (2012). The English Language in the Digital Age. In Georg Rehm and Hans Uszkoreit (Eds.) White Paper Series, Springer

Maria Liakata, Paul Thompson, Anita de Waard, Raheel Nawaz, Henk Pander Maat and Sophia Ananiadou (2012). A three-way perspective on scientific discourse annotation for knowledge extraction. Proceedings of the ACL Workshop on Detecting Structure in Scholarly Discourse (DSSD), pp. 37-46

Makoto Miwa, Paul Thompson and Sophia Ananiadou (2012). Boosting automatic event extraction from the literature using domain adaptation and coreference resolution. Bioinformatics, 28(13), 1759-1765

Makoto Miwa, Paul Thompson, John McNaught, Douglas B. Kell and Sophia Ananiadou (2012). Extracting semantically enriched events from biomedical literature. BMC Bioinformatics 13:108 (Highly Accessed)

Raheel Nawaz, Paul Thompson and Sophia Ananiadou (2012). Identification of Manner in Bio-Events. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), pp. 3505-3510.

Raheel Nawaz Paul Thompson and Sophia Ananiadou (2012). Meta-Knowledge Annotation at the Event Level: Comparison between Abstracts and Full Papers. Proceedings of the Third Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM 2012), pp. 24-31

Xinkai Wang, Paul Thompson and Sophia Ananiadou (2012). Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012) pp. 1148-1155

Rosner, M., Attard, A., Thompson, P., Gatt, A. and Ananiadou, S. (2011). Extending a Tool Resource Framework with U-Compare. Proceedings of the 5th Language & Technology Conference (LTC'2011)

Paul Thompson, Yoshinobu Kano, John McNaught, Steve Pettifer, Teresa Attwood, John Keane and Sophia Ananiadou (2011). "Promoting Interoperability of Resources in META-SHARE". Proceedings of the IJCNLP Workshop on Language Resources, Technology and Services in the Sharing Paradigm (LRTS), Chiang Mai, Thailand, November, pp. 50-58 (pdf).

Paul Thompson, Raheel Nawaz, John McNaught and Sophia Ananiadou (2011). "Enriching a biomedical event corpus with meta-knowledge annotation". BMC Bioinformatics, 12:393 (link) (Highly Accessed)

Paul Thompson, John McNaught, Simonetta Montemagni, Nicoletta Calzolari, Riccardo del Gratta, Vivian Lee, Simone Marchi, Monica Monachini, Piotr Pezik, Valeria Quochi, C.J. Rupp, Yutaka Sasaki, Giulia Venturi, Dietrich Rebholz-Schuhmann and Sophia Ananiadou. (2011). "The BioLexicon: a large-scale terminological resource for biomedical text mining." BMC Bioinformatics 12:397 (link) (Highly Accessed)

Sophia Ananiadou, Paul Thompson, Yoshinobu Kano, John McNaught, Teresa K. Attwood, Philip J. R. Day, John Keane, Dean A. Jackson and Steve Pettifer (2011). "Towards Interoperability of European Language Resources". Ariadne, 67 (link)

C.J. Rupp, Paul Thompson, William J. Black, John McNaught and Sophia Ananiadou (2010). "A Specialised Verb Lexicon as the Basis of Fact Extraction in the Biomedical Domain". Proceedings of Interdisciplinary Workshop on Verbs: The Identification and Representation of Verb Features (Verb 2010), Pisa, Italy. (pdf)

Raheel Nawaz, Paul Thompson and Sophia Ananiadou (2010). "Event Interpretation: A Step towards Event-Centred Text Mining". Proceedings of the First International Workshop on Automated Motif Discovery in Cultural Heritage and Scientific Communication Texts (AMICUS 2010), CLARIN/DARIAH 2010, Vienna, Austria.

Sophia Ananiadou, Paul Thompson and Raheel Nawaz (2010). "Improving Search Through Event-based Biomedical Text Mining". Proccedings of the First International Workshop on Automated Motif Discovery in Cultural Heritage and Scientific Communication Texts (AMICUS 2010), CLARIN/DARIAH 2010, Vienna, Austria.

Raheel Nawaz, Paul Thompson and Sophia Ananiadou (2010). "Evaluating a Meta-Knowledge Annotation Scheme for Bio-Events". Proccedings of the Workshop on Negation and Speculation in Natural Language Processing (NeSp-NLP 2010), ACL 2010, Uppsala, Sweden. p. 69-77 (pdf)

Sophia Ananiadou, Paul Thompson, James Thomas, Tingting Mu, Sandy Oliver, Mark Rickinson, Yutaka Sasaki, Davy Weissenbacher and John McNaught (2010). "Supporting the Education Evidence Portal via Text Mining". Philosophical Transcations of the Royal Society A, 368(1925), 3829-3844.(link)

Raheel Nawaz, Paul Thompson, John McNaught and Sophia Ananiadou (2010). "Meta-Knowledge Annotation of Bio-Events". Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Malta, May. (pdf)

Paul Thompson, Syed A. Iqbal, John McNaught and Sophia Ananiadou (2009). "Construction of an annotated corpus to support biomedical information extraction". BMC Bioinformatics 10:349(link)

Yutaka Sasaki, Paul Thompson, John McNaught and Sophia Ananiadou (2009). "Biological Event Recognition with Textual Induction" Proceedings of 3rd International Symposium on Languages in Biology and Medicine (LBM-2009). (pdf)

Yutaka Sasaki, Paul Thompson, John McNaught and Sophia Ananiadou (2009). "Three BioNLP Tools Powered by the BioLexicon." Proceeedings of EACL 2009 Demonstration Session, pp. 61--64. (pdf)

Giulia Venturi, Simonetta Montemagni, Simone Marchi, Yutaka Sasaki, Paul Thompson, John McNaught, Sophia Ananiadou (2009). "Bootstrapping a Verb Lexicon for Biomedical Information Extraction". Proceedings of the 10th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2009), pp. 137--148, Springer (pdf)

Yutaka Sasaki, Paul Thompson, Philip Cotter, John McNaught and Sophia Ananiadou (2008) Event Frame Extraction Based on a Gene Regulation Corpus, Proceedings of the 22nd International Conference on Computational Linguistics (Coling-2008), pp. 761-768, Manchester, August (pdf)

Paul Thompson, Giuila Venturi, John McNaught, Simonetta Montemagni and Sophia Ananiadou (2008). "Categorising Modality in Biomedical Texts". LREC 2008 workshop "Building and Evaluating resources for biomedical text mining" Marrakech, Morocco, May. (pdf)

Paul Thompson, Philip Cotter, Sophia Ananiadou, John McNaught, Simonetta Montemagni, Andrea Trabucco and Giulia Venturi (2008). "Building a Bio-Event Annotated Corpus for the Acquisition of Semantic Frames from Biomedical Corpora". Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, May. (pdf)

William Black, Andrew Conroy, Adam Funk, Allan Ramsay, Mark Stairmand, and Paul Thompson. (2004) “Multilingual Discourse Processing”. In B. GambÃƒÂ¤ck and K. Jokinen, editors, Proceedings of the 20th International Conference on Computational Linguistics, pp. 15-21, Geneva, Switzerland, August. ACL. ‘Robust and Adaptive Information Processing for Mobile Speech Interfaces: DUMAS Final Workshop’ (pdf)

Markku Turunen, Esa-Pekka Salonen, Mikko Hartikainen, Jaakko Hakulinen, William Black, Allan Ramsay, Adam Funk, Andrew Conroy, Paul Thompson, Mark Stairmand, Kristiina Jokinen, Jyrki Rissanen, Kari Kanto, Antti Kerminen, BjÃƒÂ¶rn GambÃƒÂ¤ck, Magnus Sahlgren, Fredrik Olsson, Maria Cheadle, Preben Hansen, and Stina Nylander. (2004). “AthosMail: A Multilingual Adaptive Spoken Dialogue System for the E-Mail Domain”. In B. GambÃƒÂ¤ck and K. Jokinen, editors, Proceedings of the 20th International Conference on Computational Linguistics, pp. 77-86, Geneva, Switzerland, August. ACL. ‘Robust and Adaptive Information Processing for Mobile Speech Interfaces: DUMAS Final Workshop’. (pdf)

Paul Thompson, Mark Stairmand, and William Black. (2004) “Utterance Planning in an Agent-based Dialogue System”. In Proceedings of the 3rd International Conference on Natural Language Generation, University of Brighton, Brockenhurst, England, July (pdf)

William Black, Paul Thompson, Adam Funk, and Andrew Conroy. (2003) “Learning to Classify Utterances in a Task-Oriented Dialogue”. In Kristiina Jokinen, Yorick Wilks, BjÃƒÂ¶rn GambÃƒÂ¤ck, William Black, and Roberta Catizone, editors, Proceedings of the EACL Workshop on Dialogue Systems: Interaction, Adaptation and Styles of Management, pp 9-16, Budapest, Hungary, April. (pdf)

Paul Thompson (2003) “Decision Trees for Dialogue Act Classification”. In 6th Annual Computational Linguistics in the UK Research Colloquium, Edinburgh, Scotland, January.