Dr. Yaoyong Li

Postdocal Scientist
Paterson Institute for Cancer Research
University of Manchester
Wilmslow Road, Withington
Manchester, M20 4BX
UK

Tel: +44 161 446 8218
Fax: +44 161 446 3109
Email: Yaoyong.Li@manchester.ac.uk


Research Details

May 2009 --- Present
I work in the Applied Computational Biology and Bioinformatics group at the Paterson Institute for Cancer Research. I am currently working on a project for processing the data produced by the so-called next generation sequencing machine, the SOLiD system.

February 2004 --- April 2009
I worked in the GATE team in the Sheffield NLP Group. I am currently working on an EU project LarKC. The aim of the LarKC is to build a platform and the essential plug-ins for massive distributed incomplete reasoning and semantic computing for the SemanticWeb. I have also worked in the project SAM for patent semantic annotations using the semi-automatic approach of GATE Teamware, and another EU project SEKT. My research focuses on developing effective and efficient machine learning algorithms for text mining, and building up text mining system in the software GATE. I also work on the applications of text mining, e.g. opinion mining, patent semant annotation, biomedical text mining, and bioinformatics. I also study the advanced learning algorithms for other NLP tasks, such as Chinese word segmentation, cross-language information retrieval, cross-language document categorisation. I am also inerested in and work on Semantic Web technology.

July 2001 --- January 2004:
I worked for John Shawe-Taylor on an EU 5th Framework project 'Kernel Methods for Image and Text' (KerMIT). The project concerns developing machine learning algorithms and software for classification, clustering, ranking and filtering of text and image. I have developed effective machine learning algorithms for document categorisation, filtering and cross-language information retrieval. Our document filtering system scored the 2nd in the adaptive filtering task in TREC 2002.

Aug. 1999 --- July 2001:
I worked on the China founded project 'The Natural Language Understanding and HNC Theory', a part of National Key Fundamental Research Program (the 973 Program), at Institute of Acoustics, CAS, China. The project developed a new theory of natural language understanding -- 'Hierarchical Networks of Concepts' (HNC) and a machine translation system using HNC. HNC develops knowledge and methods for semantically analysing natural language. One of its core knowledge is an ontology for analysing sentence.

Sept. 1996 --- Aug. 1999:
I worked on the 'Chaotic Neural Networks and Intelligent Information Processing' project at Xi'an Jiaotong University, China. This project involved studying the neurobiology-oriented dynamical neural network models and their applications to intelligent information processing.

Sept. 1993 --- Sept. 1996:
I worked on the 'Industrial Robot Vision' project at Xi'an Jiaotong University, China. The project involved designing and developing state of the art Robot vision system in assembly line.

Software:

I participate in developing the GATE, a well-known software for natural language engineering. In particular, I developed the machine learning facilities in GATE, the annotation merging plugin, and the inter-annotators agreement (IAA) computation. For more details about those works, see Chapter 11 and Chapter 9 in GATE User Manual, respectively.

A Chinese Word Segementater based on the Perceptron learning algorithm. It can be used for both the simplified Chinese and tranditional Chinese text. It obtained top scores in the SIGHAN-2005 Chinese word segmentation task. It has not been integrated in GATE yet. Please contact me if you are interested in using it.

The PAUM models which were learned from the SIGHAN-05 training data and can be used in the Chinese word segmentation plugin of GATE:
The PAUM model learned from the PKU training data in UTF-8 code
The PAUM model learned from the PKU training data in GB2312 code

Publications:

Yaoyong Li, Kalina Bontcheva, Hamish Cunningham (2009). Adapting SVM for Data Sparseness and Imbalance: A Case Study on Information Extraction. Natural Language Engineering, 15(02), 241-271. On-line Link

Kalina Bontcheva, Brian Davis, Adam Funk, Yaoyong Li, Ting Wang (2009). Human Language Technologies. Semantic Knowledge Management, John Davies, Marko Grobelnik, and Dunja Mladenic (Eds.), Springer, 37-49. On-line Link

Yaoyong Li, Hamish Cunningham (2008). Geometric and Quantum Methods for Information Retrieval. SIGIR Forum, 42(2), 22-32. On-line Link

Yaoyong Li, Kalina Bontcheva (2008). Adapting Support Vector Machines for F-term-based Classification of Patents. ACM Transactions on Asian Language Information Processing, 7(2), 7:1--7:19. On-line Link

M. Agatonovic, N. Aswani, K. Bontcheva, H. Cunningham, T. Heitz, Y. Li, I. Roberts and V. Tablan (2008). Large-scale, Parallel Automatic Patent Annotation. Proceedings of 1st International CIKM Workshop on Patent Information Retrieval - PaIR'08, Napa Valley, California, USA, 30 October, p1-8. On-line Link

Yaoyong Li, John Shawe-Taylor (2007). Advanced learning algorithms for cross-language patent retrieval and classification. Information Processing and Management, 43(5), 1183-1199. On-line Link

Yaoyong Li, Kalina Bontcheva (2007). Hierarchical, Perceptron-like Learning for Ontology Based Information Extraction. Proceedings of the 16th International World Wide Web Conference (WWW2007), 777-786. (PDF version available )

Yaoyong Li, Kalina Bontcheva, Hamish Cunningham (2007). Cost Sensitive Evaluation Measures for F-term Patent Classification. Proceedings of the First International Workshop on Evaluating Information Access (EVIA 2007), 44-53. (PDF version available )

Yaoyong Li, Kalina Bontcheva, Hamish Cunningham (2007). SVM Based Learning System for F-term Patent Classification. Proceedings of the Sixth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access, 396-402. (PDF version available )

Yaoyong Li, Kalina Bontcheva, Hamish Cunningham (2007). Experiments of Opinion Analysis on the Corpora MPQA and NTCIR-6. Proceedings of the Sixth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access, 323-329. (PDF version available )

Diana Maynard, Yaoyong Li, Wim Peters (2007). NLP Techniques for Term Extraction and Ontology Population. Bridging the Gap between Text and Knowledge - Selected Contributions to Ontology Learning and Population from Text, P. Buitelaar and P. Cimiano (editors). IOS Press, 107-127. On-line Link

Yaoyong Li, John Shawe-Taylor (2006). Using KCCA for Japanese-English Cross-language Information Retrieval and Document Classification. Journal of Intelligent Information Systems, 27(2), 117-133. On-line Link

Diana Maynard, Wim Peters, Yaoyong Li (2006). Metrics for Evaluation of Ontology-based Information Extraction. Proceedings of WWW 2006 Workshop on Evaluation of Ontologies for the Web (EON 2006), Edinburgh, Scotland. (PDF version available )

Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, and Ji Wang (2006). Automatic Extraction of Hierarchical Relations from Text. Y. Sure and J. Domingue (Eds.): ESWC 2006, LNCS 4011, 215--229, Springer-Verlag Berlin Heidelberg. (PDF version available )

Hamish Cunningham, Kalina Bontcheva, Yaoyong Li (2005). Knowledge Manegement and Human Language: Crossing the Chasm. Journal of Knowledge Management, Vol. 9, No. 5. pp. 108--131.

Yaoyong Li, Kalina Bontcheva, Hamish Cunningham (2005). Using Uneven Margins SVM and Perceptron for Information Extraction. Proceedings of Ninth Conference on Computational Natural Language Learning (CoNLL-2005), 72-79. ( PDF version available )

Yaoyong Li, Chuanjiang Miao, Kalina Bontcheva, Hamish Cunningham (2005). Perceptron Learning for Chinese Word Segmentation. Proceedings of Fourth SIGHAN Workshop on Chinese Language processing (Sighan-05), 154-157. ( PDF version available )

Yaoyong Li, Kalina Bontcheva, Hamish Cunningham (2005). SVM Based Learning System For Information Extraction. J. Winkler, M. Niranjan and N. Lawerence (Eds.): Deterministic and Statistical Methods in Machine Learning, LNAI 3635, Springer Verlag, 319-339. ( PDF version available )

Yaoyong Li, Kalina Bontcheva, Hamish Cunningham (2005). SVM and Perceptron Based IE Systems For the Pascal Challenge. PASCAL Challenges Workshop, Southampton, UK. ( PPT version available )

Yaoyong Li and John Shawe-Taylor (2004), Combining Clustering with Canonical Correlation Analysis for Cross-Language Patent Retrieval, in Learning Methods for Text Understanding and Mining Workshop, Grenoble, France. (PDF version available )

Yaoyong Li and John Shawe-Taylor (2003). The SVM with uneven margins and Chinese document categorisation, in Proceedings of The 17th Pacific Asia Conference on Language, Information and Computation (PACLIC17), 216--227. ( PDF version available )

Nicola Cancedda, Nicolo Cesa-Bianchi, Alex Conconi, Claudio Gentile, Cyril Goutte, Thore Graepel, Yaoyong Li, Jean-Michel Renders, John Shawe-Taylor (2003), Kernel methods for document filtering. E. M. Voorhees and Lori P. Buckland, Editors, in Proceedings of The Eleventh Text Retrieval Conference (TREC 2002). ( PDF version available )

Yaoyong Li, Hugo Zaragoza, Ralf Herbrich, John Shawe-Taylor, and Jaz Kandola (2002), The Perceptron algorithm with uneven margins, in Proceedings of the 9th International Conference on Machine Learning (ICML-2002), 379--386. ( PDF version available )

Yaoyong Li (2001), Example-based Chinese-English machine translation by using HNC theory, in Natural Language Understanding and Machine Translation -- the Proceedings of the 6th Joint Conference of Computational Linguistics of China (JSCL-2001), 319-325. (in Chinese)

Yaoyong Li (2001), The Chinese-English machine translation system based on the Analysis of Sentence Category of HNC, in The HNC and Linguistics -- Proceeding of the 1st Conference on Natural Language Theory and HNC, 294-301. (in Chinese)

Yaoyong Li and Nanning Zheng (1998), Improved dynamical model for neural network and the stabilisation of associative memory, Progress in Natural Science, 8(5), 610-618.

Yaoyong Li and Nanning Zheng (1998), Controlling the chaotic neural network, Chinese Journal of Computers Science and Technology, 21(supplement), 142-146. (in Chinese)

Yaoyong Li and Nanning Zheng (1997), The self-organized and time-delayed hybrid neural network model and its application to the occluded object recognition, Pattern Recognition And Artificial Intelligence, 10(4), 317-325. (in Chinese)

Yaoyong Li and Nanning Zheng (1997), A new Hopfield-type neural network model with more biological plausibility, Chinese Journal of Neuroanatomy, 13(supplement), 313. (in Chinese)

Yaoyong Li and Nanning Zheng (1996), The relationship between the size and generalization ability of neural networks, Journal of Xi'an Jiaotong University, 30(9), 22-29. (in Chinese)

Yaoyong Li, and Nanning Zheng (1996), Letters to the editor - an explanation of Amirikian and Nishimura's neural net, Neural Networks, 9, 1085-1086.

Yaoyong Li, Nanning Zheng, and Lixing Yuan (1996), Controlling the chaotic neural network-a way of information integration, in Proceedings of the 1996 IEEE/SICE/RSJ Int. Conf. On Multisensor Fusion and Integration, 775-780.

Nanning Zheng, Yaoyong Li, and Wiek Houwers (1995), A local feature-based recognition of partially occluded objects using neural network, in Proceedings of the 1995 IEEE Industrial Electronics, Control, and Instrumentation , 1301-1306.

Yaoyong Li and Zili Zhang (1995), Solving the free boundary problem in continuous casting by using boundary element method, Applied Mathematics and Mechanics, 12(12), 1201-1208.

Back to home page of Dept. of Computer Science, University of Sheffield