6 July 2010
Inductive Logic Programming and Annotated Corpora
Room 10, 2nd Floor, JLB
12:30pm - 13:45pm
Patrick Tschorn - Sharp Laboratories of Europe, Ltd
Part-of-speech tagging, chunk parsing, named entity recognition and co-reference resolution are examples of where machine learning can be used to produce models from annotated corpora.
Inductive Logic Programming (ILP) is a subfield of machine learning. ILP algorithms represent examples, background knowledge and the target hypothesis as first-order horn clauses (Prolog).
Annotated corpora can be represented as annotation graphs, which in turn can be expressed as first-order horn clauses and hence be processed directly by ILP algorithms. This allows ILP to exploit variable long distance relations between tokens.
While ILP is attractive for learning from annotated corpora, current ILP systems suffer from poor scalability due to high computational complexity.
In this talk, I will discuss an incremental variant of ILP and its application to sentence boundary detection and chunk parsing.
2003: M.A. Computational Linguistics and Artificial Intelligence, Osnabrück University, Germany
Automatic alignment of English-German parallel texts
2009: PhD Computer Science, Lancaster University, UK
Incremental ILP for learning from annotated corpora
present: researcher at Sharp Laboratories of Europe, Ltd.
Save to your Calendar