15 April 2010
Latent TAG Derivations for Semantic Role Labeling
Meeting Room 10, 2nd Floor, JLB
12:30pm - 13:45pm
Dr Anoop Sarker - Simon Fraser University - Canada
Anoop Sarkar (joint work with his students: Yudong Liu and Gholamreza Haffari)
Semantic Role Labeling (SRL) is a natural language processing task that aims to identify and label all the arguments for each predicate occurring in a sentence. SRL is difficult because arguments can appear in different syntactic positions relative to the predicate due to syntactic alternations. Furthermore, complex syntactic embedding can create long-distance dependencies between predicate and argument. As in other natural language learning tasks, identifying discriminative features plays an important role and all state-of-the-art SRL systems use high-quality statistical parsers as a source of features in order to identify and classify semantic roles.
In statistical parsing the use of latent information (such as state-splitting of non-terminals in a context-free grammar) has led to substantial improvements in parsing accuracy. However, apart from the sentence simplification approach of Vickrey and Koller (2008), latent information has not been exploited for semantic role labeling. In our work, we take the output of a statistical parser and then decompose the phrase structure tree into a large number of hidden Tree-adjoining grammar (TAG) derivations. Each hidden or latent TAG derivation represents a different way of representing the structural dependency relationship between the predicate and argument.
We hypothesize that positive and negative examples of individual semantic roles can be reliably distinguished by possibly different latent TAG features. Motivated by this insight we show that latent support vector machines (LSVMs) can be used for the SRL task by exploiting these latent TAG features. In experiments on the PropBank-CoNLL 2005 data set, our method significantly outperforms the state of the art (even compared to models using global constraints or global inference over multiple parses). We show that latent SVMs offer an interesting new framework for NLP tasks, and using experimental analysis we examine how and why the method is effective at exploiting the latent TAG features in order to improve the precision of identifying and classifying semantic roles.
Anoop Sarkar is an Associate Professor at Simon Fraser University in British Columbia, Canada. He received his Ph.D. from the Department of Computer and Information Sciences at the University of Pennsylvania under Prof. Aravind Joshi.
His research is focused on combining labeled and unlabeled data for statistical parsing and machine translation. His interests also include formal language theory and stochastic grammars, in particular tree automata and tree-adjoining grammars.
Anoop is currently spending his sabbatical year at the Universtity of Edinburgh.
Save to your Calendar