My primary research interests lie in text and data mining historic literature, and then adding back in the results as semantic enhancements to the texts for the benefit of future researchers.
To expand on that statement, as a researcher with the Open University since 2009, my current research centres on information extraction, especially extracting implicit information. Rather than just treating text as tokens, there are typographic cues to exploit, such as change in typeface or alignment, that conventional NLP techniques discard. I am also interested in how we record the results of this information extraction as metadata, or as semantic enhancement to a document, in an effective, accessible manner, while addressing complications due to conflicting user requirements across the disciplines that want to use the common text source.
I am a member of the Open university's Digital Humanities Steering Group.