22 November 2007

The Computer Generation of Cryptic Crossword Clues Using a Semantically-Backed Unchunking Grammar

Time: 12:30pm - 13:45pm

This presentation describes the development of a system (ENIGMA) that generates a wide variety of cryptic crossword clues for any given word. A valid cryptic clue must appear to make sense as a fragment of natural language, but it must also be possible for the reader to reinterpret it symbolically as a wordplay puzzle to which there is only one answer. This poses an unusual challenge for an NLG application; the input semantics is independent of the surface semantics of the output text, and the generated text has two layers of meaning at the same linguistic level. Unlike ambiguous texts both semantic layers are valid interpretations of the text, and furthermore they represent meanings drawn from separate and independent semantic frameworks. I present a novel approach, which I term Natural Language Creation (NLC), in which a semantically-backed grammar is used to ‘translate’ the symbolic, crossword semantics of the input into the natural semantics of the surface reading via a multilayered text. The implementation is a hybrid language generation and language understanding process which explores the large search space incrementally, progressively ‘unchunking’ potential lexicalizations and exploring the possible semantics of the surface reading as the text emerges. The work presented is a high-level overview of a PhD thesis which was submitted in August 2007. Short Biography: David Hardcastle is currently working with the Natural Language Generation Group at the Open University. He undertook his PhD part-time at Birkbeck College, University of London, where he was previously an Associate Student, and he has a long-standing interest in computational linguistics. He has worked for a variety of companies; prior to coming to the Open University he worked for several years in commercial IT.

