22 November 2007
The Computer Generation of Cryptic Crossword Clues Using a Semantically-Backed Unchunking Grammar
12:30pm - 13:45pm
This presentation describes the development of a system (ENIGMA) that generates a
wide variety of cryptic crossword clues for any given word. A valid cryptic clue must
appear to make sense as a fragment of natural language, but it must also be possible
for the reader to reinterpret it symbolically as a wordplay puzzle to which there is
only one answer.
This poses an unusual challenge for an NLG application; the input semantics is
independent of the surface semantics of the output text, and the generated text has two
layers of meaning at the same linguistic level. Unlike ambiguous texts both semantic
layers are valid interpretations of the text, and furthermore they represent meanings
drawn from separate and independent semantic frameworks.
I present a novel approach, which I term Natural Language Creation (NLC), in which
a semantically-backed grammar is used to ‘translate’ the symbolic, crossword
semantics of the input into the natural semantics of the surface reading via a multilayered
text. The implementation is a hybrid language generation and language understanding process which explores the large search space incrementally, progressively ‘unchunking’ potential lexicalizations and exploring the possible semantics of the surface reading as the text emerges. The work presented is a high-level overview of a PhD thesis which was submitted in August 2007.
David Hardcastle is currently working with the Natural Language Generation Group
at the Open University. He undertook his PhD part-time at Birkbeck College,
University of London, where he was previously an Associate Student, and he has a
long-standing interest in computational linguistics. He has worked for a variety of
companies; prior to coming to the Open University he worked for several years in
Save to your Calendar