alwaysaditi's picture
End of training
dc78b20 verified
trainable methods for surface natural language generation we present three systems for surface natural language generation that are trainable from annotated corpora. the first two systems, called nlg1 and nlg2, require a corpus marked only with domain-specific semantic attributes, while the last system, called nlg3, requires a corpus marked with both semantic attributes and syntactic dependency information. all systems attempt to produce a grammatical natural language phrase from a domain-specific semantic representation. nlg1 serves a baseline system and uses phrase frequencies to generate a whole phrase in one step, while nlg2 and nlg3 use maximum entropy probability models to individually generate each word in the phrase. the systems nlg2 and nlg3 learn to determine both the word choice and the word order of the phrase. we present experiments in which we generate phrases to describe flights in the air travel domain. we use maximum entropy models to drive generation with word bigram or dependency representations taking into account (unrealised) semantic features. we use a large collection of generation templates for surface realization. we present maximum entropy models to learn attribute ordering and lexical choice for sentence generation from a semantic representation of attribute-value pairs, restricted to an air travel domain.