alwaysaditi's picture
End of training
dc78b20 verified
language model adaptation for statistical machine translation via structured query models we explore unsupervised language model adaptation techniques for statistical machine translation. the hypotheses from the machine translation output are converted into queries at different levels of representation power and used to extract similar sentences from very large monolingual text collection. specific language models are then build from the retrieved data and interpolated with a general background model. experiments show significant improvements when translating with these adapted language models. we apply a slightly different sentence-level strategy to language model adaptation, first generating an nbest list with a baseline system, then finding similar sentences in a monolingual target language corpus. we construct specific language models by using machine translation output as queries to extract similar sentences from large monolingual corpora. we convert initial smt hypotheses to queries and retrieved similar sentences from a large monolingual collection.