alwaysaditi's picture
End of training
dc78b20 verified
base noun phrase translation using web data and the em algorithm we consider here the problem of base noun phrase translation. we propose a new method to perform the task. for a given base np, we first search its translation candidates from the web. we next determine the possible translation(s) from among the candidates using one of the two methods that we have developed. in one method, we employ an ensemble of naive bayesian classifiers constructed with the em algorithm. in the other method, we use tf-idf vectors also constructed with the em algorithm. experimental results indicate that the coverage and accuracy of our method are significantly better than those of the baseline methods relying on existing technologies. in our method, translation candidates of a term are compositionally generated by concatenating the translation of the constituents of the term and are re-ranked by measuring contextual similarity against the source language term.