Papers
arxiv:2001.06381

A Common Semantic Space for Monolingual and Cross-Lingual Meta-Embeddings

Published on Jan 17, 2020
Authors:

Abstract

This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings. Our method integrates multiple word embeddings created from complementary techniques, textual sources, knowledge bases and languages. Existing word vectors are projected to a common semantic space using linear transformations and averaging. With our method the resulting meta-embeddings maintain the dimensionality of the original embeddings without losing information while dealing with the out-of-vocabulary problem. An extensive empirical evaluation demonstrates the effectiveness of our technique with respect to previous work on various intrinsic and extrinsic multilingual evaluations, obtaining competitive results for Semantic Textual Similarity and state-of-the-art performance for word similarity and POS tagging (English and Spanish). The resulting cross-lingual meta-embeddings also exhibit excellent cross-lingual transfer learning capabilities. In other words, we can leverage pre-trained source embeddings from a resource-rich language in order to improve the word representations for under-resourced languages.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2001.06381 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2001.06381 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2001.06381 in a Space README.md to link it from this page.

Collections including this paper 1