ai-forever
/

sbert_large_mt_nlu_ru

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

ai-forever commited on Jun 3, 2021

Commit

8712fdb

•

1 Parent(s): a31c394

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -1,4 +1,7 @@
 # BERT large model multitask (cased) for Sentence Embeddings in Russian language.
 For better quality, use mean token embeddings.
 ## Usage (HuggingFace Models Repository)
 You can use the model directly from the model repository to compute sentence embeddings:
@@ -16,8 +19,8 @@ def mean_pooling(model_output, attention_mask):
 sentences = ['Привет! Как твои дела?',
              'А правда, что 42 твое любимое число?']
 #Load AutoModel from huggingface model repository
-tokenizer = AutoTokenizer.from_pretrained("sberbank-ai/sbert_large_nlu_ru")
-model = AutoModel.from_pretrained("sberbank-ai/sbert_large_nlu_ru")
 #Tokenize sentences
 encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=24, return_tensors='pt')
 #Compute token embeddings

 # BERT large model multitask (cased) for Sentence Embeddings in Russian language.
+The model is described [in this article](https://habr.com/ru/company/sberdevices/blog/560748/)
+Russian SuperGLUE [metrics](https://russiansuperglue.com/login/submit_info/944)
 For better quality, use mean token embeddings.
 ## Usage (HuggingFace Models Repository)
 You can use the model directly from the model repository to compute sentence embeddings:
 sentences = ['Привет! Как твои дела?',
              'А правда, что 42 твое любимое число?']
 #Load AutoModel from huggingface model repository
+tokenizer = AutoTokenizer.from_pretrained("sberbank-ai/sbert_large_mt_nlu_ru")
+model = AutoModel.from_pretrained("sberbank-ai/sbert_large_mt_nlu_ru")
 #Tokenize sentences
 encoded_input = tokenizer(sentences, padding=True, truncation=True, max_length=24, return_tensors='pt')
 #Compute token embeddings