Result of Usage (Sentence-Transformers) and Usage (HuggingFace Transformers) is different
In Model card page, two ways to generate embeddings are intoduced; Usage (Sentence-Transformers) and Usage (HuggingFace Transformers).
However, if you input a sentence longer than certain length (like around 128), their outputs will not be the same.
The reason is that max_seq_length defined in sentence_bert_config.json is 128 while BERT's max_position_embeddings is 512 as shown in config.json.
In addition, this document says you cannot increase the length higher than what is maximally supported by the respective transformer model, but I think that in Usage (HuggingFace Transformers), max_seq_length is actually increased from 128 to 512.
Does it mean that we had better use SentenceTransformer class directly, not Usage (HuggingFace Transformers) ? When should I use Usage (HuggingFace Transformers)?
I stumbled upon the same issue, and noticed empirically that the accuracy on a retrieval task decreases when the sentences are longer than 128 tokens.
The easiest fix seems to specify explicitly a maximum length of 128 when calling the tokenizer in the section "Usage (HuggingFace Transformers) ":
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt', max_length=128)