Two questions: is max_seq_length = 75 ? If so, why 75?
Because of this max_seq_length = 75 coming from https://huggingface.co/NbAiLab/nb-sbert-base/blob/main/sentence_bert_config.json I am getting different results when running the model using Sentence-Transformers vs. HuggingFace Transformers
The method from Sentence-Transformers is in fact limited to 75 tokens (including two special tokens)
The method from HuggingFace Transformers does not have a max length set so sequences up 512 tokens like the original BERT model will work
Is the 75 a mistake? Or is it intentional because the model wasn't fine tuned on sentences longer than 75 - 2 (CLS & SEP) = 73 tokens
See GitHub gist for details: https://gist.github.com/sam-h-long/a5874c55d2f4452651fe504fa607321f
Any thoughts on this?
Hi. Sorry about the delayed response.
It seems the max length comes from this line in the script we used as the basis for our training.
I'm not sure why it's so low, but it might still work well if you override it.