elastic
/

multilingual-e5-small-optimized

Sentence Similarity

sentence-transformers

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

multilingual-e5-small-optimized / README.md

SerenaChou's picture

Update README.md

2c9ce0f verified 2 months ago

|

history blame contribute delete

3.12 kB

	---
	pipeline_tag: sentence-similarity
	tags:
	- sentence-similarity
	- sentence-transformers
	license: mit
	language:
	- multilingual
	- af
	- am
	- ar
	- as
	- az
	- be
	- bg
	- bn
	- br
	- bs
	- ca
	- cs
	- cy
	- da
	- de
	- el
	- en
	- eo
	- es
	- et
	- eu
	- fa
	- fi
	- fr
	- fy
	- ga
	- gd
	- gl
	- gu
	- ha
	- he
	- hi
	- hr
	- hu
	- hy
	- id
	- is
	- it
	- ja
	- jv
	- ka
	- kk
	- km
	- kn
	- ko
	- ku
	- ky
	- la
	- lo
	- lt
	- lv
	- mg
	- mk
	- ml
	- mn
	- mr
	- ms
	- my
	- ne
	- nl
	- no
	- om
	- or
	- pa
	- pl
	- ps
	- pt
	- ro
	- ru
	- sa
	- sd
	- si
	- sk
	- sl
	- so
	- sq
	- sr
	- su
	- sv
	- sw
	- ta
	- te
	- th
	- tl
	- tr
	- ug
	- uk
	- ur
	- uz
	- vi
	- xh
	- yi
	- zh
	---

	A quantized version of [multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small). Quantization was performed per-layer under the same conditions as our ELSERv2 model, as described [here](https://www.elastic.co/search-labs/blog/articles/introducing-elser-v2-part-1#quantization).

	[Text Embeddings by Weakly-Supervised Contrastive Pre-training](https://arxiv.org/pdf/2212.03533.pdf).
	Liang Wang, Nan Yang, Xiaolong Huang, Binxing Jiao, Linjun Yang, Daxin Jiang, Rangan Majumder, Furu Wei, arXiv 2022

	## Benchmarks

	We performed a number of small benchmarks to assess both the changes in quality as well as inference latency against the baseline original model.

	### Quality

	Measuring NDCG@10 using the dev split of the MIRACL datasets for select languages, we see mostly a marginal change in quality of the quantized model.

	\| \| de \| yo\| ru \| ar \| es \| th \|
	\| --- \| --- \| ---\| --- \| --- \| --- \| --- \|
	\| multilingual-e5-small \| 0.75862 \| 0.56193 \| 0.80309 \| 0.82778 \| 0.81672 \| 0.85072 \|
	\| multilingual-e5-small-optimized \| 0.75992 \| 0.48934 \| 0.79668 \| 0.82017 \| 0.8135 \| 0.84316 \|

	To test the English out-of-domain performance, we used the test split of various datasets in the BEIR evaluation. Measuring NDCG@10, we see a larger change in SCIFACT, but marginal in the other datasets evaluated.

	\| \| FIQA \| SCIFACT \| nfcorpus \|
	\| --- \| --- \| --- \| --- \|
	\| multilingual-e5-small \| 0.33126 \| 0.677 \| 0.31004 \|
	\| multilingual-e5-small-optimized \| 0.31734 \| 0.65484 \| 0.30126 \|

	### Performance

	Using a PyTorch model traced for Linux and Intel CPUs, we performed performance benchmarking with various lengths of input. Overall, we see on average a 50-20% performance improvement with the optimized model.

	\| input length (characters) \| multilingual-e5-small \| multilingual-e5-small-optimized \| speedup \|
	\| --- \| --- \| --- \| --- \|
	\| 0 - 50 \| 0.0181 \| 0.00826 \| 54.36% \|
	\| 50 - 100 \| 0.0275 \| 0.0164 \| 40.36% \|
	\| 100 - 150 \| 0.0366 \| 0.0237 \| 35.25% \|
	\| 150 - 200 \| 0.0435 \| 0.0301 \| 30.80% \|
	\| 200 - 250 \| 0.0514 \| 0.0379 \| 26.26% \|
	\| 250 - 300 \| 0.0569 \| 0.043 \| 24.43% \|
	\| 300 - 350 \| 0.0663 \| 0.0513 \| 22.62% \|
	\| 350 - 400 \| 0.0737 \| 0.0576 \| 21.85% \|

	### Disclaimer

	This e5 model, as defined, hosted, integrated and used in conjunction with our other Elastic Software is covered by our standard warranty.