The model's performance

#25
by drmeir - opened

I deployed this model with the entry-level Inference Endpoint and it takes 7 seconds to compute 35 embeddings, i.e. only 5 embeddings per second. Is this normal for this model or the endpoint is slower than it should be?

Sign up or log in to comment