Mistral-Inference package

#32
by swtb - opened

Can someone summarise for me what are the key advantages of using mistral-inference? Are there performance issues with the transformers version? In an experiment I am running I find that Mistral instruct is slower than Llama and Phi and I am wondering if this is really because of mistral or if it is because of the transformers implementation.

Sign up or log in to comment