Running it on an Apple MBP M3 - non-quantized

#14

by christianweyer - opened May 17

May 17

We are really loving the results we get with the online demo (https://llava.hliu.cc). Kudos!
When trying to run e.g. an fp16 quant on our M3 Max with llama.cpp or Ollama, we get horrible results (due to some pending PRs in llama.cpp).

How are people running it on a Mac without quants to get the same results as with the original demo?

Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment