Transformers
English
falcon

Other versions?

#1
by wise-time - opened

A q8_0 version would be grand.

See the README. When this works, I'll upload it.

Don’t think you can compile https://github.com/jploski/ggml falcon-ggml on windows. Under Linux looks like it just makes a libgglm.a library.

I put instructions in the README for how to compile - it builds a bunch of stuff including the command line utility that you can use to do inference.

But yes I've only tested on Linux and don't know if it works on Windows also. I would assume it does, assuming you have a compiler and cmake installed. But I've not tested it.

i compiled and had successful command line inference on windows msvc2019 vstudio 2022 it built the files with no issues about 3 days ago

i compiled and had successful command line inference on windows msvc2019 vstudio 2022 it built the files with no issues about 3 days ago

I’m getting a bunch of errors, I was able to bypass some by including #define restrict in the ggml.c but I got no main.exe. Did you have to modify your code at all? Did you not use cmake to create the project files? How were you able to convert it? Would you mind sharing the main.exe?

I compiled successfully under windows with msys2 /ucrt environment, without the CUDA stuff.
You have to change the
cmake -DGGML_CUBLAS=1 ..
to
cmake .. -G "MSYS Makefiles"
Model runs fine, though still slow - let's see how performance is optimized over time.

I managed to get it working with msys as well,
git clone https://github.com/cmp-nct/ggllm.cpp
cd ggllm.cpp
rm -rf build && mkdir build && cd build && cmake . && cmake --build . --config Release

btw
having a 7b Model would be nice too

Ah, you are still so reluctant in publishing 7b Falcons ...
Thanks again for the 40b Models, but 7b would be fine too - it's so much faster in testing, and its already better than any llama 7b.

Pleeease ...

Sign up or log in to comment