Update README.md
Browse files
README.md
CHANGED
@@ -39,7 +39,7 @@ The can be used with a new fork of llama.cpp that adds Falcon GGML support: [cmp
|
|
39 |
<!-- compatibility_ggml start -->
|
40 |
## Compatibility
|
41 |
|
42 |
-
To build cmp-nct's fork of llama.cpp with Falcon 40B support plus preliminary CUDA acceleration, please
|
43 |
|
44 |
```
|
45 |
git clone https://github.com/cmp-nct/ggllm.cpp
|
@@ -48,12 +48,16 @@ git checkout cuda-integration
|
|
48 |
rm -rf build && mkdir build && cd build && cmake -DGGML_CUBLAS=1 .. && cmake --build . --config Release
|
49 |
```
|
50 |
|
51 |
-
|
|
|
|
|
52 |
```
|
53 |
-
bin/falcon_main -t
|
54 |
```
|
55 |
|
56 |
-
|
|
|
|
|
57 |
|
58 |
<!-- compatibility_ggml end -->
|
59 |
|
|
|
39 |
<!-- compatibility_ggml start -->
|
40 |
## Compatibility
|
41 |
|
42 |
+
To build cmp-nct's fork of llama.cpp with Falcon 40B support plus preliminary CUDA acceleration, please try the following steps:
|
43 |
|
44 |
```
|
45 |
git clone https://github.com/cmp-nct/ggllm.cpp
|
|
|
48 |
rm -rf build && mkdir build && cd build && cmake -DGGML_CUBLAS=1 .. && cmake --build . --config Release
|
49 |
```
|
50 |
|
51 |
+
Compiling on Windows: developer cmp-nct notes: 'I personally compile it using VScode. When compiling with CUDA support using the Microsoft compiler it's essential to select the "Community edition build tools". Otherwise CUDA won't compile.'
|
52 |
+
|
53 |
+
Once compiled you can then use `bin/falcon_main` just like you would use llama.cpp. For example:
|
54 |
```
|
55 |
+
bin/falcon_main -t 8 -ngl 100 -m /workspace/wizard-falcon40b.ggmlv3.q3_K_S.bin -p "What is a falcon?\n### Response:"
|
56 |
```
|
57 |
|
58 |
+
Using `-ngl 100` will offload all layers to GPU. If you do not have enough VRAM for this, either lower the number or try a smaller quant size as otherwise performance will be severely affected.
|
59 |
+
|
60 |
+
Adjust `-t 8` according to what performs best on your system. Do not exceed the number of physical CPU cores you have.
|
61 |
|
62 |
<!-- compatibility_ggml end -->
|
63 |
|