TheBloke
/

MPT-30B-Dolphin-v2-GGML

Model card Files Files and versions Community

TheBloke commited on Jul 10, 2023

Commit

3fe0eae

•

1 Parent(s): 6e41676

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -30,7 +30,8 @@ Please note that these GGMLs are **not compatible with llama.cpp, or currently w
 ## Prompt template: orca
-```<system>: You are a helpful assistant
 <human>: {prompt}
@@ -66,7 +67,6 @@ As other options become available I will endeavour to update them here (do let m
 | mpt-30b-dolphin-v2.ggmlv1.q5_1.bin | q5_1 | 5 | 22.47 GB| 24.97 GB | 5-bit. Even higher accuracy, resource usage and slower inference. |
 | mpt-30b-dolphin-v2.ggmlv1.q8_0.bin | q8_0 | 8 | 31.83 GB| 34.33 GB | 8-bit. Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. |
 **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
 <!-- footer start -->

 ## Prompt template: orca
+```
+<system>: You are a helpful assistant
 <human>: {prompt}
 | mpt-30b-dolphin-v2.ggmlv1.q5_1.bin | q5_1 | 5 | 22.47 GB| 24.97 GB | 5-bit. Even higher accuracy, resource usage and slower inference. |
 | mpt-30b-dolphin-v2.ggmlv1.q8_0.bin | q8_0 | 8 | 31.83 GB| 34.33 GB | 8-bit. Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. |
 **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
 <!-- footer start -->