TheBloke commited on
Commit
3fe0eae
1 Parent(s): 6e41676

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -30,7 +30,8 @@ Please note that these GGMLs are **not compatible with llama.cpp, or currently w
30
 
31
  ## Prompt template: orca
32
 
33
- ```<system>: You are a helpful assistant
 
34
 
35
  <human>: {prompt}
36
 
@@ -66,7 +67,6 @@ As other options become available I will endeavour to update them here (do let m
66
  | mpt-30b-dolphin-v2.ggmlv1.q5_1.bin | q5_1 | 5 | 22.47 GB| 24.97 GB | 5-bit. Even higher accuracy, resource usage and slower inference. |
67
  | mpt-30b-dolphin-v2.ggmlv1.q8_0.bin | q8_0 | 8 | 31.83 GB| 34.33 GB | 8-bit. Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. |
68
 
69
-
70
  **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
71
 
72
  <!-- footer start -->
 
30
 
31
  ## Prompt template: orca
32
 
33
+ ```
34
+ <system>: You are a helpful assistant
35
 
36
  <human>: {prompt}
37
 
 
67
  | mpt-30b-dolphin-v2.ggmlv1.q5_1.bin | q5_1 | 5 | 22.47 GB| 24.97 GB | 5-bit. Even higher accuracy, resource usage and slower inference. |
68
  | mpt-30b-dolphin-v2.ggmlv1.q8_0.bin | q8_0 | 8 | 31.83 GB| 34.33 GB | 8-bit. Almost indistinguishable from float16. High resource use and slow. Not recommended for most users. |
69
 
 
70
  **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
71
 
72
  <!-- footer start -->