alvarobartt HF staff commited on
Commit
bec663b
1 Parent(s): 15d745b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -4
README.md CHANGED
@@ -27,7 +27,8 @@ LLM outputs on helpfulness, truthfulness, honesty, and to what extent the answer
27
  UltraCM-13B is a 13b param LLM that was released by [OpenBMB](https://huggingface.co/openbmb), as part of their paper
28
  [UltraFeedback: Boosting Language Models with High-quality Feedback](https://arxiv.org/abs/2310.01377).
29
 
30
- This model contains the quantized variants using the GGUF format, introduced by the [llama.cpp](https://github.com/ggerganov/llama.cpp) team.
 
31
 
32
  ### Model Details
33
 
@@ -45,13 +46,18 @@ This model contains the quantized variants using the GGUF format, introduced by
45
 
46
  | Name | Quant method | Bits | Size | Max RAM required | Use case |
47
  | ---- | ---- | ---- | ---- | ---- | ----- |
 
48
  | [UltraCM-13b.q4_k_s.gguf](https://huggingface.co/alvarobartt/UltraCM-13b-GGUF/blob/main/UltraCM-13b.q4_k_s.gguf) | Q4_K_S | 4 | 7.41 GB| 9.91 GB | small, greater quality loss |
49
- | [UltraCM-13b.q4_k_m.gguf](https://huggingface.co/alvarobartt/UltraCM-13b.GGUF/blob/main/UltraCM-13b.q4_k_m.gguf) | Q4_K_M | 4 | 7.87 GB| 10.37 GB | medium, balanced quality - recommended |
50
- | [UltraCM-13b.q5_k_s.gguf](https://huggingface.co/alvarobartt/UltraCM-13b.GGUF/blob/main/UltraCM-13b.q5_k_s.gguf) | Q5_K_S | 5 | 8.97 GB| 11.47 GB | large, low quality loss - recommended |
51
- | [UltraCM-13b.q5_k_m.gguf](https://huggingface.co/alvarobartt/UltraCM-13b.GGUF/blob/main/UltraCM-13b.q5_k_m.gguf) | Q5_K_M | 5 | 9.23 GB| 11.73 GB | large, very low quality loss - recommended |
 
52
 
53
  **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
54
 
 
 
 
55
  ### Uses
56
 
57
  #### Direct Use
 
27
  UltraCM-13B is a 13b param LLM that was released by [OpenBMB](https://huggingface.co/openbmb), as part of their paper
28
  [UltraFeedback: Boosting Language Models with High-quality Feedback](https://arxiv.org/abs/2310.01377).
29
 
30
+ This model contains the quantized variants using the GGUF format, introduced by the [llama.cpp](https://github.com/ggerganov/llama.cpp) team,
31
+ and also heavily inspired by [TheBloke](https://huggingface.co/TheBloke) work on quantizing most of the LLMs out there.
32
 
33
  ### Model Details
34
 
 
46
 
47
  | Name | Quant method | Bits | Size | Max RAM required | Use case |
48
  | ---- | ---- | ---- | ---- | ---- | ----- |
49
+ | [UltraCM-13b.q4_0.gguf](https://huggingface.co/alvarobartt/UltraCM-13b-GGUF/blob/main/UltraCM-13b.q4_0.gguf) | Q4_0 | 4 | 3.83 GB| 6.33 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
50
  | [UltraCM-13b.q4_k_s.gguf](https://huggingface.co/alvarobartt/UltraCM-13b-GGUF/blob/main/UltraCM-13b.q4_k_s.gguf) | Q4_K_S | 4 | 7.41 GB| 9.91 GB | small, greater quality loss |
51
+ | [UltraCM-13b.q4_k_m.gguf](https://huggingface.co/alvarobartt/UltraCM-13b-GGUF/blob/main/UltraCM-13b.q4_k_m.gguf) | Q4_K_M | 4 | 7.87 GB| 10.37 GB | medium, balanced quality - recommended |
52
+ | [UltraCM-13b.q5_0.gguf](https://huggingface.co/alvarobartt/UltraCM-13b-GGUF/blob/main/UltraCM-13b.q5_0.gguf) | Q5_0 | 5 | 4.65 GB| 7.15 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
53
+ | [UltraCM-13b.q5_k_s.gguf](https://huggingface.co/alvarobartt/UltraCM-13b-GGUF/blob/main/UltraCM-13b.q5_k_s.gguf) | Q5_K_S | 5 | 8.97 GB| 11.47 GB | large, low quality loss - recommended |
54
+ | [UltraCM-13b.q5_k_m.gguf](https://huggingface.co/alvarobartt/UltraCM-13b-GGUF/blob/main/UltraCM-13b.q5_k_m.gguf) | Q5_K_M | 5 | 9.23 GB| 11.73 GB | large, very low quality loss - recommended |
55
 
56
  **Note**: the above RAM figures assume no GPU offloading. If layers are offloaded to the GPU, this will reduce RAM usage and use VRAM instead.
57
 
58
+ For more information on quantization, I'd highly suggest anyone reading to go check [TheBloke](https://huggingface.co/TheBloke) out, as well as joining [their
59
+ Discord server](https://discord.gg/Jq4vkcDakD).
60
+
61
  ### Uses
62
 
63
  #### Direct Use