nisten commited on
Commit
2a9427e
β€’
1 Parent(s): 0b1ab1e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -7,7 +7,7 @@ base_model: [mattshumer/Reflection-Llama-3.1-70B]
7
  # This gets 99.96% perplexity at 50gb filesize whereas fp8 (not tested on this model) is known to be 97-98.8%
8
 
9
 
10
- Only posting one quant because it's really annoying to make these and I haven't automated it yet, takes 30+ iterations of models as I have to recompile llama.cpp every build/test step until the lowest weight configs are found.
11
 
12
  >🐧 To download faster on Linux `sudo apt install -y aria2`
13
  >🍎 On Mac `brew install aria2`
 
7
  # This gets 99.96% perplexity at 50gb filesize whereas fp8 (not tested on this model) is known to be 97-98.8%
8
 
9
 
10
+ Only posting one quant because it's really annoying to make these and I haven't automated it yet, takes 30+ iterations of models as I have to recompile llama.cpp every build/test step until the lowest perplexity loss per weight quantization configs are found. End result is... saves 5gb of space vs regular q6_k
11
 
12
  >🐧 To download faster on Linux `sudo apt install -y aria2`
13
  >🍎 On Mac `brew install aria2`