Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,12 @@ These files are the result of merging the [delta weights](https://huggingface.co
|
|
8 |
|
9 |
The code for merging is provided in the [WizardLM official Github repo](https://github.com/nlpxucan/WizardLM).
|
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
## WizardLM-7B HF
|
12 |
|
13 |
This repo contains the full unquantised model files in HF format for GPU inference and as a base for quantisation/conversion.
|
|
|
8 |
|
9 |
The code for merging is provided in the [WizardLM official Github repo](https://github.com/nlpxucan/WizardLM).
|
10 |
|
11 |
+
The original WizardLM deltas are in float32, and this results in producing an HF repo that is also float32, and so much larger than a normal 7B Llama model.
|
12 |
+
|
13 |
+
Therefore for this repo I converted the model to float16, to produce a standard size 7B model.
|
14 |
+
|
15 |
+
This was done achieved by running **`model = model.half()`** prior to saving.
|
16 |
+
|
17 |
## WizardLM-7B HF
|
18 |
|
19 |
This repo contains the full unquantised model files in HF format for GPU inference and as a base for quantisation/conversion.
|