adding note about the LlamaTokenizerFast is not included in this build so the Inference API will not work. please use the LlamaTokenizerFast from: Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct to use this model at this time to the README

Files changed (1) hide show

README.md CHANGED Viewed

@@ -12,6 +12,14 @@ This model was merged with the following HuggingFace TinyLlama models using ties
 - Tensoic/TinyLlama-1.1B-3T-openhermes
 - Josephgflowers/TinyLlama-3T-Cinder-v1.3
 ## How do I fine-tune this model?
 Please refer to the Unsloth fine-tuning guide for:
@@ -20,8 +28,6 @@ Please refer to the Unsloth fine-tuning guide for:
 ## How do I generate my own model merges?
-Here's [the standalone python script](https://huggingface.co/matlok/tinyllama-cinder-openhermes-32k/blob/main/run-tiny-merge.py) used with logs below:
 ```python3
 #!/usr/bin/env python3

 - Tensoic/TinyLlama-1.1B-3T-openhermes
 - Josephgflowers/TinyLlama-3T-Cinder-v1.3
+## Why does the Inference API on HuggingFace not work for this merged model?
+The included [merge python script](https://huggingface.co/matlok/tinyllama-cinder-openhermes-32k/blob/main/run-tiny-merge.py) does not contain the **LlamaTokenizerFast** tokenizer. This means the HuggingFace Inference API will not work. The tokenizer to use with this model is:
+```
+TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
+```
 ## How do I fine-tune this model?
 Please refer to the Unsloth fine-tuning guide for:
 ## How do I generate my own model merges?
 ```python3
 #!/usr/bin/env python3