LM Studio crash
Same here, Windows with 4080.
May be related, I'm getting an error with llama.cpp. I'm using mixtral-8x7b-instruct-v0.1.Q6_K.gguf
:
llama.cpp
error loading model: create_tensor: tensor 'blk.0.ffn_gate.weight' not found
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model './models/mixtral-8x7b-instruct-v0.1.Q6_K.gguf'
You currently need a special build of llama.cpp (branch mixtral) as the support has not been merged into master yet.
I'm getting an error in Text Generation WebUI:
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\llamaindex_text_generation_webui\text-generation-webui\modules\models.py", line 88, in load_model
output = load_func_map[loader](model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\llamaindex_text_generation_webui\text-generation-webui\modules\models.py", line 253, in llamacpp_loader
model, tokenizer = LlamaCppModel.from_pretrained(model_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\llamaindex_text_generation_webui\text-generation-webui\modules\llamacpp_model.py", line 91, in from_pretrained
result.model = Llama(**params)
^^^^^^^^^^^^^^^
File "C:\dev\llamaindex_text_generation_webui\text-generation-webui\installer_files\env\Lib\site-packages\llama_cpp_cuda\llama.py", line 923, in init
self._n_vocab = self.n_vocab()
^^^^^^^^^^^^^^
File "C:\dev\llamaindex_text_generation_webui\text-generation-webui\installer_files\env\Lib\site-packages\llama_cpp_cuda\llama.py", line 2184, in n_vocab
return self._model.n_vocab()
^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\llamaindex_text_generation_webui\text-generation-webui\installer_files\env\Lib\site-packages\llama_cpp_cuda\llama.py", line 250, in n_vocab
assert self.model is not None
^^^^^^^^^^^^^^^^^^^^^^
AssertionError
I'll see if I can try something with llama.cpp as mentioned above
I built that specific "mixtral" branch (https://github.com/ggerganov/llama.cpp/tree/mixtral), but I've never used any llamacpp that I built myself with TextGenWebUi, are there any steps or docs?
#First go to your textwebui directory
cd yourdirectoryhere
#Activate the conda env
cp ./start_linux.sh activate_conda.sh # also remove the last line of activate_conda.sh
chmod +x activate_conda.sh
./activate_conda.sh
#Go to the repository setting and clone llama-cpp-python
cd repositories/
git clone https://github.com/abetlen/llama-cpp-python
cd llama-cpp-python
#Delete the normal usual lama.cpp and get the new one
cd vendor
rm -R rm -R llama.cpp/
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
git checkout mixtral
#Build the good stuff, check here for more details: https://github.com/ggerganov/llama.cpp
cd ../../
#I assume you use cuda here, check the link otherwise
make LLAMA_OPENBLAS=1
#Uninstall the old lama.cpp
pip list # look for lama.cpp or something similar
pip uninstall lama.cpporsomethingsimilar
#Install the new one
pip install .
And you should be good ! Not near my computer so I'm writing this from memory. Don't hesitate to correct this and I will edit.
Downloaded and installed the new LM Studio this morning which has experimental support for this model. LM Studio no longer gives me that error immediately, but I still get a "failed to load model" error after it tries to load for some time.
working for me with LMStudio v0.2.9. So far, I'm really impressed with this one.