eformat/granite-3.0-8b-instruct-Q4_K_M-GGUF
Not all tools (vllm, llama.cpp) seem to support the new model config params it seems (25/10/2024).
# config.json
"model_type": "granite"
"architectures": [
"GraniteForCausalLM"
]
This gguf conversion done using old ones
# config.json
"model_type": "llama"
"architectures": [
"LlamaForCausalLM"
]
This gguf loads OK - tested using:
# llama.cpp
./llama-server --verbose --gpu-layers 99999 --parallel 2 --ctx-size 4096 -m ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
# vllm
vllm serve ~/instructlab/models/granite-3.0-8b-instruct-Q4_K_M.gguf
- Downloads last month
- 43
Model tree for eformat/granite-3.0-8b-instruct-Q4_K_M-GGUF
Base model
ibm-granite/granite-3.0-8b-base
Finetuned
ibm-granite/granite-3.0-8b-instruct