Quantization from: TokenBender/llama2-7b-chat-hf-codeCherryPop-qLoRA-merged
Converted to the GGML format with: llama.cpp master-b5fe67f (JUL 22, 2023)
Tested with: koboldcpp 1.36
Example usage:
koboldcpp.exe llama2-7b-chat-hf-codeCherryPop-qLoRA-merged-ggmlv3.Q6_K.bin --threads 6 --contextsize 4096 --stream --smartcontext --unbantokens --ropeconfig 1.0 10000 --noblas
Tested with the following format (refer to the original model and Stanford Alpaca for additional details):
### Instruction:
{code request}
### Response: