Original model: https://huggingface.co/Doctor-Shotgun/Nous-Capybara-limarpv3-34B
Using erotiquantxl.parquet as calibration dataset.
4.65bpw - Can run 21k context in ~23.3GB VRAM with 8bit-cache option.
Prompt format - extended alpaca with length modifier:
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Response: (length = long)