jondurbin/spicyboros-c34b-2.2 · Max position embeddings

Sep 14, 2023

I think that for CodeLlama 2, "max_position_embeddings": 16384, is the correct line. 4096 is for Llama 2.

Owner Sep 14, 2023

This is true, however the fine-tuning data only had items going up to 4k context. I can change it back to 16384, but results may not be great beyond 4k.

Nexesenex

Sep 14, 2023

•

edited Sep 14, 2023

Can the fine tuning in 4k affect Codellama rope, when the base model is on 16k?

Anyway, that's what I have in The Bloke GGUF quants of SB 2.2 :

llm_load_print_meta: n_ctx_train = 16384
llm_load_print_meta: n_ctx = 16384 (my pick)
llm_load_print_meta: n_embd = 8192

It's difficult to understand all these changes from the base model to the fine tuning (that part I understand, even if I wonder if the fine tuning totally takes over the initial training of the base model), then to the quant (displaying different values) for us profanes ! ^^

Anyway, thanks for your amazing work, Jon. I'm hooked since your very first version, that I downloaded as soon as it was on HF.

jondurbin

Owner Sep 14, 2023

It's probably fine TBH, I was just playing it safe.

I don't think the fine-tuning on 4k would completely degrade the 16k performance, although I would imagine the model will likely be resistant to producing more than 4k tokens, and things like contextual question answering may suffer beyond that just due to lack of fine-tuning data, but I haven't had the time to really analyze it.

Glad you like the models!