Edit model card

InfinityKuno-2x7B

InfinityKuno-2x7B

GGUF-Imatrix quantizations of InfinityKuno-2x7B

Experimental model from Endevor/InfinityRP-v1-7B and SanjiWatsuki/Kunoichi-DPO-v2-7B models. Merged to MoE model with 2x7B parameters.

Perplexity

Using llama.cpp/perplexity with private roleplay dataset.

Format PPL
FP16 3.2686 +/- 0.12496
Q8_0 3.2738 +/- 0.12570
Q5_K_M 3.2589 +/- 0.12430
IQ4_NL 3.2689 +/- 0.12487
IQ3_M 3.3097 +/- 0.12233
IQ2_M 3.4658 +/- 0.13077

Prompt format:

Alpaca, Extended Alpaca, Roleplay-Alpaca. (Use any Alpaca based prompt formatting and you should be fine.)

Switch: FP16 - GGUF

Downloads last month
41
GGUF
Model size
12.9B params
Architecture
llama

4-bit

5-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .