63.17 MMLU-Pro Computer Science with `Q8_0`
#2
by
ubergarm
- opened
llama.cpp
$ ./llama-server \
--model "../models/bartowski/SuperNova-Medius-GGUF/SuperNova-Medius-Q8_0.gguf" \
--n-gpu-layers 49 \
--ctx-size 40960 \
--parallel 10 \
--cache-type-k f16 \
--cache-type-v f16 \
--threads 16 \
--flash-attn \
--mlock \
--n-predict -1 \
--host 127.0.0.1 \
--port 8080
Ollama-MMLU-Pro
Default .toml
configs modified for local url, model name, and parallel inferencing. Run on 1x 3090TI FE w/ 24GB VRAM
Finished testing computer science in 0 hours, 19 minutes, 48 seconds.
Total, 259/410, 63.17%
Random Guess Attempts, 2/410, 0.49%
Correct Random Guesses, 0/2, 0.00%
Adjusted Score Without Random Guesses, 259/408, 63.48%
Finished the benchmark in 0 hours, 19 minutes, 50 seconds.
Total, 259/410, 63.17%
Token Usage:
Prompt tokens: min 1448, average 1601, max 2897, total 656306, tk/s 551.25
Completion tokens: min 59, average 273, max 2048, total 112019, tk/s 94.09
Markdown Table:
| overall | computer science |
| ------- | ---------------- |
| 63.17 | 63.17 |