Edit model card

It is worth noting that compared with the prince-canuma version, this version is smaller in size after quantization and its accuracy is also improved by one percentage point.

In my ERP testing, this model did perform better.

vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.820 ± 0.0243
strict-match 5 exact_match 0.816 ± 0.0246

vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.804 ± 0.0252
strict-match 5 exact_match 0.804 ± 0.0252

vllm (pretrained=/root/autodl-tmp/Ministral-8B-Instruct-2410-HF,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float32), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.820 ± 0.0243
strict-match 5 exact_match 0.816 ± 0.0246

vllm (pretrained=/root/autodl-tmp/output,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=float16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.816 ± 0.0246
strict-match 5 exact_match 0.812 ± 0.0248

vllm (pretrained=/root/autodl-tmp/output,add_bos_token=true,tensor_parallel_size=2,max_model_len=2048,dtype=bfloat16), gen_kwargs: (None), limit: 250.0, num_fewshot: 5, batch_size: auto

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 5 exact_match 0.796 ± 0.0255
strict-match 5 exact_match 0.792 ± 0.0257
Downloads last month
25
Safetensors
Model size
8.02B params
Tensor type
BF16
·
I8
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for noneUsername/TouchNight-Ministral-8B-Instruct-2410-HF-W8A8-Dynamic-Per-Token

Quantized
(5)
this model