|
--- |
|
license: apache-2.0 |
|
model-index: |
|
- name: Llama3.1_CoT |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: IFEval (0-Shot) |
|
type: HuggingFaceH4/ifeval |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: inst_level_strict_acc and prompt_level_strict_acc |
|
value: 22.46 |
|
name: strict accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: BBH (3-Shot) |
|
type: BBH |
|
args: |
|
num_few_shot: 3 |
|
metrics: |
|
- type: acc_norm |
|
value: 19.9 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MATH Lvl 5 (4-Shot) |
|
type: hendrycks/competition_math |
|
args: |
|
num_few_shot: 4 |
|
metrics: |
|
- type: exact_match |
|
value: 1.51 |
|
name: exact match |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GPQA (0-shot) |
|
type: Idavidrein/gpqa |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 5.15 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MuSR (0-shot) |
|
type: TAUR-Lab/MuSR |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 11.77 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU-PRO (5-shot) |
|
type: TIGER-Lab/MMLU-Pro |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 19.32 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=xinchen9/Llama3.1_CoT |
|
name: Open LLM Leaderboard |
|
--- |
|
|
|
### 1. Model Details |
|
Introducing xinchen9/llama3-b8-ft, an advanced language model comprising 8 billion parameters. It has been fine-trained based on |
|
[meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B). |
|
|
|
The llama3-b8 model was fine-tuning on dataset [CoT_ollection](https://huggingface.co/datasets/kaist-ai/CoT-Collection). |
|
|
|
### 2. How to Use |
|
Here give some examples of how to use our model. |
|
#### Text Completion |
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig |
|
|
|
model_name = "xinchen9/Llama3.1_CoT" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto") |
|
model.generation_config = GenerationConfig.from_pretrained(model_name) |
|
model.generation_config.pad_token_id = model.generation_config.eos_token_id |
|
``` |
|
### 3 Disclaimer |
|
The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. Please cosult an attorney before using this model for commercial purposes. |
|
|
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_xinchen9__Llama3.1_CoT) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |13.35| |
|
|IFEval (0-Shot) |22.46| |
|
|BBH (3-Shot) |19.90| |
|
|MATH Lvl 5 (4-Shot)| 1.51| |
|
|GPQA (0-shot) | 5.15| |
|
|MuSR (0-shot) |11.77| |
|
|MMLU-PRO (5-shot) |19.32| |
|
|
|
|