experimental model to expose arco to some reasoning
after some research i notice i was finetuning models with super high lr, further models should be better since will maintain most of the power of arco
Task | Score | Metric |
---|---|---|
ARC Challenge | 0.3473 | acc_norm |
HellaSwag | 0.5986 | acc_norm |
MMLU | 0.2489 | acc |
PIQA | 0.7318 | acc_norm |
Winogrande | 0.6259 | acc |
This table presents the extracted scores in a clear, tabular format. The "Task" column shows the name of each benchmark, the "Score" column displays the corresponding value, and the "Metric" column indicates whether the score is acc_norm or acc.
format is this:
Instruction: <your instruction>
Reasoning: // starting from here, the model will start to generate the resoning and output
Output:
Uploaded model
- Developed by: appvoid
- License: apache-2.0
- Finetuned from model : appvoid/arco
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.