Self-Rewarding Mistral-7B-v0.3 JA
Collection
Mistral-7B-v0.3をloraでself-rewardingしたモデル
•
4 items
•
Updated
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
1.1951 | 1.0 | 262 | 0.4563 |
0.9304 | 2.0 | 524 | 0.4279 |
0.9129 | 3.0 | 786 | 0.4242 |
0.9088 | 4.0 | 1048 | 0.4237 |
0.9089 | 5.0 | 1310 | 0.4237 |
Base model
mistralai/Mistral-7B-v0.3