Update README.md
Browse files
README.md
CHANGED
@@ -3,6 +3,8 @@ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
|
|
3 |
library_name: peft
|
4 |
datasets:
|
5 |
- sanaa-11/math-dataset
|
|
|
|
|
6 |
---
|
7 |
# Model Card for LLaMA 3.1 Fine-Tuned Model
|
8 |
|
@@ -91,7 +93,7 @@ for _ in range(5):
|
|
91 |
## Training Details
|
92 |
|
93 |
### Training Data
|
94 |
-
- **Dataset**: The model was fine-tuned on a custom dataset consisting of
|
95 |
|
96 |
### Training Procedure
|
97 |
|
@@ -102,9 +104,9 @@ for _ in range(5):
|
|
102 |
### Training Hyperparameters
|
103 |
- **Training Regime**: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources.
|
104 |
- **Batch Size**: 1 (with gradient accumulation steps of 8)
|
105 |
-
- **Number of Epochs**:
|
106 |
- **Learning Rate**: 5e-5
|
107 |
-
|
108 |
|
109 |
## Evaluation
|
110 |
|
@@ -126,7 +128,7 @@ for _ in range(5):
|
|
126 |
### Summary
|
127 |
**Model Examination**
|
128 |
- The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset.
|
129 |
-
|
130 |
|
131 |
## Environmental Impact
|
132 |
**Carbon Emissions**
|
|
|
3 |
library_name: peft
|
4 |
datasets:
|
5 |
- sanaa-11/math-dataset
|
6 |
+
language:
|
7 |
+
- fr
|
8 |
---
|
9 |
# Model Card for LLaMA 3.1 Fine-Tuned Model
|
10 |
|
|
|
93 |
## Training Details
|
94 |
|
95 |
### Training Data
|
96 |
+
- **Dataset**: The model was fine-tuned on a custom dataset consisting of 3.6K rows of math exercises, lesson content, and solutions, specifically designed for Moroccan students in French laungage.
|
97 |
|
98 |
### Training Procedure
|
99 |
|
|
|
104 |
### Training Hyperparameters
|
105 |
- **Training Regime**: The model was fine-tuned using 4-bit quantization with QLoRA to optimize GPU and RAM usage. The training was performed on a Kaggle environment with limited resources.
|
106 |
- **Batch Size**: 1 (with gradient accumulation steps of 8)
|
107 |
+
- **Number of Epochs**: 8
|
108 |
- **Learning Rate**: 5e-5
|
109 |
+
|
110 |
|
111 |
## Evaluation
|
112 |
|
|
|
128 |
### Summary
|
129 |
**Model Examination**
|
130 |
- The model demonstrated a consistent reduction in both training and validation loss across the training epochs, suggesting effective learning and generalization from the provided dataset.
|
131 |
+
|
132 |
|
133 |
## Environmental Impact
|
134 |
**Carbon Emissions**
|