|
--- |
|
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct |
|
library_name: peft |
|
--- |
|
|
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
gretelai/synthetic_text_to_sql |
|
https://huggingface.co/datasets/gretelai/synthetic_text_to_sql |
|
gretelai/synthetic_text_to_sql is a rich dataset of high quality synthetic Text-to-SQL samples. The dataset includes 105,851 records partitioned into 100,000 train and 5,851 test records. But i used only 50k records for my training. |
|
### Training Result |
|
|
|
|
|
Step Training Loss |
|
|
|
10 1.296000 |
|
|
|
20 1.331600 |
|
|
|
30 1.279400 |
|
|
|
40 1.312900 |
|
|
|
50 1.274100 |
|
|
|
60 1.271700 |
|
|
|
70 1.209100 |
|
|
|
80 1.192600 |
|
|
|
90 1.176700 |
|
|
|
100 1.118300 |
|
|
|
110 1.086800 |
|
|
|
120 1.048000 |
|
|
|
130 1.019500 |
|
|
|
140 1.001400 |
|
|
|
150 0.994300 |
|
|
|
160 0.934900 |
|
|
|
170 0.904500 |
|
|
|
180 0.879900 |
|
|
|
190 0.850400 |
|
|
|
200 0.828000 |
|
|
|
210 0.811400 |
|
|
|
220 0.846000 |
|
|
|
230 0.791100 |
|
|
|
240 0.766900 |
|
|
|
250 0.782000 |
|
|
|
260 0.718300 |
|
|
|
270 0.701800 |
|
|
|
280 0.720000 |
|
|
|
290 0.693600 |
|
|
|
300 0.676500 |
|
|
|
310 0.679900 |
|
|
|
320 0.673200 |
|
|
|
330 0.669500 |
|
|
|
340 0.692800 |
|
|
|
350 0.662200 |
|
|
|
360 0.761200 |
|
|
|
370 0.659600 |
|
|
|
380 0.683700 |
|
|
|
390 0.681200 |
|
|
|
400 0.674000 |
|
|
|
410 0.651800 |
|
|
|
420 0.641800 |
|
|
|
430 0.646500 |
|
|
|
440 0.664200 |
|
|
|
450 0.633600 |
|
|
|
460 0.646900 |
|
|
|
470 0.643400 |
|
|
|
480 0.658800 |
|
|
|
490 0.631500 |
|
|
|
500 0.678200 |
|
|
|
510 0.633400 |
|
|
|
520 0.623300 |
|
|
|
530 0.655700 |
|
|
|
540 0.631500 |
|
|
|
550 0.617700 |
|
|
|
560 0.644000 |
|
|
|
570 0.650200 |
|
|
|
580 0.618500 |
|
|
|
590 0.615400 |
|
|
|
600 0.614000 |
|
|
|
610 0.612800 |
|
|
|
620 0.616900 |
|
|
|
630 0.640200 |
|
|
|
640 0.613000 |
|
|
|
650 0.611400 |
|
|
|
660 0.617000 |
|
|
|
670 0.629800 |
|
|
|
680 0.648800 |
|
|
|
690 0.608800 |
|
|
|
700 0.603200 |
|
|
|
710 0.628200 |
|
|
|
720 0.629700 |
|
|
|
730 0.604400 |
|
|
|
740 0.610700 |
|
|
|
750 0.621300 |
|
|
|
760 0.617900 |
|
|
|
770 0.596500 |
|
|
|
780 0.612800 |
|
|
|
790 0.611700 |
|
|
|
800 0.618600 |
|
|
|
810 0.590900 |
|
|
|
820 0.590300 |
|
|
|
830 0.592900 |
|
|
|
840 0.611700 |
|
|
|
850 0.628300 |
|
|
|
860 0.590100 |
|
|
|
870 0.584800 |
|
|
|
880 0.591200 |
|
|
|
890 0.585900 |
|
|
|
900 0.607000 |
|
|
|
910 0.578800 |
|
|
|
920 0.576600 |
|
|
|
930 0.597600 |
|
|
|
940 0.602100 |
|
|
|
950 0.579000 |
|
|
|
960 0.597900 |
|
|
|
970 0.590600 |
|
|
|
980 0.606100 |
|
|
|
990 0.577600 |
|
|
|
1000 0.584000 |
|
|
|
1010 0.569300 |
|
|
|
1020 0.594000 |
|
|
|
1030 0.596100 |
|
|
|
1040 0.590600 |
|
|
|
1050 0.570300 |
|
|
|
1060 0.572800 |
|
|
|
1070 0.572200 |
|
|
|
1080 0.569900 |
|
|
|
1090 0.587200 |
|
|
|
1100 0.572200 |
|
|
|
1110 0.569700 |
|
|
|
1120 0.612500 |
|
|
|
1130 0.587800 |
|
|
|
1140 0.568100 |
|
|
|
1150 0.573100 |
|
|
|
1160 0.568300 |
|
|
|
1170 0.620800 |
|
|
|
1180 0.570600 |
|
|
|
1190 0.561500 |
|
|
|
1200 0.560200 |
|
|
|
1210 0.592400 |
|
|
|
1220 0.580500 |
|
|
|
1230 0.578300 |
|
|
|
1240 0.573400 |
|
|
|
1250 0.568800 |
|
|
|
1260 0.600500 |
|
|
|
1270 0.578800 |
|
|
|
1280 0.561300 |
|
|
|
1290 0.570900 |
|
|
|
1300 0.567700 |
|
|
|
1310 0.589800 |
|
|
|
1320 0.598200 |
|
|
|
1330 0.564900 |
|
|
|
1340 0.577500 |
|
|
|
1350 0.565700 |
|
|
|
1360 0.581400 |
|
|
|
1370 0.562000 |
|
|
|
1380 0.588200 |
|
|
|
1390 0.603800 |
|
|
|
1400 0.560300 |
|
|
|
1410 0.559600 |
|
|
|
1420 0.567000 |
|
|
|
1430 0.562700 |
|
|
|
1440 0.564200 |
|
|
|
1450 0.563700 |
|
|
|
1460 0.561100 |
|
|
|
1470 0.561100 |
|
|
|
1480 0.561600 |
|
|
|
1490 0.564800 |
|
|
|
1500 0.579100 |
|
|
|
1510 0.564100 |
|
|
|
1520 0.562900 |
|
|
|
1530 0.569800 |
|
|
|
1540 0.566200 |
|
|
|
1550 0.599100 |
|
|
|
1560 0.562000 |
|
|
|
1570 0.580600 |
|
|
|
1580 0.564900 |
|
|
|
1590 0.571900 |
|
|
|
1600 0.580000 |
|
|
|
1610 0.559200 |
|
|
|
1620 0.566900 |
|
|
|
1630 0.556100 |
|
|
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/66465899a15e2eb8fd53727d/UNamiG8HciSUBxfS2erbv.png) |
|
|
|
#### Training Hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
num_train_epochs=3, |
|
per_device_train_batch_size=2, |
|
gradient_accumulation_steps=4, |
|
optim="adamw_torch_fused", |
|
learning_rate=2e-4, |
|
max_grad_norm=0.3, |
|
weight_decay=0.01, |
|
lr_scheduler_type="cosine", |
|
warmup_steps=50, |
|
bf16=True, |
|
tf32=True, |
|
) |
|
|
|
|