Edit model card

Llama3.1-8b-instruct-SFT-2024-09-04

This model is a fine-tuned version of meta-llama/Meta-Llama-3.1-8B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7816

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 1.5

Training results

Training Loss Epoch Step Validation Loss
1.4731 0.0586 1000 1.3818
1.3159 0.1171 2000 1.2533
1.2472 0.1757 3000 1.1814
1.2053 0.2342 4000 1.1296
1.165 0.2928 5000 1.0964
1.1603 0.3514 6000 1.0608
1.1854 0.4099 7000 1.0294
1.0564 0.4685 8000 1.0074
0.9583 0.5271 9000 0.9807
1.0542 0.5856 10000 0.9559
0.9881 0.6442 11000 0.9371
0.9607 0.7027 12000 0.9125
1.0272 0.7613 13000 0.8907
0.9374 0.8199 14000 0.8739
0.9506 0.8784 15000 0.8549
0.8963 0.9370 16000 0.8389
0.8529 0.9955 17000 0.8225
0.6032 1.0541 18000 0.8162
0.5758 1.1127 19000 0.8079
0.6367 1.1712 20000 0.7976
0.5814 1.2298 21000 0.7917
0.5761 1.2884 22000 0.7873
0.618 1.3469 23000 0.7840
0.5374 1.4055 24000 0.7826
0.6227 1.4640 25000 0.7816

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.2
  • Pytorch 2.0.1+cu118
  • Datasets 2.21.0
  • Tokenizers 0.19.1
Downloads last month
56
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for ccibeekeoc42/Llama3.1-8b-instruct-SFT-2024-09-04

Adapter
(356)
this model