sql-code-llama / README.md
Liu-Xiang's picture
Model save
bd70901 verified
|
raw
history blame
2.93 kB
metadata
license: llama2
base_model: codellama/CodeLlama-7b-hf
tags:
  - generated_from_trainer
model-index:
  - name: sql-code-llama
    results: []
library_name: peft

sql-code-llama

This model is a fine-tuned version of codellama/CodeLlama-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4577

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

The following bitsandbytes quantization config was used during training:

  • quant_method: bitsandbytes
  • _load_in_8bit: True
  • _load_in_4bit: False
  • llm_int8_threshold: 6.0
  • llm_int8_skip_modules: None
  • llm_int8_enable_fp32_cpu_offload: False
  • llm_int8_has_fp16_weight: False
  • bnb_4bit_quant_type: fp4
  • bnb_4bit_use_double_quant: False
  • bnb_4bit_compute_dtype: float32
  • bnb_4bit_quant_storage: uint8
  • load_in_4bit: False
  • load_in_8bit: True

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 400
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.1953 0.0465 20 2.0335
1.1292 0.0931 40 0.8342
0.8133 0.1396 60 0.6552
0.5873 0.1862 80 0.5861
0.4095 0.2327 100 0.5589
0.5731 0.2792 120 0.5159
0.4221 0.3258 140 0.5039
0.6365 0.3723 160 0.5159
0.4779 0.4188 180 0.4867
0.3584 0.4654 200 0.5007
0.5325 0.5119 220 0.4802
0.3998 0.5585 240 0.4767
0.5952 0.6050 260 0.4777
0.4649 0.6515 280 0.4671
0.3394 0.6981 300 0.4752
0.5084 0.7446 320 0.4669
0.3934 0.7912 340 0.4613
0.5762 0.8377 360 0.4617
0.4563 0.8842 380 0.4586
0.345 0.9308 400 0.4577

Framework versions

  • PEFT 0.6.0.dev0
  • Transformers 4.44.0.dev0
  • Pytorch 2.2.2+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1