Edit model card

phi-2-coedit

This model is a fine-tuned version of microsoft/phi-2 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7388
  • Rouge1: 0.5206
  • Rouge2: 0.4123
  • Rougel: 0.4979
  • Rougelsum: 0.5032
  • Sacreblue: 28.1346
  • Memory Used: 81917.5
  • Cuda Allocated: 10795.7861
  • Cuda Reserved: 74746.0
  • Ram Usage: 24042.6719
  • Em: 0.0
  • Gen Len: 120.6545

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 35
  • eval_batch_size: 35
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 140
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Sacreblue Memory Used Cuda Allocated Cuda Reserved Ram Usage Em Gen Len
0.5716 0.22 100 0.7558 0.5041 0.3927 0.4809 0.4853 26.9798 81917.5 10795.811 74738.0 22888.4102 0.0 120.3347
0.5407 0.44 200 0.7404 0.5241 0.4171 0.5013 0.5068 27.6806 81917.5 10795.814 74738.0 23733.9805 0.0 120.8277
0.5324 0.66 300 0.7230 0.5176 0.4093 0.4947 0.5002 27.5145 81917.5 10795.8184 74738.0 23831.1484 0.0 120.576
0.5107 0.88 400 0.7161 0.5256 0.4167 0.5042 0.5092 28.1274 81917.5 10795.7935 74738.0 23891.7891 0.0 120.5225
0.4374 1.1 500 0.7495 0.5237 0.414 0.501 0.5059 28.0405 81917.5 10795.7861 74746.0 23922.043 0.0 120.3181
0.3515 1.32 600 0.7418 0.5216 0.4133 0.499 0.5049 28.0528 81917.5 10795.7832 74746.0 23973.8164 0.0 120.6453
0.3449 1.54 700 0.7386 0.5242 0.4163 0.5016 0.5075 28.3145 81917.5 10795.8066 74746.0 23950.1016 0.0 120.5367
0.3375 1.76 800 0.7354 0.5194 0.4124 0.4973 0.5025 28.0252 81917.5 10795.814 74746.0 23931.0 0.0 120.6476
0.3373 1.98 900 0.7388 0.5206 0.4123 0.4979 0.5032 28.1346 81917.5 10795.7861 74746.0 24042.6719 0.0 120.6545

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
14
Safetensors
Model size
2.78B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for iliazlobin/phi-2-coedit

Base model

microsoft/phi-2
Finetuned
(281)
this model