MedMobile / README.md
KrithikV's picture
Update README.md
e44beec verified
|
raw
history blame
8.47 kB
metadata
license: mit
base_model: microsoft/Phi-3-mini-4k-instruct
tags:
  - trl
  - sft
  - generated_from_trainer
model-index:
  - name: MedMobile
    results: []

MedMobile

Manuscript: https://arxiv.org/abs/2410.09019

This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on the UltraMedical dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7358

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.8656 0.0225 200 0.7711
0.7615 0.0451 400 0.7521
0.748 0.0676 600 0.7457
0.7465 0.0902 800 0.7428
0.7468 0.1127 1000 0.7419
0.7434 0.1352 1200 0.7429
0.7467 0.1578 1400 0.7451
0.7508 0.1803 1600 0.7469
0.7505 0.2029 1800 0.7503
0.7541 0.2254 2000 0.7531
0.7559 0.2479 2200 0.7576
0.7592 0.2705 2400 0.7599
0.7729 0.2930 2600 0.7635
0.772 0.3156 2800 0.7645
0.7707 0.3381 3000 0.7628
0.7616 0.3606 3200 0.7614
0.7632 0.3832 3400 0.7590
0.7613 0.4057 3600 0.7574
0.7581 0.4283 3800 0.7558
0.7583 0.4508 4000 0.7539
0.7509 0.4733 4200 0.7518
0.7559 0.4959 4400 0.7506
0.7523 0.5184 4600 0.7491
0.7461 0.5410 4800 0.7469
0.7504 0.5635 5000 0.7464
0.7486 0.5860 5200 0.7449
0.7454 0.6086 5400 0.7436
0.7451 0.6311 5600 0.7427
0.7431 0.6537 5800 0.7412
0.7438 0.6762 6000 0.7402
0.7471 0.6987 6200 0.7390
0.7416 0.7213 6400 0.7378
0.7345 0.7438 6600 0.7364
0.7437 0.7663 6800 0.7349
0.7431 0.7889 7000 0.7349
0.737 0.8114 7200 0.7339
0.7358 0.8340 7400 0.7333
0.7336 0.8565 7600 0.7320
0.7327 0.8790 7800 0.7310
0.7288 0.9016 8000 0.7303
0.7326 0.9241 8200 0.7295
0.7354 0.9467 8400 0.7287
0.731 0.9692 8600 0.7278
0.7317 0.9917 8800 0.7272
0.6809 1.0143 9000 0.7359
0.6548 1.0368 9200 0.7341
0.6463 1.0594 9400 0.7353
0.6516 1.0819 9600 0.7357
0.6544 1.1044 9800 0.7345
0.6558 1.1270 10000 0.7342
0.6532 1.1495 10200 0.7331
0.653 1.1721 10400 0.7328
0.6583 1.1946 10600 0.7323
0.6537 1.2171 10800 0.7326
0.6622 1.2397 11000 0.7318
0.6596 1.2622 11200 0.7315
0.6522 1.2848 11400 0.7304
0.6517 1.3073 11600 0.7300
0.657 1.3298 11800 0.7296
0.6554 1.3524 12000 0.7286
0.6545 1.3749 12200 0.7287
0.6556 1.3975 12400 0.7283
0.655 1.4200 12600 0.7294
0.6489 1.4425 12800 0.7285
0.6539 1.4651 13000 0.7269
0.654 1.4876 13200 0.7273
0.6556 1.5102 13400 0.7273
0.6529 1.5327 13600 0.7271
0.6504 1.5552 13800 0.7264
0.6498 1.5778 14000 0.7256
0.6517 1.6003 14200 0.7255
0.656 1.6229 14400 0.7252
0.6471 1.6454 14600 0.7242
0.6485 1.6679 14800 0.7243
0.6545 1.6905 15000 0.7242
0.6527 1.7130 15200 0.7238
0.6504 1.7356 15400 0.7236
0.6492 1.7581 15600 0.7229
0.6529 1.7806 15800 0.7232
0.6507 1.8032 16000 0.7226
0.653 1.8257 16200 0.7229
0.6461 1.8483 16400 0.7223
0.6453 1.8708 16600 0.7221
0.6534 1.8933 16800 0.7219
0.6455 1.9159 17000 0.7220
0.6485 1.9384 17200 0.7212
0.6536 1.9610 17400 0.7214
0.6444 1.9835 17600 0.7211
0.6346 2.0060 17800 0.7356
0.5929 2.0286 18000 0.7368
0.5951 2.0511 18200 0.7371
0.6013 2.0736 18400 0.7374
0.6004 2.0962 18600 0.7375
0.5991 2.1187 18800 0.7375
0.5971 2.1413 19000 0.7369
0.597 2.1638 19200 0.7380
0.5951 2.1863 19400 0.7370
0.5916 2.2089 19600 0.7370
0.5992 2.2314 19800 0.7372
0.6011 2.2540 20000 0.7364
0.6003 2.2765 20200 0.7370
0.6003 2.2990 20400 0.7370
0.5985 2.3216 20600 0.7370
0.5988 2.3441 20800 0.7367
0.5959 2.3667 21000 0.7370
0.6019 2.3892 21200 0.7370
0.5977 2.4117 21400 0.7367
0.602 2.4343 21600 0.7368
0.5958 2.4568 21800 0.7368
0.5969 2.4794 22000 0.7360
0.6025 2.5019 22200 0.7362
0.5942 2.5244 22400 0.7361
0.6006 2.5470 22600 0.7361
0.5952 2.5695 22800 0.7366
0.6007 2.5921 23000 0.7363
0.6003 2.6146 23200 0.7363
0.6006 2.6371 23400 0.7359
0.6014 2.6597 23600 0.7360
0.6008 2.6822 23800 0.7356
0.6005 2.7048 24000 0.7357
0.5958 2.7273 24200 0.7356
0.5977 2.7498 24400 0.7358
0.6 2.7724 24600 0.7358
0.5978 2.7949 24800 0.7362
0.6018 2.8175 25000 0.7359
0.6079 2.8400 25200 0.7359
0.6036 2.8625 25400 0.7359
0.5985 2.8851 25600 0.7359
0.6019 2.9076 25800 0.7359
0.5994 2.9302 26000 0.7358
0.6027 2.9527 26200 0.7358
0.6014 2.9752 26400 0.7358
0.5957 2.9978 26600 0.7358

Framework versions

  • Transformers 4.43.3
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1