iliazlobin
/

t5-large-coedit

Text2Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Edit model card

t5-large-coedit

This model is a fine-tuned version of google-t5/t5-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.5679
Rouge1: 0.6412
Rouge2: 0.5082
Rougel: 0.6068
Rougelsum: 0.6066
Sacreblue: 25.9478
Memory Used: 4111.5
Cuda Allocated: 2814.4805
Cuda Reserved: 2816.0
Ram Usage: 3545.0898
Em: 0.0333
Gen Len: 17.2363

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 50
eval_batch_size: 50
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 200
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Sacreblue	Memory Used	Cuda Allocated	Cuda Reserved	Ram Usage	Em	Gen Len
3.898	0.16	50	0.7311	0.3939	0.3011	0.3707	0.3708	10.1387	4111.5	2814.4805	2816.0	3545.0898	0.0014	13.4078
0.5752	0.31	100	0.6169	0.6336	0.4988	0.5994	0.5993	25.1341	4111.5	2814.4805	2816.0	3545.0898	0.0169	17.2158
0.5095	0.47	150	0.5912	0.6369	0.5033	0.6026	0.6026	25.5313	4111.5	2814.4805	2816.0	3545.0898	0.0256	17.2322
0.4836	0.63	200	0.5777	0.6398	0.5061	0.6053	0.6052	25.7757	4111.5	2814.4805	2816.0	3545.0898	0.0297	17.235
0.4634	0.78	250	0.5709	0.6411	0.5077	0.6067	0.6066	25.9025	4111.5	2814.4805	2816.0	3545.0898	0.0315	17.2362
0.4568	0.94	300	0.5679	0.6412	0.5082	0.6068	0.6066	25.9478	4111.5	2814.4805	2816.0	3545.0898	0.0333	17.2363

Framework versions

Transformers 4.39.3
Pytorch 2.2.2
Datasets 2.18.0
Tokenizers 0.15.2

Downloads last month: 2

Safetensors

Model size

738M params

Tensor type

F32

·

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for iliazlobin/t5-large-coedit

Base model

google-t5/t5-large

Finetuned

(68)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard