Title Generation for Persian using Transformers
Model Details
Model Description:
This model is a fine-tuned version of mt5-small
on a custom Persian dataset for the task of title generation. The model was trained for 4 epochs on a dataset containing 25,000 rows of Persian text, using an NVIDIA P100 GPU. It is designed to generate titles for Persian text, making it useful for applications such as summarizing articles, generating headlines, and creating titles for various text inputs.
Intended Use: The model is intended for generating titles for Persian text. It can be used in applications such as summarizing articles, generating headlines, or creating titles for various text inputs.
Model Architecture:
- Model Type: Transformers-based text generation
- Language: Persian (fa)
- Base Model:
mt5-small
Training Data
Dataset: The model was fine-tuned on a custom Persian dataset specifically curated for the task of title generation. The dataset includes 25,000 rows of Persian texts along with their corresponding titles.
Data Preprocessing:
- Text normalization and cleaning were performed to ensure consistency.
- Tokenization was done using the mT5 tokenizer.
Training Procedure
Training Configuration:
- Number of Epochs: 4
- Batch Size: 8
- Learning Rate: 1e-5
- Optimizer: AdamW
Training Environment:
- Hardware: NVIDIA P100 GPU
- Training Time: Approximately 4 hours
How To Use
You can use this model with the transformers
library as follows:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("NLPclass/mt5-title-generation")
model = AutoModelForSeq2SeqLM.from_pretrained("NLPclass/mt5-title-generation")
# Example text in Persian
input_text = "به گزارش ایمنا، در دیدار سوپر جام فوتبال روسیه زنیت سنپترزبورگ قهرمان رقابتهای لیگ و جام حذفی این کشور در حضور عدهای معدود از تماشاگران به دیدار لوکوموتیو مسکو نایب قهرمان لیگ روسیه رفت"
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(inputs.input_ids, max_length=50, num_beams=5, early_stopping=True)
# Decode the generated title
generated_title = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_title)
# Create a text generation pipeline
title_generation_pipeline = pipeline("text-generation", model="NLPclass/mt5-title-generation")
generated_title = title_generation_pipeline(input_text, max_length=50, num_beams=5, early_stopping=True)
print(generated_title)
@misc{NLPclass,
author = {NLPclass},
title = {Title Generation for Persian using Transformers},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/NLPclass/mt5-title-generation}},
}
- Downloads last month
- 50