Title Generation for Persian using Transformers

Model Details

Model Description: This model is a fine-tuned version of mt5-small on a custom Persian dataset for the task of title generation. The model was trained for 4 epochs on a dataset containing 25,000 rows of Persian text, using an NVIDIA P100 GPU. It is designed to generate titles for Persian text, making it useful for applications such as summarizing articles, generating headlines, and creating titles for various text inputs.

Intended Use: The model is intended for generating titles for Persian text. It can be used in applications such as summarizing articles, generating headlines, or creating titles for various text inputs.

Model Architecture:

Model Type: Transformers-based text generation
Language: Persian (fa)
Base Model: mt5-small

Training Data

Dataset: The model was fine-tuned on a custom Persian dataset specifically curated for the task of title generation. The dataset includes 25,000 rows of Persian texts along with their corresponding titles.

Data Preprocessing:

Text normalization and cleaning were performed to ensure consistency.
Tokenization was done using the mT5 tokenizer.

Training Procedure

Training Configuration:

Number of Epochs: 4
Batch Size: 8
Learning Rate: 1e-5
Optimizer: AdamW

Training Environment:

Hardware: NVIDIA P100 GPU
Training Time: Approximately 4 hours

How To Use

You can use this model with the transformers library as follows:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("NLPclass/mt5-title-generation")
model = AutoModelForSeq2SeqLM.from_pretrained("NLPclass/mt5-title-generation")

# Example text in Persian
input_text = "به گزارش ایمنا، در دیدار سوپر جام فوتبال روسیه زنیت سن‌پترزبورگ قهرمان رقابتهای لیگ و جام حذفی این کشور در حضور عده‌ای معدود از تماشاگران به دیدار لوکوموتیو مسکو نایب قهرمان لیگ روسیه رفت"
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(inputs.input_ids, max_length=50, num_beams=5, early_stopping=True)

# Decode the generated title
generated_title = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_title)


# Create a text generation pipeline
title_generation_pipeline = pipeline("text-generation", model="NLPclass/mt5-title-generation")
generated_title = title_generation_pipeline(input_text, max_length=50, num_beams=5, early_stopping=True)
print(generated_title)

@misc{NLPclass,
  author = {NLPclass},
  title = {Title Generation for Persian using Transformers},
  year = {2024},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/NLPclass/mt5-title-generation}},
}