metadata
language: fa
license: mit
pipeline_tag: text-classification
SentenceFormalityClassifier
This model is fine-tuned to classify text based on formality. It has been fine-tuned on [Mohavere Dataset] (Takalli vahideh, Kalantari, Fateme, Shamsfard, Mehrnoush, Developing an Informal-Formal Persian Corpus, 2022.) using the pretrained model persian-t5-formality-transfer.
Evaluation Metrics
INFORMAL: Precision: 0.99 Recall: 0.99 F1-Score: 0.99
FORMAL: Precision: 0.99 Recall: 1.0 F1-Score: 0.99
Accuracy: 0.99
Macro Avg: Precision: 0.99 Recall: 0.99 F1-Score: 0.99
Weighted Avg: Precision: 0.99 Recall: 0.99 F1-Score: 0.99
Usage
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
labels = ["INFORMAL", "FORMAL"]
model = AutoModelForSequenceClassification.from_pretrained('parsi-ai-nlpclass/sentence_formality_classifier')
tokenizer = AutoTokenizer.from_pretrained('parsi-ai-nlpclass/sentence_formality_classifier')
def test_model(text):
inputs = tokenizer(text, return_tensors='pt')
outputs = model(**inputs)
predicted_label = labels[int(torch.argmax(outputs.logits))]
return predicted_label
# Test the model
text1 = "من فقط میخواستم بگویم که چقدر قدردان هستم."
print("Original:", text1)
print("Predicted Label:", test_model(text1))
# output: FORMAL
text2 = "آرزویش است او را یک رستوران ببرم."
print("\nOriginal:", text2)
print("Predicted Label:", test_model(text2))
# output: FORMAL
text3 = "گل منو اذیت نکنید"
print("\nOriginal:", text2)
print("Predicted Label:", test_model(text3))
# output: INFORMAL
text4 = "من این دوربین رو خالم برام کادو خرید"
print("\nOriginal:", text2)
print("Predicted Label:", test_model(text3))
# output: INFORMAL