metadata

language: fa
license: mit
pipeline_tag: text-classification

SentenceFormalityClassifier

This model is fine-tuned to classify text based on formality. It has been fine-tuned on [Mohavere Dataset] (Takalli vahideh, Kalantari, Fateme, Shamsfard, Mehrnoush, Developing an Informal-Formal Persian Corpus, 2022.) using the pretrained model persian-t5-formality-transfer.

Evaluation Metrics

INFORMAL: Precision: 0.99 Recall: 0.99 F1-Score: 0.99

FORMAL: Precision: 0.99 Recall: 1.0 F1-Score: 0.99

Accuracy: 0.99

Macro Avg: Precision: 0.99 Recall: 0.99 F1-Score: 0.99

Weighted Avg: Precision: 0.99 Recall: 0.99 F1-Score: 0.99

Usage


from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

labels = ["INFORMAL", "FORMAL"]

model = AutoModelForSequenceClassification.from_pretrained('parsi-ai-nlpclass/sentence_formality_classifier')
tokenizer = AutoTokenizer.from_pretrained('parsi-ai-nlpclass/sentence_formality_classifier')

def test_model(text):
    inputs = tokenizer(text, return_tensors='pt')
    outputs = model(**inputs)
    predicted_label = labels[int(torch.argmax(outputs.logits))]
    return predicted_label

# Test the model
text1 = "من فقط می‌خواستم بگویم که چقدر قدردان هستم."
print("Original:", text1)
print("Predicted Label:", test_model(text1))

# output: FORMAL

text2 = "آرزویش است او را یک رستوران ببرم."
print("\nOriginal:", text2)
print("Predicted Label:", test_model(text2))

# output: FORMAL

text3 = "گل منو اذیت نکنید"
print("\nOriginal:", text2)
print("Predicted Label:", test_model(text3))

# output: INFORMAL

text4 = "من این دوربین رو خالم برام کادو خرید"
print("\nOriginal:", text2)
print("Predicted Label:", test_model(text3))

# output: INFORMAL