Model Description
This model has been fine-tuned using dbmdz/bert-base-turkish-128k-uncased model.
This model created for detecting gibberish sentences like "adssnfjnfjn" . It is a simple binary classification project that gives sentence is gibberish or real.
Usage
from transformers import AutoModelForSequenceClassification, AutoTokenizer
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = AutoModelForSequenceClassification.from_pretrained("TURKCELL/gibberish-detection-model-tr")
tokenizer = AutoTokenizer.from_pretrained("TURKCELL/gibberish-detection-model-tr", do_lower_case=True, use_fast=True)
model.to(device)
def get_result_for_one_sample(model, tokenizer, device, sample):
d = {
1: 'gibberish',
0: 'real'
}
test_sample = tokenizer([sample], padding=True, truncation=True, max_length=256, return_tensors='pt').to(device)
# test_sample
output = model(**test_sample)
y_pred = np.argmax(output.logits.detach().to('cpu').numpy(), axis=1)
return d[y_pred[0]]
sentence = "nabeer rdahdaajdajdnjnjf"
result = get_result_for_one_sample(model, tokenizer, device, sentence)
print(result)
- Downloads last month
- 569
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.