julien-c HF staff commited on
Commit
0c6b2ba
1 Parent(s): 6930a53

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/gurkan08/bert-turkish-text-classification/README.md

Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: tr
3
+ ---
4
+ # Turkish News Text Classification
5
+
6
+ Turkish text classification model obtained by fine-tuning the Turkish bert model (dbmdz/bert-base-turkish-cased)
7
+
8
+ # Dataset
9
+
10
+ Dataset consists of 11 classes were obtained from https://www.trthaber.com/. The model was created using the most distinctive 6 classes.
11
+
12
+ Dataset can be accessed at https://github.com/gurkan08/datasets/tree/master/trt_11_category.
13
+
14
+ label_dict = {
15
+ 'LABEL_0': 'ekonomi',
16
+ 'LABEL_1': 'spor',
17
+ 'LABEL_2': 'saglik',
18
+ 'LABEL_3': 'kultur_sanat',
19
+ 'LABEL_4': 'bilim_teknoloji',
20
+ 'LABEL_5': 'egitim'
21
+ }
22
+
23
+ 70% of the data were used for training and 30% for testing.
24
+
25
+ train f1-weighted score = %97
26
+
27
+ test f1-weighted score = %94
28
+
29
+ # Usage
30
+
31
+ from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained("gurkan08/bert-turkish-text-classification")
34
+ model = AutoModelForSequenceClassification.from_pretrained("gurkan08/bert-turkish-text-classification")
35
+
36
+ nlp = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
37
+
38
+ text = ["Süper Lig'in 6. haftasında Sivasspor ile Çaykur Rizespor karşı karşıya geldi...",
39
+ "Son 24 saatte 69 kişi Kovid-19 nedeniyle yaşamını yitirdi, 1573 kişi iyileşti"]
40
+
41
+ out = nlp(text)
42
+
43
+ label_dict = {
44
+ 'LABEL_0': 'ekonomi',
45
+ 'LABEL_1': 'spor',
46
+ 'LABEL_2': 'saglik',
47
+ 'LABEL_3': 'kultur_sanat',
48
+ 'LABEL_4': 'bilim_teknoloji',
49
+ 'LABEL_5': 'egitim'
50
+ }
51
+
52
+ results = []
53
+ for result in out:
54
+ result['label'] = label_dict[result['label']]
55
+ results.append(result)
56
+ print(results)
57
+
58
+ # > [{'label': 'spor', 'score': 0.9992026090621948}, {'label': 'saglik', 'score': 0.9972177147865295}]
59
+
60
+
61
+