Update README.md
Browse files
README.md
CHANGED
@@ -42,11 +42,11 @@ pip install transformers
|
|
42 |
```
|
43 |
|
44 |
```python
|
45 |
-
from transformers import AutoModelForSequenceClassification,
|
46 |
import torch
|
47 |
|
48 |
# Load tokenizer and model
|
49 |
-
tokenizer =
|
50 |
model = AutoModelForSequenceClassification.from_pretrained("LocalDoc/language_detection")
|
51 |
|
52 |
# Prepare text
|
@@ -67,6 +67,35 @@ predicted_label = labels[predicted_class_index]
|
|
67 |
print(f"Predicted Language: {predicted_label}")
|
68 |
```
|
69 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
70 |
|
71 |
|
72 |
Training Performance
|
|
|
42 |
```
|
43 |
|
44 |
```python
|
45 |
+
from transformers import AutoModelForSequenceClassification, XLMRobertaTokenizer
|
46 |
import torch
|
47 |
|
48 |
# Load tokenizer and model
|
49 |
+
tokenizer = XLMRobertaTokenizer.from_pretrained("LocalDoc/language_detection")
|
50 |
model = AutoModelForSequenceClassification.from_pretrained("LocalDoc/language_detection")
|
51 |
|
52 |
# Prepare text
|
|
|
67 |
print(f"Predicted Language: {predicted_label}")
|
68 |
```
|
69 |
|
70 |
+
## Language Label Information
|
71 |
+
|
72 |
+
The model outputs a label for each prediction, corresponding to one of the languages listed below. Each label is associated with a specific language code as detailed in the following table:
|
73 |
+
|
74 |
+
| Label | Language Code | Language Name |
|
75 |
+
|-------|---------------|---------------|
|
76 |
+
| 0 | az | Azerbaijani |
|
77 |
+
| 1 | ar | Arabic |
|
78 |
+
| 2 | bg | Bulgarian |
|
79 |
+
| 3 | de | German |
|
80 |
+
| 4 | el | Greek |
|
81 |
+
| 5 | en | English |
|
82 |
+
| 6 | es | Spanish |
|
83 |
+
| 7 | fr | French |
|
84 |
+
| 8 | hi | Hindi |
|
85 |
+
| 9 | it | Italian |
|
86 |
+
| 10 | ja | Japanese |
|
87 |
+
| 11 | nl | Dutch |
|
88 |
+
| 12 | pl | Polish |
|
89 |
+
| 13 | pt | Portuguese |
|
90 |
+
| 14 | ru | Russian |
|
91 |
+
| 15 | sw | Swahili |
|
92 |
+
| 16 | th | Thai |
|
93 |
+
| 17 | tr | Turkish |
|
94 |
+
| 18 | ur | Urdu |
|
95 |
+
| 19 | vi | Vietnamese |
|
96 |
+
| 20 | zh | Chinese |
|
97 |
+
|
98 |
+
This mapping is utilized to decode the model's predictions into understandable language names, facilitating the interpretation of results for further processing or analysis.
|
99 |
|
100 |
|
101 |
Training Performance
|