File size: 1,931 Bytes
d9da0c1 4f22a43 d9da0c1 4f22a43 d9da0c1 5a0b406 4f22a43 858f522 4f22a43 858f522 4f22a43 858f522 4f22a43 858f522 4f22a43 5a0b406 d9da0c1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
---
language: bn
tags:
- collaborative
- bengali
- NER
license: apache-2.0
datasets: xtreme
metrics:
- Loss
- Accuracy
- Precision
- Recall
---
# sahajBERT Named Entity Recognition
## Model description
[sahajBERT](https://huggingface.co/neuropark/sahajBERT-NER) fine-tuned for NER using the bengali split of [WikiANN ](https://huggingface.co/datasets/wikiann).
Named Entities predicted by the model:
| Label id | Label |
|:--------:|:----:|
|0 |O|
|1 |B-PER|
|2 |I-PER|
|3 |B-ORG|
|4 |I-ORG|
|5 |B-LOC|
|6 |I-LOC|
## Intended uses & limitations
#### How to use
You can use this model directly with a pipeline for masked language modeling:
```python
from transformers import AlbertForTokenClassification, TokenClassificationPipeline, PreTrainedTokenizerFast
# Initialize tokenizer
tokenizer = PreTrainedTokenizerFast.from_pretrained("neuropark/sahajBERT-NER")
# Initialize model
model = AlbertForTokenClassification.from_pretrained("neuropark/sahajBERT-NER")
# Initialize pipeline
pipeline = TokenClassificationPipeline(tokenizer=tokenizer, model=model)
raw_text = "এই ইউনিয়নে ৩ টি মৌজা ও ১০ টি গ্রাম আছে ।" # Change me
output = pipeline(raw_text)
```
#### Limitations and bias
<!-- Provide examples of latent issues and potential remediations. -->
WIP
## Training data
The model was initialized it with pre-trained weights of [sahajBERT](https://huggingface.co/neuropark/sahajBERT-NER) at step 19519 and trained on the bengali of [WikiANN ](https://huggingface.co/datasets/wikiann)
## Training procedure
Coming soon!
<!-- ```bibtex
@inproceedings{...,
year={2020}
}
``` -->
## Eval results
loss: 0.11714419722557068
accuracy: 0.9772286821705426
precision: 0.9585365853658536
recall: 0.9651277013752456
f1 : 0.9618208516886931
### BibTeX entry and citation info
Coming soon!
<!-- ```bibtex
@inproceedings{...,
year={2020}
}
``` -->
|