---
library_name: transformers
license: cc-by-nc-4.0
language:
- az
pipeline_tag: token-classification
tags:
- NER
- Named Entity Recognition
widget:
- text: >-
    İyunun 11-i saat 20:55 radələrində Oğuz rayonu Tayıflı, Şirvanlı, Xalxal
    kəndlərinə diametri 10 mm olan dolu düşüb.
datasets:
- LocalDoc/azerbaijani-ner-dataset
---

# Azerbaijani Named Entity Recognition (NER) Model

This repository contains the code and model for Named Entity Recognition (NER) in Azerbaijani language. The model is built using the XLM-RoBERTa architecture and fine-tuned on a custom dataset.

## Model Description

The model recognizes the following entity types:

-  LABEL_0: **O**: Outside any named entity
-  LABEL_1: **PERSON**: Names of individuals
-  LABEL_2 :**LOCATION**: Geographical locations, both man-made and natural
-  LABEL_3 :**ORGANISATION**: Names of companies, institutions
-  LABEL_4 :**DATE**: Dates or periods
-  LABEL_5 :**TIME**: Times of the day
-  LABEL_6 :**MONEY**: Monetary values
-  LABEL_7 :**PERCENTAGE**: Percentage values
-  LABEL_8 :**FACILITY**: Buildings, airports, etc.
-  LABEL_9 :**PRODUCT**: Products and goods
-  LABEL_10 :**EVENT**: Events and occurrences
-  LABEL_11 :**ART**: Artworks, titles of books, songs
-  LABEL_12 :**LAW**: Legal documents
-  LABEL_13 :**LANGUAGE**: Languages
-  LABEL_14 :**GPE**: Countries, cities, states
-  LABEL_15 :**NORP**: Nationalities or religious or political groups
-  LABEL_16 :**ORDINAL**: Ordinal numbers
-  LABEL_17 :**CARDINAL**: Cardinal numbers
-  LABEL_18 :**DISEASE**: Diseases and medical conditions
-  LABEL_19 :**CONTACT**: Contact information, e.g., phone numbers, emails
-  LABEL_20 :**ADAGE**: Proverbs, sayings
-  LABEL_21 :**QUANTITY**: Measurements and quantities
-  LABEL_22 :**MISCELLANEOUS**: Miscellaneous entities
-  LABEL_23 :**POSITION**: Professional or social positions
-  LABEL_24 :**PROJECT**: Names of projects or programs

## Installation

To use the model, you need to install the required libraries. You can do this using `pip`:

```bash
pip install transformers
pip install datasets
```
```python
from transformers import pipeline, XLMRobertaTokenizerFast, XLMRobertaForTokenClassification

# Load the model and tokenizer
tokenizer = XLMRobertaTokenizerFast.from_pretrained("LocalDoc/ner_azerbaijan")
model = XLMRobertaForTokenClassification.from_pretrained("LocalDoc/ner_azerbaijan")

# Create NER pipeline
nlp = pipeline("ner", model=model, tokenizer=tokenizer, aggregation_strategy="simple")

# Example text
example = "Komitədən bildirilib ki, sovet dövründə Azərbaycanda cəmi 17 məscid fəaliyyət göstərirdisə, dövlət müstəqilliyinin bərpasından sonra ölkədə 814 məscid tikilib."

# Perform NER
ner_results = nlp(example)

# Mapping of label indices to their descriptions
label_mapping = {
    0: "O",
    1: "PERSON",
    2: "LOCATION",
    3: "ORGANISATION",
    4: "DATE",
    5: "TIME",
    6: "MONEY",
    7: "PERCENTAGE",
    8: "FACILITY",
    9: "PRODUCT",
    10: "EVENT",
    11: "ART",
    12: "LAW",
    13: "LANGUAGE",
    14: "GPE",
    15: "NORP",
    16: "ORDINAL",
    17: "CARDINAL",
    18: "DISEASE",
    19: "CONTACT",
    20: "ADAGE",
    21: "QUANTITY",
    22: "MISCELLANEOUS",
    23: "POSITION",
    24: "PROJECT"
}

# Print results with mapped entity types
for result in ner_results:
    entity_group = result['entity_group']
    entity_description = label_mapping[int(entity_group.split('_')[-1])]
    print({
        'entity_group': entity_description,
        'score': result['score'],
        'word': result['word'],
        'start': result['start'],
        'end': result['end']
    })
```

## License

This model licensed under the CC BY-NC-ND 4.0 license.
What does this license allow?

    Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made.
    Non-Commercial: You may not use the material for commercial purposes.
    No Derivatives: If you remix, transform, or build upon the material, you may not distribute the modified material.

For more information, please refer to the <a target="_blank" href="https://creativecommons.org/licenses/by-nc-nd/4.0/">CC BY-NC-ND 4.0 license</a>.


## Contact

For more information, questions, or issues, please contact LocalDoc at [v.resad.89@gmail.com].