File size: 1,801 Bytes
def1aef
 
 
 
 
d64628f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
def1aef
 
b6f1146
def1aef
 
 
f1c49db
 
def1aef
 
 
 
 
f05146e
def1aef
 
 
f05146e
def1aef
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
language: en
license: apache-2.0
datasets:
- conll2003
model-index:
- name: elastic/distilbert-base-cased-finetuned-conll03-english
  results:
  - task:
      type: token-classification
      name: Token Classification
    dataset:
      name: conll2003
      type: conll2003
      config: conll2003
      split: validation
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.9834432212868665
      verified: true
    - name: Precision
      type: precision
      value: 0.9857564461012737
      verified: true
    - name: Recall
      type: recall
      value: 0.9882123948925569
      verified: true
    - name: F1
      type: f1
      value: 0.9869828926905132
      verified: true
    - name: loss
      type: loss
      value: 0.07748260349035263
      verified: true
---

[DistilBERT base cased](https://huggingface.co/distilbert-base-cased), fine-tuned for NER using the [conll03 english dataset](https://huggingface.co/datasets/conll2003). Note that this model is sensitive to capital letters — "english" is different than "English". For the case insensitive version, please use [elastic/distilbert-base-uncased-finetuned-conll03-english](https://huggingface.co/elastic/distilbert-base-uncased-finetuned-conll03-english).

## Versions

- Transformers version: 4.3.1
- Datasets version: 1.3.0

## Training

```
$ run_ner.py \
  --model_name_or_path distilbert-base-cased \
  --label_all_tokens True \
  --return_entity_level_metrics True \
  --dataset_name conll2003 \
  --output_dir /tmp/distilbert-base-cased-finetuned-conll03-english \
  --do_train \
  --do_eval
```

After training, we update the labels to match the NER specific labels from the
dataset [conll2003](https://raw.githubusercontent.com/huggingface/datasets/1.3.0/datasets/conll2003/dataset_infos.json)