Add limitation due to RoBERTa
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ pipeline_tag: token-classification
|
|
11 |
widget:
|
12 |
- text: >-
|
13 |
Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic
|
14 |
-
to Paris.
|
15 |
example_title: Amelia Earhart
|
16 |
model-index:
|
17 |
- name: >-
|
@@ -71,4 +71,18 @@ model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-xlm-roberta-large
|
|
71 |
entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
|
72 |
```
|
73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
74 |
See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.
|
|
|
11 |
widget:
|
12 |
- text: >-
|
13 |
Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic
|
14 |
+
to Paris .
|
15 |
example_title: Amelia Earhart
|
16 |
model-index:
|
17 |
- name: >-
|
|
|
71 |
entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
|
72 |
```
|
73 |
|
74 |
+
### Limitations
|
75 |
+
|
76 |
+
**Warning**: This model works best when punctuation is separated from the prior words, so
|
77 |
+
```python
|
78 |
+
# ✅
|
79 |
+
model.predict("He plays J. Robert Oppenheimer , an American theoretical physicist .")
|
80 |
+
# ❌
|
81 |
+
model.predict("He plays J. Robert Oppenheimer, an American theoretical physicist.")
|
82 |
+
|
83 |
+
# You can also supply a list of words directly: ✅
|
84 |
+
model.predict(["He", "plays", "J.", "Robert", "Oppenheimer", ",", "an", "American", "theoretical", "physicist", "."])
|
85 |
+
```
|
86 |
+
The same may be beneficial for some languages, such as splitting `"l'ocean Atlantique"` into `"l' ocean Atlantique"`.
|
87 |
+
|
88 |
See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.
|