tomaarsen HF staff commited on
Commit
c4cd982
1 Parent(s): 85dde0f

Add limitation due to RoBERTa

Browse files
Files changed (1) hide show
  1. README.md +15 -1
README.md CHANGED
@@ -11,7 +11,7 @@ pipeline_tag: token-classification
11
  widget:
12
  - text: >-
13
  Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic
14
- to Paris.
15
  example_title: Amelia Earhart
16
  model-index:
17
  - name: >-
@@ -71,4 +71,18 @@ model = SpanMarkerModel.from_pretrained("tomaarsen/span-marker-xlm-roberta-large
71
  entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
72
  ```
73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.
 
11
  widget:
12
  - text: >-
13
  Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic
14
+ to Paris .
15
  example_title: Amelia Earhart
16
  model-index:
17
  - name: >-
 
71
  entities = model.predict("Amelia Earhart flew her single engine Lockheed Vega 5B across the Atlantic to Paris.")
72
  ```
73
 
74
+ ### Limitations
75
+
76
+ **Warning**: This model works best when punctuation is separated from the prior words, so
77
+ ```python
78
+ # ✅
79
+ model.predict("He plays J. Robert Oppenheimer , an American theoretical physicist .")
80
+ # ❌
81
+ model.predict("He plays J. Robert Oppenheimer, an American theoretical physicist.")
82
+
83
+ # You can also supply a list of words directly: ✅
84
+ model.predict(["He", "plays", "J.", "Robert", "Oppenheimer", ",", "an", "American", "theoretical", "physicist", "."])
85
+ ```
86
+ The same may be beneficial for some languages, such as splitting `"l'ocean Atlantique"` into `"l' ocean Atlantique"`.
87
+
88
  See the [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) repository for documentation and additional information on this library.