1-800-BAD-CODE
commited on
Commit
•
1d4fc79
1
Parent(s):
affeb69
Update README.md
Browse files
README.md
CHANGED
@@ -733,3 +733,84 @@ seg test report:
|
|
733 |
```
|
734 |
|
735 |
</details>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
733 |
```
|
734 |
|
735 |
</details>
|
736 |
+
|
737 |
+
|
738 |
+
|
739 |
+
# Acronyms, abbreviations, and bi-capitalized words
|
740 |
+
|
741 |
+
This section briefly demonstrates the models behavior when presented with the following:
|
742 |
+
|
743 |
+
1. Acronyms: "NATO"
|
744 |
+
2. Fake acronyms: "NHTG" in place of "NATO"
|
745 |
+
3. Ambigous term which could be an acronym or proper noun: "Tuny"
|
746 |
+
3. Bi-capitalized words: "McDavid"
|
747 |
+
4. Intialisms: "p.m."
|
748 |
+
|
749 |
+
<details open>
|
750 |
+
|
751 |
+
<summary>Acronyms, etc. inputs</summary>
|
752 |
+
|
753 |
+
```python
|
754 |
+
from typing import List
|
755 |
+
|
756 |
+
from punctuators.models import PunctCapSegModelONNX
|
757 |
+
|
758 |
+
m: PunctCapSegModelONNX = PunctCapSegModelONNX.from_pretrained(
|
759 |
+
"1-800-BAD-CODE/xlm-roberta_punctuation_fullstop_truecase"
|
760 |
+
)
|
761 |
+
|
762 |
+
input_texts = [
|
763 |
+
"the us is a nato member as a nato member the country enjoys security guarantees notably article 5",
|
764 |
+
"the us is a nhtg member as a nhtg member the country enjoys security guarantees notably article 5",
|
765 |
+
"the us is a tuny member as a tuny member the country enjoys security guarantees notably article 5",
|
766 |
+
"connor andrew mcdavid is a canadian professional ice hockey centre and captain of the edmonton oilers of the national hockey league the oilers selected him first overall in the 2015 nhl entry draft mcdavid spent his childhood playing ice hockey against older children",
|
767 |
+
"please rsvp for the party asap preferably before 8 pm tonight",
|
768 |
+
]
|
769 |
+
|
770 |
+
results: List[List[str]] = m.infer(
|
771 |
+
texts=input_texts, apply_sbd=True,
|
772 |
+
)
|
773 |
+
for input_text, output_texts in zip(input_texts, results):
|
774 |
+
print(f"Input: {input_text}")
|
775 |
+
print(f"Outputs:")
|
776 |
+
for text in output_texts:
|
777 |
+
print(f"\t{text}")
|
778 |
+
print()
|
779 |
+
|
780 |
+
```
|
781 |
+
|
782 |
+
</details>
|
783 |
+
|
784 |
+
|
785 |
+
<details open>
|
786 |
+
|
787 |
+
<summary>Expected output</summary>
|
788 |
+
|
789 |
+
```python
|
790 |
+
Input: the us is a nato member as a nato member the country enjoys security guarantees notably article 5
|
791 |
+
Outputs:
|
792 |
+
The U.S. is a NATO member.
|
793 |
+
As a NATO member, the country enjoys security guarantees, notably Article 5.
|
794 |
+
|
795 |
+
Input: the us is a nhtg member as a nhtg member the country enjoys security guarantees notably article 5
|
796 |
+
Outputs:
|
797 |
+
The U.S. is a NHTG member.
|
798 |
+
As a NHTG member, the country enjoys security guarantees, notably Article 5.
|
799 |
+
|
800 |
+
Input: the us is a tuny member as a tuny member the country enjoys security guarantees notably article 5
|
801 |
+
Outputs:
|
802 |
+
The U.S. is a Tuny member.
|
803 |
+
As a Tuny member, the country enjoys security guarantees, notably Article 5.
|
804 |
+
|
805 |
+
Input: connor andrew mcdavid is a canadian professional ice hockey centre and captain of the edmonton oilers of the national hockey league the oilers selected him first overall in the 2015 nhl entry draft mcdavid spent his childhood playing ice hockey against older children
|
806 |
+
Outputs:
|
807 |
+
Connor Andrew McDavid is a Canadian professional ice hockey centre and captain of the Edmonton Oilers of the National Hockey League.
|
808 |
+
The Oilers selected him first overall in the 2015 NHL entry draft.
|
809 |
+
McDavid spent his childhood playing ice hockey against older children.
|
810 |
+
|
811 |
+
Input: please rsvp for the party asap preferably before 8 pm tonight
|
812 |
+
Outputs:
|
813 |
+
Please RSVP for the party ASAP, preferably before 8 p.m. tonight.
|
814 |
+
```
|
815 |
+
|
816 |
+
</details>
|