96abhishekarora commited on
Commit
5e651fe
1 Parent(s): a0a938a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -1
README.md CHANGED
@@ -9,4 +9,12 @@ tags:
9
  - Wikipedia
10
  - newspaper
11
  - news
12
- ---
 
 
 
 
 
 
 
 
 
9
  - Wikipedia
10
  - newspaper
11
  - news
12
+ ---
13
+
14
+ This model was contrastively trained for entity coreference on a dataset constructed by mentions of the same entity. The model requires text with entities detected via NER and focuses specifically on Person [PER] tags.
15
+
16
+ We start with a base S-BERT MPNet bi-encoder model (18). This is constrastively trained on 179 million pairs taken from mentions of entities on Wikipedia, where positives are mentions of the same individual. Hard negatives are mined using individuals that appear on the same disambiguation pages. Embeddings from the tuned co-reference resolution model are then clustered using Hierarchical Agglomerative Clustering.
17
+
18
+ More information about its training (and use) can be found on the associated code [repo](https://github.com/dell-research-harvard/newswire/tree/main) and [paper](https://arxiv.org/pdf/2406.09490).
19
+
20
+