--- license: cc-by-nc-4.0 --- KEPTlongfomer using [contrastive learning](https://arxiv.org/pdf/2210.03304.pdf). First, init from RoBERTa-base-PM-M3-Voc-distill from [bio-lm](https://github.com/facebookresearch/bio-lm/blob/main/README.md). And then pretrained with Hierarchical Self-Alignment Pretrainumls (HSAP) using Knowledge Graph UMLS. This includes (a) Hierarchy, (b) Synonym, (c) Abbreviation. For more info, see section 3.3 in [paper](https://arxiv.org/pdf/2210.03304.pdf). See [here](https://github.com/whaleloops/KEPT/tree/rerank300) for how to use this on auto ICD coding. With the following result: | Metric | Score | | ------------- | ------------- | |rec_micro| =0.5844294992252652| |rec_macro| =0.12471916602840005| |rec_at_8| =0.4138093882408751| |rec_at_75| =0.8581874197033126| |rec_at_50| =0.8109877644497351| |rec_at_5| =0.2923155353947738| |rec_at_15| =0.586890060777621| |prec_micro| =0.6537291416981642| |prec_macro| =0.1382069689951297| |prec_at_8| =0.7835112692763938| |prec_at_75| =0.20033214709371291| |prec_at_50| =0.2810260972716489| |prec_at_5| =0.8551008303677343| |prec_at_15| =0.6288256227758008| |f1_micro| =0.6171399726721254| |f1_macro| =0.13111711325953157| |f1_at_8| =0.54158310388029| |f1_at_75| =0.324835806140454| |f1_at_50| =0.4174099512237087| |f1_at_5| =0.4356905906241822| |f1_at_15| =0.6071345676658747| |auc_micro| =0.9653561390964384| |auc_macro| =0.8572490224880879| |acc_micro| =0.4462779749767132| |acc_macro| =0.09732882850157536|