File size: 2,498 Bytes
8569700 28a9708 62c409a 8569700 d2f8eb8 8569700 d2f8eb8 8569700 d2f8eb8 8569700 d2f8eb8 8569700 d2f8eb8 8569700 d2f8eb8 8569700 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
---
language: "en"
tags:
- bert
- medical
- clinical
- assertion
- negation
- text-classification
widget:
- text: "Patient denies [entity] SOB [entity]."
---
# Clinical Assertion / Negation Classification BERT
## Model description
The Clinical Assertion and Negation Classification BERT is introduced in the paper [Assertion Detection in Clinical Notes: Medical Language Models to the Rescue?
](https://aclanthology.org/2021.nlpmc-1.5/). The model helps structure information in clinical patient letters by classifying medical conditions mentioned in the letter into PRESENT, ABSENT and POSSIBLE.
The model is based on the [ClinicalBERT - Bio + Discharge Summary BERT Model](https://huggingface.co/emilyalsentzer/Bio_Discharge_Summary_BERT) by Alsentzer et al. and fine-tuned on assertion data from the [2010 i2b2 challenge](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168320/).
#### How to use the model
You can load the model via the transformers library:
```
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("bvanaken/clinical-assertion-negation-bert")
model = AutoModelForSequenceClassification.from_pretrained("bvanaken/clinical-assertion-negation-bert")
```
The model expects input in the form of spans/sentences with one marked entity to classify as `PRESENT(0)`, `ABSENT(1)` or `POSSIBLE(2)`. The entity in question is identified with the special token `[entity]` surrounding it.
Example input and inference:
```
input = "The patient recovered during the night and now denies any [entity] shortness of breath [entity]."
tokenized_input = tokenizer(input, return_tensors="pt")
output = model(**tokenized_input)
import numpy as np
predicted_label = np.argmax(output.logits.detach().numpy()) ## 1 == ABSENT
```
### Cite
When working with the model, please cite our paper as follows:
```bibtex
@inproceedings{van-aken-2021-assertion,
title = "Assertion Detection in Clinical Notes: Medical Language Models to the Rescue?",
author = "van Aken, Betty and
Trajanovska, Ivana and
Siu, Amy and
Mayrdorfer, Manuel and
Budde, Klemens and
Loeser, Alexander",
booktitle = "Proceedings of the Second Workshop on Natural Language Processing for Medical Conversations",
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.nlpmc-1.5",
doi = "10.18653/v1/2021.nlpmc-1.5"
}
``` |