--- language: de tags: - negation - speculation - cross-lingual - bert - clinical/medical - text-classification extra_gated_prompt: >- You agree to not use the model to conduct experiments that cause harm to human subjects, i.e. attempting to misuse clinical data or re-identify any sensible data. extra_gated_fields: Company: text Country: text Name: text Email: text I agree to use this model for non-commercial use ONLY: checkbox pipeline_tag: text-classification --- # FactualMedBERT-DE: Clinical Factuality Detection BERT model for German language ## Model description FactualMedBERT-DE is the first pre-trained language model to address factuality/assertion detection problem in German clinical texts (primarily discharge summaries). It is introduced in the paper [Factuality Detection using Machine Translation - a Use Case for German Clinical Text](https://arxiv.org/abs/2308.08827). The model classifies tagged medical conditions based on their factuality value. It can support label classification of `Affirmed`, `Negated` and `Possible`. It was intialized from [smanjil/German-MedBERT](https://huggingface.co/smanjil/German-MedBERT) German language model and was trained on a translated subset data of [the 2010 i2b2/VA assertion challenege](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168320/). ## How to use the model - You might need to authenticate and login before being able to download the model (see more [here](https://huggingface.co/docs/huggingface_hub/quick-start)) - Get the model using the transformers library ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("binsumait/factual-med-bert-de") model = AutoModelForSequenceClassification.from_pretrained("binsumait/factual-med-bert-de") ``` - Predict an instance by pre-tagging the factuality target (ideally a medical condition) with `[unused1]` special token: ```python from transformers import TextClassificationPipeline instance = "Der Patient hat vielleicht [unused1] Fieber [unused1]" factuality_pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer) print(factuality_pipeline(instance)) ``` which should output: `[{'label': 'possible', 'score': 0.9744388461112976}]` ## Cite If you use our model, please cite your paper as follows: ```bibtex @inproceedings{bin_sumait_2023, title={Factuality Detection using Machine Translation - a Use Case for German Clinical Text}, author={Bin Sumait, Mohammed and Gabryszak, Aleksandra and Hennig, Leonhard and Roller, Roland}, booktitle={Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)}, year={2023} } ```