language: en
tags:
- bert
- medical
- clinical
thumbnail: https://core.app.datexis.com/static/paper.png
CORe Model - BioBERT + Clinical Outcome Pre-Training
Model description
The CORe (Clinical Outcome Representations) model is introduced in the paper Clinical Outcome Predictions from Admission Notes using Self-Supervised Knowledge Integration. It is based on BioBERT and further pre-trained on clinical notes, disease descriptions and medical articles with a specialised Clinical Outcome Pre-Training objective.
How to use CORe
You can load the model via the transformers library:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bvanaken/CORe-clinical-outcome-biobert-v1")
model = AutoModel.from_pretrained("bvanaken/CORe-clinical-outcome-biobert-v1")
From there, you can fine-tune it on clinical tasks that benefit from patient outcome knowledge.
Pre-Training Data
The model is based on BioBERT pre-trained on PubMed data. The Clinical Outcome Pre-Training included discharge summaries from the MIMIC III training set (specified here), medical transcriptions from MTSamples and clinical notes from the i2b2 challenges 2006-2012. It further includes ~10k case reports from PubMed Central (PMC), disease articles from Wikipedia and article sections from the MedQuAd dataset extracted from NIH websites.
More Information
For all the details about CORe and contact info, please visit CORe.app.datexis.com.
Cite
@inproceedings{vanaken21,
author = {Betty van Aken and
Jens-Michalis Papaioannou and
Manuel Mayrdorfer and
Klemens Budde and
Felix A. Gers and
Alexander Löser},
title = {Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration},
booktitle = {Proceedings of the 16th Conference of the European Chapter of the
Association for Computational Linguistics: Main Volume, {EACL} 2021,
Online, April 19 - 23, 2021},
publisher = {Association for Computational Linguistics},
year = {2021},
}