medBIT / README.md
Neuroinformatica's picture
Update README.md
b5c238b verified
metadata
language:
  - it
tags:
  - Biomedical Language Modeling
widget:
  - text: >-
      L'asma allergica è una patologia dell'[MASK] respiratorio causata dalla
      presenza di allergeni responsabili dell'infiammazione dell'albero
      bronchiale.
    example_title: Example 1
  - text: >-
      Il pancreas produce diversi [MASK] molto importanti tra i quali l'insulina
      e il glucagone.
    example_title: Example 2
  - text: >-
      Il GABA è un amminoacido ed è il principale neurotrasmettitore inibitorio
      del [MASK].
    example_title: Example 3
datasets:
  - IVN-RIN/BioBERT_Italian

🤗 + 📚🩺🇮🇹 + 📖🧑‍⚕️ = MedBIT

From this repository you can download the MedBIT (Medical Bert for ITalian) checkpoint.

MedBIT is built on top of BioBIT, further pretrained on a corpus of medical textbooks, either directly written by Italian authors or translated by human professional translators, used in formal medical doctors’ education and specialized training. The size of this corpus amounts to 100 MB of data. These comprehensive collections of medical concepts can impact the encoding of biomedical knowledge in language models, with the advantage of being natively available in Italian, and not being translated. More details in the paper.

Check the full paper for further details, and feel free to contact us if you have some inquiry!