YiDuo1999 commited on
Commit
deb94f9
1 Parent(s): 0e81185

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -1
README.md CHANGED
@@ -43,4 +43,41 @@ the type of answer is :
43
  ```
44
  HIV, or Human Immunodeficiency Virus, is a retrovirus that primarily infects cells of the human immune system, particularly CD4+ T cells, which are crucial to the body's ability to fight off infection. HIV infection can lead to AIDS, or Acquired Immune Deficiency Syndrome, a condition that causes severe damage to the immune system and makes individuals more susceptible to life-threatening infections. HIV
45
  is transmitted through sexual contact, sharing needles, or through mother-to-child transmission during pregnancy.
46
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ```
44
  HIV, or Human Immunodeficiency Virus, is a retrovirus that primarily infects cells of the human immune system, particularly CD4+ T cells, which are crucial to the body's ability to fight off infection. HIV infection can lead to AIDS, or Acquired Immune Deficiency Syndrome, a condition that causes severe damage to the immune system and makes individuals more susceptible to life-threatening infections. HIV
45
  is transmitted through sexual contact, sharing needles, or through mother-to-child transmission during pregnancy.
46
+ ```
47
+
48
+ ## 🏆 Evaluation
49
+ For question-answering tasks, we have
50
+
51
+ | Model | MMLU-Medical | PubMedQA | MedMCQA | MedQA-4-Option | Avg |
52
+ |:--------------------------------|:--------------|:----------|:---------|:----------------|:------|
53
+ | Mistral-7B-instruct | 55.8 | 17.8 | 40.2 | 41.1 | 37.5 |
54
+ | Zephyr-7B-instruct-β | 63.3 | 46.0 | 43.0 | 48.5 | 48.7 |
55
+ | PMC-Llama-7B | 59.7 | 59.2 | 57.6 | 49.2 | 53.6 |
56
+ | Medalpaca-13B | 55.2 | 50.4 | 21.2 | 20.2 | 36.7 |
57
+ | AlpaCare-13B | 60.2 | 53.8 | 38.5 | 30.4 | 45.7 |
58
+ | BioMedGPT-LM 7B | 52.0 | 58.6 | 34.9 | 39.3 | 46.2 |
59
+ | Me-Llama-13B | - | 70.0 | 44.9 | 42.7 | - |
60
+ | Llama-3-8B instruct | 82.0 | 74.6 | 57.1 | 60.3 | 68.5 |
61
+ | JSL-Med-Sft-Llama-3-8B | 83.0 | 75.4 | 57.5 | 74.8 | 72.7 |
62
+ | GPT-3.5-turbo-1106 | 74.0 | 72.6 | 34.9 | 39.3 | 60.6 |
63
+ | GPT-4 | 85.5 | 69.2 | 69.5 | 83.9 | 77.0 |
64
+ | Llama-3-physician-8B instruct (ours) | 80.0 | 76.0 | 80.2 | 60.3 | 74.1 |
65
+
66
+ For Medical claasification, relation extraction, natural language inference, summarization tasks, we have
67
+
68
+
69
+ | Task type | Classification | Relation extraction | Natural Language Inference | Summarization |
70
+ |:--------------------------------|:----------------|:----------------------|:----------------------------|:---------------|
71
+ | Datasets | HOC | DDI-2013 | BioNLI | MIMIC-CXR |
72
+ | Mistral-7B-instruct | 35.8 | 14.1 | 16.7 | 12.5 |
73
+ | Zephyr-7B-instruct-β | 26.1 | 19.4 | 19.9 | 10.5 |
74
+ | PMC-Llama-7B | 18.4 | 14.7 | 15.9 | 13.9 |
75
+ | Medalpaca-13B | 24.6 | 5.8 | 16.4 | 1.0 |
76
+ | AlpaCare-13B | 26.7 | 11.0 | 17.0 | 13.4 |
77
+ | BioMedGPT-LM 7B | 23.4 | 15.5 | 17.9 | 6.2 |
78
+ | Me-Llama-13B | 33.5 | 21.4 | 19.5 | 40.0 |
79
+ | JSL-Med-Sft-Llama-3-8B | 25.6 | 19.7 | 16.6 | 13.8 |
80
+ | Llama-3-8B instruct | 31.0 | 15.1 | 18.8 | 10.3 |
81
+ | GPT-3.5-turbo-1106 | 54.5 | 21.6 | 31.7 | 13.5 |
82
+ | GPT-4 | 60.2 | 29.2 | 57.8 | 15.2 |
83
+ | Llama-3-physician-8B instruct (ours) | 78.9 | 33.6 | 76.2 | 37.7 |