maximegmd commited on
Commit
a131cd2
1 Parent(s): 006a6d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -171,13 +171,15 @@ If you use Internist.ai 7b, please cite us:
171
  author = {Griot, Maxime and Hemptinne, Coralie and Vanderdonckt, Jean and Yuksel, Demet},
172
  title = "{Impact of high-quality, mixed-domain data on the performance of medical language models}",
173
  journal = {Journal of the American Medical Informatics Association},
174
- pages = {ocae120},
 
 
175
  year = {2024},
176
  month = {05},
177
  abstract = "{To optimize the training strategy of large language models for medical applications, focusing on creating clinically relevant systems that efficiently integrate into healthcare settings, while ensuring high standards of accuracy and reliability.We curated a comprehensive collection of high-quality, domain-specific data and used it to train several models, each with different subsets of this data. These models were rigorously evaluated against standard medical benchmarks, such as the USMLE, to measure their performance. Furthermore, for a thorough effectiveness assessment, they were compared with other state-of-the-art medical models of comparable size.The models trained with a mix of high-quality, domain-specific, and general data showed superior performance over those trained on larger, less clinically relevant datasets (P \\< .001). Our 7-billion-parameter model Med5 scores 60.5\\% on MedQA, outperforming the previous best of 49.3\\% from comparable models, and becomes the first of its size to achieve a passing score on the USMLE. Additionally, this model retained its proficiency in general domain tasks, comparable to state-of-the-art general domain models of similar size.Our findings underscore the importance of integrating high-quality, domain-specific data in training large language models for medical purposes. The balanced approach between specialized and general data significantly enhances the model’s clinical relevance and performance.This study sets a new standard in medical language models, proving that a strategically trained, smaller model can outperform larger ones in clinical relevance and general proficiency, highlighting the importance of data quality and expert curation in generative artificial intelligence for healthcare applications.}",
178
  issn = {1527-974X},
179
  doi = {10.1093/jamia/ocae120},
180
  url = {https://doi.org/10.1093/jamia/ocae120},
181
- eprint = {https://academic.oup.com/jamia/advance-article-pdf/doi/10.1093/jamia/ocae120/57845903/ocae120.pdf},
182
  }
183
  ```
 
171
  author = {Griot, Maxime and Hemptinne, Coralie and Vanderdonckt, Jean and Yuksel, Demet},
172
  title = "{Impact of high-quality, mixed-domain data on the performance of medical language models}",
173
  journal = {Journal of the American Medical Informatics Association},
174
+ volume = {31},
175
+ number = {9},
176
+ pages = {1875-1883},
177
  year = {2024},
178
  month = {05},
179
  abstract = "{To optimize the training strategy of large language models for medical applications, focusing on creating clinically relevant systems that efficiently integrate into healthcare settings, while ensuring high standards of accuracy and reliability.We curated a comprehensive collection of high-quality, domain-specific data and used it to train several models, each with different subsets of this data. These models were rigorously evaluated against standard medical benchmarks, such as the USMLE, to measure their performance. Furthermore, for a thorough effectiveness assessment, they were compared with other state-of-the-art medical models of comparable size.The models trained with a mix of high-quality, domain-specific, and general data showed superior performance over those trained on larger, less clinically relevant datasets (P \\< .001). Our 7-billion-parameter model Med5 scores 60.5\\% on MedQA, outperforming the previous best of 49.3\\% from comparable models, and becomes the first of its size to achieve a passing score on the USMLE. Additionally, this model retained its proficiency in general domain tasks, comparable to state-of-the-art general domain models of similar size.Our findings underscore the importance of integrating high-quality, domain-specific data in training large language models for medical purposes. The balanced approach between specialized and general data significantly enhances the model’s clinical relevance and performance.This study sets a new standard in medical language models, proving that a strategically trained, smaller model can outperform larger ones in clinical relevance and general proficiency, highlighting the importance of data quality and expert curation in generative artificial intelligence for healthcare applications.}",
180
  issn = {1527-974X},
181
  doi = {10.1093/jamia/ocae120},
182
  url = {https://doi.org/10.1093/jamia/ocae120},
183
+ eprint = {https://academic.oup.com/jamia/article-pdf/31/9/1875/58868289/ocae120.pdf},
184
  }
185
  ```