chansurgeplus
commited on
Commit
•
3a676b9
1
Parent(s):
85e289b
Fixed a minor typo.
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ The OpenBezoar-HH-RLHF-SFT is an LLM that has been further instruction fine tune
|
|
20 |
|
21 |
### Model Description
|
22 |
|
23 |
-
OpenBezoar-SFT is an LLM that is built upon the OpenLLaMA 3B v2 architecture. Primary purpose of performing SFT on [OpenBezoar-SFT](https://huggingface.co/SurgeGlobal/OpenBezoar-SFT) is to minimize the distribution shift before applying Direct Preference Optimization (DPO) for human preferences alignment. For more information please refer to our paper.
|
24 |
|
25 |
### Model Sources
|
26 |
|
|
|
20 |
|
21 |
### Model Description
|
22 |
|
23 |
+
OpenBezoar-HH-RLHF-SFT is an LLM that is built upon the OpenLLaMA 3B v2 architecture. Primary purpose of performing SFT on [OpenBezoar-SFT](https://huggingface.co/SurgeGlobal/OpenBezoar-SFT) is to minimize the distribution shift before applying Direct Preference Optimization (DPO) for human preferences alignment. For more information please refer to our paper.
|
24 |
|
25 |
### Model Sources
|
26 |
|