Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ language:
|
|
20 |
# TL; DR
|
21 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550c4f27bbfce1878f5f280/vrQl8D8FV3vqUJYbPgsiG.png)
|
22 |
|
23 |
-
Janus is a model trained using [Mistral-7B-v0.2](https://huggingface.co/mistral-community/Mistral-7B-v0.2) as its base model. Janus has been trained on [Multifaceted Collection](), a preference dataset
|
24 |
|
25 |
# Model Details
|
26 |
|
@@ -30,11 +30,21 @@ Janus is a model trained using [Mistral-7B-v0.2](https://huggingface.co/mistral-
|
|
30 |
- **Language(s) (NLP):** English
|
31 |
- **License:** Apache 2.0
|
32 |
- **Related Models:** [Janus-66k-7B]() [Janus-DPO-7B](), [Janus-ORPO-7B](), [Janus-RM-7B]()
|
|
|
33 |
- **Resources for more information:**
|
34 |
- [Research paper]()
|
35 |
- [GitHub Repo](https://github.com/kaistAI/Janus)
|
36 |
|
37 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
|
39 |
The following hyperparameters were used during training:
|
40 |
- learning_rate: 5e-06
|
@@ -51,9 +61,18 @@ The following hyperparameters were used during training:
|
|
51 |
- lr_scheduler_warmup_steps: 10
|
52 |
- num_epochs: 4
|
53 |
|
54 |
-
|
55 |
|
56 |
- Transformers 4.40.0.dev0
|
57 |
- Pytorch 2.2.2
|
58 |
- Datasets 2.18.0
|
59 |
-
- Tokenizers 0.15.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
# TL; DR
|
21 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550c4f27bbfce1878f5f280/vrQl8D8FV3vqUJYbPgsiG.png)
|
22 |
|
23 |
+
Janus is a model trained using [Mistral-7B-v0.2](https://huggingface.co/mistral-community/Mistral-7B-v0.2) as its base model. Janus has been trained on [Multifaceted Collection](), a preference dataset containing 192k unique system messages for aligning LLMs to diverse human preferences. Janus not only excels at generating personalized responses that cater to various human preferences but is also adept at producing responses that are generally preferred for being helpful and harmless.
|
24 |
|
25 |
# Model Details
|
26 |
|
|
|
30 |
- **Language(s) (NLP):** English
|
31 |
- **License:** Apache 2.0
|
32 |
- **Related Models:** [Janus-66k-7B]() [Janus-DPO-7B](), [Janus-ORPO-7B](), [Janus-RM-7B]()
|
33 |
+
- **Training Data**: [Multifaceted Collection]()
|
34 |
- **Resources for more information:**
|
35 |
- [Research paper]()
|
36 |
- [GitHub Repo](https://github.com/kaistAI/Janus)
|
37 |
|
38 |
+
# Usage
|
39 |
+
Janus is a model generalized for various system messages, allowing users to control the model's response by inputting the desired system message. The input prompt format is as follows:
|
40 |
+
```
|
41 |
+
[INST]{system_message}\n{instruction}[/INST]
|
42 |
+
```
|
43 |
+
Additionally, an example of the inference code applying this is as follows:
|
44 |
+
```
|
45 |
+
```
|
46 |
+
# Training Details
|
47 |
+
## Training hyperparameters
|
48 |
|
49 |
The following hyperparameters were used during training:
|
50 |
- learning_rate: 5e-06
|
|
|
61 |
- lr_scheduler_warmup_steps: 10
|
62 |
- num_epochs: 4
|
63 |
|
64 |
+
## Framework versions
|
65 |
|
66 |
- Transformers 4.40.0.dev0
|
67 |
- Pytorch 2.2.2
|
68 |
- Datasets 2.18.0
|
69 |
+
- Tokenizers 0.15.0
|
70 |
+
|
71 |
+
# Citation
|
72 |
+
|
73 |
+
If you find the following model helpful, please consider citing our paper!
|
74 |
+
|
75 |
+
**BibTeX:**
|
76 |
+
|
77 |
+
```bibtex
|
78 |
+
```
|