Monero commited on
Commit
f25d922
1 Parent(s): 6d4cd76

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +175 -6
README.md CHANGED
@@ -2,14 +2,185 @@
2
  license: other
3
  datasets:
4
  - ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered
 
 
 
 
 
 
5
  tags:
6
  - uncensored
7
  ---
8
- WizardLM 30b + SuperCOT + Guacano
9
 
10
- This is WizardLM trained with a subset of the dataset - responses that contained alignment / moralizing were removed. The intent is to train a WizardLM that doesn't have alignment built-in, so that alignment (of any sort) can be added separately with for example with a RLHF LoRA.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
- Shout out to the open source AI/ML community, and everyone who helped me out.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
  Note:
15
  An uncensored model has no guardrails.
@@ -17,6 +188,4 @@ You are responsible for anything you do with the model, just as you are responsi
17
  Publishing anything this model generates is the same as publishing it yourself.
18
  You are responsible for the content you publish, and you cannot blame the model any more than you can blame the knife, gun, lighter, or car for what you do with it.
19
 
20
- https://huggingface.co/kaiokendev/SuperCOT-LoRA
21
- https://huggingface.co/timdettmers/guanaco-33b
22
- https://huggingface.co/ehartford/WizardLM-30B-Uncensored/tree/main
 
2
  license: other
3
  datasets:
4
  - ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered
5
+ - kaiokendev/SuperCOT-dataset
6
+ - neulab/conala
7
+ - yahma/alpaca-cleaned
8
+ - QingyiSi/Alpaca-CoT
9
+ - timdettmers/guanaco-33b
10
+ - JosephusCheung/GuanacoDataset
11
  tags:
12
  - uncensored
13
  ---
14
+ <center><h1><b>WizardLM 30b + SuperCOT + Guacano</b></h1></center>
15
 
16
+ <html>
17
+ <head>
18
+ <style>
19
+ table {
20
+ border:1px solid #b3adad;
21
+ border-collapse:collapse;
22
+ padding:5px;
23
+ }
24
+ table th {
25
+ border:1px solid #b3adad;
26
+ padding:5px;
27
+ background: #f0f0f0;
28
+ color: #313030;
29
+ }
30
+ table td {
31
+ border:1px solid #b3adad;
32
+ text-align:center;
33
+ padding:5px;
34
+ background: #ffffff;
35
+ color: #313030;
36
+ }
37
+ </style>
38
+ </head>
39
+ <body>
40
+ <table>
41
+ <thead>
42
+ <tr>
43
+ <th>Model:</th>
44
+ <th>Wikitext2</th>
45
+ <th>Ptb-New</th>
46
+ <th>C4-New</th>
47
+ </tr>
48
+ </thead>
49
+ <tbody>
50
+ <tr>
51
+ <td>WizardLM-30B-Uncensored-Guanaco-SuperCOT-30b</td>
52
+ <td></td>
53
+ <td></td>
54
+ <td></td>
55
+ </tr>
56
+ </tbody>
57
+ </table>
58
+ </body>
59
+ </html>
60
 
61
+ ### Guanaco SuperCOT
62
+ Guanaco SuperCOT is trained with the aim of making LLaMa follow prompts for Langchain better, by infusing chain-of-thought datasets, code explanations and instructions, snippets, logical deductions and Alpaca GPT-4 prompts. It's also an advanced instruction-following language model built on Meta's LLaMA 33B model. Expanding upon the initial 52K dataset from the Alpaca model, an additional 534,530 entries have been incorporated, covering English, Simplified Chinese, Traditional Chinese (Taiwan), Traditional Chinese (Hong Kong), Japanese, Deutsch, and various linguistic and grammatical tasks. This wealth of data enables Guanaco to perform exceptionally well in multilingual environments.
63
+
64
+ It uses a mixture of the following datasets:
65
+
66
+ [https://huggingface.co/datasets/QingyiSi/Alpaca-CoT](https://huggingface.co/datasets/QingyiSi/Alpaca-CoT)
67
+ - Chain of thought QED
68
+ - Chain of thought Aqua
69
+ - CodeAlpaca
70
+
71
+ [https://huggingface.co/datasets/neulab/conala](https://huggingface.co/datasets/neulab/conala)
72
+ - Code snippets
73
+
74
+ [https://huggingface.co/datasets/yahma/alpaca-cleaned](https://huggingface.co/datasets/yahma/alpaca-cleaned)
75
+ - Alpaca GPT4
76
+
77
+ - [https://huggingface.co/datasets/JosephusCheung/GuanacoDataset](https://huggingface.co/datasets/JosephusCheung/GuanacoDataset)
78
+ - Guacano
79
+
80
+ - [https://huggingface.co/timdettmers/guanaco-33b](https://huggingface.co/timdettmers/guanaco-33b)
81
+ - Guacano 33b LoRa
82
+
83
+ - [https://huggingface.co/kaiokendev/SuperCOT-LoRA](https://huggingface.co/kaiokendev/SuperCOT-LoRA)
84
+ - SuperChain-of-Thought LoRa
85
+
86
+ - [https://huggingface.co/ehartford/WizardLM-30B-Uncensored/](https://huggingface.co/ehartford/WizardLM-30B-Uncensored/)
87
+ - WizardLM 30B Uncensored
88
+
89
+ 1\. Prompting
90
+ -------------------------
91
+
92
+ You should prompt the LoRA the same way you would prompt Alpaca or Alpacino.
93
+ The new format is designed to be similar to ChatGPT, allowing for better integration with the Alpaca format and enhancing the overall user experience.
94
+
95
+ Instruction is utilized as a few-shot context to support diverse inputs and responses, making it easier for the model to understand and provide accurate responses to user queries.
96
+
97
+ The format is as follows:
98
+ ```
99
+ ### Instruction:
100
+ User: History User Input
101
+ Assistant: History Assistant Answer
102
+ ### Input:
103
+ System: Knowledge
104
+ User: New User Input
105
+ ### Response:
106
+ New Assistant Answer
107
+ ```
108
+
109
+ This structured format allows for easier tracking of the conversation history and maintaining context throughout a multi-turn dialogue.
110
+ ```
111
+ Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
112
+
113
+ ### Instruction:
114
+ <instruction>
115
+
116
+ ### Input:
117
+ <any additional context. Remove this if it's not neccesary>
118
+
119
+ ### Response:
120
+ <make sure to leave a single new-line here for optimal results>
121
+ ```
122
+
123
+ Remember that with lower parameter sizes, the structure of the prompt becomes more important. The same prompt worded differently can give wildly different answers. Consider using the following suggestion suffixes to improve output quality:
124
+
125
+ - "Think through this step by step"
126
+ - "Let's think about this logically"
127
+ - "Explain your reasoning"
128
+ - "Provide details to support your answer"
129
+ - "Compare and contrast your answer with alternatives"
130
+
131
+ 2\. Role-playing support:
132
+ -------------------------
133
+
134
+ Guanaco now offers advanced role-playing support, similar to Character.AI, in English, Simplified Chinese, Traditional Chinese, Japanese, and Deutsch, making it more versatile for users from different linguistic backgrounds.
135
+
136
+ Users can instruct the model to assume specific roles, historical figures, or fictional characters, as well as personalities based on their input. This allows for more engaging and immersive conversations.
137
+
138
+ The model can use various sources of information to provide knowledge and context for the character's background and behavior, such as encyclopedic entries, first-person narrations, or a list of personality traits.
139
+
140
+ The model will consistently output responses in the format "Character Name: Reply" to maintain the chosen role throughout the conversation, enhancing the user's experience.
141
+
142
+ 3\. Continuation of responses for ongoing topics:
143
+ -------------------------------------------------
144
+
145
+ The Guanaco model can now continue answering questions or discussing topics upon the user's request, making it more adaptable and better suited for extended conversations.
146
+
147
+ The contextual structure consisting of System, Assistant, and User roles allows the model to engage in multi-turn dialogues, maintain context-aware conversations, and provide more coherent responses.
148
+
149
+ The model can now accommodate role specification and character settings, providing a more immersive and tailored conversational experience based on the user's preferences.
150
+
151
+ It is important to remember that Guanaco is a 33B-parameter model, and any knowledge-based content should be considered potentially inaccurate. We strongly recommend providing verifiable sources, such as Wikipedia, for knowledge-based answers. In the absence of sources, it is crucial to inform users of this limitation to prevent the dissemination of false information and to maintain transparency.
152
+
153
+ ### Citations
154
+ Alpaca COT datasets
155
+ ```
156
+ @misc{alpaca-cot,
157
+ author = {Qingyi Si, Zheng Lin },
158
+ school = {Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China},
159
+ title = {Alpaca-CoT: An Instruction Fine-Tuning Platform with Instruction Data Collection and Unified Large Language Models Interface},
160
+ year = {2023},
161
+ publisher = {GitHub},
162
+ journal = {GitHub repository},
163
+ howpublished = {\url{https://github.com/PhoebusSi/alpaca-CoT}},
164
+ }
165
+ ```
166
+ Stanford Alpaca
167
+ ```
168
+ @misc{alpaca,
169
+ author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
170
+ title = {Stanford Alpaca: An Instruction-following LLaMA model},
171
+ year = {2023},
172
+ publisher = {GitHub},
173
+ journal = {GitHub repository},
174
+ howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
175
+ }
176
+ ```
177
+ Google FLAN
178
+ ```
179
+ @inproceedings{weifinetuned,
180
+ title={Finetuned Language Models are Zero-Shot Learners},
181
+ author={Wei, Jason and Bosma, Maarten and Zhao, Vincent and Guu, Kelvin and Yu, Adams Wei and Lester, Brian and Du, Nan and Dai, Andrew M and Le, Quoc V},
182
+ booktitle={International Conference on Learning Representations}
183
+ }
184
 
185
  Note:
186
  An uncensored model has no guardrails.
 
188
  Publishing anything this model generates is the same as publishing it yourself.
189
  You are responsible for the content you publish, and you cannot blame the model any more than you can blame the knife, gun, lighter, or car for what you do with it.
190
 
191
+ ```