John David Pressman commited on
Commit
972c0d6
1 Parent(s): 42c9409

Add BigVAE

Browse files
README.md CHANGED
@@ -1,3 +1,996 @@
1
  ---
2
- license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: peft
3
  ---
4
+ ## Training procedure
5
+
6
+
7
+ The following `bitsandbytes` quantization config was used during training:
8
+ - quant_method: bitsandbytes
9
+ - load_in_8bit: False
10
+ - load_in_4bit: True
11
+ - llm_int8_threshold: 6.0
12
+ - llm_int8_skip_modules: None
13
+ - llm_int8_enable_fp32_cpu_offload: False
14
+ - llm_int8_has_fp16_weight: False
15
+ - bnb_4bit_quant_type: nf4
16
+ - bnb_4bit_use_double_quant: True
17
+ - bnb_4bit_compute_dtype: bfloat16
18
+
19
+ The following `bitsandbytes` quantization config was used during training:
20
+ - quant_method: bitsandbytes
21
+ - load_in_8bit: False
22
+ - load_in_4bit: True
23
+ - llm_int8_threshold: 6.0
24
+ - llm_int8_skip_modules: None
25
+ - llm_int8_enable_fp32_cpu_offload: False
26
+ - llm_int8_has_fp16_weight: False
27
+ - bnb_4bit_quant_type: nf4
28
+ - bnb_4bit_use_double_quant: True
29
+ - bnb_4bit_compute_dtype: bfloat16
30
+
31
+ The following `bitsandbytes` quantization config was used during training:
32
+ - quant_method: bitsandbytes
33
+ - load_in_8bit: False
34
+ - load_in_4bit: True
35
+ - llm_int8_threshold: 6.0
36
+ - llm_int8_skip_modules: None
37
+ - llm_int8_enable_fp32_cpu_offload: False
38
+ - llm_int8_has_fp16_weight: False
39
+ - bnb_4bit_quant_type: nf4
40
+ - bnb_4bit_use_double_quant: True
41
+ - bnb_4bit_compute_dtype: bfloat16
42
+
43
+ The following `bitsandbytes` quantization config was used during training:
44
+ - quant_method: bitsandbytes
45
+ - load_in_8bit: False
46
+ - load_in_4bit: True
47
+ - llm_int8_threshold: 6.0
48
+ - llm_int8_skip_modules: None
49
+ - llm_int8_enable_fp32_cpu_offload: False
50
+ - llm_int8_has_fp16_weight: False
51
+ - bnb_4bit_quant_type: nf4
52
+ - bnb_4bit_use_double_quant: True
53
+ - bnb_4bit_compute_dtype: bfloat16
54
+
55
+ The following `bitsandbytes` quantization config was used during training:
56
+ - quant_method: bitsandbytes
57
+ - load_in_8bit: False
58
+ - load_in_4bit: True
59
+ - llm_int8_threshold: 6.0
60
+ - llm_int8_skip_modules: None
61
+ - llm_int8_enable_fp32_cpu_offload: False
62
+ - llm_int8_has_fp16_weight: False
63
+ - bnb_4bit_quant_type: nf4
64
+ - bnb_4bit_use_double_quant: True
65
+ - bnb_4bit_compute_dtype: bfloat16
66
+
67
+ The following `bitsandbytes` quantization config was used during training:
68
+ - quant_method: bitsandbytes
69
+ - load_in_8bit: False
70
+ - load_in_4bit: True
71
+ - llm_int8_threshold: 6.0
72
+ - llm_int8_skip_modules: None
73
+ - llm_int8_enable_fp32_cpu_offload: False
74
+ - llm_int8_has_fp16_weight: False
75
+ - bnb_4bit_quant_type: nf4
76
+ - bnb_4bit_use_double_quant: True
77
+ - bnb_4bit_compute_dtype: bfloat16
78
+
79
+ The following `bitsandbytes` quantization config was used during training:
80
+ - quant_method: bitsandbytes
81
+ - load_in_8bit: False
82
+ - load_in_4bit: True
83
+ - llm_int8_threshold: 6.0
84
+ - llm_int8_skip_modules: None
85
+ - llm_int8_enable_fp32_cpu_offload: False
86
+ - llm_int8_has_fp16_weight: False
87
+ - bnb_4bit_quant_type: nf4
88
+ - bnb_4bit_use_double_quant: True
89
+ - bnb_4bit_compute_dtype: bfloat16
90
+
91
+ The following `bitsandbytes` quantization config was used during training:
92
+ - quant_method: bitsandbytes
93
+ - load_in_8bit: False
94
+ - load_in_4bit: True
95
+ - llm_int8_threshold: 6.0
96
+ - llm_int8_skip_modules: None
97
+ - llm_int8_enable_fp32_cpu_offload: False
98
+ - llm_int8_has_fp16_weight: False
99
+ - bnb_4bit_quant_type: nf4
100
+ - bnb_4bit_use_double_quant: True
101
+ - bnb_4bit_compute_dtype: bfloat16
102
+
103
+ The following `bitsandbytes` quantization config was used during training:
104
+ - quant_method: bitsandbytes
105
+ - load_in_8bit: False
106
+ - load_in_4bit: True
107
+ - llm_int8_threshold: 6.0
108
+ - llm_int8_skip_modules: None
109
+ - llm_int8_enable_fp32_cpu_offload: False
110
+ - llm_int8_has_fp16_weight: False
111
+ - bnb_4bit_quant_type: nf4
112
+ - bnb_4bit_use_double_quant: True
113
+ - bnb_4bit_compute_dtype: bfloat16
114
+
115
+ The following `bitsandbytes` quantization config was used during training:
116
+ - quant_method: bitsandbytes
117
+ - load_in_8bit: False
118
+ - load_in_4bit: True
119
+ - llm_int8_threshold: 6.0
120
+ - llm_int8_skip_modules: None
121
+ - llm_int8_enable_fp32_cpu_offload: False
122
+ - llm_int8_has_fp16_weight: False
123
+ - bnb_4bit_quant_type: nf4
124
+ - bnb_4bit_use_double_quant: True
125
+ - bnb_4bit_compute_dtype: bfloat16
126
+
127
+ The following `bitsandbytes` quantization config was used during training:
128
+ - quant_method: bitsandbytes
129
+ - load_in_8bit: False
130
+ - load_in_4bit: True
131
+ - llm_int8_threshold: 6.0
132
+ - llm_int8_skip_modules: None
133
+ - llm_int8_enable_fp32_cpu_offload: False
134
+ - llm_int8_has_fp16_weight: False
135
+ - bnb_4bit_quant_type: nf4
136
+ - bnb_4bit_use_double_quant: True
137
+ - bnb_4bit_compute_dtype: bfloat16
138
+
139
+ The following `bitsandbytes` quantization config was used during training:
140
+ - quant_method: bitsandbytes
141
+ - load_in_8bit: False
142
+ - load_in_4bit: True
143
+ - llm_int8_threshold: 6.0
144
+ - llm_int8_skip_modules: None
145
+ - llm_int8_enable_fp32_cpu_offload: False
146
+ - llm_int8_has_fp16_weight: False
147
+ - bnb_4bit_quant_type: nf4
148
+ - bnb_4bit_use_double_quant: True
149
+ - bnb_4bit_compute_dtype: bfloat16
150
+
151
+ The following `bitsandbytes` quantization config was used during training:
152
+ - quant_method: bitsandbytes
153
+ - load_in_8bit: False
154
+ - load_in_4bit: True
155
+ - llm_int8_threshold: 6.0
156
+ - llm_int8_skip_modules: None
157
+ - llm_int8_enable_fp32_cpu_offload: False
158
+ - llm_int8_has_fp16_weight: False
159
+ - bnb_4bit_quant_type: nf4
160
+ - bnb_4bit_use_double_quant: True
161
+ - bnb_4bit_compute_dtype: bfloat16
162
+
163
+ The following `bitsandbytes` quantization config was used during training:
164
+ - quant_method: bitsandbytes
165
+ - load_in_8bit: False
166
+ - load_in_4bit: True
167
+ - llm_int8_threshold: 6.0
168
+ - llm_int8_skip_modules: None
169
+ - llm_int8_enable_fp32_cpu_offload: False
170
+ - llm_int8_has_fp16_weight: False
171
+ - bnb_4bit_quant_type: nf4
172
+ - bnb_4bit_use_double_quant: True
173
+ - bnb_4bit_compute_dtype: bfloat16
174
+
175
+ The following `bitsandbytes` quantization config was used during training:
176
+ - quant_method: bitsandbytes
177
+ - load_in_8bit: False
178
+ - load_in_4bit: True
179
+ - llm_int8_threshold: 6.0
180
+ - llm_int8_skip_modules: None
181
+ - llm_int8_enable_fp32_cpu_offload: False
182
+ - llm_int8_has_fp16_weight: False
183
+ - bnb_4bit_quant_type: nf4
184
+ - bnb_4bit_use_double_quant: True
185
+ - bnb_4bit_compute_dtype: bfloat16
186
+
187
+ The following `bitsandbytes` quantization config was used during training:
188
+ - quant_method: bitsandbytes
189
+ - load_in_8bit: False
190
+ - load_in_4bit: True
191
+ - llm_int8_threshold: 6.0
192
+ - llm_int8_skip_modules: None
193
+ - llm_int8_enable_fp32_cpu_offload: False
194
+ - llm_int8_has_fp16_weight: False
195
+ - bnb_4bit_quant_type: nf4
196
+ - bnb_4bit_use_double_quant: True
197
+ - bnb_4bit_compute_dtype: bfloat16
198
+
199
+ The following `bitsandbytes` quantization config was used during training:
200
+ - quant_method: bitsandbytes
201
+ - load_in_8bit: False
202
+ - load_in_4bit: True
203
+ - llm_int8_threshold: 6.0
204
+ - llm_int8_skip_modules: None
205
+ - llm_int8_enable_fp32_cpu_offload: False
206
+ - llm_int8_has_fp16_weight: False
207
+ - bnb_4bit_quant_type: nf4
208
+ - bnb_4bit_use_double_quant: True
209
+ - bnb_4bit_compute_dtype: bfloat16
210
+
211
+ The following `bitsandbytes` quantization config was used during training:
212
+ - quant_method: bitsandbytes
213
+ - load_in_8bit: False
214
+ - load_in_4bit: True
215
+ - llm_int8_threshold: 6.0
216
+ - llm_int8_skip_modules: None
217
+ - llm_int8_enable_fp32_cpu_offload: False
218
+ - llm_int8_has_fp16_weight: False
219
+ - bnb_4bit_quant_type: nf4
220
+ - bnb_4bit_use_double_quant: True
221
+ - bnb_4bit_compute_dtype: bfloat16
222
+
223
+ The following `bitsandbytes` quantization config was used during training:
224
+ - quant_method: bitsandbytes
225
+ - load_in_8bit: False
226
+ - load_in_4bit: True
227
+ - llm_int8_threshold: 6.0
228
+ - llm_int8_skip_modules: None
229
+ - llm_int8_enable_fp32_cpu_offload: False
230
+ - llm_int8_has_fp16_weight: False
231
+ - bnb_4bit_quant_type: nf4
232
+ - bnb_4bit_use_double_quant: True
233
+ - bnb_4bit_compute_dtype: bfloat16
234
+
235
+ The following `bitsandbytes` quantization config was used during training:
236
+ - quant_method: bitsandbytes
237
+ - load_in_8bit: False
238
+ - load_in_4bit: True
239
+ - llm_int8_threshold: 6.0
240
+ - llm_int8_skip_modules: None
241
+ - llm_int8_enable_fp32_cpu_offload: False
242
+ - llm_int8_has_fp16_weight: False
243
+ - bnb_4bit_quant_type: nf4
244
+ - bnb_4bit_use_double_quant: True
245
+ - bnb_4bit_compute_dtype: bfloat16
246
+
247
+ The following `bitsandbytes` quantization config was used during training:
248
+ - quant_method: bitsandbytes
249
+ - load_in_8bit: False
250
+ - load_in_4bit: True
251
+ - llm_int8_threshold: 6.0
252
+ - llm_int8_skip_modules: None
253
+ - llm_int8_enable_fp32_cpu_offload: False
254
+ - llm_int8_has_fp16_weight: False
255
+ - bnb_4bit_quant_type: nf4
256
+ - bnb_4bit_use_double_quant: True
257
+ - bnb_4bit_compute_dtype: bfloat16
258
+
259
+ The following `bitsandbytes` quantization config was used during training:
260
+ - quant_method: bitsandbytes
261
+ - load_in_8bit: False
262
+ - load_in_4bit: True
263
+ - llm_int8_threshold: 6.0
264
+ - llm_int8_skip_modules: None
265
+ - llm_int8_enable_fp32_cpu_offload: False
266
+ - llm_int8_has_fp16_weight: False
267
+ - bnb_4bit_quant_type: nf4
268
+ - bnb_4bit_use_double_quant: True
269
+ - bnb_4bit_compute_dtype: bfloat16
270
+
271
+ The following `bitsandbytes` quantization config was used during training:
272
+ - quant_method: bitsandbytes
273
+ - load_in_8bit: False
274
+ - load_in_4bit: True
275
+ - llm_int8_threshold: 6.0
276
+ - llm_int8_skip_modules: None
277
+ - llm_int8_enable_fp32_cpu_offload: False
278
+ - llm_int8_has_fp16_weight: False
279
+ - bnb_4bit_quant_type: nf4
280
+ - bnb_4bit_use_double_quant: True
281
+ - bnb_4bit_compute_dtype: bfloat16
282
+
283
+ The following `bitsandbytes` quantization config was used during training:
284
+ - quant_method: bitsandbytes
285
+ - load_in_8bit: False
286
+ - load_in_4bit: True
287
+ - llm_int8_threshold: 6.0
288
+ - llm_int8_skip_modules: None
289
+ - llm_int8_enable_fp32_cpu_offload: False
290
+ - llm_int8_has_fp16_weight: False
291
+ - bnb_4bit_quant_type: nf4
292
+ - bnb_4bit_use_double_quant: True
293
+ - bnb_4bit_compute_dtype: bfloat16
294
+
295
+ The following `bitsandbytes` quantization config was used during training:
296
+ - quant_method: bitsandbytes
297
+ - load_in_8bit: False
298
+ - load_in_4bit: True
299
+ - llm_int8_threshold: 6.0
300
+ - llm_int8_skip_modules: None
301
+ - llm_int8_enable_fp32_cpu_offload: False
302
+ - llm_int8_has_fp16_weight: False
303
+ - bnb_4bit_quant_type: nf4
304
+ - bnb_4bit_use_double_quant: True
305
+ - bnb_4bit_compute_dtype: bfloat16
306
+
307
+ The following `bitsandbytes` quantization config was used during training:
308
+ - quant_method: bitsandbytes
309
+ - load_in_8bit: False
310
+ - load_in_4bit: True
311
+ - llm_int8_threshold: 6.0
312
+ - llm_int8_skip_modules: None
313
+ - llm_int8_enable_fp32_cpu_offload: False
314
+ - llm_int8_has_fp16_weight: False
315
+ - bnb_4bit_quant_type: nf4
316
+ - bnb_4bit_use_double_quant: True
317
+ - bnb_4bit_compute_dtype: bfloat16
318
+
319
+ The following `bitsandbytes` quantization config was used during training:
320
+ - quant_method: bitsandbytes
321
+ - load_in_8bit: False
322
+ - load_in_4bit: True
323
+ - llm_int8_threshold: 6.0
324
+ - llm_int8_skip_modules: None
325
+ - llm_int8_enable_fp32_cpu_offload: False
326
+ - llm_int8_has_fp16_weight: False
327
+ - bnb_4bit_quant_type: nf4
328
+ - bnb_4bit_use_double_quant: True
329
+ - bnb_4bit_compute_dtype: bfloat16
330
+
331
+ The following `bitsandbytes` quantization config was used during training:
332
+ - quant_method: bitsandbytes
333
+ - load_in_8bit: False
334
+ - load_in_4bit: True
335
+ - llm_int8_threshold: 6.0
336
+ - llm_int8_skip_modules: None
337
+ - llm_int8_enable_fp32_cpu_offload: False
338
+ - llm_int8_has_fp16_weight: False
339
+ - bnb_4bit_quant_type: nf4
340
+ - bnb_4bit_use_double_quant: True
341
+ - bnb_4bit_compute_dtype: bfloat16
342
+
343
+ The following `bitsandbytes` quantization config was used during training:
344
+ - quant_method: bitsandbytes
345
+ - load_in_8bit: False
346
+ - load_in_4bit: True
347
+ - llm_int8_threshold: 6.0
348
+ - llm_int8_skip_modules: None
349
+ - llm_int8_enable_fp32_cpu_offload: False
350
+ - llm_int8_has_fp16_weight: False
351
+ - bnb_4bit_quant_type: nf4
352
+ - bnb_4bit_use_double_quant: True
353
+ - bnb_4bit_compute_dtype: bfloat16
354
+
355
+ The following `bitsandbytes` quantization config was used during training:
356
+ - quant_method: bitsandbytes
357
+ - load_in_8bit: False
358
+ - load_in_4bit: True
359
+ - llm_int8_threshold: 6.0
360
+ - llm_int8_skip_modules: None
361
+ - llm_int8_enable_fp32_cpu_offload: False
362
+ - llm_int8_has_fp16_weight: False
363
+ - bnb_4bit_quant_type: nf4
364
+ - bnb_4bit_use_double_quant: True
365
+ - bnb_4bit_compute_dtype: bfloat16
366
+
367
+ The following `bitsandbytes` quantization config was used during training:
368
+ - quant_method: bitsandbytes
369
+ - load_in_8bit: False
370
+ - load_in_4bit: True
371
+ - llm_int8_threshold: 6.0
372
+ - llm_int8_skip_modules: None
373
+ - llm_int8_enable_fp32_cpu_offload: False
374
+ - llm_int8_has_fp16_weight: False
375
+ - bnb_4bit_quant_type: nf4
376
+ - bnb_4bit_use_double_quant: True
377
+ - bnb_4bit_compute_dtype: bfloat16
378
+
379
+ The following `bitsandbytes` quantization config was used during training:
380
+ - quant_method: bitsandbytes
381
+ - load_in_8bit: False
382
+ - load_in_4bit: True
383
+ - llm_int8_threshold: 6.0
384
+ - llm_int8_skip_modules: None
385
+ - llm_int8_enable_fp32_cpu_offload: False
386
+ - llm_int8_has_fp16_weight: False
387
+ - bnb_4bit_quant_type: nf4
388
+ - bnb_4bit_use_double_quant: True
389
+ - bnb_4bit_compute_dtype: bfloat16
390
+
391
+ The following `bitsandbytes` quantization config was used during training:
392
+ - quant_method: bitsandbytes
393
+ - load_in_8bit: False
394
+ - load_in_4bit: True
395
+ - llm_int8_threshold: 6.0
396
+ - llm_int8_skip_modules: None
397
+ - llm_int8_enable_fp32_cpu_offload: False
398
+ - llm_int8_has_fp16_weight: False
399
+ - bnb_4bit_quant_type: nf4
400
+ - bnb_4bit_use_double_quant: True
401
+ - bnb_4bit_compute_dtype: bfloat16
402
+
403
+ The following `bitsandbytes` quantization config was used during training:
404
+ - quant_method: bitsandbytes
405
+ - load_in_8bit: False
406
+ - load_in_4bit: True
407
+ - llm_int8_threshold: 6.0
408
+ - llm_int8_skip_modules: None
409
+ - llm_int8_enable_fp32_cpu_offload: False
410
+ - llm_int8_has_fp16_weight: False
411
+ - bnb_4bit_quant_type: nf4
412
+ - bnb_4bit_use_double_quant: True
413
+ - bnb_4bit_compute_dtype: bfloat16
414
+
415
+ The following `bitsandbytes` quantization config was used during training:
416
+ - quant_method: bitsandbytes
417
+ - load_in_8bit: False
418
+ - load_in_4bit: True
419
+ - llm_int8_threshold: 6.0
420
+ - llm_int8_skip_modules: None
421
+ - llm_int8_enable_fp32_cpu_offload: False
422
+ - llm_int8_has_fp16_weight: False
423
+ - bnb_4bit_quant_type: nf4
424
+ - bnb_4bit_use_double_quant: True
425
+ - bnb_4bit_compute_dtype: bfloat16
426
+
427
+ The following `bitsandbytes` quantization config was used during training:
428
+ - quant_method: bitsandbytes
429
+ - load_in_8bit: False
430
+ - load_in_4bit: True
431
+ - llm_int8_threshold: 6.0
432
+ - llm_int8_skip_modules: None
433
+ - llm_int8_enable_fp32_cpu_offload: False
434
+ - llm_int8_has_fp16_weight: False
435
+ - bnb_4bit_quant_type: nf4
436
+ - bnb_4bit_use_double_quant: True
437
+ - bnb_4bit_compute_dtype: bfloat16
438
+
439
+ The following `bitsandbytes` quantization config was used during training:
440
+ - quant_method: bitsandbytes
441
+ - load_in_8bit: False
442
+ - load_in_4bit: True
443
+ - llm_int8_threshold: 6.0
444
+ - llm_int8_skip_modules: None
445
+ - llm_int8_enable_fp32_cpu_offload: False
446
+ - llm_int8_has_fp16_weight: False
447
+ - bnb_4bit_quant_type: nf4
448
+ - bnb_4bit_use_double_quant: True
449
+ - bnb_4bit_compute_dtype: bfloat16
450
+
451
+ The following `bitsandbytes` quantization config was used during training:
452
+ - quant_method: bitsandbytes
453
+ - load_in_8bit: False
454
+ - load_in_4bit: True
455
+ - llm_int8_threshold: 6.0
456
+ - llm_int8_skip_modules: None
457
+ - llm_int8_enable_fp32_cpu_offload: False
458
+ - llm_int8_has_fp16_weight: False
459
+ - bnb_4bit_quant_type: nf4
460
+ - bnb_4bit_use_double_quant: True
461
+ - bnb_4bit_compute_dtype: bfloat16
462
+
463
+ The following `bitsandbytes` quantization config was used during training:
464
+ - quant_method: bitsandbytes
465
+ - load_in_8bit: False
466
+ - load_in_4bit: True
467
+ - llm_int8_threshold: 6.0
468
+ - llm_int8_skip_modules: None
469
+ - llm_int8_enable_fp32_cpu_offload: False
470
+ - llm_int8_has_fp16_weight: False
471
+ - bnb_4bit_quant_type: nf4
472
+ - bnb_4bit_use_double_quant: True
473
+ - bnb_4bit_compute_dtype: bfloat16
474
+
475
+ The following `bitsandbytes` quantization config was used during training:
476
+ - quant_method: bitsandbytes
477
+ - load_in_8bit: False
478
+ - load_in_4bit: True
479
+ - llm_int8_threshold: 6.0
480
+ - llm_int8_skip_modules: None
481
+ - llm_int8_enable_fp32_cpu_offload: False
482
+ - llm_int8_has_fp16_weight: False
483
+ - bnb_4bit_quant_type: nf4
484
+ - bnb_4bit_use_double_quant: True
485
+ - bnb_4bit_compute_dtype: bfloat16
486
+
487
+ The following `bitsandbytes` quantization config was used during training:
488
+ - quant_method: bitsandbytes
489
+ - load_in_8bit: False
490
+ - load_in_4bit: True
491
+ - llm_int8_threshold: 6.0
492
+ - llm_int8_skip_modules: None
493
+ - llm_int8_enable_fp32_cpu_offload: False
494
+ - llm_int8_has_fp16_weight: False
495
+ - bnb_4bit_quant_type: nf4
496
+ - bnb_4bit_use_double_quant: True
497
+ - bnb_4bit_compute_dtype: bfloat16
498
+
499
+ The following `bitsandbytes` quantization config was used during training:
500
+ - quant_method: bitsandbytes
501
+ - load_in_8bit: False
502
+ - load_in_4bit: True
503
+ - llm_int8_threshold: 6.0
504
+ - llm_int8_skip_modules: None
505
+ - llm_int8_enable_fp32_cpu_offload: False
506
+ - llm_int8_has_fp16_weight: False
507
+ - bnb_4bit_quant_type: nf4
508
+ - bnb_4bit_use_double_quant: True
509
+ - bnb_4bit_compute_dtype: bfloat16
510
+
511
+ The following `bitsandbytes` quantization config was used during training:
512
+ - quant_method: bitsandbytes
513
+ - load_in_8bit: False
514
+ - load_in_4bit: True
515
+ - llm_int8_threshold: 6.0
516
+ - llm_int8_skip_modules: None
517
+ - llm_int8_enable_fp32_cpu_offload: False
518
+ - llm_int8_has_fp16_weight: False
519
+ - bnb_4bit_quant_type: nf4
520
+ - bnb_4bit_use_double_quant: True
521
+ - bnb_4bit_compute_dtype: bfloat16
522
+
523
+ The following `bitsandbytes` quantization config was used during training:
524
+ - quant_method: bitsandbytes
525
+ - load_in_8bit: False
526
+ - load_in_4bit: True
527
+ - llm_int8_threshold: 6.0
528
+ - llm_int8_skip_modules: None
529
+ - llm_int8_enable_fp32_cpu_offload: False
530
+ - llm_int8_has_fp16_weight: False
531
+ - bnb_4bit_quant_type: nf4
532
+ - bnb_4bit_use_double_quant: True
533
+ - bnb_4bit_compute_dtype: bfloat16
534
+
535
+ The following `bitsandbytes` quantization config was used during training:
536
+ - quant_method: bitsandbytes
537
+ - load_in_8bit: False
538
+ - load_in_4bit: True
539
+ - llm_int8_threshold: 6.0
540
+ - llm_int8_skip_modules: None
541
+ - llm_int8_enable_fp32_cpu_offload: False
542
+ - llm_int8_has_fp16_weight: False
543
+ - bnb_4bit_quant_type: nf4
544
+ - bnb_4bit_use_double_quant: True
545
+ - bnb_4bit_compute_dtype: bfloat16
546
+
547
+ The following `bitsandbytes` quantization config was used during training:
548
+ - quant_method: bitsandbytes
549
+ - load_in_8bit: False
550
+ - load_in_4bit: True
551
+ - llm_int8_threshold: 6.0
552
+ - llm_int8_skip_modules: None
553
+ - llm_int8_enable_fp32_cpu_offload: False
554
+ - llm_int8_has_fp16_weight: False
555
+ - bnb_4bit_quant_type: nf4
556
+ - bnb_4bit_use_double_quant: True
557
+ - bnb_4bit_compute_dtype: bfloat16
558
+
559
+ The following `bitsandbytes` quantization config was used during training:
560
+ - quant_method: bitsandbytes
561
+ - load_in_8bit: False
562
+ - load_in_4bit: True
563
+ - llm_int8_threshold: 6.0
564
+ - llm_int8_skip_modules: None
565
+ - llm_int8_enable_fp32_cpu_offload: False
566
+ - llm_int8_has_fp16_weight: False
567
+ - bnb_4bit_quant_type: nf4
568
+ - bnb_4bit_use_double_quant: True
569
+ - bnb_4bit_compute_dtype: bfloat16
570
+
571
+ The following `bitsandbytes` quantization config was used during training:
572
+ - quant_method: bitsandbytes
573
+ - load_in_8bit: False
574
+ - load_in_4bit: True
575
+ - llm_int8_threshold: 6.0
576
+ - llm_int8_skip_modules: None
577
+ - llm_int8_enable_fp32_cpu_offload: False
578
+ - llm_int8_has_fp16_weight: False
579
+ - bnb_4bit_quant_type: nf4
580
+ - bnb_4bit_use_double_quant: True
581
+ - bnb_4bit_compute_dtype: bfloat16
582
+
583
+ The following `bitsandbytes` quantization config was used during training:
584
+ - quant_method: bitsandbytes
585
+ - load_in_8bit: False
586
+ - load_in_4bit: True
587
+ - llm_int8_threshold: 6.0
588
+ - llm_int8_skip_modules: None
589
+ - llm_int8_enable_fp32_cpu_offload: False
590
+ - llm_int8_has_fp16_weight: False
591
+ - bnb_4bit_quant_type: nf4
592
+ - bnb_4bit_use_double_quant: True
593
+ - bnb_4bit_compute_dtype: bfloat16
594
+
595
+ The following `bitsandbytes` quantization config was used during training:
596
+ - quant_method: bitsandbytes
597
+ - load_in_8bit: False
598
+ - load_in_4bit: True
599
+ - llm_int8_threshold: 6.0
600
+ - llm_int8_skip_modules: None
601
+ - llm_int8_enable_fp32_cpu_offload: False
602
+ - llm_int8_has_fp16_weight: False
603
+ - bnb_4bit_quant_type: nf4
604
+ - bnb_4bit_use_double_quant: True
605
+ - bnb_4bit_compute_dtype: bfloat16
606
+
607
+ The following `bitsandbytes` quantization config was used during training:
608
+ - quant_method: bitsandbytes
609
+ - load_in_8bit: False
610
+ - load_in_4bit: True
611
+ - llm_int8_threshold: 6.0
612
+ - llm_int8_skip_modules: None
613
+ - llm_int8_enable_fp32_cpu_offload: False
614
+ - llm_int8_has_fp16_weight: False
615
+ - bnb_4bit_quant_type: nf4
616
+ - bnb_4bit_use_double_quant: True
617
+ - bnb_4bit_compute_dtype: bfloat16
618
+
619
+ The following `bitsandbytes` quantization config was used during training:
620
+ - quant_method: bitsandbytes
621
+ - load_in_8bit: False
622
+ - load_in_4bit: True
623
+ - llm_int8_threshold: 6.0
624
+ - llm_int8_skip_modules: None
625
+ - llm_int8_enable_fp32_cpu_offload: False
626
+ - llm_int8_has_fp16_weight: False
627
+ - bnb_4bit_quant_type: nf4
628
+ - bnb_4bit_use_double_quant: True
629
+ - bnb_4bit_compute_dtype: bfloat16
630
+
631
+ The following `bitsandbytes` quantization config was used during training:
632
+ - quant_method: bitsandbytes
633
+ - load_in_8bit: False
634
+ - load_in_4bit: True
635
+ - llm_int8_threshold: 6.0
636
+ - llm_int8_skip_modules: None
637
+ - llm_int8_enable_fp32_cpu_offload: False
638
+ - llm_int8_has_fp16_weight: False
639
+ - bnb_4bit_quant_type: nf4
640
+ - bnb_4bit_use_double_quant: True
641
+ - bnb_4bit_compute_dtype: bfloat16
642
+
643
+ The following `bitsandbytes` quantization config was used during training:
644
+ - quant_method: bitsandbytes
645
+ - load_in_8bit: False
646
+ - load_in_4bit: True
647
+ - llm_int8_threshold: 6.0
648
+ - llm_int8_skip_modules: None
649
+ - llm_int8_enable_fp32_cpu_offload: False
650
+ - llm_int8_has_fp16_weight: False
651
+ - bnb_4bit_quant_type: nf4
652
+ - bnb_4bit_use_double_quant: True
653
+ - bnb_4bit_compute_dtype: bfloat16
654
+
655
+ The following `bitsandbytes` quantization config was used during training:
656
+ - quant_method: bitsandbytes
657
+ - load_in_8bit: False
658
+ - load_in_4bit: True
659
+ - llm_int8_threshold: 6.0
660
+ - llm_int8_skip_modules: None
661
+ - llm_int8_enable_fp32_cpu_offload: False
662
+ - llm_int8_has_fp16_weight: False
663
+ - bnb_4bit_quant_type: nf4
664
+ - bnb_4bit_use_double_quant: True
665
+ - bnb_4bit_compute_dtype: bfloat16
666
+
667
+ The following `bitsandbytes` quantization config was used during training:
668
+ - quant_method: bitsandbytes
669
+ - load_in_8bit: False
670
+ - load_in_4bit: True
671
+ - llm_int8_threshold: 6.0
672
+ - llm_int8_skip_modules: None
673
+ - llm_int8_enable_fp32_cpu_offload: False
674
+ - llm_int8_has_fp16_weight: False
675
+ - bnb_4bit_quant_type: nf4
676
+ - bnb_4bit_use_double_quant: True
677
+ - bnb_4bit_compute_dtype: bfloat16
678
+
679
+ The following `bitsandbytes` quantization config was used during training:
680
+ - quant_method: bitsandbytes
681
+ - load_in_8bit: False
682
+ - load_in_4bit: True
683
+ - llm_int8_threshold: 6.0
684
+ - llm_int8_skip_modules: None
685
+ - llm_int8_enable_fp32_cpu_offload: False
686
+ - llm_int8_has_fp16_weight: False
687
+ - bnb_4bit_quant_type: nf4
688
+ - bnb_4bit_use_double_quant: True
689
+ - bnb_4bit_compute_dtype: bfloat16
690
+
691
+ The following `bitsandbytes` quantization config was used during training:
692
+ - quant_method: bitsandbytes
693
+ - load_in_8bit: False
694
+ - load_in_4bit: True
695
+ - llm_int8_threshold: 6.0
696
+ - llm_int8_skip_modules: None
697
+ - llm_int8_enable_fp32_cpu_offload: False
698
+ - llm_int8_has_fp16_weight: False
699
+ - bnb_4bit_quant_type: nf4
700
+ - bnb_4bit_use_double_quant: True
701
+ - bnb_4bit_compute_dtype: bfloat16
702
+
703
+ The following `bitsandbytes` quantization config was used during training:
704
+ - quant_method: bitsandbytes
705
+ - load_in_8bit: False
706
+ - load_in_4bit: True
707
+ - llm_int8_threshold: 6.0
708
+ - llm_int8_skip_modules: None
709
+ - llm_int8_enable_fp32_cpu_offload: False
710
+ - llm_int8_has_fp16_weight: False
711
+ - bnb_4bit_quant_type: nf4
712
+ - bnb_4bit_use_double_quant: True
713
+ - bnb_4bit_compute_dtype: bfloat16
714
+
715
+ The following `bitsandbytes` quantization config was used during training:
716
+ - quant_method: bitsandbytes
717
+ - load_in_8bit: False
718
+ - load_in_4bit: True
719
+ - llm_int8_threshold: 6.0
720
+ - llm_int8_skip_modules: None
721
+ - llm_int8_enable_fp32_cpu_offload: False
722
+ - llm_int8_has_fp16_weight: False
723
+ - bnb_4bit_quant_type: nf4
724
+ - bnb_4bit_use_double_quant: True
725
+ - bnb_4bit_compute_dtype: bfloat16
726
+
727
+ The following `bitsandbytes` quantization config was used during training:
728
+ - quant_method: bitsandbytes
729
+ - load_in_8bit: False
730
+ - load_in_4bit: True
731
+ - llm_int8_threshold: 6.0
732
+ - llm_int8_skip_modules: None
733
+ - llm_int8_enable_fp32_cpu_offload: False
734
+ - llm_int8_has_fp16_weight: False
735
+ - bnb_4bit_quant_type: nf4
736
+ - bnb_4bit_use_double_quant: True
737
+ - bnb_4bit_compute_dtype: bfloat16
738
+
739
+ The following `bitsandbytes` quantization config was used during training:
740
+ - quant_method: bitsandbytes
741
+ - load_in_8bit: False
742
+ - load_in_4bit: True
743
+ - llm_int8_threshold: 6.0
744
+ - llm_int8_skip_modules: None
745
+ - llm_int8_enable_fp32_cpu_offload: False
746
+ - llm_int8_has_fp16_weight: False
747
+ - bnb_4bit_quant_type: nf4
748
+ - bnb_4bit_use_double_quant: True
749
+ - bnb_4bit_compute_dtype: bfloat16
750
+
751
+ The following `bitsandbytes` quantization config was used during training:
752
+ - quant_method: bitsandbytes
753
+ - load_in_8bit: False
754
+ - load_in_4bit: True
755
+ - llm_int8_threshold: 6.0
756
+ - llm_int8_skip_modules: None
757
+ - llm_int8_enable_fp32_cpu_offload: False
758
+ - llm_int8_has_fp16_weight: False
759
+ - bnb_4bit_quant_type: nf4
760
+ - bnb_4bit_use_double_quant: True
761
+ - bnb_4bit_compute_dtype: bfloat16
762
+
763
+ The following `bitsandbytes` quantization config was used during training:
764
+ - quant_method: bitsandbytes
765
+ - load_in_8bit: False
766
+ - load_in_4bit: True
767
+ - llm_int8_threshold: 6.0
768
+ - llm_int8_skip_modules: None
769
+ - llm_int8_enable_fp32_cpu_offload: False
770
+ - llm_int8_has_fp16_weight: False
771
+ - bnb_4bit_quant_type: nf4
772
+ - bnb_4bit_use_double_quant: True
773
+ - bnb_4bit_compute_dtype: bfloat16
774
+
775
+ The following `bitsandbytes` quantization config was used during training:
776
+ - quant_method: bitsandbytes
777
+ - load_in_8bit: False
778
+ - load_in_4bit: True
779
+ - llm_int8_threshold: 6.0
780
+ - llm_int8_skip_modules: None
781
+ - llm_int8_enable_fp32_cpu_offload: False
782
+ - llm_int8_has_fp16_weight: False
783
+ - bnb_4bit_quant_type: nf4
784
+ - bnb_4bit_use_double_quant: True
785
+ - bnb_4bit_compute_dtype: bfloat16
786
+
787
+ The following `bitsandbytes` quantization config was used during training:
788
+ - quant_method: bitsandbytes
789
+ - load_in_8bit: False
790
+ - load_in_4bit: True
791
+ - llm_int8_threshold: 6.0
792
+ - llm_int8_skip_modules: None
793
+ - llm_int8_enable_fp32_cpu_offload: False
794
+ - llm_int8_has_fp16_weight: False
795
+ - bnb_4bit_quant_type: nf4
796
+ - bnb_4bit_use_double_quant: True
797
+ - bnb_4bit_compute_dtype: bfloat16
798
+
799
+ The following `bitsandbytes` quantization config was used during training:
800
+ - quant_method: bitsandbytes
801
+ - load_in_8bit: False
802
+ - load_in_4bit: True
803
+ - llm_int8_threshold: 6.0
804
+ - llm_int8_skip_modules: None
805
+ - llm_int8_enable_fp32_cpu_offload: False
806
+ - llm_int8_has_fp16_weight: False
807
+ - bnb_4bit_quant_type: nf4
808
+ - bnb_4bit_use_double_quant: True
809
+ - bnb_4bit_compute_dtype: bfloat16
810
+
811
+ The following `bitsandbytes` quantization config was used during training:
812
+ - quant_method: bitsandbytes
813
+ - load_in_8bit: False
814
+ - load_in_4bit: True
815
+ - llm_int8_threshold: 6.0
816
+ - llm_int8_skip_modules: None
817
+ - llm_int8_enable_fp32_cpu_offload: False
818
+ - llm_int8_has_fp16_weight: False
819
+ - bnb_4bit_quant_type: nf4
820
+ - bnb_4bit_use_double_quant: True
821
+ - bnb_4bit_compute_dtype: bfloat16
822
+
823
+ The following `bitsandbytes` quantization config was used during training:
824
+ - quant_method: bitsandbytes
825
+ - load_in_8bit: False
826
+ - load_in_4bit: True
827
+ - llm_int8_threshold: 6.0
828
+ - llm_int8_skip_modules: None
829
+ - llm_int8_enable_fp32_cpu_offload: False
830
+ - llm_int8_has_fp16_weight: False
831
+ - bnb_4bit_quant_type: nf4
832
+ - bnb_4bit_use_double_quant: True
833
+ - bnb_4bit_compute_dtype: bfloat16
834
+
835
+ The following `bitsandbytes` quantization config was used during training:
836
+ - quant_method: bitsandbytes
837
+ - load_in_8bit: False
838
+ - load_in_4bit: True
839
+ - llm_int8_threshold: 6.0
840
+ - llm_int8_skip_modules: None
841
+ - llm_int8_enable_fp32_cpu_offload: False
842
+ - llm_int8_has_fp16_weight: False
843
+ - bnb_4bit_quant_type: nf4
844
+ - bnb_4bit_use_double_quant: True
845
+ - bnb_4bit_compute_dtype: bfloat16
846
+
847
+ The following `bitsandbytes` quantization config was used during training:
848
+ - quant_method: bitsandbytes
849
+ - load_in_8bit: False
850
+ - load_in_4bit: True
851
+ - llm_int8_threshold: 6.0
852
+ - llm_int8_skip_modules: None
853
+ - llm_int8_enable_fp32_cpu_offload: False
854
+ - llm_int8_has_fp16_weight: False
855
+ - bnb_4bit_quant_type: nf4
856
+ - bnb_4bit_use_double_quant: True
857
+ - bnb_4bit_compute_dtype: bfloat16
858
+
859
+ The following `bitsandbytes` quantization config was used during training:
860
+ - quant_method: bitsandbytes
861
+ - load_in_8bit: False
862
+ - load_in_4bit: True
863
+ - llm_int8_threshold: 6.0
864
+ - llm_int8_skip_modules: None
865
+ - llm_int8_enable_fp32_cpu_offload: False
866
+ - llm_int8_has_fp16_weight: False
867
+ - bnb_4bit_quant_type: nf4
868
+ - bnb_4bit_use_double_quant: True
869
+ - bnb_4bit_compute_dtype: bfloat16
870
+
871
+ The following `bitsandbytes` quantization config was used during training:
872
+ - quant_method: bitsandbytes
873
+ - load_in_8bit: False
874
+ - load_in_4bit: True
875
+ - llm_int8_threshold: 6.0
876
+ - llm_int8_skip_modules: None
877
+ - llm_int8_enable_fp32_cpu_offload: False
878
+ - llm_int8_has_fp16_weight: False
879
+ - bnb_4bit_quant_type: nf4
880
+ - bnb_4bit_use_double_quant: True
881
+ - bnb_4bit_compute_dtype: bfloat16
882
+
883
+ The following `bitsandbytes` quantization config was used during training:
884
+ - quant_method: bitsandbytes
885
+ - load_in_8bit: False
886
+ - load_in_4bit: True
887
+ - llm_int8_threshold: 6.0
888
+ - llm_int8_skip_modules: None
889
+ - llm_int8_enable_fp32_cpu_offload: False
890
+ - llm_int8_has_fp16_weight: False
891
+ - bnb_4bit_quant_type: nf4
892
+ - bnb_4bit_use_double_quant: True
893
+ - bnb_4bit_compute_dtype: bfloat16
894
+
895
+ The following `bitsandbytes` quantization config was used during training:
896
+ - quant_method: bitsandbytes
897
+ - load_in_8bit: False
898
+ - load_in_4bit: True
899
+ - llm_int8_threshold: 6.0
900
+ - llm_int8_skip_modules: None
901
+ - llm_int8_enable_fp32_cpu_offload: False
902
+ - llm_int8_has_fp16_weight: False
903
+ - bnb_4bit_quant_type: nf4
904
+ - bnb_4bit_use_double_quant: True
905
+ - bnb_4bit_compute_dtype: bfloat16
906
+
907
+ The following `bitsandbytes` quantization config was used during training:
908
+ - quant_method: bitsandbytes
909
+ - load_in_8bit: False
910
+ - load_in_4bit: True
911
+ - llm_int8_threshold: 6.0
912
+ - llm_int8_skip_modules: None
913
+ - llm_int8_enable_fp32_cpu_offload: False
914
+ - llm_int8_has_fp16_weight: False
915
+ - bnb_4bit_quant_type: nf4
916
+ - bnb_4bit_use_double_quant: True
917
+ - bnb_4bit_compute_dtype: bfloat16
918
+ ### Framework versions
919
+
920
+ - PEFT 0.4.0
921
+ - PEFT 0.4.0
922
+ - PEFT 0.4.0
923
+ - PEFT 0.4.0
924
+ - PEFT 0.4.0
925
+ - PEFT 0.4.0
926
+ - PEFT 0.4.0
927
+ - PEFT 0.4.0
928
+ - PEFT 0.4.0
929
+ - PEFT 0.4.0
930
+ - PEFT 0.4.0
931
+ - PEFT 0.4.0
932
+ - PEFT 0.4.0
933
+ - PEFT 0.4.0
934
+ - PEFT 0.4.0
935
+ - PEFT 0.4.0
936
+ - PEFT 0.4.0
937
+ - PEFT 0.4.0
938
+ - PEFT 0.4.0
939
+ - PEFT 0.4.0
940
+ - PEFT 0.4.0
941
+ - PEFT 0.4.0
942
+ - PEFT 0.4.0
943
+ - PEFT 0.4.0
944
+ - PEFT 0.4.0
945
+ - PEFT 0.4.0
946
+ - PEFT 0.4.0
947
+ - PEFT 0.4.0
948
+ - PEFT 0.4.0
949
+ - PEFT 0.4.0
950
+ - PEFT 0.4.0
951
+ - PEFT 0.4.0
952
+ - PEFT 0.4.0
953
+ - PEFT 0.4.0
954
+ - PEFT 0.4.0
955
+ - PEFT 0.4.0
956
+ - PEFT 0.4.0
957
+ - PEFT 0.4.0
958
+ - PEFT 0.4.0
959
+ - PEFT 0.4.0
960
+ - PEFT 0.4.0
961
+ - PEFT 0.4.0
962
+ - PEFT 0.4.0
963
+ - PEFT 0.4.0
964
+ - PEFT 0.4.0
965
+ - PEFT 0.4.0
966
+ - PEFT 0.4.0
967
+ - PEFT 0.4.0
968
+ - PEFT 0.4.0
969
+ - PEFT 0.4.0
970
+ - PEFT 0.4.0
971
+ - PEFT 0.4.0
972
+ - PEFT 0.4.0
973
+ - PEFT 0.4.0
974
+ - PEFT 0.4.0
975
+ - PEFT 0.4.0
976
+ - PEFT 0.4.0
977
+ - PEFT 0.4.0
978
+ - PEFT 0.4.0
979
+ - PEFT 0.4.0
980
+ - PEFT 0.4.0
981
+ - PEFT 0.4.0
982
+ - PEFT 0.4.0
983
+ - PEFT 0.4.0
984
+ - PEFT 0.4.0
985
+ - PEFT 0.4.0
986
+ - PEFT 0.4.0
987
+ - PEFT 0.4.0
988
+ - PEFT 0.4.0
989
+ - PEFT 0.4.0
990
+ - PEFT 0.4.0
991
+ - PEFT 0.4.0
992
+ - PEFT 0.4.0
993
+ - PEFT 0.4.0
994
+ - PEFT 0.4.0
995
+
996
+ - PEFT 0.4.0
adapter_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "auto_mapping": {
3
+ "base_model_class": "MistralForCausalLM",
4
+ "parent_library": "transformers.models.mistral.modeling_mistral"
5
+ },
6
+ "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
7
+ "bias": "none",
8
+ "fan_in_fan_out": false,
9
+ "inference_mode": true,
10
+ "init_lora_weights": true,
11
+ "layers_pattern": null,
12
+ "layers_to_transform": null,
13
+ "lora_alpha": 8,
14
+ "lora_dropout": 0.0,
15
+ "modules_to_save": null,
16
+ "peft_type": "LORA",
17
+ "r": 32,
18
+ "revision": null,
19
+ "target_modules": [
20
+ "self_attn.q_proj",
21
+ "self_attn.k_proj",
22
+ "self_attn.v_proj",
23
+ "self_attn.o_proj",
24
+ "mlp.gate_proj",
25
+ "mlp.up_proj",
26
+ "mlp.down_proj"
27
+ ],
28
+ "task_type": null
29
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fab1e759e64f68cd2edbeffb4415502436a8bffa49ff8fd8c18ca09ea9abaa90
3
+ size 335604696
decoder/adapter_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "auto_mapping": {
3
+ "base_model_class": "MistralForCausalLM",
4
+ "parent_library": "transformers.models.mistral.modeling_mistral"
5
+ },
6
+ "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
7
+ "bias": "none",
8
+ "fan_in_fan_out": false,
9
+ "inference_mode": true,
10
+ "init_lora_weights": true,
11
+ "layers_pattern": null,
12
+ "layers_to_transform": null,
13
+ "lora_alpha": 8,
14
+ "lora_dropout": 0.0,
15
+ "modules_to_save": null,
16
+ "peft_type": "LORA",
17
+ "r": 32,
18
+ "revision": null,
19
+ "target_modules": [
20
+ "self_attn.q_proj",
21
+ "self_attn.k_proj",
22
+ "self_attn.v_proj",
23
+ "self_attn.o_proj",
24
+ "mlp.gate_proj",
25
+ "mlp.up_proj",
26
+ "mlp.down_proj"
27
+ ],
28
+ "task_type": null
29
+ }
decoder/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6785f9a28432a24080ad0e52fbea1da31cfdfb2bbcd6e8f6b7b566d78ce1c115
3
+ size 335604696
encoder/adapter_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "auto_mapping": {
3
+ "base_model_class": "MistralForCausalLM",
4
+ "parent_library": "transformers.models.mistral.modeling_mistral"
5
+ },
6
+ "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
7
+ "bias": "none",
8
+ "fan_in_fan_out": false,
9
+ "inference_mode": true,
10
+ "init_lora_weights": true,
11
+ "layers_pattern": null,
12
+ "layers_to_transform": null,
13
+ "lora_alpha": 8,
14
+ "lora_dropout": 0.0,
15
+ "modules_to_save": null,
16
+ "peft_type": "LORA",
17
+ "r": 32,
18
+ "revision": null,
19
+ "target_modules": [
20
+ "self_attn.q_proj",
21
+ "self_attn.k_proj",
22
+ "self_attn.v_proj",
23
+ "self_attn.o_proj",
24
+ "mlp.gate_proj",
25
+ "mlp.up_proj",
26
+ "mlp.down_proj"
27
+ ],
28
+ "task_type": null
29
+ }
encoder/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:43d6af5b35550f8895ef79f245499860a8647fe75a314b0a567fab2e789ff4f8
3
+ size 335604696
router/adapter_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "auto_mapping": {
3
+ "base_model_class": "MistralForCausalLM",
4
+ "parent_library": "transformers.models.mistral.modeling_mistral"
5
+ },
6
+ "base_model_name_or_path": "mistralai/Mistral-7B-v0.1",
7
+ "bias": "none",
8
+ "fan_in_fan_out": false,
9
+ "inference_mode": true,
10
+ "init_lora_weights": true,
11
+ "layers_pattern": null,
12
+ "layers_to_transform": null,
13
+ "lora_alpha": 8,
14
+ "lora_dropout": 0.0,
15
+ "modules_to_save": null,
16
+ "peft_type": "LORA",
17
+ "r": 32,
18
+ "revision": null,
19
+ "target_modules": [
20
+ "self_attn.q_proj",
21
+ "self_attn.k_proj",
22
+ "self_attn.v_proj",
23
+ "self_attn.o_proj",
24
+ "mlp.gate_proj",
25
+ "mlp.up_proj",
26
+ "mlp.down_proj"
27
+ ],
28
+ "task_type": null
29
+ }
router/adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06a5a4886a7f3f5dc2824aab08b7c0cee45aac0326c78dae0cbe677cb820d10e
3
+ size 335604696
state.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"step": 75144, "last_kl_weight": 0.01}
vae.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:11c6772eccb8a163298d608bbde7526209b1de4aad907e021c3704c72220cd1e
3
+ size 25202116