|
--- |
|
license: other |
|
--- |
|
|
|
# Overview |
|
|
|
This is a fine-tuned 13b parameter LlaMa model, using completely synthetic training data created by https://github.com/jondurbin/airoboros |
|
|
|
### Eval (gpt4 judging) |
|
|
|
![chart](meta-chart.png) |
|
|
|
| model | raw score | gpt-3.5 adjusted score | |
|
| --- | --- | --- | |
|
| __airoboros-13b__ | __17947__ | __98.087__ | |
|
| gpt35 | 18297 | 100.0 | |
|
| gpt4-x-alpasta-30b | 15612 | 85.33 | |
|
| manticore-13b | 15856 | 86.66 | |
|
| vicuna-13b-1.1 | 16306 | 89.12 | |
|
| wizard-vicuna-13b-uncensored | 16287 | 89.01 | |
|
|
|
<details> |
|
<summary>individual question scores, with shareGPT links (200 prompts generated by gpt-4)</summary> |
|
|
|
| question | airoboros-13b | gpt35 | gpt4-x-alpasta-30b | manticore-13b | vicuna-13b-1.1 | wizard-vicuna-13b-uncensored | link | |
|
|-----------:|----------------:|--------:|---------------------:|----------------:|-----------------:|-------------------------------:|:---------------------------------------| |
|
| 1 | 80 | 95 | 70 | 90 | 85 | 60 | [eval](https://sharegpt.com/c/PIbRQD3) | |
|
| 2 | 20 | 95 | 40 | 30 | 90 | 80 | [eval](https://sharegpt.com/c/fSzwzzd) | |
|
| 3 | 100 | 100 | 100 | 95 | 95 | 100 | [eval](https://sharegpt.com/c/AXMzZiO) | |
|
| 4 | 90 | 100 | 85 | 60 | 95 | 100 | [eval](https://sharegpt.com/c/7obzJm2) | |
|
| 5 | 95 | 90 | 80 | 85 | 95 | 75 | [eval](https://sharegpt.com/c/cRpj6M1) | |
|
| 6 | 100 | 95 | 90 | 95 | 98 | 92 | [eval](https://sharegpt.com/c/p0by1T7) | |
|
| 7 | 50 | 100 | 80 | 95 | 60 | 55 | [eval](https://sharegpt.com/c/rowNlKx) | |
|
| 8 | 70 | 90 | 80 | 60 | 85 | 40 | [eval](https://sharegpt.com/c/I4POj4I) | |
|
| 9 | 100 | 95 | 50 | 85 | 40 | 60 | [eval](https://sharegpt.com/c/gUAeiRp) | |
|
| 10 | 85 | 60 | 55 | 65 | 50 | 70 | [eval](https://sharegpt.com/c/Lgw4QQL) | |
|
| 11 | 95 | 100 | 85 | 90 | 60 | 75 | [eval](https://sharegpt.com/c/X9tDYft) | |
|
| 12 | 100 | 95 | 70 | 80 | 50 | 85 | [eval](https://sharegpt.com/c/9V2ElkH) | |
|
| 13 | 100 | 95 | 80 | 70 | 60 | 90 | [eval](https://sharegpt.com/c/D5xg6qt) | |
|
| 14 | 95 | 100 | 70 | 85 | 90 | 90 | [eval](https://sharegpt.com/c/lQnSfDs) | |
|
| 15 | 80 | 95 | 90 | 60 | 30 | 85 | [eval](https://sharegpt.com/c/1hpHGNc) | |
|
| 16 | 60 | 95 | 0 | 75 | 50 | 40 | [eval](https://sharegpt.com/c/an6TqE4) | |
|
| 17 | 100 | 95 | 90 | 98 | 95 | 95 | [eval](https://sharegpt.com/c/7vr6n3F) | |
|
| 18 | 60 | 85 | 40 | 50 | 20 | 0 | [eval](https://sharegpt.com/c/TOkMkgE) | |
|
| 19 | 100 | 90 | 85 | 95 | 95 | 80 | [eval](https://sharegpt.com/c/Qu7ak0r) | |
|
| 20 | 100 | 95 | 100 | 95 | 90 | 95 | [eval](https://sharegpt.com/c/hMD4gPo) | |
|
| 21 | 95 | 90 | 96 | 80 | 92 | 88 | [eval](https://sharegpt.com/c/HTlicNh) | |
|
| 22 | 95 | 92 | 90 | 93 | 89 | 91 | [eval](https://sharegpt.com/c/MjxHpAf) | |
|
| 23 | 95 | 93 | 90 | 94 | 96 | 92 | [eval](https://sharegpt.com/c/4RvxOR9) | |
|
| 24 | 95 | 90 | 93 | 88 | 92 | 85 | [eval](https://sharegpt.com/c/PcAIU9r) | |
|
| 25 | 95 | 90 | 85 | 96 | 88 | 92 | [eval](https://sharegpt.com/c/MMqul3q) | |
|
| 26 | 95 | 95 | 90 | 93 | 92 | 91 | [eval](https://sharegpt.com/c/YQsLyzJ) | |
|
| 27 | 95 | 98 | 80 | 97 | 99 | 96 | [eval](https://sharegpt.com/c/UDhSTMq) | |
|
| 28 | 95 | 93 | 90 | 87 | 92 | 89 | [eval](https://sharegpt.com/c/4gCfdCV) | |
|
| 29 | 90 | 85 | 95 | 80 | 92 | 75 | [eval](https://sharegpt.com/c/bkQs4SP) | |
|
| 30 | 90 | 85 | 95 | 93 | 80 | 92 | [eval](https://sharegpt.com/c/LeLCEEt) | |
|
| 31 | 95 | 92 | 90 | 91 | 93 | 89 | [eval](https://sharegpt.com/c/DFxNzVu) | |
|
| 32 | 100 | 95 | 90 | 85 | 80 | 95 | [eval](https://sharegpt.com/c/gnVzNML) | |
|
| 33 | 95 | 97 | 93 | 92 | 96 | 94 | [eval](https://sharegpt.com/c/y7pxMIy) | |
|
| 34 | 95 | 93 | 94 | 90 | 88 | 92 | [eval](https://sharegpt.com/c/5UeCvTY) | |
|
| 35 | 90 | 95 | 98 | 85 | 96 | 92 | [eval](https://sharegpt.com/c/T4oL9I5) | |
|
| 36 | 90 | 88 | 85 | 80 | 82 | 84 | [eval](https://sharegpt.com/c/HnGyTAG) | |
|
| 37 | 90 | 95 | 85 | 87 | 92 | 88 | [eval](https://sharegpt.com/c/ZbRMBNj) | |
|
| 38 | 95 | 97 | 96 | 90 | 93 | 92 | [eval](https://sharegpt.com/c/iTmFJqd) | |
|
| 39 | 95 | 93 | 92 | 90 | 89 | 91 | [eval](https://sharegpt.com/c/VuPifET) | |
|
| 40 | 90 | 95 | 93 | 92 | 94 | 91 | [eval](https://sharegpt.com/c/AvFAH1x) | |
|
| 41 | 90 | 85 | 95 | 80 | 88 | 75 | [eval](https://sharegpt.com/c/4ealKGN) | |
|
| 42 | 85 | 90 | 95 | 88 | 92 | 80 | [eval](https://sharegpt.com/c/bE1b2vX) | |
|
| 43 | 90 | 95 | 92 | 85 | 80 | 87 | [eval](https://sharegpt.com/c/I3nMPBC) | |
|
| 44 | 85 | 90 | 95 | 80 | 88 | 75 | [eval](https://sharegpt.com/c/as7r3bW) | |
|
| 45 | 85 | 80 | 75 | 90 | 70 | 82 | [eval](https://sharegpt.com/c/qYceaUa) | |
|
| 46 | 90 | 85 | 95 | 92 | 93 | 80 | [eval](https://sharegpt.com/c/g4FXchU) | |
|
| 47 | 90 | 95 | 75 | 85 | 80 | 70 | [eval](https://sharegpt.com/c/6kGLvL5) | |
|
| 48 | 85 | 90 | 80 | 88 | 82 | 83 | [eval](https://sharegpt.com/c/SRozqaF) | |
|
| 49 | 85 | 90 | 95 | 92 | 88 | 80 | [eval](https://sharegpt.com/c/GoKydf6) | |
|
| 50 | 85 | 90 | 80 | 75 | 95 | 88 | [eval](https://sharegpt.com/c/37aXkHQ) | |
|
| 51 | 85 | 90 | 80 | 88 | 84 | 92 | [eval](https://sharegpt.com/c/nVuUaTj) | |
|
| 52 | 80 | 90 | 75 | 85 | 70 | 95 | [eval](https://sharegpt.com/c/TkAQKLC) | |
|
| 53 | 90 | 88 | 85 | 80 | 92 | 83 | [eval](https://sharegpt.com/c/55cO2y0) | |
|
| 54 | 85 | 75 | 90 | 80 | 78 | 88 | [eval](https://sharegpt.com/c/tXtq5lT) | |
|
| 55 | 85 | 90 | 80 | 82 | 75 | 88 | [eval](https://sharegpt.com/c/TfMjeJQ) | |
|
| 56 | 90 | 85 | 40 | 95 | 80 | 88 | [eval](https://sharegpt.com/c/2jQ6K2S) | |
|
| 57 | 85 | 95 | 90 | 75 | 88 | 80 | [eval](https://sharegpt.com/c/aQtr2ca) | |
|
| 58 | 85 | 95 | 90 | 92 | 89 | 88 | [eval](https://sharegpt.com/c/tbWLyZ7) | |
|
| 59 | 80 | 85 | 75 | 60 | 90 | 70 | [eval](https://sharegpt.com/c/moHC7i2) | |
|
| 60 | 85 | 90 | 87 | 80 | 88 | 75 | [eval](https://sharegpt.com/c/GK6GShh) | |
|
| 61 | 85 | 80 | 75 | 50 | 90 | 80 | [eval](https://sharegpt.com/c/ugcW4qG) | |
|
| 62 | 95 | 80 | 90 | 85 | 75 | 82 | [eval](https://sharegpt.com/c/WL8iq6F) | |
|
| 63 | 85 | 90 | 80 | 70 | 95 | 88 | [eval](https://sharegpt.com/c/TZJKnvS) | |
|
| 64 | 90 | 95 | 70 | 85 | 80 | 75 | [eval](https://sharegpt.com/c/beNOKb5) | |
|
| 65 | 90 | 85 | 70 | 75 | 80 | 60 | [eval](https://sharegpt.com/c/o2oRCF5) | |
|
| 66 | 95 | 90 | 70 | 50 | 85 | 80 | [eval](https://sharegpt.com/c/TNjbK6D) | |
|
| 67 | 80 | 85 | 40 | 60 | 90 | 95 | [eval](https://sharegpt.com/c/rJvszWJ) | |
|
| 68 | 75 | 60 | 80 | 55 | 70 | 85 | [eval](https://sharegpt.com/c/HJwRkro) | |
|
| 69 | 90 | 85 | 60 | 50 | 80 | 95 | [eval](https://sharegpt.com/c/AeFoSDK) | |
|
| 70 | 45 | 85 | 60 | 20 | 65 | 75 | [eval](https://sharegpt.com/c/KA1cgOl) | |
|
| 71 | 85 | 90 | 30 | 60 | 80 | 70 | [eval](https://sharegpt.com/c/RTy8n0y) | |
|
| 72 | 90 | 95 | 80 | 40 | 85 | 70 | [eval](https://sharegpt.com/c/PJMJoXh) | |
|
| 73 | 85 | 90 | 70 | 75 | 80 | 95 | [eval](https://sharegpt.com/c/Ib3jzyC) | |
|
| 74 | 90 | 70 | 50 | 20 | 60 | 40 | [eval](https://sharegpt.com/c/oMmqqtX) | |
|
| 75 | 90 | 95 | 75 | 60 | 85 | 80 | [eval](https://sharegpt.com/c/qRNhNTw) | |
|
| 76 | 85 | 80 | 60 | 70 | 65 | 75 | [eval](https://sharegpt.com/c/3MAHQIy) | |
|
| 77 | 90 | 85 | 80 | 75 | 82 | 70 | [eval](https://sharegpt.com/c/0Emc5HS) | |
|
| 78 | 90 | 95 | 80 | 70 | 85 | 75 | [eval](https://sharegpt.com/c/UqAxRWF) | |
|
| 79 | 85 | 75 | 30 | 80 | 90 | 70 | [eval](https://sharegpt.com/c/eywxGAw) | |
|
| 80 | 85 | 90 | 50 | 70 | 80 | 60 | [eval](https://sharegpt.com/c/A2KSEWP) | |
|
| 81 | 100 | 95 | 98 | 99 | 97 | 96 | [eval](https://sharegpt.com/c/C8rebQf) | |
|
| 82 | 95 | 90 | 92 | 93 | 91 | 89 | [eval](https://sharegpt.com/c/cd9HF4V) | |
|
| 83 | 95 | 92 | 90 | 85 | 88 | 91 | [eval](https://sharegpt.com/c/LHkjvQJ) | |
|
| 84 | 100 | 95 | 98 | 97 | 96 | 99 | [eval](https://sharegpt.com/c/o5PdoyZ) | |
|
| 85 | 100 | 100 | 100 | 90 | 100 | 95 | [eval](https://sharegpt.com/c/rh8pZVg) | |
|
| 86 | 100 | 95 | 98 | 97 | 94 | 99 | [eval](https://sharegpt.com/c/T5DYL83) | |
|
| 87 | 95 | 90 | 92 | 93 | 94 | 91 | [eval](https://sharegpt.com/c/G5Osg3X) | |
|
| 88 | 100 | 95 | 98 | 90 | 96 | 95 | [eval](https://sharegpt.com/c/9ZqI03V) | |
|
| 89 | 95 | 96 | 92 | 90 | 89 | 93 | [eval](https://sharegpt.com/c/4tFfwZU) | |
|
| 90 | 100 | 95 | 93 | 90 | 92 | 88 | [eval](https://sharegpt.com/c/mG1JqPH) | |
|
| 91 | 100 | 100 | 98 | 97 | 99 | 100 | [eval](https://sharegpt.com/c/VDdtgCu) | |
|
| 92 | 95 | 90 | 92 | 85 | 93 | 94 | [eval](https://sharegpt.com/c/uKtGkvg) | |
|
| 93 | 95 | 93 | 90 | 92 | 96 | 91 | [eval](https://sharegpt.com/c/9B92N6P) | |
|
| 94 | 95 | 96 | 92 | 90 | 93 | 91 | [eval](https://sharegpt.com/c/GeIFfOu) | |
|
| 95 | 95 | 90 | 92 | 93 | 91 | 89 | [eval](https://sharegpt.com/c/gn3E9nN) | |
|
| 96 | 100 | 98 | 95 | 97 | 96 | 99 | [eval](https://sharegpt.com/c/Erxa46H) | |
|
| 97 | 90 | 95 | 85 | 88 | 92 | 87 | [eval](https://sharegpt.com/c/oRHVOvK) | |
|
| 98 | 95 | 93 | 90 | 92 | 89 | 88 | [eval](https://sharegpt.com/c/ghtKLUX) | |
|
| 99 | 100 | 95 | 97 | 90 | 96 | 94 | [eval](https://sharegpt.com/c/ZL4KjqP) | |
|
| 100 | 95 | 93 | 90 | 92 | 94 | 91 | [eval](https://sharegpt.com/c/YOnqIQa) | |
|
| 101 | 95 | 92 | 90 | 93 | 94 | 88 | [eval](https://sharegpt.com/c/3BKwKho) | |
|
| 102 | 95 | 92 | 60 | 97 | 90 | 96 | [eval](https://sharegpt.com/c/U1i31bn) | |
|
| 103 | 95 | 90 | 92 | 93 | 91 | 89 | [eval](https://sharegpt.com/c/etfRoAE) | |
|
| 104 | 95 | 90 | 97 | 92 | 91 | 93 | [eval](https://sharegpt.com/c/B0OpVxR) | |
|
| 105 | 90 | 95 | 93 | 85 | 92 | 91 | [eval](https://sharegpt.com/c/MBgGJ5A) | |
|
| 106 | 95 | 90 | 40 | 92 | 93 | 85 | [eval](https://sharegpt.com/c/eQKTYO7) | |
|
| 107 | 100 | 100 | 95 | 90 | 95 | 90 | [eval](https://sharegpt.com/c/szKWCBt) | |
|
| 108 | 90 | 95 | 96 | 98 | 93 | 92 | [eval](https://sharegpt.com/c/8ZhUcAv) | |
|
| 109 | 90 | 95 | 92 | 89 | 93 | 94 | [eval](https://sharegpt.com/c/VQWdy99) | |
|
| 110 | 100 | 95 | 100 | 98 | 96 | 99 | [eval](https://sharegpt.com/c/g1DHUSM) | |
|
| 111 | 100 | 100 | 95 | 90 | 100 | 90 | [eval](https://sharegpt.com/c/uYgfJC3) | |
|
| 112 | 90 | 85 | 88 | 92 | 87 | 91 | [eval](https://sharegpt.com/c/crk8BH3) | |
|
| 113 | 95 | 97 | 90 | 92 | 93 | 94 | [eval](https://sharegpt.com/c/95F9afQ) | |
|
| 114 | 90 | 95 | 85 | 88 | 92 | 89 | [eval](https://sharegpt.com/c/otioHUo) | |
|
| 115 | 95 | 93 | 90 | 92 | 94 | 91 | [eval](https://sharegpt.com/c/KSiL9F6) | |
|
| 116 | 90 | 95 | 85 | 80 | 88 | 82 | [eval](https://sharegpt.com/c/GmGq3b3) | |
|
| 117 | 95 | 90 | 60 | 85 | 93 | 70 | [eval](https://sharegpt.com/c/VOhklyz) | |
|
| 118 | 95 | 92 | 94 | 93 | 96 | 90 | [eval](https://sharegpt.com/c/wqy8m6k) | |
|
| 119 | 95 | 90 | 85 | 93 | 87 | 92 | [eval](https://sharegpt.com/c/iWKrIuS) | |
|
| 120 | 95 | 96 | 93 | 90 | 97 | 92 | [eval](https://sharegpt.com/c/o1h3w8N) | |
|
| 121 | 100 | 0 | 0 | 100 | 0 | 0 | [eval](https://sharegpt.com/c/3UH9eed) | |
|
| 122 | 60 | 100 | 0 | 80 | 0 | 0 | [eval](https://sharegpt.com/c/44g0FAh) | |
|
| 123 | 0 | 100 | 60 | 0 | 0 | 90 | [eval](https://sharegpt.com/c/PaQlcrU) | |
|
| 124 | 100 | 100 | 0 | 100 | 100 | 100 | [eval](https://sharegpt.com/c/51icV4o) | |
|
| 125 | 100 | 100 | 100 | 100 | 95 | 100 | [eval](https://sharegpt.com/c/1VnbGAR) | |
|
| 126 | 100 | 100 | 100 | 50 | 90 | 100 | [eval](https://sharegpt.com/c/EYGBrgw) | |
|
| 127 | 100 | 100 | 100 | 100 | 95 | 90 | [eval](https://sharegpt.com/c/EGRduOt) | |
|
| 128 | 100 | 100 | 100 | 95 | 0 | 100 | [eval](https://sharegpt.com/c/O3JJfnK) | |
|
| 129 | 50 | 95 | 20 | 10 | 30 | 85 | [eval](https://sharegpt.com/c/2roVtAu) | |
|
| 130 | 100 | 100 | 60 | 20 | 30 | 40 | [eval](https://sharegpt.com/c/sphFpfx) | |
|
| 131 | 100 | 0 | 0 | 0 | 0 | 100 | [eval](https://sharegpt.com/c/OeWGKBo) | |
|
| 132 | 0 | 100 | 60 | 0 | 0 | 80 | [eval](https://sharegpt.com/c/TOUsuFA) | |
|
| 133 | 50 | 100 | 20 | 90 | 0 | 10 | [eval](https://sharegpt.com/c/Y3P6DCu) | |
|
| 134 | 100 | 100 | 100 | 100 | 100 | 100 | [eval](https://sharegpt.com/c/hkbdeiM) | |
|
| 135 | 100 | 100 | 100 | 100 | 100 | 100 | [eval](https://sharegpt.com/c/eubbaVC) | |
|
| 136 | 40 | 100 | 95 | 0 | 100 | 40 | [eval](https://sharegpt.com/c/QWiF49v) | |
|
| 137 | 100 | 100 | 100 | 100 | 80 | 100 | [eval](https://sharegpt.com/c/dKTapBu) | |
|
| 138 | 100 | 100 | 100 | 0 | 90 | 40 | [eval](https://sharegpt.com/c/P8NGwFZ) | |
|
| 139 | 0 | 100 | 100 | 50 | 70 | 20 | [eval](https://sharegpt.com/c/v96BtBL) | |
|
| 140 | 100 | 100 | 50 | 90 | 0 | 95 | [eval](https://sharegpt.com/c/YRlzj1t) | |
|
| 141 | 100 | 95 | 90 | 85 | 98 | 80 | [eval](https://sharegpt.com/c/76VX3eB) | |
|
| 142 | 95 | 98 | 90 | 92 | 96 | 89 | [eval](https://sharegpt.com/c/JK1uNef) | |
|
| 143 | 90 | 95 | 75 | 85 | 80 | 82 | [eval](https://sharegpt.com/c/ku6CKmx) | |
|
| 144 | 95 | 98 | 50 | 92 | 96 | 94 | [eval](https://sharegpt.com/c/0iAFuKW) | |
|
| 145 | 95 | 90 | 0 | 93 | 92 | 94 | [eval](https://sharegpt.com/c/6uGnKio) | |
|
| 146 | 95 | 90 | 85 | 92 | 80 | 88 | [eval](https://sharegpt.com/c/lfpRBw8) | |
|
| 147 | 95 | 93 | 75 | 85 | 90 | 92 | [eval](https://sharegpt.com/c/mKu70jb) | |
|
| 148 | 90 | 95 | 88 | 85 | 92 | 89 | [eval](https://sharegpt.com/c/GkYzJHO) | |
|
| 149 | 100 | 100 | 100 | 95 | 97 | 98 | [eval](https://sharegpt.com/c/mly2k0z) | |
|
| 150 | 85 | 40 | 30 | 95 | 90 | 88 | [eval](https://sharegpt.com/c/5td2ob0) | |
|
| 151 | 90 | 95 | 92 | 85 | 88 | 93 | [eval](https://sharegpt.com/c/0ISpWfy) | |
|
| 152 | 95 | 96 | 92 | 90 | 89 | 93 | [eval](https://sharegpt.com/c/kdUDUn7) | |
|
| 153 | 90 | 95 | 85 | 80 | 92 | 88 | [eval](https://sharegpt.com/c/fjMNYr2) | |
|
| 154 | 95 | 98 | 65 | 90 | 85 | 93 | [eval](https://sharegpt.com/c/6xBIf2Q) | |
|
| 155 | 95 | 92 | 96 | 97 | 90 | 89 | [eval](https://sharegpt.com/c/B9GY8Ln) | |
|
| 156 | 95 | 90 | 92 | 91 | 89 | 93 | [eval](https://sharegpt.com/c/vn1FPU4) | |
|
| 157 | 95 | 90 | 80 | 75 | 95 | 90 | [eval](https://sharegpt.com/c/YurEMYg) | |
|
| 158 | 92 | 40 | 30 | 95 | 90 | 93 | [eval](https://sharegpt.com/c/D19Qeui) | |
|
| 159 | 90 | 92 | 85 | 88 | 89 | 87 | [eval](https://sharegpt.com/c/5QRFfrt) | |
|
| 160 | 95 | 80 | 90 | 92 | 91 | 88 | [eval](https://sharegpt.com/c/pYWPRi4) | |
|
| 161 | 95 | 93 | 92 | 90 | 91 | 94 | [eval](https://sharegpt.com/c/wPRTntL) | |
|
| 162 | 100 | 98 | 95 | 90 | 92 | 96 | [eval](https://sharegpt.com/c/F6PLYKE) | |
|
| 163 | 95 | 92 | 80 | 85 | 90 | 93 | [eval](https://sharegpt.com/c/WeJnMGv) | |
|
| 164 | 95 | 98 | 90 | 88 | 97 | 96 | [eval](https://sharegpt.com/c/zNKL49e) | |
|
| 165 | 90 | 95 | 85 | 88 | 86 | 92 | [eval](https://sharegpt.com/c/kIKmA1b) | |
|
| 166 | 100 | 100 | 100 | 100 | 100 | 100 | [eval](https://sharegpt.com/c/1btWd4O) | |
|
| 167 | 90 | 95 | 85 | 96 | 92 | 88 | [eval](https://sharegpt.com/c/s9sf1Lp) | |
|
| 168 | 100 | 98 | 95 | 99 | 97 | 96 | [eval](https://sharegpt.com/c/RWzv8py) | |
|
| 169 | 95 | 92 | 70 | 90 | 93 | 89 | [eval](https://sharegpt.com/c/bYF7FqA) | |
|
| 170 | 95 | 90 | 88 | 92 | 94 | 93 | [eval](https://sharegpt.com/c/SuUqjMj) | |
|
| 171 | 95 | 90 | 93 | 92 | 85 | 94 | [eval](https://sharegpt.com/c/r0aRdYY) | |
|
| 172 | 95 | 93 | 90 | 87 | 92 | 91 | [eval](https://sharegpt.com/c/VuMfkkd) | |
|
| 173 | 95 | 93 | 90 | 96 | 92 | 91 | [eval](https://sharegpt.com/c/rhm6fa4) | |
|
| 174 | 95 | 97 | 85 | 96 | 98 | 90 | [eval](https://sharegpt.com/c/DwXnyqG) | |
|
| 175 | 95 | 92 | 90 | 85 | 93 | 94 | [eval](https://sharegpt.com/c/0ScdkGS) | |
|
| 176 | 95 | 96 | 92 | 90 | 97 | 93 | [eval](https://sharegpt.com/c/6yIoCDU) | |
|
| 177 | 95 | 93 | 96 | 94 | 90 | 92 | [eval](https://sharegpt.com/c/VubEvp9) | |
|
| 178 | 95 | 94 | 93 | 92 | 90 | 89 | [eval](https://sharegpt.com/c/RHzmZWG) | |
|
| 179 | 90 | 85 | 95 | 80 | 87 | 75 | [eval](https://sharegpt.com/c/IMiP9Zm) | |
|
| 180 | 95 | 94 | 92 | 93 | 90 | 96 | [eval](https://sharegpt.com/c/bft4PIL) | |
|
| 181 | 95 | 100 | 90 | 95 | 95 | 95 | [eval](https://sharegpt.com/c/iHXB34b) | |
|
| 182 | 100 | 95 | 85 | 100 | 0 | 90 | [eval](https://sharegpt.com/c/vCGn9R7) | |
|
| 183 | 100 | 95 | 90 | 95 | 100 | 95 | [eval](https://sharegpt.com/c/be8crZL) | |
|
| 184 | 95 | 90 | 60 | 95 | 85 | 80 | [eval](https://sharegpt.com/c/33elmDz) | |
|
| 185 | 100 | 95 | 90 | 98 | 97 | 99 | [eval](https://sharegpt.com/c/RWD3Zx7) | |
|
| 186 | 95 | 90 | 85 | 95 | 80 | 92 | [eval](https://sharegpt.com/c/GiwBvM7) | |
|
| 187 | 100 | 95 | 100 | 98 | 100 | 90 | [eval](https://sharegpt.com/c/hX2pYxk) | |
|
| 188 | 100 | 95 | 80 | 85 | 90 | 85 | [eval](https://sharegpt.com/c/MfxdGd7) | |
|
| 189 | 100 | 90 | 95 | 85 | 95 | 100 | [eval](https://sharegpt.com/c/28hQjmS) | |
|
| 190 | 95 | 90 | 85 | 80 | 88 | 92 | [eval](https://sharegpt.com/c/fzy5EPe) | |
|
| 191 | 100 | 100 | 0 | 0 | 100 | 0 | [eval](https://sharegpt.com/c/vwxPjbR) | |
|
| 192 | 100 | 100 | 100 | 50 | 100 | 75 | [eval](https://sharegpt.com/c/FAYfFWy) | |
|
| 193 | 100 | 100 | 0 | 0 | 100 | 0 | [eval](https://sharegpt.com/c/SoudGsQ) | |
|
| 194 | 0 | 100 | 0 | 0 | 0 | 0 | [eval](https://sharegpt.com/c/mkwEgVn) | |
|
| 195 | 100 | 100 | 50 | 0 | 0 | 0 | [eval](https://sharegpt.com/c/q8MQEsz) | |
|
| 196 | 100 | 100 | 100 | 100 | 100 | 95 | [eval](https://sharegpt.com/c/tzHpsKh) | |
|
| 197 | 100 | 100 | 50 | 0 | 0 | 0 | [eval](https://sharegpt.com/c/3ugYBtJ) | |
|
| 198 | 100 | 100 | 0 | 0 | 100 | 0 | [eval](https://sharegpt.com/c/I6KfOJT) | |
|
| 199 | 90 | 85 | 80 | 95 | 70 | 75 | [eval](https://sharegpt.com/c/enaV1CK) | |
|
| 200 | 100 | 100 | 0 | 0 | 0 | 0 | [eval](https://sharegpt.com/c/JBk7oSh) | |
|
|
|
</details> |
|
|
|
|
|
### Training data |
|
|
|
I used a jailbreak prompt to generate the synthetic instructions, which resulted in some training data that would likely be censored by other models, such as how-to prompts about synthesizing drugs, making homemade flamethrowers, etc. Mind you, this is all generated by ChatGPT, not me. My goal was to simply test some of the capabilities of ChatGPT when unfiltered (as much as possible), and not to intentionally produce any harmful/dangerous/etc. content. |
|
|
|
The jailbreak prompt I used is the default prompt in the python code when using the `--uncensored` flag: https://github.com/jondurbin/airoboros/blob/main/airoboros/self_instruct.py#L39 |
|
|
|
I also did a few passes of manually cleanup to remove some bad prompts, but mostly I left the data as-is. Initially, the model was fairly bad at math/extrapolation, closed question-answering (heavy hallucination), and coding, so I did one more fine tuning pass with additional synthetic instructions aimed at those types of problems. |
|
|
|
Both the initial instructions and final-pass fine-tuning instructions will be published soon. |
|
|
|
### Fine-tuning method |
|
|
|
I used the excellent [FastChat](https://github.com/lm-sys/FastChat) module, running with: |
|
|
|
``` |
|
source /workspace/venv/bin/activate |
|
|
|
export NCCL_P2P_DISABLE=1 |
|
export NCCL_P2P_LEVEL=LOC |
|
|
|
torchrun --nproc_per_node=8 --master_port=20001 /workspace/FastChat/fastchat/train/train_mem.py \ |
|
--model_name_or_path /workspace/llama-13b \ |
|
--data_path /workspace/as_conversations.json \ |
|
--bf16 True \ |
|
--output_dir /workspace/airoboros-uncensored-13b \ |
|
--num_train_epochs 3 \ |
|
--per_device_train_batch_size 20 \ |
|
--per_device_eval_batch_size 20 \ |
|
--gradient_accumulation_steps 2 \ |
|
--evaluation_strategy "steps" \ |
|
--eval_steps 500 \ |
|
--save_strategy "steps" \ |
|
--save_steps 500 \ |
|
--save_total_limit 10 \ |
|
--learning_rate 2e-5 \ |
|
--weight_decay 0. \ |
|
--warmup_ratio 0.04 \ |
|
--lr_scheduler_type "cosine" \ |
|
--logging_steps 1 \ |
|
--fsdp "full_shard auto_wrap offload" \ |
|
--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \ |
|
--tf32 True \ |
|
--model_max_length 2048 \ |
|
--gradient_checkpointing True \ |
|
--lazy_preprocess True |
|
``` |
|
|
|
This ran on 8x nvidia 80gb a100's for about 40 hours. |
|
|
|
![train/loss](IMG_0128.jpeg) |
|
|
|
![eval/loss](IMG_0130.jpeg) |
|
|
|
|
|
### Prompt format |
|
|
|
The prompt should be 1:1 compatible with the FastChat/vicuna format, e.g.: |
|
|
|
With a preamble: |
|
``` |
|
A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. |
|
|
|
USER: [prompt] |
|
<\s> |
|
|
|
ASSISTANT: |
|
``` |
|
|
|
Or just: |
|
``` |
|
USER: [prompt] |
|
<\s> |
|
|
|
ASSISTANT: |
|
``` |
|
|
|
### License |
|
The model is licensed under the LLaMA model, and the dataset is licensed under the terms of OpenAI because it uses ChatGPT. Everything else is free. |
|
|