ShieldX commited on
Commit
c78423d
1 Parent(s): 8c4cf8f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +182 -157
README.md CHANGED
@@ -12,199 +12,224 @@ language:
12
  metrics:
13
  - accuracy
14
  pipeline_tag: text-generation
 
15
  ---
16
 
17
- # Model Card for Model ID
18
 
19
- <!-- Provide a quick summary of what the model is/does. -->
 
 
20
 
 
 
 
 
 
 
 
 
 
 
21
 
 
22
 
23
- ## Model Details
24
 
25
- ### Model Description
26
 
27
- <!-- Provide a longer summary of what this model is. -->
28
 
29
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
30
 
31
- - **Developed by:** [More Information Needed]
32
- - **Funded by [optional]:** [More Information Needed]
33
- - **Shared by [optional]:** [More Information Needed]
34
- - **Model type:** [More Information Needed]
35
- - **Language(s) (NLP):** [More Information Needed]
36
- - **License:** [More Information Needed]
37
- - **Finetuned from model [optional]:** [More Information Needed]
38
 
39
- ### Model Sources [optional]
40
 
41
- <!-- Provide the basic links for the model. -->
42
 
43
- - **Repository:** [More Information Needed]
44
- - **Paper [optional]:** [More Information Needed]
45
- - **Demo [optional]:** [More Information Needed]
46
 
47
- ## Uses
48
 
49
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
50
 
51
- ### Direct Use
52
 
53
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
54
 
55
- [More Information Needed]
56
 
57
- ### Downstream Use [optional]
58
 
59
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 
 
60
 
61
- [More Information Needed]
62
 
63
- ### Out-of-Scope Use
64
 
65
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
66
 
67
- [More Information Needed]
68
 
69
- ## Bias, Risks, and Limitations
70
 
71
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
72
 
73
- [More Information Needed]
74
-
75
- ### Recommendations
76
-
77
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
78
-
79
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
80
-
81
- ## How to Get Started with the Model
82
-
83
- Use the code below to get started with the model.
84
-
85
- [More Information Needed]
86
-
87
- ## Training Details
88
-
89
- ### Training Data
90
-
91
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
92
-
93
- [More Information Needed]
94
-
95
- ### Training Procedure
96
-
97
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
98
-
99
- #### Preprocessing [optional]
100
-
101
- [More Information Needed]
102
-
103
-
104
- #### Training Hyperparameters
105
-
106
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
107
-
108
- #### Speeds, Sizes, Times [optional]
109
-
110
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
111
-
112
- [More Information Needed]
113
-
114
- ## Evaluation
115
-
116
- <!-- This section describes the evaluation protocols and provides the results. -->
117
-
118
- ### Testing Data, Factors & Metrics
119
-
120
- #### Testing Data
121
-
122
- <!-- This should link to a Dataset Card if possible. -->
123
-
124
- [More Information Needed]
125
-
126
- #### Factors
127
-
128
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
129
-
130
- [More Information Needed]
131
-
132
- #### Metrics
133
-
134
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
135
-
136
- [More Information Needed]
137
-
138
- ### Results
139
-
140
- [More Information Needed]
141
-
142
- #### Summary
143
-
144
-
145
-
146
- ## Model Examination [optional]
147
-
148
- <!-- Relevant interpretability work for the model goes here -->
149
-
150
- [More Information Needed]
151
-
152
- ## Environmental Impact
153
-
154
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
155
 
156
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
157
 
158
- - **Hardware Type:** [More Information Needed]
159
- - **Hours used:** [More Information Needed]
160
- - **Cloud Provider:** [More Information Needed]
161
- - **Compute Region:** [More Information Needed]
162
- - **Carbon Emitted:** [More Information Needed]
163
-
164
- ## Technical Specifications [optional]
165
-
166
- ### Model Architecture and Objective
167
-
168
- [More Information Needed]
169
-
170
- ### Compute Infrastructure
171
-
172
- [More Information Needed]
173
-
174
- #### Hardware
175
-
176
- [More Information Needed]
177
-
178
- #### Software
179
-
180
- [More Information Needed]
181
-
182
- ## Citation [optional]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
183
 
184
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
185
 
186
  **BibTeX:**
187
 
188
- [More Information Needed]
189
-
190
- **APA:**
191
-
192
- [More Information Needed]
193
-
194
- ## Glossary [optional]
195
-
196
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
197
-
198
- [More Information Needed]
199
-
200
- ## More Information [optional]
201
-
202
- [More Information Needed]
203
 
204
- ## Model Card Authors [optional]
205
 
206
- [More Information Needed]
207
 
208
- ## Model Card Contact
209
 
210
- [More Information Needed]
 
12
  metrics:
13
  - accuracy
14
  pipeline_tag: text-generation
15
+ base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
16
  ---
17
 
18
+ # Uploaded model
19
 
20
+ - **Developed by:** ShieldX
21
+ - **License:** apache-2.0
22
+ - **Finetuned from model :** TinyLlama/TinyLlama-1.1B-Chat-v1.0
23
 
24
+ <style>
25
+ img{
26
+ width: 40vw;
27
+ height: auto;
28
+ margin: 0 auto;
29
+ display: flex;
30
+ align-items: center;
31
+ justify-content: center;
32
+ }
33
+ </style>
34
 
35
+ # ShieldX/manovyadh-1.1B-v1
36
 
37
+ Introducing ManoVyadh, A finetuned version of TinyLlama 1.1B Chat on Mental Health Counselling Dataset.
38
 
 
39
 
40
+ <img class="custom-image" src="manovyadh.png" alt="BongLlama">
41
 
 
42
 
43
+ # Model Details
 
 
 
 
 
 
44
 
45
+ ## Model Description
46
 
47
+ ManoVyadh is a LLM for mental health counselling.
48
 
49
+ # Uses
 
 
50
 
51
+ ## Direct Use
52
 
53
+ - base model for further finetuning
54
+ - for fun
55
 
 
56
 
57
+ ## Downstream Use
58
+
59
+ - can be deployed with api
60
+ - used to create webapp or app to show demo
61
 
 
62
 
63
+ ## Out-of-Scope Use
64
 
65
+ - cannot be used for production purpose
66
+ - not to be applied in real life health purpose
67
+ - cannot be used to generate text for research or academic purposes
68
 
 
69
 
70
+ # Bias, Risks, and Limitations
71
 
72
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
73
 
74
+ # Training Details
75
 
76
+ # Model Examination
77
 
78
+ We will be further finetuning this model on large dataset to see how it performs
79
 
80
+ # Environmental Impact
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
83
 
84
+ - **Hardware Type:** 1 X Tesla T4
85
+ - **Hours used:** 0.48
86
+ - **Cloud Provider:** Google Colab
87
+ - **Compute Region:** India
88
+
89
+ # Technical Specifications
90
+
91
+ ## Model Architecture and Objective
92
+
93
+ Finetuned on Tiny-Llama 1.1B Chat model
94
+
95
+ ### Hardware
96
+
97
+ 1 X Tesla T4
98
+
99
+ # training
100
+
101
+ This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) on [ShieldX/manovyadh-3.5k](https://huggingface.co/datasets/ShieldX/manovyadh-3.5k) dataset.
102
+ It achieves the following results on the evaluation set:
103
+ - Loss: 1.8587
104
+
105
+ ## Training procedure
106
+
107
+ ### Training hyperparameters
108
+
109
+ The following hyperparameters were used during training:
110
+ - learning_rate: 2.5e-05
111
+ - train_batch_size: 2
112
+ - eval_batch_size: 8
113
+ - seed: 42
114
+ - gradient_accumulation_steps: 4
115
+ - total_train_batch_size: 8
116
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
117
+ - lr_scheduler_type: linear
118
+ - training_steps: 400
119
+ - mixed_precision_training: Native AMP
120
+ -
121
+ ### Training results
122
+
123
+ | Training Loss | Epoch | Step | Validation Loss |
124
+ |:-------------:|:-----:|:----:|:---------------:|
125
+ | 2.5894 | 0.01 | 5 | 2.5428 |
126
+ | 2.5283 | 0.02 | 10 | 2.5240 |
127
+ | 2.5013 | 0.03 | 15 | 2.5033 |
128
+ | 2.378 | 0.05 | 20 | 2.4770 |
129
+ | 2.3735 | 0.06 | 25 | 2.4544 |
130
+ | 2.3894 | 0.07 | 30 | 2.4335 |
131
+ | 2.403 | 0.08 | 35 | 2.4098 |
132
+ | 2.3719 | 0.09 | 40 | 2.3846 |
133
+ | 2.3691 | 0.1 | 45 | 2.3649 |
134
+ | 2.3088 | 0.12 | 50 | 2.3405 |
135
+ | 2.3384 | 0.13 | 55 | 2.3182 |
136
+ | 2.2577 | 0.14 | 60 | 2.2926 |
137
+ | 2.245 | 0.15 | 65 | 2.2702 |
138
+ | 2.1389 | 0.16 | 70 | 2.2457 |
139
+ | 2.1482 | 0.17 | 75 | 2.2176 |
140
+ | 2.1567 | 0.18 | 80 | 2.1887 |
141
+ | 2.1533 | 0.2 | 85 | 2.1616 |
142
+ | 2.0629 | 0.21 | 90 | 2.1318 |
143
+ | 2.1068 | 0.22 | 95 | 2.0995 |
144
+ | 2.0196 | 0.23 | 100 | 2.0740 |
145
+ | 2.062 | 0.24 | 105 | 2.0461 |
146
+ | 1.9436 | 0.25 | 110 | 2.0203 |
147
+ | 1.9348 | 0.26 | 115 | 1.9975 |
148
+ | 1.8803 | 0.28 | 120 | 1.9747 |
149
+ | 1.9108 | 0.29 | 125 | 1.9607 |
150
+ | 1.7826 | 0.3 | 130 | 1.9506 |
151
+ | 1.906 | 0.31 | 135 | 1.9374 |
152
+ | 1.8745 | 0.32 | 140 | 1.9300 |
153
+ | 1.8634 | 0.33 | 145 | 1.9232 |
154
+ | 1.8561 | 0.35 | 150 | 1.9183 |
155
+ | 1.8371 | 0.36 | 155 | 1.9147 |
156
+ | 1.8006 | 0.37 | 160 | 1.9106 |
157
+ | 1.8941 | 0.38 | 165 | 1.9069 |
158
+ | 1.8456 | 0.39 | 170 | 1.9048 |
159
+ | 1.8525 | 0.4 | 175 | 1.9014 |
160
+ | 1.8475 | 0.41 | 180 | 1.8998 |
161
+ | 1.8255 | 0.43 | 185 | 1.8962 |
162
+ | 1.9358 | 0.44 | 190 | 1.8948 |
163
+ | 1.758 | 0.45 | 195 | 1.8935 |
164
+ | 1.7859 | 0.46 | 200 | 1.8910 |
165
+ | 1.8412 | 0.47 | 205 | 1.8893 |
166
+ | 1.835 | 0.48 | 210 | 1.8875 |
167
+ | 1.8739 | 0.49 | 215 | 1.8860 |
168
+ | 1.9397 | 0.51 | 220 | 1.8843 |
169
+ | 1.8187 | 0.52 | 225 | 1.8816 |
170
+ | 1.8174 | 0.53 | 230 | 1.8807 |
171
+ | 1.8 | 0.54 | 235 | 1.8794 |
172
+ | 1.7736 | 0.55 | 240 | 1.8772 |
173
+ | 1.7429 | 0.56 | 245 | 1.8778 |
174
+ | 1.8024 | 0.58 | 250 | 1.8742 |
175
+ | 1.8431 | 0.59 | 255 | 1.8731 |
176
+ | 1.7692 | 0.6 | 260 | 1.8706 |
177
+ | 1.8084 | 0.61 | 265 | 1.8698 |
178
+ | 1.7602 | 0.62 | 270 | 1.8705 |
179
+ | 1.7751 | 0.63 | 275 | 1.8681 |
180
+ | 1.7403 | 0.64 | 280 | 1.8672 |
181
+ | 1.8078 | 0.66 | 285 | 1.8648 |
182
+ | 1.8464 | 0.67 | 290 | 1.8648 |
183
+ | 1.7853 | 0.68 | 295 | 1.8651 |
184
+ | 1.8546 | 0.69 | 300 | 1.8643 |
185
+ | 1.8319 | 0.7 | 305 | 1.8633 |
186
+ | 1.7908 | 0.71 | 310 | 1.8614 |
187
+ | 1.738 | 0.72 | 315 | 1.8625 |
188
+ | 1.8868 | 0.74 | 320 | 1.8630 |
189
+ | 1.7744 | 0.75 | 325 | 1.8621 |
190
+ | 1.8292 | 0.76 | 330 | 1.8609 |
191
+ | 1.7905 | 0.77 | 335 | 1.8623 |
192
+ | 1.7652 | 0.78 | 340 | 1.8610 |
193
+ | 1.8371 | 0.79 | 345 | 1.8611 |
194
+ | 1.7024 | 0.81 | 350 | 1.8593 |
195
+ | 1.7328 | 0.82 | 355 | 1.8593 |
196
+ | 1.7376 | 0.83 | 360 | 1.8606 |
197
+ | 1.747 | 0.84 | 365 | 1.8601 |
198
+ | 1.7777 | 0.85 | 370 | 1.8602 |
199
+ | 1.8701 | 0.86 | 375 | 1.8598 |
200
+ | 1.7165 | 0.87 | 380 | 1.8579 |
201
+ | 1.779 | 0.89 | 385 | 1.8588 |
202
+ | 1.8536 | 0.9 | 390 | 1.8583 |
203
+ | 1.7263 | 0.91 | 395 | 1.8582 |
204
+ | 1.7983 | 0.92 | 400 | 1.8587 |
205
+
206
+
207
+ ### Framework versions
208
+ - PEFT 0.7.1
209
+ - Transformers 4.37.1
210
+ - Pytorch 2.1.0+cu121
211
+ - Datasets 2.16.1
212
+ - Tokenizers 0.15.1
213
+
214
+ # Citation
215
 
216
  <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
217
 
218
  **BibTeX:**
219
 
220
+ ```
221
+ @misc{ShieldX/manovyadh-1.1B-v1,
222
+ url={[https://huggingface.co/ShieldX/manovyadh-1.1B-v1](https://huggingface.co/ShieldX/manovyadh-1.1B-v1)},
223
+ title={ManoVyadh},
224
+ author={Rohan Shaw},
225
+ year={2024}, month={Jan}
226
+ }
227
+ ```
 
 
 
 
 
 
 
228
 
229
+ # Model Card Authors
230
 
231
+ ShieldX a.k.a Rohan Shaw
232
 
233
+ # Model Card Contact
234
 
235
+ email : [email protected]