feat(readme): updated readme, increased max-length in config

Browse files

Files changed (2) hide show

README.md +34 -11
config.json +6 -0

README.md CHANGED Viewed

@@ -1,33 +1,51 @@
 ---
-license: mit
 base_model: igorktech/rugpt3-joker-150k
 tags:
 - generated_from_trainer
-datasets:
-- baneks
 model-index:
-- name: wit
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # wit
-This model is a fine-tuned version of [igorktech/rugpt3-joker-150k](https://huggingface.co/igorktech/rugpt3-joker-150k) on the baneks dataset.
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
@@ -47,7 +65,9 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions
@@ -55,3 +75,6 @@ The following hyperparameters were used during training:
 - Pytorch 2.1.0
 - Datasets 2.12.0
 - Tokenizers 0.14.1

 ---
+language:
+- ru
+- en
+license: apache-2.0
 base_model: igorktech/rugpt3-joker-150k
 tags:
+- not-for-all-audiences
+- art
+- humour
+- jokes
 - generated_from_trainer
 model-index:
+- name: zeio/wit
   results: []
+datasets:
+- zeio/baneks
+metrics:
+- loss
+widget:
+- text: 'Купил мужик шляпу'
+  example_title: hat
+- text: 'Пришла бабка к врачу'
+  example_title: doctor
+- text: 'Нашел мужик подкову'
+  example_title: horseshoe
 ---
+<p align="center">
+    <img src="https://i.ibb.co/zP7j7ng/wit-logo.png"/>
+</p>
 # wit
+This model is a fine-tuned version of [igorktech/rugpt3-joker-150k][base] on the [baneks][dataset] dataset for 1 epoch. It achieved `2.0391` loss during training.
+Model evaluation has not been performed.
 ## Model description
+The model is a fine-tuned variant of the [igorktech/rugpt3-joker-150k][base] architecture with causal language modeling head.
 ## Intended uses & limitations
+The model should be used for studying abilities of natural language models to generate jokes.
 ## Training and evaluation data
+The model is trained on a list of anecdotes pulled from a few vk communities (see [baneks][dataset] dataset for more details).
 ## Training procedure
 ### Training results
+| Train Loss | Epoch |
+|:----------:|:-----:|
+| 2.0391     | 10    |
 ### Framework versions
 - Pytorch 2.1.0
 - Datasets 2.12.0
 - Tokenizers 0.14.1
+[base]: https://huggingface.co/igorktech/rugpt3-joker-150k
+[dataset]: https://huggingface.co/datasets/zeio/baneks

config.json CHANGED Viewed

@@ -1,5 +1,11 @@
 {
   "_name_or_path": "igorktech/rugpt3-joker-150k",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"

 {
   "_name_or_path": "igorktech/rugpt3-joker-150k",
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 192
+    }
+  },
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"