|
--- |
|
language: |
|
- ru |
|
- en |
|
license: apache-2.0 |
|
base_model: Dmitriy007/rugpt2_gen_news |
|
tags: |
|
- not-for-all-audiences |
|
- art |
|
- humour |
|
- jokes |
|
- generated_from_trainer |
|
model-index: |
|
- name: zeio/clown |
|
results: [] |
|
datasets: |
|
- zeio/baneks |
|
metrics: |
|
- loss |
|
widget: |
|
- text: 'Купил мужик шляпу' |
|
example_title: hat |
|
- text: 'Пришла бабка к врачу' |
|
example_title: doctor |
|
- text: 'Нашел мужик подкову' |
|
example_title: horseshoe |
|
--- |
|
|
|
<p align="center"> |
|
<img src="https://i.ibb.co/QbQZ8Gz/clown-logo.png"/> |
|
</p> |
|
|
|
# clown |
|
|
|
This model is a fine-tuned version of [Dmitriy007/rugpt2_gen_news][base] on the [baneks][dataset] dataset for 1 epoch. It achieved `2.0760` loss during training. |
|
Model evaluation has not been performed. |
|
|
|
## Model description |
|
|
|
The model is a fine-tuned variant of the [Dmitriy007/rugpt2_gen_news][base] architecture with causal language modeling head. |
|
|
|
## Intended uses & limitations |
|
|
|
The model should be used for studying abilities of natural language models to generate jokes. |
|
|
|
## Training and evaluation data |
|
|
|
The model is trained on a list of anecdotes pulled from a few vk communities (see [baneks][dataset] dataset for more details). |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0005 |
|
- train_batch_size: 8 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 8 |
|
- total_train_batch_size: 64 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: cosine |
|
- lr_scheduler_warmup_steps: 1000 |
|
- num_epochs: 10 |
|
|
|
### Training results |
|
|
|
| Train Loss | Epoch | |
|
|:----------:|:-----:| |
|
| 2.0760 | 10 | |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.34.0 |
|
- Pytorch 2.1.0 |
|
- Datasets 2.12.0 |
|
- Tokenizers 0.14.1 |
|
|
|
[base]: https://huggingface.co/Dmitriy007/rugpt2_gen_news |
|
[dataset]: https://huggingface.co/datasets/zeio/baneks |
|
|