qgyd2021
/

chinese_chitchat

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

qgyd2021 commited on Nov 16, 2023

Commit

2b195f6

•

1 Parent(s): e3a6d76

Model save

Files changed (2) hide show

README.md +56 -3
pytorch_model.bin +1 -1

README.md CHANGED Viewed

@@ -12,8 +12,61 @@ should probably proofread and complete it, then remove this comment. -->
 # chinese_chitchat
-这个模型是基于 [uer/gpt2-chinese-cluecorpussmall](https://huggingface.co/uer/gpt2-chinese-cluecorpussmall) 在 [qgyd2021/chinese_chitchat](https://huggingface.co/datasets/qgyd2021/chinese_chitchat) 数据集的 [xiaohuangji](https://huggingface.co/datasets/qgyd2021/chinese_chitchat/viewer/xiaohuangji) 子集上进行微调的。
-由于该数据集(xiaohuangji)中问答不相关(答非所问)的样本很多，噪音大，因此虽然有45万样本，但感觉效果并不太好。
-训练了 2 次，第一次 26000 步，第二次 8000 步，总共大约是 10 个 epoch 的样子。

 # chinese_chitchat
+This model is a fine-tuned version of [qgyd2021/chinese_chitchat](https://huggingface.co/qgyd2021/chinese_chitchat) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 2.1314
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 16
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 128
+- total_eval_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 10000
+- num_epochs: 40.0
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 1.5203        | 0.29  | 1000  | 2.0882          |
+| 1.4243        | 0.58  | 2000  | 2.1525          |
+| 1.3502        | 0.86  | 3000  | 2.1544          |
+| 1.5332        | 1.15  | 4000  | 2.0826          |
+| 1.5208        | 1.44  | 5000  | 2.0789          |
+| 1.5521        | 1.73  | 6000  | 2.0613          |
+| 1.5634        | 2.02  | 7000  | 2.1124          |
+| 1.5067        | 2.3   | 8000  | 2.1014          |
+| 1.5573        | 2.59  | 9000  | 2.0972          |
+| 1.5949        | 2.88  | 10000 | 2.0907          |
+| 1.5491        | 3.17  | 11000 | 2.1314          |
+### Framework versions
+- Transformers 4.33.0
+- Pytorch 2.0.0
+- Datasets 2.1.0
+- Tokenizers 0.13.3

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a784aa34ddf249484a1c8ec63289232467021f4d43bc7f023e6aeeacb616b6bd
 size 408322909

 version https://git-lfs.github.com/spec/v1
+oid sha256:05886975fd5d351ea65cf7d0426cdd19bb82dc0f36dde5657e50c0f306debe0f
 size 408322909