seonghyeonye commited on
Commit
c7a3ebd
1 Parent(s): d633172

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -34,7 +34,7 @@ We also provide a quick [Jupyter Notebook](https://github.com/seonghyeonye/Flipp
34
  **Note: the model was trained with bfloat16 activations. As such, we highly discourage running inference with fp16.**
35
 
36
  # Training procedure
37
- FLIPPED models are based on [T5](https://huggingface.co/google/t5-v1_1-xl), a Transformer-based encoder-decoder language model pre-trained with a masked language modeling-style objective on [C4](https://huggingface.co/datasets/c4).
38
  At a high level, the input text along with output label is fed to the encoder and the instruction text is produced by the decoder. The model is fine-tuned to autoregressively generate the target. We also feed input text along with a wrong input, adding an unlikelihood loss in order not to make model produce the proper instruction in that case. Here are our training details.
39
  Training details:
40
  - Fine-tuning steps: 5'000
 
34
  **Note: the model was trained with bfloat16 activations. As such, we highly discourage running inference with fp16.**
35
 
36
  # Training procedure
37
+ FLIPPED models are based on [T5](https://huggingface.co/google/t5-v1_1-xxl), a Transformer-based encoder-decoder language model pre-trained with a masked language modeling-style objective on [C4](https://huggingface.co/datasets/c4).
38
  At a high level, the input text along with output label is fed to the encoder and the instruction text is produced by the decoder. The model is fine-tuned to autoregressively generate the target. We also feed input text along with a wrong input, adding an unlikelihood loss in order not to make model produce the proper instruction in that case. Here are our training details.
39
  Training details:
40
  - Fine-tuning steps: 5'000