seonghyeonye
/

flipped_11B

Text2Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

seonghyeonye commited on Oct 12, 2022

Commit

c7a3ebd

•

1 Parent(s): d633172

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -34,7 +34,7 @@ We also provide a quick [Jupyter Notebook](https://github.com/seonghyeonye/Flipp
 **Note: the model was trained with bfloat16 activations. As such, we highly discourage running inference with fp16.**
 # Training procedure
-FLIPPED models are based on [T5](https://huggingface.co/google/t5-v1_1-xl), a Transformer-based encoder-decoder language model pre-trained with a masked language modeling-style objective on [C4](https://huggingface.co/datasets/c4).
 At a high level, the input text along with output label is fed to the encoder and the instruction text is produced by the decoder. The model is fine-tuned to autoregressively generate the target. We also feed input text along with a wrong input, adding an unlikelihood loss in order not to make model produce the proper instruction in that case. Here are our training details.
 Training details:
 - Fine-tuning steps: 5'000

 **Note: the model was trained with bfloat16 activations. As such, we highly discourage running inference with fp16.**
 # Training procedure
+FLIPPED models are based on [T5](https://huggingface.co/google/t5-v1_1-xxl), a Transformer-based encoder-decoder language model pre-trained with a masked language modeling-style objective on [C4](https://huggingface.co/datasets/c4).
 At a high level, the input text along with output label is fed to the encoder and the instruction text is produced by the decoder. The model is fine-tuned to autoregressively generate the target. We also feed input text along with a wrong input, adding an unlikelihood loss in order not to make model produce the proper instruction in that case. Here are our training details.
 Training details:
 - Fine-tuning steps: 5'000