pszemraj
/

long-t5-tglobal-large-pubmed-3k-booksum-16384-WIP

text2text-generation

Model card Files Files and versions Community

pszemraj commited on Jun 25, 2022

Commit

368c55a

•

1 Parent(s): 258c454

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -69,9 +69,15 @@ inference:
 > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
 - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
 - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final two epochs**
-- An important caveat (and part of why this is WIP) is that the dataset was filtered to only contain summaries of 1024 **characters** or shorter instead of tokens. Further checkpoints I post will have this fixed.
   - the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).

 > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
+## About
 - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
 - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final two epochs**
+- **An important caveat** (and part of why this is WIP) is that the dataset was filtered to only contain summaries of 1024 **characters** or shorter instead of tokens. Other checkpoints I post will have this fixed.
   - the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).
+ ## Comparisons
+ - compare to [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). _Note: the LED large model was filtered correctly to include summaries of up to 1024 tokens, which is why they are longer._
+   - I kept the inference API checkpoints the same for reference.