Update README.md
Browse files
README.md
CHANGED
@@ -69,9 +69,15 @@ inference:
|
|
69 |
|
70 |
> NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
|
71 |
|
|
|
|
|
72 |
- a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
|
73 |
- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final two epochs**
|
74 |
-
- An important caveat (and part of why this is WIP) is that the dataset was filtered to only contain summaries of 1024 **characters** or shorter instead of tokens.
|
75 |
- the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).
|
76 |
|
|
|
|
|
|
|
|
|
77 |
|
|
|
69 |
|
70 |
> NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
|
71 |
|
72 |
+
## About
|
73 |
+
|
74 |
- a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
|
75 |
- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final two epochs**
|
76 |
+
- **An important caveat** (and part of why this is WIP) is that the dataset was filtered to only contain summaries of 1024 **characters** or shorter instead of tokens. Other checkpoints I post will have this fixed.
|
77 |
- the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).
|
78 |
|
79 |
+
## Comparisons
|
80 |
+
|
81 |
+
- compare to [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). _Note: the LED large model was filtered correctly to include summaries of up to 1024 tokens, which is why they are longer._
|
82 |
+
- I kept the inference API checkpoints the same for reference.
|
83 |
|