pszemraj commited on
Commit
368c55a
1 Parent(s): 258c454

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -69,9 +69,15 @@ inference:
69
 
70
  > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
71
 
 
 
72
  - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
73
  - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final two epochs**
74
- - An important caveat (and part of why this is WIP) is that the dataset was filtered to only contain summaries of 1024 **characters** or shorter instead of tokens. Further checkpoints I post will have this fixed.
75
  - the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).
76
 
 
 
 
 
77
 
 
69
 
70
  > NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :)
71
 
72
+ ## About
73
+
74
  - a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 20 epochs
75
  - max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final two epochs**
76
+ - **An important caveat** (and part of why this is WIP) is that the dataset was filtered to only contain summaries of 1024 **characters** or shorter instead of tokens. Other checkpoints I post will have this fixed.
77
  - the tl;dr of this is that if you use this checkpoint for inference, it will produce short summaries of the input text (perhaps shorter than you wanted).
78
 
79
+ ## Comparisons
80
+
81
+ - compare to [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). _Note: the LED large model was filtered correctly to include summaries of up to 1024 tokens, which is why they are longer._
82
+ - I kept the inference API checkpoints the same for reference.
83