---
tags:
- summarization
- summary
- booksum
- long-document
- long-form
license: apache-2.0
datasets:
- kmfoda/booksum
metrics:
- rouge
inference: false
model-index:
- name: pszemraj/long-t5-tglobal-large-pubmed-3k-booksum-16384-WIP
  results:
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: kmfoda/booksum
      type: kmfoda/booksum
      config: kmfoda--booksum
      split: test
    metrics:
    - name: ROUGE-1
      type: rouge
      value: 35.9969
      verified: true
    - name: ROUGE-2
      type: rouge
      value: 5.9272
      verified: true
    - name: ROUGE-L
      type: rouge
      value: 16.0136
      verified: true
    - name: ROUGE-LSUM
      type: rouge
      value: 32.941
      verified: true
    - name: loss
      type: loss
      value: 2.9339466094970703
      verified: true
    - name: gen_len
      type: gen_len
      value: 283.7198
      verified: true
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: samsum
      type: samsum
      config: samsum
      split: test
    metrics:
    - name: ROUGE-1
      type: rouge
      value: 26.2412
      verified: true
    - name: ROUGE-2
      type: rouge
      value: 5.9791
      verified: true
    - name: ROUGE-L
      type: rouge
      value: 18.7467
      verified: true
    - name: ROUGE-LSUM
      type: rouge
      value: 22.5566
      verified: true
    - name: loss
      type: loss
      value: 2.877626895904541
      verified: true
    - name: gen_len
      type: gen_len
      value: 47.6532
      verified: true
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: xsum
      type: xsum
      config: default
      split: test
    metrics:
    - name: ROUGE-1
      type: rouge
      value: 19.3209
      verified: true
    - name: ROUGE-2
      type: rouge
      value: 2.7978
      verified: true
    - name: ROUGE-L
      type: rouge
      value: 12.5816
      verified: true
    - name: ROUGE-LSUM
      type: rouge
      value: 15.0239
      verified: true
    - name: loss
      type: loss
      value: 4.483709335327148
      verified: true
    - name: gen_len
      type: gen_len
      value: 82.729
      verified: true
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: billsum
      type: billsum
      config: default
      split: test
    metrics:
    - name: ROUGE-1
      type: rouge
      value: 36.5688
      verified: true
    - name: ROUGE-2
      type: rouge
      value: 12.5849
      verified: true
    - name: ROUGE-L
      type: rouge
      value: 22.2461
      verified: true
    - name: ROUGE-LSUM
      type: rouge
      value: 30.6507
      verified: true
    - name: loss
      type: loss
      value: 2.6456267833709717
      verified: true
    - name: gen_len
      type: gen_len
      value: 139.0398
      verified: true
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: launch/gov_report
      type: launch/gov_report
      config: plain_text
      split: test
    metrics:
    - name: ROUGE-1
      type: rouge
      value: 37.0248
      verified: true
    - name: ROUGE-2
      type: rouge
      value: 9.0446
      verified: true
    - name: ROUGE-L
      type: rouge
      value: 18.0521
      verified: true
    - name: ROUGE-LSUM
      type: rouge
      value: 33.4723
      verified: true
    - name: loss
      type: loss
      value: 3.381495237350464
      verified: true
    - name: gen_len
      type: gen_len
      value: 211.2066
      verified: true
---

# long-t5-tglobal-large-pubmed-3k-booksum-16384-WIP

> NOTE: this is still a work-in-progress (WIP) and not completed/converged by any means, but sharing to maybe save some time for others :) 

## Updates

_As I update this WIP checkpoint, I will post a note here._

- July 26, 2022: add two more epochs of training, metrics starting to be _almost_ as good as the more-tuned `base` variant
- July 8, 2022: add checkpoint with ~4 epochs of training on A100, equating to approx 350 steps of functional batch size 128
- July 4, 2022: add checkpoint with six additional epochs of training with the dataset summary outputs filtered to 1024 **tokens**, resolving the prior issue of short summaries.

## About

- a checkpoint of [Stancld/longt5-tglobal-large-16384-pubmed-3k_steps](https://huggingface.co/Stancld/longt5-tglobal-large-16384-pubmed-3k_steps) trained on `kmfoda/booksum` for about 26 epochs 
- max input lengths during training vary between 8192 and 16384 tokens depending on GPU availability. This checkpoint was **trained with 16384 tokens as the max input length for the final 10+ epochs**

  
 ## Comparisons
 
 - compare to [pszemraj/led-large-book-summary](https://huggingface.co/pszemraj/led-large-book-summary). 
   - **inference API has been disabled because it's too compute-intensive :/**