AlgorithmicResearchGroup
/

led_base_16384_arxiv_summarization

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

Edit model card

Introduction

A led-base-16384 model to summarize ArXiv papers. Inputs are the abstracts of papers and full documents, and outputs are the summaries of the papers.

Allenai's Longformer Encoder-Decoder (LED).

As described in Longformer: The Long-Document Transformer by Iz Beltagy, Matthew E. Peters, Arman Cohan, led-base-16384 was initialized from bart-base since both models share the exact same architecture. To be able to process 16K tokens, bart-base's position embedding matrix was simply copied 16 times.

Rouge 2

Type	Score
`precision`	0.1839148953011932
`recall`	0.14904707945189774
`fmeasure`	0.1580026685776864

Downloads last month: 8

Safetensors

Model size

162M params

Tensor type

F32

·

Inference Examples

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results

ROUGE-1 on ccdv/arxiv-summarization
test set verified

37.325
ROUGE-2 on ccdv/arxiv-summarization
test set verified

10.895
ROUGE-L on ccdv/arxiv-summarization
test set verified

20.387
ROUGE-LSUM on ccdv/arxiv-summarization
test set verified

33.301
loss on ccdv/arxiv-summarization
test set verified

3.182
gen_len on ccdv/arxiv-summarization
test set verified

145.590

View on Papers With Code