Hungarian Abstractive Summarization BART model
For further models, scripts and details, see our repository or our demo site.
- BART base model (see Results Table - bold):
- Pretrained on Webcorpus 2.0
- Finetuned NOL corpus (nol.hu)
- Segments: 397,343
Limitations
- tokenized input text (tokenizer: HuSpaCy)
- max_source_length = 512
- max_target_length = 256
Results
Model | HI | NOL |
---|---|---|
BART-base-512 | 30.18/13.86/22.92 | 46.48/32.40/39.45 |
BART-base-1024 | 31.86/14.59/23.79 | 47.01/32.91/39.97 |
Citation
If you use this model, please cite the following paper:
@inproceedings {yang-bart,
title = {{BARTerezzünk! - Messze, messze, messze a világtól, - BART kísérleti modellek magyar nyelvre}},
booktitle = {XVIII. Magyar Számítógépes Nyelvészeti Konferencia},
year = {2022},
publisher = {Szegedi Tudományegyetem, Informatikai Intézet},
address = {Szeged, Magyarország},
author = {Yang, Zijian Győző},
pages = {15--29}
}
- Downloads last month
- 26
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.