base_model: KB/bert-base-swedish-cased
tags:
- generated_from_trainer
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: news_category_classification
results: []
News Category Classification for IPTC NewsCodes
This model is a fine-tuned version of KB/bert-base-swedish-cased on a private dataset.
Built from a limited set of English, Swedish and Norwegian titles to classify news content within 16 categories as specified by the IPTC NewsCodes.
The model has been fine-tuned on a dataset that is greatly skewed, but has been slightly augmented to stabilize it.
Model description
The model is intended to categorize Norwegian, Swedish and English news content within the specified 16 categories but is a test model for demonstration purposes. It needs more data within several categories to provide 100% value but it will outperform Claude Haiku and GPT-3.5 on this use case.
Intended uses & limitations
Use it to categorize news texts. Only set the category if the value is at least 60% for the label, otherwise the model is uncertain.
Test examples
Input: Mann siktet for drapsforsøk på Slovakias statsministeren
Output: crime, law and justice
Input: Tre døde i kioskbrann i Tyskland
Output: disaster, accident, and emergency incident
Input: Kultfilm får Netflix-oppfølger. Kultfilmen «Happy Gilmore» fra 1996 får en oppfølger på Netflix. Det røper strømmetjenesten selv på X, tidligere Twitter. –Happy Gilmore er tilbake!
Output: arts, culture, entertainment and media
Performance
It achieves the following results on the evaluation set:
- Loss: 0.8030
- Accuracy: 0.7431
- F1: 0.7474
- Precision: 0.7695
- Recall: 0.7431
See the performance (accuracy) for each label below:
- Arts, culture, entertainment and media: 0.6842
- Conflict, war and peace: 0.7351
- Crime, law and justice: 0.8918
- Disaster, accident, and emergency incident: 0.8699
- Economy, business, and finance: 0.6893
- Environment: 0.4483
- Health: 0.7222
- Human interest: 0.3182
- Labour: 0.5
- Lifestyle and leisure: 0.5556
- Politics: 0.7909
- Science and technology: 0.4583
- Society: 0.3538
- Sport: 0.9615
- Weather: 1.0
- Religion: 0.0
Training and evaluation data
Trained with the trainer, setting a learning rate of 2e-05 and batch size of 16 for 3 epochs.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | Accuracy Label Arts, culture, entertainment and media | Accuracy Label Conflict, war and peace | Accuracy Label Crime, law and justice | Accuracy Label Disaster, accident, and emergency incident | Accuracy Label Economy, business, and finance | Accuracy Label Environment | Accuracy Label Health | Accuracy Label Human interest | Accuracy Label Labour | Accuracy Label Lifestyle and leisure | Accuracy Label Politics | Accuracy Label Religion | Accuracy Label Science and technology | Accuracy Label Society | Accuracy Label Sport | Accuracy Label Weather |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1.9761 | 0.2907 | 200 | 1.4046 | 0.6462 | 0.6164 | 0.6057 | 0.6462 | 0.3158 | 0.8315 | 0.7629 | 0.7055 | 0.5437 | 0.0 | 0.5 | 0.0 | 0.0 | 0.3333 | 0.4843 | 0.0 | 0.0833 | 0.0 | 0.9615 | 0.0 |
1.2153 | 0.5814 | 400 | 1.0225 | 0.6894 | 0.6868 | 0.7652 | 0.6894 | 0.7895 | 0.6554 | 0.8196 | 0.8562 | 0.6408 | 0.2414 | 0.8333 | 0.1364 | 0.0 | 0.6667 | 0.8467 | 0.0 | 0.375 | 0.0154 | 0.9615 | 1.0 |
0.954 | 0.8721 | 600 | 0.8858 | 0.7231 | 0.7138 | 0.7309 | 0.7231 | 0.7368 | 0.7795 | 0.8918 | 0.8699 | 0.6214 | 0.3448 | 0.8889 | 0.1818 | 1.0 | 0.5556 | 0.6899 | 0.0 | 0.25 | 0.0462 | 0.9615 | 1.0 |
0.6662 | 1.1628 | 800 | 0.9381 | 0.6881 | 0.7009 | 0.7618 | 0.6881 | 0.7895 | 0.6126 | 0.8454 | 0.8630 | 0.6505 | 0.4483 | 0.7222 | 0.2273 | 1.0 | 0.4444 | 0.8293 | 0.0 | 0.5417 | 0.2308 | 0.9615 | 1.0 |
0.5554 | 1.4535 | 1000 | 0.8791 | 0.7025 | 0.7124 | 0.7628 | 0.7025 | 0.7368 | 0.6478 | 0.9021 | 0.8562 | 0.6602 | 0.3103 | 0.7778 | 0.3636 | 0.5 | 0.5556 | 0.8084 | 0.0 | 0.5 | 0.1846 | 0.9615 | 1.0 |
0.4396 | 1.7442 | 1200 | 0.8275 | 0.7175 | 0.7280 | 0.7686 | 0.7175 | 0.7895 | 0.6631 | 0.8196 | 0.8836 | 0.6893 | 0.3793 | 0.8333 | 0.4091 | 0.5 | 0.5556 | 0.8362 | 0.0 | 0.4167 | 0.3692 | 0.9615 | 1.0 |
0.383 | 2.0349 | 1400 | 0.7929 | 0.745 | 0.7501 | 0.7653 | 0.745 | 0.6842 | 0.7841 | 0.8866 | 0.8767 | 0.7087 | 0.4483 | 0.7778 | 0.4091 | 0.5 | 0.5556 | 0.6899 | 0.0 | 0.4167 | 0.2923 | 0.9615 | 0.0 |
0.3418 | 2.3256 | 1600 | 0.8042 | 0.7438 | 0.7440 | 0.7686 | 0.7438 | 0.7895 | 0.7351 | 0.9072 | 0.8493 | 0.7864 | 0.4483 | 0.7778 | 0.3182 | 0.5 | 0.5556 | 0.7909 | 0.0 | 0.4167 | 0.1846 | 0.9615 | 0.0 |
0.248 | 2.6163 | 1800 | 0.8387 | 0.7275 | 0.7325 | 0.7610 | 0.7275 | 0.6842 | 0.6891 | 0.8814 | 0.8699 | 0.7573 | 0.4138 | 0.8333 | 0.4091 | 0.5 | 0.5556 | 0.8014 | 0.0 | 0.4167 | 0.2769 | 0.9615 | 0.0 |
0.2525 | 2.9070 | 2000 | 0.8137 | 0.735 | 0.7413 | 0.7697 | 0.735 | 0.6842 | 0.7106 | 0.8763 | 0.8699 | 0.6796 | 0.4483 | 0.7222 | 0.3636 | 0.5 | 0.5556 | 0.8153 | 0.0 | 0.4583 | 0.3385 | 0.9615 | 0.0 |
Framework versions
- Transformers 4.40.2
- Pytorch 2.2.1+cu121
- Datasets 2.19.1
- Tokenizers 0.19.1