File size: 4,409 Bytes
c50d2bd fd68a18 c50d2bd fd68a18 c50d2bd fd68a18 c50d2bd 7b8061a c50d2bd 7b8061a c50d2bd 7b8061a c50d2bd fd68a18 6e9a314 fd68a18 c50d2bd fd68a18 c50d2bd fd68a18 c50d2bd fd68a18 c50d2bd 30f6631 fd68a18 c50d2bd fd68a18 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 |
---
library_name: transformers
tags:
- text summarization
license: apache-2.0
language:
- en
metrics:
- rouge
pipeline_tag: text2text-generation
---
# Model Card for Post-Disaster Digital Help Desk Summarization Model
<!-- Provide a quick summary of what the model is/does. -->
This model is designed to summarize digital help desk conversations in post-disaster scenarios, specifically tailored for non-profit organizations providing aid. It is based on the BART model, fine-tuned using parameter-efficient methods like LoRa adapters.
## Model Details
### Model Description
This is a parameter efficient finetuned model based on the fine-tuning of the BART model. the methodology used is the LoRa adapter. this model focuses on automated text summarization of digital helpdesk conversations in post-disaster assistance scenarios in order to improve the efficiency and quality of the information gathered to provide timely and effective support to the affected people.
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
The model is designed to summarize digital help desk conversations for nonprofit organizations in post-disaster assistance scenarios, helping digital help desk staff to quickly extract key information and reduce the time it takes to manually write high-quality summaries.
## Bias, Risks, and Limitations
Generated summaries may contain certain errors, such as the inclusion of sensitive information, and require manual secondary correction to ensure accuracy and privacy protection.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
# install package
!pip install transformers[torch] -U
!pip install -q -U peft
import os
import torch
from google.colab import drive
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from huggingface_hub import notebook_login
# login to hugging_face
notebook_login() # use model on GPU
device = "cuda" if torch.cuda.is_available() else "cpu"
# load base model
model_name = "knkarthick/MEETING_SUMMARY"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# load trained adapter
adapter_id = "Joaaaane/510_ABW_LoRaAdapter_PostDisasterConv"
model.load_adapter(adapter_id) # set the model to evaluation mode
model.eval()
input_text = """
PA: Hello, I need urgent housing help as a refugee from Ukraine. Can you assist?
agent: Hello, thank you for reaching out to the Red Cross. We’re here to help with housing.
agent: Have you registered with the local authorities yet?
PA: Yes, but they mentioned delays, and we need something soon. It's urgent.
agent: We have temporary shelters available. How many are with you, and are there any special needs?
PA: It's just me and my elderly mother; we need accessible housing.
agent: We can arrange for accessible temporary shelter. I’ll expedite your request and aim to place you within a few days.
agent: I'll also connect you with a Ukrainian-speaking volunteer to help with your paperwork and make your mother more comfortable.
PA: Thank you so much. This help means a lot to us right now.
agent: You're welcome! Expect a call from our volunteer by tomorrow. We’ll make sure you both are settled quickly.
PA: Thanks again. Looking forward to resolving this soon.
"""
# tokenized inputs
inputs = tokenizer(input_text, return_tensors="pt", max_length=1024, truncation=True).to(device)
# generate summary tokens
outputs = model.generate(inputs['input_ids'], max_length=62, num_beams=5, early_stopping=True)
# decode tokens
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Summary:", summary)
```
## Training Details
### Training Data
Data provided by 510, an initiative of the Netherlands Red Cross (all confidential data has been masked).
### Testing Data
Data provided by 510, an initiative of the Netherlands Red Cross (all confidential data has been masked).
### Metrics
ROUGE Score
### Results
| Metric | Before LoRA | After LoRA |
|--------------------|-------------|------------|
| **ROUGE 1** | 22.50 | 28.30 |
| **ROUGE 2** | 4.96 | 8.64 |
| **ROUGE L** | 17.24 | 22.50 |
## Citation
Base model: https://huggingface.co/knkarthick/MEETING_SUMMARY |