File size: 4,405 Bytes
c50d2bd
 
fd68a18
 
 
 
 
 
 
 
c50d2bd
 
fd68a18
c50d2bd
 
fd68a18
c50d2bd
 
 
 
7b8061a
c50d2bd
 
 
7b8061a
c50d2bd
 
7b8061a
c50d2bd
 
 
 
 
fd68a18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c50d2bd
 
 
fd68a18
c50d2bd
fd68a18
 
c50d2bd
fd68a18
 
c50d2bd
 
30f6631
 
 
 
 
 
 
fd68a18
c50d2bd
fd68a18
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
library_name: transformers
tags:
- text summarization
license: apache-2.0
language:
- en
metrics:
- rouge
pipeline_tag: text2text-generation
---

# Model Card for Post-Disaster Digital Help Desk Summarization Model

<!-- Provide a quick summary of what the model is/does. -->
This model is designed to summarize digital help desk conversations in post-disaster scenarios, specifically tailored for non-profit organizations providing aid. It is based on the BART model, fine-tuned using parameter-efficient methods like LoRa adapters.

## Model Details

### Model Description
This is a parameter efficient finetuned model based on the fine-tuning of the BART model. the methodology used is the LoRa adapter. this model focuses on automated text summarization of digital helpdesk conversations in post-disaster assistance scenarios in order to improve the efficiency and quality of the information gathered to provide timely and effective support to the affected people.
## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
The model is designed to summarize digital help desk conversations for nonprofit organizations in post-disaster assistance scenarios, helping digital help desk staff to quickly extract key information and reduce the time it takes to manually write high-quality summaries.

## Bias, Risks, and Limitations
Generated summaries may contain certain errors, such as the inclusion of sensitive information, and require manual secondary correction to ensure accuracy and privacy protection.

## How to Get Started with the Model

Use the code below to get started with the model.

```python
# install package
!pip install transformers[torch] -U
!pip install -q -U peft

import os
import torch
from google.colab import drive
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
from huggingface_hub import notebook_login

# login to hugging_face
notebook_login() # use model on GPU
device = "cuda" if torch.cuda.is_available() else "cpu"

# load base model
model_name = "knkarthick/MEETING_SUMMARY"
model = AutoModelForSeq2SeqLM.from_pretrained(model_name).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# load trained adapter
adapter_id = "Joaaaane/510_ABW_LoRaAdapter_PostDisasterConv"
model.load_adapter(adapter_id) # set the model to evaluation mode
model.eval()
input_text = """
PA: Hello, I need urgent housing help as a refugee from Ukraine. Can you assist?
agent: Hello, thank you for reaching out to the Red Cross. We’re here to help with housing.
agent: Have you registered with the local authorities yet?
PA: Yes, but they mentioned delays, and we need something soon. It's urgent.
agent: We have temporary shelters available. How many are with you, and are there any special needs?
PA: It's just me and my elderly mother; we need accessible housing.
agent: We can arrange for accessible temporary shelter. I’ll expedite your request and aim to place you within a few days.
agent: I'll also connect you with a Ukrainian-speaking volunteer to help with your paperwork and make your mother more comfortable.
PA: Thank you so much. This help means a lot to us right now.
agent: You're welcome! Expect a call from our volunteer by tomorrow. We’ll make sure you both are settled quickly.
PA: Thanks again. Looking forward to resolving this soon.
"""

# tokenized inputs
inputs = tokenizer(input_text, return_tensors="pt", max_length=1024, truncation=True).to(device)
# generate summary tokens
outputs = model.generate(inputs['input_ids'], max_length=62, num_beams=5, early_stopping=True)
# decode tokens
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("Generated Summary:", summary)

## Training Details

### Training Data
Data provided by 510, an initiative of the Netherlands Red Cross (all confidential data has been masked).

### Testing Data
Data provided by 510, an initiative of the Netherlands Red Cross (all confidential data has been masked).

### Metrics
ROUGE Score

### Results

| Metric             | Before LoRA | After LoRA |
|--------------------|-------------|------------|
| **ROUGE 1**        | 22.50       | 28.30      |
| **ROUGE 2**        | 4.96        | 8.64       |
| **ROUGE L**        | 17.24       | 22.50      |

## Citation

Base model: https://huggingface.co/knkarthick/MEETING_SUMMARY