File size: 2,167 Bytes
2732c44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# Model Card for krishnagarg09/stance-detection-semeval2016

## Model Description
The goal is to identify the stance (AGAINST, NONE, FAVOR) of a user towards a given target.

Sample:

```
Input: Lord, You are my Hope! In You I will always trust.
Target: Atheism
Stance: AGAINST
```

The model is pretrained on SemEval2016-Task6 stance detection dataset. The dataset is available  at https://huggingface.co/datasets/krishnagarg09/SemEval2016Task6.

Ref: https://aclanthology.org/S16-1003/ for more details about the dataset

- **Developed by:** Krishna Garg
- **Shared by [Optional]:** Krishna Garg
- **Model type:** Language model
- **Language(s) (NLP):** en
- **License:** mit
- **Resources for more information:**
    - [Associated Paper](https://aclanthology.org/S16-1003/)

## Direct Use
```
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from datasets import load_dataset

# load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("krishnagarg09/stance-detection-semeval2016")
model = AutoModelForSequenceClassification.from_pretrained("krishnagarg09/stance-detection-semeval2016")

# load dataset
dataset = load_dataset("krishnagarg09/SemEval2016Task6")

# prepare input
text = dataset['test']['Tweet']
encoded_input = tokenizer(text, return_tensors='pt', add_special_tokens = True, max_length=128, padding=True, truncation=True)

# forward pass
output = model(**encoded_input)
```

## Dataset
The dataset is available  at https://huggingface.co/datasets/krishnagarg09/SemEval2016Task6.
```
dataset = load_dataset("krishnagarg09/SemEval2016Task6")
```

## Training Details
optimizer: Adam
lr: 2e-5
loss: crossentropy
epochs: 5 (best weights chosen over validation)
batch_size: 32

### Preprocessing
Text lowercased, `#semst` tags removed, `p.OPT.URL,p.OPT.EMOJI,p.OPT.RESERVED` removed using `tweet-preprocessor` package, normalization done using `emnlp_dict.txt` and `noslang_data.json`

## Evaluation
Evaluation for Stance Detection is done only for 2/3 labels, i.e., FAVOR and AGAINST.

```
Precision: 62.69
Recall: 69.43
F1: 65.56
```

## Hardware
Nvidia RTX A5000 24GB

## Model Card Contact
[email protected]