|
# Model Card for krishnagarg09/stance-detection-semeval2016 |
|
|
|
## Model Description |
|
The goal is to identify the stance (AGAINST, NONE, FAVOR) of a user towards a given target. |
|
|
|
Sample: |
|
|
|
``` |
|
Input: Lord, You are my Hope! In You I will always trust. |
|
Target: Atheism |
|
Stance: AGAINST |
|
``` |
|
|
|
The model is pretrained on SemEval2016-Task6 stance detection dataset. The dataset is available at https://huggingface.co/datasets/krishnagarg09/SemEval2016Task6. |
|
|
|
Ref: https://aclanthology.org/S16-1003/ for more details about the dataset |
|
|
|
- **Developed by:** Krishna Garg |
|
- **Shared by [Optional]:** Krishna Garg |
|
- **Model type:** Language model |
|
- **Language(s) (NLP):** en |
|
- **License:** mit |
|
- **Resources for more information:** |
|
- [Associated Paper](https://aclanthology.org/S16-1003/) |
|
|
|
## Direct Use |
|
``` |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
from datasets import load_dataset |
|
|
|
# load model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("krishnagarg09/stance-detection-semeval2016") |
|
model = AutoModelForSequenceClassification.from_pretrained("krishnagarg09/stance-detection-semeval2016") |
|
|
|
# load dataset |
|
dataset = load_dataset("krishnagarg09/SemEval2016Task6") |
|
|
|
# prepare input |
|
text = dataset['test']['Tweet'] |
|
encoded_input = tokenizer(text, return_tensors='pt', add_special_tokens = True, max_length=128, padding=True, truncation=True) |
|
|
|
# forward pass |
|
output = model(**encoded_input) |
|
``` |
|
|
|
## Dataset |
|
The dataset is available at https://huggingface.co/datasets/krishnagarg09/SemEval2016Task6. |
|
``` |
|
dataset = load_dataset("krishnagarg09/SemEval2016Task6") |
|
``` |
|
|
|
## Training Details |
|
optimizer: Adam |
|
lr: 2e-5 |
|
loss: crossentropy |
|
epochs: 5 (best weights chosen over validation) |
|
batch_size: 32 |
|
|
|
### Preprocessing |
|
Text lowercased, `#semst` tags removed, `p.OPT.URL,p.OPT.EMOJI,p.OPT.RESERVED` removed using `tweet-preprocessor` package, normalization done using `emnlp_dict.txt` and `noslang_data.json` |
|
|
|
## Evaluation |
|
Evaluation for Stance Detection is done only for 2/3 labels, i.e., FAVOR and AGAINST. |
|
|
|
``` |
|
Precision: 62.69 |
|
Recall: 69.43 |
|
F1: 65.56 |
|
``` |
|
|
|
## Hardware |
|
Nvidia RTX A5000 24GB |
|
|
|
## Model Card Contact |
|
[email protected] |